Logging metadata

Logging to Neptune in 30 seconds

To start logging to Neptune you need to install the client library and add 3 lines of code.
1
import neptune.new as neptune
2
3
# connect your script to Neptune
4
run = neptune.init(project='<YOUR_WORKSPACE/YOUR_PROJECT>',
5
api_token='<YOUR_API_TOKEN>')
6
7
# Log 'acc' value 0.95
8
run['acc'] = 0.95
Copied!
The snippet above:
  • creates a Run specifying the name of a Neptune Project, and passing your API token so that Neptune knows that you have permissions to that project.
  • logs a single float value to the 'acc' Field of a Run.
Once you run the snippet (with your credentials) you can go to the Neptune UI to see your Run and logged metadata.

Run structure - namespaces

Runs can be viewed as dictionary-like structures that you define in your code.
They have:
  • namespaces to organize Fields
  • Fields where you log your ML metadata
Whatever hierarchical metadata structure you create will be reflected in the UI.
To create a structured namespace use the / symbol:
1
run['metrics/f1_score'] = 0.67
2
run['metrics/test/roc'] = 0.82
3
4
for epoch in range(100):
5
run["metrics/train/accuracy"].log(acc)
6
run["metrics/train/loss"].log(loss)
Copied!
The snippet above:
  • creates Namespaces:
    • 'metrics', ''metrics/test' , 'metrics/train'
  • assigns a single value to Fields:
    • 'f1_score' , 'roc'
  • logs multiple values to Fields:
    • 'accuracy', 'loss'
The corresponding namespace structure in Neptune is:
  • metrics:
    • f1_score: 0.67
    • test:
      • roc: 0.82
    • train:
      • accuracy: [0.12, 0.63, ... , 0.76]
      • loss: [0.43, 0.21, ..., 0.06]

Essential logging methods

Single value: '='

To log a single-valued metadata, like a hyperparameter or evaluation metric, use an equals '=' assignment:
1
run["max_epochs"] = 5
2
run["optimizer"] = "Adam"
Copied!

Dictionary: '='

To log metadata from a Python dictionary, like a training configuration, use an equals '=' assignment. Your Python dictionary will be parsed into a Neptune namespace automatically.
1
run["parameters"] = {"batch_size": 64,
2
"dropout": 0.2,
3
"optim": {"learning_rate": 0.001,
4
"optimizer": "Adam"},
5
}
Copied!

Series of values: .log()

To log a series of metadata values, like a loss during training or text logs after every iteration, use the .log() method:
1
for iteration in range(100):
2
run["train_accuracy"].log(accuracy)
3
run["logs"].log(iteration_config)
Copied!

File, folder, S3 version: .track_files()

To log and version a dataset, model, or any other artifacts stored in a file, folder, or S3-compatible storage, use .track_files() method:
1
run["dataset/train"].upload("./datasets/train/images")
Copied!

Single File or object: .upload()

To log a single file or object like a sample of data or confusion matrix figure, use the .upload() method:
1
run["data_sample"].upload("sample_data.csv")
Copied!

Series of Files or objects: .log() + File

To log a series of files or objects like image predictions or model checkpoints after every epoch use .log() method and a File constructor:
1
from neptune.new.types import File
2
3
for iteration in range(100):
4
run["model_checkpoints"].log(File("model.ckpt"))
5
run["image_preds"].log(File("image_pred.png"))
Copied!
You can display your metadata as interactive html, series of images, or log Python objects as pickles with File.as_image() File.as_html() File.as_pickle() methods:
1
from neptune.new.types import File
2
3
run["prediction_example"].upload(File.as_image(numpy_array))
4
run["results"].upload(File.as_html(df_predictions))
5
run["pickled_model"].upload(File.as_pickle(trained_model))
Copied!

What is logged automatically?

Neptune logs some model building metadata automatically by default. You can control

Code

  • Entrypoint script: filename of the script you executed like main.py
  • Files: the content of the script you executed
  • Git: information extracted from the .git directory
    • Dirty: whether the commit was dirty or not
    • Commit message: what was the commit message
    • Commit ID: Git SHA signature
    • Commit author: the author of the last commit
    • Commit date: the data of the last commit
    • Current branch: what is the current git branch
    • Remotes: list of git remotes your local .git is connected to
Important You can turn off logging of source code by passing an empty list to the source_files argument of neptune.init():
1
run = neptune.init(..., source_files=[])
Copied!

System information

  • creation_time: the time when the Run was created
  • hostname: your system hostname
  • id: Neptune id of a run ('SHOW-3051' for example)
  • modification_time: when the Run was last modified
  • owner: who has created the Run
  • ping_time: the time of the last interaction with the Neptune client
  • running_time: total time during the Run spent in the active mode

Hardware consumption and console logs

  • cpu: total CPU consumption of your machine
  • memory: total memory consumption of your machine
  • gpu: GPU consumption of your machine
  • gpu_memory: GPU memory consumption of your machine
  • stderr: standard error logs from your console
  • stdout: standard output logs from your console
To turn off logging of hardware consumption and console logs use:
1
run = neptune.init(...,
2
capture_stdout=False,
3
capture_stderr=False,
4
capture_hardware_metrics=False)
Copied!

Logging with integrations

To make logging easier, we created integrations for most of the Python ML libraries including PyTorch, TensorFlow, Keras, and Scikit-Learn.
Integrations give you out-of-the-box utilities that log most of the ML metadata you would normally log in those ML libraries.
For example, to log metadata from TensorFlow / Keras add a NeptuneCallback:
1
from neptune.new.integrations.tensorflow_keras import NeptuneCallback
2
3
neptune_cbk = NeptuneCallback(run=run, base_namespace='metrics')
4
5
model.fit(x_train, y_train,
6
epochs=5,
7
batch_size=64,
8
callbacks=[neptune_cbk])
Copied!
Now when you run your script all your metrics and losses will be logged to Neptune.

Logging to an existing Run

Neptune allows logging to existing Runs. This lets you:
  • add new data or visualizations after model training has finished
  • use Neptune in multi-stage training workflows
  • log to one Run from multiple scripts
To update an existing run you need to pass the Run ID of the existing Run to neptune.init() :
1
import neptune.new as neptune
2
3
run = neptune.init(...,
4
run="SUN-123")
Copied!
Now you can log ML metadata to it as you normally would.

Logging in the offline and debug modes

By default, the Neptune client library logs ML metadata in the asynchronous connection mode: as your script is running a separate process is created to handle metadata logging. This makes logging faster and robust.
However, there are situations where you may want to use other connection modes.

No internet connection: use offline mode

If the machine you are executing your scripts on doesn't have an internet connection you can save ML metadata locally and sync it with Neptune later.
To do that, create a Run in the "offline" connection mode:
1
import neptune.new as neptune
2
3
run = neptune.init(...,
4
mode="offline")
Copied!
Once your internet is back on sync the local logs with Neptune by going to your console and running:
1
neptune sync
Copied!

Code debugging: use debug mode

Sometimes, you are just getting the code to work, and don't want to clutter your Neptune project with failed runs. In those situations, use a "debug" connection mode.
Runs created in the "debug" mode don't log any metadata to Neptune servers nor save it locally. When you use the "debug" mode you turn off all the Neptune-related logging.
To do that, create a Run and pass mode="debug" :
1
import neptune.new as neptune
2
3
run = neptune.init(...,
4
mode="debug")
Copied!

What's next?

Last modified 1mo ago