Skip to content

Adding Neptune to your code#

  1. In your code, import the Neptune client library:

    import neptune.new as neptune
    
  2. Depending on how you want to organize the metadata in the app, start one or more Neptune objects:

    For metadata relating to a single experiment:

    run = neptune.init_run()  # (1)
    
    1. We recommend saving your API token and project name as environment variables. If needed, you can pass them as arguments when initializing Neptune: neptune.init_run(project="workspace-name/project-name", api_token="Your Neptune API token here")

    Log experiment tracking metadata:

    params = {
        "max_epochs": 10,
        "optimizer": "Adam",
        "dropout": 0.2,
    }
    run["parameters"] = params
    

    When you're done, stop the connection to sync the data:

    run.stop()
    

    View the results in the Runs section.

    Register the model on a high level:

    model = neptune.init_model(key="FOREST")  # (1)
    
    1. If the project key is CLS, creates model with ID CLS-FOREST.

    Log metadata common to all model versions:

    model["signature"].upload("model_signature.json")
    

    When you're done, stop the connection to sync the data:

    model.stop()
    

    See the results in the Models section.

    Capture model version specifics:

    model_version = neptune.init_model_version(model="CLS-FOREST")  # (1)
    
    1. Creates model version based on the model CLS-FOREST. Will have ID CLS-FOREST-1.

    Log metadata specific to a model version:

    model_version["model/binary"].upload("model.pt")
    

    When you're done, stop the connection to sync the data:

    model_version.stop()
    

    See the results in the Models section.

    For metadata common to the entire project, initialize the project as a Neptune object:

    project = neptune.init_project(name="ml-team/classification")
    

    Log project-level metadata:

    project["dataset/v0.1"].track_files("s3://datasets/images")
    

    When you're done, stop the connection to sync the data:

    project.stop()
    
  3. After executing the script, Neptune prints a link to the relevant section in the app.

Run example#

The following example shows on a high level how you can plug Neptune into a typical model training flow.

Start the tracking#

In your model training script, import Neptune and start a run:

import neptune.new as neptune

run = neptune.init_run(project="ml-team/classification")  # (1)
  1. You can also set the project name as an environment variable. For instructions, see Setting the project name.

Log hyperparameters#

Define some hyperparameters to track for the experiment and log them to the run object:

parameters = {
    "dense_units": 128,
    "activation": "relu",
    "dropout": 0.23,
    "learning_rate": 0.15,
    "batch_size": 64,
    "n_epochs": 30,
}
run["model/parameters"] = parameters

You can update or add new entries later in the code:

# Add additional parameters
run["model/parameters/seed"] = RANDOM_SEED

# Update parameters. For example, after triggering early stopping
run["model/parameters/n_epochs"] = epoch

Log training metrics#

Track the training process by logging your training metrics. Use the log() method for a series of values, or one of our ready-made integrations:

for epoch in range(parameters["n_epochs"]):
    [...]  # My training loop

    run["train/epoch/loss"].log(loss)
    run["train/epoch/accuracy"].log(acc)
from neptune.new.integrations.tensorflow_keras import NeptuneCallback

model.fit(
    x_train,
    y_train,
    callbacks=[NeptuneCallback(run=run)],
)
from neptune.new.integrations.xgboost import NeptuneCallback

xgb.train(
    params=parameters,
    dtrain=dtrain,
    callbacks=[NeptuneCallback(run=run)],
)
from neptune.new.integrations.lightgbm import NeptuneCallback

gbm = lgb.train(
    parameters,
    lgb_train,
    callbacks=[NeptuneCallback(run=run)],
)
import neptune.new.integrations.sklearn as npt_utils

run["cls_summary"] = npt_utils.create_classifier_summary(
    gbc, X_train, X_test, y_train, y_test
)

run["rfr_summary"] = npt_utils.create_regressor_summary(
    rfr, X_train, X_test, y_train, y_test
)

run["kmeans_summary"] = npt_utils.create_kmeans_summary(
    km, X, n_clusters=17
)

You can use Neptune with any machine learning framework. If you use a framework that supports logging (most of them do) you don't need to write the logging code yourself. The Neptune integration takes care of tracking all the training metrics.

Related

Integrations

Log evaluation results#

Assign the metrics to a namespace and field of your choice:

run["evaluation/accuracy"] = eval_acc
run["evaluation/loss"] = eval_loss

Using the snippet above, both evaluation metrics will be stored in the same evaluation namespace.

You can log plots and charts with the upload() method.

A plot object is converted to an image file, but you can also upload images from the local disk.

import matplotlib.pyplot as plt
from scikitplot.metrics import plot_roc, plot_precision_recall

fig, ax = plt.subplots()
plot_roc(y_test, y_pred_proba, ax=ax)

run["evaluation/ROC"].upload(fig)

fig, ax = plt.subplots()
plot_precision_recall(y_test, y_pred_proba, ax=ax)

run["evaluation/precision-recall"].upload(fig)
run["evaluation/ROC"].upload("roc.png")
run["evaluation/precision-recall"].upload("prec-recall.jpg")

The following snippet logs sample predictions by using the FileSeries type to log a series of labeled images:

for image, predicted_label, probabilites in sambple_predictions:

    description = "\n".join(
        [f"class {label}: {prob}" for label, prob in probabilites]
    )

    run["evaluation/predictions"].log(
        image,
        name=predicted_label,
        description=description,
    )

You can upload tabular data as a pandas DataFrame and inspect it as a neat table in the app:

import pandas as pd

df = pd.DataFrame(
    data={
        "y_test": y_test,
        "y_pred": y_pred,
        "y_pred_probability": y_pred_proba.max(axis=1),
    }
)

run["evaluation/predictions"].upload(File.as_html(df))

You can also just upload data as CSV, which you can preview as an interactive table.

Upload relevant files#

You can upload any binary file (such as a model file) from disk using the upload() method.

If your model is saved as multiple files, you can upload a whole folder as a FileSet with upload_files().

torch.save(net.state_dict(), "model.pt")

run["model/saved_model"].upload("model.pt")
from neptune.new.types import File

run["model/pickled_model"].upload(File.as_pickle(model_object))

Instead of uploading entire files, you can track their metadata only.

run["dataset/train"].track_files("./datasets/train/images")

For details, see Tracking artifacts.

Tips

  • To organize the model metadata in the Models section, instead of just logging to a run object, you can create a model object and log the data there. For more, see Model registry overview.

Explore results#

Once you're done logging, end the run with the stop() method:

run.stop()

Next, run your script and follow the link to explore your metadata in Neptune.