Add Neptune to your code#
-
In your code, import the Neptune client library:
-
Depending on how you want to organize the metadata in the app, start one or more Neptune objects:
For metadata relating to a single experiment:
-
We recommend saving your API token and project name as environment variables.
If needed, you can pass them as arguments when initializing Neptune:
Log experiment tracking metadata:
When you're done, stop the connection to sync the data:
View the results in the Runs section.
Register the model on a high level:
- If the project key is
CLS
, creates model with IDCLS-FOREST
.
Log metadata common to all model versions:
When you're done, stop the connection to sync the data:
See the results in the Models section.
Capture model version specifics:
- Creates model version based on the model
CLS-FOREST
. Will have IDCLS-FOREST-1
.
Log metadata specific to a model version:
When you're done, stop the connection to sync the data:
See the results in the Models section.
For metadata common to the entire project, initialize the project as a Neptune object:
Log project-level metadata:
When you're done, stop the connection to sync the data:
-
-
After executing the script, Neptune prints a link to the relevant section in the app.
Run example#
The following example shows on a high level how you can plug Neptune into a typical model training flow.
Start the tracking#
In your model training script, import Neptune and start a run:
- You can also set the project name as an environment variable. For instructions, see Set the project name.
Log hyperparameters#
Define some hyperparameters to track for the experiment and log them to the run
object:
parameters = {
"dense_units": 128,
"activation": "relu",
"dropout": 0.23,
"learning_rate": 0.15,
"batch_size": 64,
"n_epochs": 30,
}
run["model/parameters"] = parameters
You can update or add new entries later in the code:
# Add additional parameters
run["model/parameters/seed"] = RANDOM_SEED
# Update parameters. For example, after triggering early stopping
run["model/parameters/n_epochs"] = epoch
Log training metrics#
Track the training process by logging your training metrics. Use the append()
method for a series of values, or one of our ready-made integrations:
import neptune.integrations.sklearn as npt_utils
run["cls_summary"] = npt_utils.create_classifier_summary(
gbc, X_train, X_test, y_train, y_test
)
run["rfr_summary"] = npt_utils.create_regressor_summary(
rfr, X_train, X_test, y_train, y_test
)
run["kmeans_summary"] = npt_utils.create_kmeans_summary(
km, X, n_clusters=17
)
You can use Neptune with any machine learning framework. If you use a framework that supports logging (most of them do) you don't need to write the logging code yourself. The Neptune integration takes care of tracking all the training metrics.
Related
Log evaluation results#
Assign the metrics to a namespace and field of your choice:
Using the snippet above, both evaluation metrics will be stored in the same evaluation
namespace.
You can log plots and charts with the upload()
method.
A plot object is converted to an image file, but you can also upload images from the local disk.
import matplotlib.pyplot as plt
from scikitplot.metrics import plot_roc, plot_precision_recall
fig, ax = plt.subplots()
plot_roc(y_test, y_pred_proba, ax=ax)
run["evaluation/ROC"].upload(fig)
fig, ax = plt.subplots()
plot_precision_recall(y_test, y_pred_proba, ax=ax)
run["evaluation/precision-recall"].upload(fig)
The following snippet logs sample predictions by using the FileSeries
type to log a series of labeled images:
You can upload tabular data as a pandas DataFrame and inspect it as a neat table in the app:
import pandas as pd
df = pd.DataFrame(
data={
"y_test": y_test,
"y_pred": y_pred,
"y_pred_probability": y_pred_proba.max(axis=1),
}
)
run["evaluation/predictions"].upload(File.as_html(df))
You can also just upload data as CSV, which you can preview as an interactive table.
Upload relevant files#
You can upload any binary file (such as a model file) from disk using the upload()
method.
If your model is saved as multiple files, you can upload a whole folder as a FileSet
with upload_files()
.
Instead of uploading entire files, you can track their metadata only.
For details, see Tracking artifacts.
Tips
- To organize the model metadata in the Models section, instead of just logging to a
run
object, you can create amodel
object and log the data there. For more, see Model registry overview.
Explore results#
Once you're done logging, end the run with the stop()
method:
Next, run your script and follow the link to explore your metadata in Neptune.