Core concepts

Core concepts in Neptune

Neptune Client Library (API): suite of libraries to help you log, query, and download model building metadata. You can use Neptune client for Python, R, and one of many integrations with ML tools.

Neptune Web Interface (UI): visual interface where you can display, filter, compare, and organize ML runs and model building metadata.

Workspace: space inside Neptune where you can manage Projects , people, and plans. When you create an account you are assigned to your individual workspace.

Project: space inside a Workspace where you put all the ML runs and metadata generated while working on a single ML model. You can assign different people from your Workspace to different Projects.

Run: a namespace inside a Project where you log model building metadata. Typically, you create a Run every time you execute a script that does model training, re-training, or inference.

Field: a place inside the Run namespace where you log various types of model building metadata.

Metadata: metrics and losses, hyperparameters, figures, image predictions, hardware consumption logs, code snapshots, and any other information you need to feel in control of your model building.

Client library (Neptune API)

Neptune client is an open-source Python library that helps you do two things:

It is designed to be:

  • lightweight: easy connect to your workflow and integrate with other tools in your MLOps tech stack,

  • flexible: capable of logging any kind of ML metadata in any structure you need,

  • straightforward: you define what you want to log for each ML run.

Neptune API in 30 seconds

Step 1: Install neptune-client library

pip
conda
pip
pip install neptune-client
conda
conda install -c conda-forge neptune-client

Step 2: Add logging into your model training script

train.py

import neptune.new as neptune
run = neptune.init(project='#', # your project
api_token='#', # your api token
)
# Track metadata and hyperparameters of your run
run["JIRA"] = "NPT-952"
params = {
"batch_size": 64,
"dropout": 0.2,
"optimizer":{ "learning_rate": 0.001,
"optimizer": "Adam"},
}
run["parameters"] = params
# Track the training process by logging your training metrics
for epoch in range(100):
run["train/accuracy"].log(epoch * 0.6)
run["train/loss"].log(epoch * 0.4)
# Log the final results
run["f1_score"] = 0.66

Step 3: Run model training

python train.py

Step 4: Query whatever model building metadata you need

import neptune.new as neptune
run = neptune.init(project='#', # your project
api_token='#', # your api token
run='#', # id of the run you want to access
)
params = run["parameters"].fetch()

Web Interface (Neptune UI)

Neptune comes with a Web Interface designed for working with ML model building metadata.

It makes it easy to:

There are a few essential components of the Neptune UI:

  • Projects: where you can create and manage all your projects

  • Runs Table: where you can compare, filter, and organize your runs

  • Single run: where you can explore and display rich metadata

  • People: where you can deal with user access control (ACL)

  • Subscription: where you can manage your plan and see current usage

Neptune UI Tour

Examples of Neptune UI in action

Description

Example in Neptune

Runs table

See in the UI

Comparison of runs

See in the UI

Logged learning curves

See in the UI

Logged hardware consumption

See in the UI

Logged interactive charts

See in the UI

Logged images

See in the UI

Custom Dashboard

See in the UI

Projects

See in the UI

Workspace

Workspace is a space inside Neptune where you can manage projects, people, and plans.

There are two types of workspaces:

  • Individual

    • When you create an account you are assigned an individual workspace called your username,

    • You can create an unlimited number of public and private projects,

    • You cannot invite other people to this workspace.

  • Team

    • Comes in handy when you want to collaborate on projects with other people,

    • You can invite as many people as you want to your team workspace,

    • Team workspace is paid but you can try it for free, for details check our pricing page,

    • For academic purposes or Kaggle, you can use the team workspace for free. Apply here.

You can be a member of multiple workspaces. For example, you can have:

  • your individual workspace,

  • a team workspace at work,

  • a team workspace at the university.

Project

A Project is a collection of Runs created by a user (or users) assigned to the Project. Typically you should create a Project per machine learning task to make it easy to compare runs that are connected to building a single ML model.

There are two types of projects:

  • Private projects: Only people assigned to the project can see it,

  • Public projects: freely available to view by everyone who has access to the Internet.

You can assign different people from your Workspace to different Projects.

To log to a Project via Neptune API you need a full project name of the format workspace/project:

run = neptune.init(project='my_workspace/my_project')

See how to find it in the Neptune UI and set it in your logging code.

Run

A Run is a namespace inside a Project where you log model building metadata. Typically, you create a run every time you execute a script that does model training, re-training, or inference.

It has a dictionary-like structure with Fields to which any type of ML metadata can be logged.

You can create a nested structure in your Runs to organize your parameters, metrics, or other metadata.

You start a Run with neptune.init():

import neptune.new as neptune
run = neptune.init(project='#', api_token='#') # your credentials

Log model building metadata to various fields of the run:

run["JIRA"] = "NPT-952"
run["parameters"] = {
"batch_size": 64,
"dropout": 0.2,
"optimizer":{ "learning_rate": 0.001,
"optimizer": "Adam"},
}
for epoch in range(100):
run["train/accuracy"].log(epoch * 0.6)
run["train/loss"].log(epoch * 0.4)
# Log the final results
run["f1_score"] = 0.66

Explore your Run in the UI:

Run in the Neptune UI

Access Run programmatically (after it has finished) :

run = neptune.init(project="#", api_token="#", # pass credentials
run="SUN-123") # pass a run ID
# update with new data
run["test_accuracy"] = 0.42
# fetch metadata
run["parameters/batch_size"].fetch()

The run stops when the script finishes or when you explicitly call:

run.stop()

Field

A Field of a Run is a namespace to which various types of model building metadata can be logged.

Run field can be flat or nested:

# flat
run["f1_score"] = 0.66
run["model"].upload("model.pkl")
# nested
run["parameters/batch_size"] = 64
run["parameters/optimizer/learning_rate"] = 0.001
run["parameters/optimizer/optimizer"] = "Adam"

You can also assign values to a Field of a Run with a dictionary:

run["parameters"] = {
"batch_size": 64,
"optimizer":{ "learning_rate": 0.001,
"optimizer": "Adam"},
}

Neptune Field types and logging methods

Model building metadata are displayed in the Neptune UI based on the type you choose to assign to your Field.

Depending on what metadata you want to log and how you wish to display it you should use different Neptune field types and methods:

Metadata

Field Type

Logging Method

Example

Single metric

Float

=, assign()

run["f1_score"] = 0.66

Series of loss values

FloatSeries

.log()

for epoch in range(100):

run["loss"].log(loss_value)

Hyperparameters

Float, String

=, assign()

run["learning_rate"] = 0.001

run["optimizer"] = "Adam"

Image predictions

FileSeries

.log()

for epoch in range(100):

run["preds"].log(image)

Charts

File

.upload() , .as_html(), .as_image()

run["results"].upload(File.as_image(fig))

Model weights

File

.upload()

run["model"].upload("model.pkl")

Query and download metadata from the Field

You can access information from the Field of a Run programmatically with .fetch() and .download() methods.

Field type

Method

What it does

Example

Float, String

.fetch()

get value from the field

run["train/epoch"].fetch()

File, FileSeries

.download()

download file from the field

run["predictions/img2"].download()

FloatSeries

.fetch_values()

get all values from a series field

run["train/loss"].fetch_values()

FileSeries

.download_last()

download the last file from a series field

run["predictions"].download_last()

Metadata

When we talk about metadata in the context of Neptune we usually mean model building metadata.

There are three main types of ML metadata that you can log and display in Neptune.

Experiment and model training metadata

  • Metrics

  • Hyperparameters

  • Learning curves

  • Training code and configuration files

  • Predictions (images, tables, etc)

  • Diagnostic charts (Confusion matrix, ROC curve, etc)

  • Console logs

  • Hardware logs

  • And more

Artifact (datasets, predictions, and models) metadata

  • Paths to the dataset or model (s3 bucket, filesystem)

  • Dataset hash

  • Dataset/prediction preview (head of the table, snapshot of the image folder)

  • Description

  • Feature column names (for tabular data)

  • Who created/modified

  • When last modified

  • Size of the dataset

  • And more

(Trained) Model metadata

  • Model binary or location to your model asset

  • Dataset versions

  • Links to recorded model training runs and experiments

  • Who trained the model

  • Model descriptions and notes

  • Links to observability dashboards (Grafana)

  • And more

Most model building metadata types can be easily logged and nicely displayed in the Neptune UI:

Have a look at the complete list of metadata types you can log.

What's next?