Migrating to neptune.new

This migration guide is meant to give you a quick start on migrating your existing code to the new Python API and leverage the new way of tracking and logging metadata.

Project migration

The new Python API and revamped user interface, under the hood, require a changed data structure. Over the following weeks, we will be migrating existing projects to that new structure, but you can already try it out, as all new projects are created using the new structure.

The legacy Python API will continue to work after the migration, so you don’t need to change a single line of code. In the background, we quietly do our magic and make sure things work for you. The only thing you need to do is update the Neptune client library to at least version 0.9.1.

The new Python API is only compatible with the new data structure so it will only be available for your project once it’s migrated. Similarly, the improved web interface also requires the new data structure. You can already try it out with a new project, and it will be available for your existing projects once they are migrated.

At some point in the future, we plan to make the new Python API the default one with a release of version 1.0 of the client library. However, we will be supporting the current Python API for a long time so that you can make the switch at a convenient moment. It’s worth the switch though, it’s quite awesome!

Hierarchical structure

Runs can be viewed as nested dictionary-like structures that you can define in your code. Thanks to this you can easily organize your metadata in a way that is most convenient for you.

The hierarchical structure that you apply to your metadata will be reflected later in the UI.

Run's structure consists of fields that are organized into namespaces. Field's path is a combination of the namespaces and its name - if you store a value 0.8 in a Float field in params namespace under name momentum - params/momentum will be its path. You can organize this way any type of metadata - images, parameters, metrics, scores, model checkpoint, CSV files, etc. Let look at the following code:

import neptune.new as neptune
run = neptune.init(project='my_workspace/my_project')
run['about/JIRA'] = 'NPT-952'
run['parameters/batch_size'] = 5
run['parameters/algorithm'] = 'ConvNet'
for epoch in range(100):
acc_value = ...
loss_value = ...
run['train/accuracy'].log(acc_value)
run['train/loss'].log(loss_value)
exp['trained_model'].upload('model.pt')

The resulting structure of the run will be following:

'about':
'JIRA': String
'parameters':
'batch_size': Float
'algorithm': String
'train':
'accuracy': FloatSeries
'loss': FloatSeries
'trained_model': File

Batch assign

You can assign multiple values to multiple fields in a batch by using a dictionary. You can use this method to quickly log all run parameters:

import neptune.new as neptune
run = neptune.init()
# Assign multiple fields from a dictionary
params = {'max_epochs': 10, 'optimizer': 'Adam'}
run['parameters'] = params
# Dictionaries can be nested
params = {'train': {'max_epochs': 10}}
run['parameters'] = params
# This will save value 10 under path "parameters/train/max_epochs"

Initialization

Initialization got simpler a bit. You can replace your current code that probably looks like this:

# Legacy API
import neptune
neptune.init(project_qualified_name='my_workspace/my_project')
neptune.create_experiment(tags=['resnet'])

With following:

# neptune.new API
import neptune.new as neptune
run = neptune.init(project='my_workspace/my_project', tags=['resnet'])

The name of the environment variables didn't change. Instead of specifying project name or API token in the code, you can always provide them by setting NEPTUNE_PROJECT and NEPTUNE_API_TOKEN variables.

Parameters

With the legacy API, you had to pass parameters when creating an experiment and it was not possible to change them afterward. In addition, nested dictionaries were not fully supported.

# Legacy API
import neptune
PARAMS = {'epoch_nr': 100,
'lr': 0.005,
'use_nesterov': True}
neptune.init(project_qualified_name='my_workspace/my_project')
neptune.create_experiment(params=PARAMS)

With neptune.new API it's up to you when and where you want to specify parameters. Now you can also update them later:

# neptune.new API
import neptune.new as neptune
PARAMS = {'epoch_nr': 100,
'lr': 0.005,
'use_nesterov': True}
run = neptune.init(project='my_workspace/my_project')
run['my_params'] = PARAMS
# You can also specify parameters one by one
run['my_params/batch_size'] = 64
# Update lr value
run['my_params/lr'] = 0.007

The artificial distinction between parameters and properties is also gone, and you can log and access them in one unified way.

Interacting with files (Artifacts)

You are no longer bound to store files only in the artifacts folder. Whether it's a model checkpoint, custom interactive visualization, or audio file you can track it in the same hierarchical structure with the rest of the metadata:

Legacy API

neptune.new API

neptune.log_artifact('model_viz.png')

run['model/viz'].upload('model_viz.png')

neptune.log_artifact('model.pt')

run['trained_model'].upload('model.pt')

neptune.download_artifact('model.pt')

run['trained_model'].download()

Note that in the legacy API artifacts could have been explained as a file system-like functionality and were mimicking it very closely. What you did upload was saved under the same name with the extension etc.

The mental model behind the new Python API is more database-like. We have a field (with a path) and under it, we store some content - Float, String, series of String, or File. In this model extension is part of the content. For example, if under the path 'model/last'you upload a .ptfile, the file will be visible as 'last.pt'in the UI.

When it's unambiguous we implicitly convert an object to a File and there is no need for explicit conversion. E.g. for Matplotlib charts, you can writerun['conf_matrix'].upload(plt_fig)instead ofrun['conf_matrix'].upload(File.as_image(plt_fig)).

neptune-contrib

We've integrated neptune-contrib file-related functionalities into the core library (in fact there is no more neptune-contrib - see Integrations). The conversion methods are available as File factory methods:

Interactive charts (Altair, Bokeh, Plotly, Matplotlib)

Legacy API

neptune.new API

from neptunecontrib.api import log_chart

log_chart('int_chart', chart)

from neptune.new.types import File

run['int_chart'].upload(File.as_html(chart))

Pandas DataFrame

Legacy API

neptune.new API

from neptunecontrib.api import log_table

log_table('pred_df', df)

from neptune.new.types import File

run['pred_df'].upload(File.as_html(df))

Audio & video

We've expanded the range of files that are nativeley supported in the Neptune UI, so for audio and video files you no longer need to use conversion methods:

Legacy API

neptune.new API

from neptunecontrib.api import log_audio

log_audio('sample.mp3')

run['sample'].upload('sample.mp3')

from neptunecontrib.api import log_video

log_video('sample.mp4')

run['sample'].upload('sample.mp4')

Pickled objects

Legacy API

neptune.new API

from neptunecontrib.api import log_pickle

log_pickle('model.pkl', model)

from neptune.new.types import File

run['model'].upload(File.as_pickle(model))

HTML strings

Legacy API

neptune.new API

from neptunecontrib.api import log_html

log_pickle('custom_viz', html_string)

from neptune.new.types import File

run['custom_viz'].upload(File.from_content(model), extension='html')

Scores and metrics

Logging metrics is quite similar, except that you can now organize them in a hierarchical structure:

Legacy API

New API

neptune.log_metric('acc', 0.97)

run['acc'].log(0.97)

neptune.log_metric('train_acc', 0.97)

run['train/acc'].log(0.97)

neptune.log_metric('loss', 0.8)

run['key'].log()

To log scores you don't need to use Series fields anymore as you can track single values anytime, anywhere:

Legacy API

New API

neptune.log_metric('final_accuracy', 0.8)

run['final_accuracy'] = 0.8

Text and image series

Similar changes need to be applied for text and image series:

Legacy API

New API

neptune.log_text('train_log', custom_log_msg)

run['train/log'].log(custom_log_msg)

neptune.log_image('misclasified', filepath)

run['misclasified'].log(File(filepath))

neptune.log_image('pred_dist', hist_chart)

run['pred_dist'].log(hist_chart)

To add a single image that you want to view with the rest of the metrics you no longer need to use Series fields. As you control whether they are grouped in the same namespace you can upload it as a single File field.

Legacy API

New API

neptune.log_image('train_hist', hist_chart)

run['train/hist'].upload(hist_chart)

Integrations

We've re-designed how our integrations work so that:

  • It's more tightly integrated with our base library and the API is more unified.

  • You can update the version of each integration separately in case of dependency conflicts.

  • There is a smaller number of dependencies if you are not using all integrations.

There is no longer neptune-contrib library for the new Python API - each integration has now two parts:

  • Boilerplate code for ease of use in the main library import neptune.new.integrations.framework_name

  • Actual integration that can be installed separately: pip install neptune-framework-name or as an extra together with neptune-client pip install "neptune-client[framework-name]"

Existing integrations from neptune-contrib are still fully functional. You can use them both with projects using the previous structure and the new structure. However, integrations from neptune-contrib are using legacy Python API while the new integrations are re-written to fully use possibilities provided by the new Python API and achieve better metadata organization.

File-related neptune-contrib functionalities are now part of the core library. Read more here.

You can read in detail about each integration in the Integrations section.

We are still re-writing some of the integrations using the new Python API and they should be available in the next few weeks. In the meantime, you can use the previous version of the integration built using the legacy Python API.

Let's look at how this looks like in the case of TensorFlow/Keras integration.

Installation

Legacy API

neptune.new API

pip install neptune-contrib

pip install "neptune-client[tensorflow-keras]"

or

pip install neptune-tensorflow-keras

Legacy API usage

import neptune
neptune.init(project_qualified_name='my_workspace/my_project')
from neptunecontrib.monitoring.keras import NeptuneMonitor
model.fit(x_train, y_train,
epochs=5,
batch_size=64,
callbacks=[NeptuneMonitor()])

neptune.new API usage

import neptune.new as neptune
run = neptune.init(project='my_workspace/my_project')
from neptune.new.integrations.tensorflow_keras import NeptuneCallback
model.fit(x_train, y_train,
epochs=5,
batch_size=64,
callbacks=[NeptuneCallback(run=run)])

Tags

Interaction with tags is quite similar - the tags are stored as a StringSet field under sys/tags path.

Code using legacy API:

# Legacy API
import neptune
neptune.init(project_qualified_name='my_workspace/my_project')
neptune.create_experiment(params=PARAMS,
tags=['maskRCNN'])
neptune.append_tag('prod_v2.0.1')
neptune.append_tags('finetune', 'keras')

Code using neptune.new API:

# neptune.new API
import neptune.new as neptune
run = neptune.init(project='my_workspace/my_project',
tags=['maskRCNN'])
run["sys/tags"].add('prod_v2.0.1')
run["sys/tags"].add(['finetune', 'keras'])