This migration guide is meant to give you a quick start on migrating your existing code to the new Python API and leverage the new way of tracking and logging metadata.
The new Python API and revamped user interface, under the hood, require a changed data structure. Over the following weeks, we will be migrating existing projects to that new structure, but you can already try it out, as all new projects are created using the new structure.
The legacy Python API will continue to work after the migration, so you don’t need to change a single line of code. In the background, we quietly do our magic and make sure things work for you. However, the new Python API is only compatible with the new data structure so it will only be available for your project once it’s migrated. Similarly, the improved web interface also requires the new data structure. You can already try it out with a new project, and it will be available for your existing projects once they are migrated.
At some point in the future, we plan to make the new Python API the default one with a release of v1.0 of the client library. However, we will be supporting the current Python API for a long time so that you can make the switch at a convenient moment. It’s worth the switch though, it’s quite awesome!
Runs can be viewed as nested dictionary-like structures that you can define in your code. Thanks to this you can easily organize your metadata in a way that is most convenient for you.
The hierarchical structure that you apply to your metadata will be reflected later in the UI.
Run's structure consists of fields that are organized into namespaces. Field's path is a combination of the namespaces and its name - if you store a value 0.8
in a Float
field in params
namespace under name momentum
- params/momemtum
will be its path. You can organize this way any type of metadata - images, parameters, metrics, scores, model checkpoint, CSV files, etc. Let look at the following code:
import neptune.new as neptunerun = neptune.init(project='my_workspace/my_project')​run['about/JIRA'] = 'NPT-952'​run['parameters/batch_size'] = 5run['parameters/algorithm'] = 'ConvNet'​for epoch in range(100):acc_value = ...loss_value = ...run['train/accuracy'].log(acc_value)run['train/loss'].log(loss_value)​exp['trained_model'].upload('model.pt')
The resulting structure of the run will be following:
'about':'JIRA': String'parameters':'batch_size': Float'algorithm': String'train':'accuracy': FloatSeries'loss': FloatSeries'trained_model': File
You can assign multiple values to multiple fields in a batch by using a dictionary. You can use this method to quickly log all run parameters:
import neptune.new as neptunerun = neptune.init()​# Assign multiple fields from a dictionaryparams = {'max_epochs': 10, 'optimizer': 'Adam'}run['parameters'] = params​# Dictionaries can be nestedparams = {'train': {'max_epochs': 10}}run['parameters'] = params# This will save value 10 under path "parameters/train/max_epochs"
Initialization got simpler a bit. You can replace your current code that probably looks like this:
# Legacy APIimport neptune​neptune.init(project_qualified_name='my_workspace/my_project')​neptune.create_experiment(tags=['resnet'])
With following:
# neptune.new APIimport neptune.new as neptune​run = neptune.init(project='my_workspace/my_project', tags=['resnet'])
The name of the environment variables didn't change. Instead of specifying project name or API token in the code, you can always provide them by setting NEPTUNE_PROJECT
and NEPTUNE_API_TOKEN
variables.
With the legacy API, you had to pass parameters when creating an experiment and it was not possible to change them afterward. In addition, nested dictionaries were not fully supported.
# Legacy APIimport neptune​PARAMS = {'epoch_nr': 100,'lr': 0.005,'use_nesterov': True}​neptune.init(project_qualified_name='my_workspace/my_project')neptune.create_experiment(params=PARAMS)
With neptune.new API it's up to you when and where you want to specify parameters. Now you can also update them later:
# neptune.new APIimport neptune.new as neptune​PARAMS = {'epoch_nr': 100,'lr': 0.005,'use_nesterov': True}​run = neptune.init(project='my_workspace/my_project')​run['my_params'] = PARAMS​# You can also specify parameters one by onerun['my_params/batch_size'] = 64​# Update lr valuerun['my_params/lr'] = 0.007
The artificial distinction between parameters and properties is also gone, and you can log and access them in one unified way.
You are no longer bound to store files only in the artifacts folder. Whether it's a model checkpoint, custom interactive visualization, or audio file you can track it in the same hierarchical structure with the rest of the metadata:
Legacy API | neptune.new API |
|
|
|
|
|
|
Note that in the legacy API artifacts could have been explained as a file system-like functionality and were mimicking it very closely. What you did upload was saved under the same name with the extension etc.
The mental model behind the new Python API is more database-like. We have a field (with a path) and under it, we store some content - Float, String, series of String, or File. In this model extension is part of the content. For example, if under the path 'model/last'
you upload a .pt
file the downloaded file will be called 'last.pt'
.
When it's unambiguous we implicitly convert an object to a File and there is no need for explicit conversion. E.g. for Matplotlib charts, you can writerun['conf_matrix'].upload(plt_fig)
instead ofrun['conf_matrix'].upload(File.as_image(plt_fig))
.
We've integrated neptune-contrib
file-related functionalities into the core library (in fact there is no more neptune-contrib - see Integrations). The conversion methods are available as File
factory methods:
Legacy API | neptune.new API |
|
|
Legacy API | neptune.new API |
|
|
We've expanded the range of files that are nativeley supported in the Neptune UI, so for audio and video files you no longer need to use conversion methods:
Legacy API | neptune.new API |
|
|
|
|
Legacy API | neptune.new API |
|
|
Legacy API | neptune.new API |
|
|
Logging metrics is quite similar, except that you can now organize them in a hierarchical structure:
Legacy API | New API |
|
|
|
|
|
|
To log scores you don't need to use Series fields anymore as you can track single values anytime, anywhere:
Legacy API | New API |
|
|
Similar changes need to be applied for text and image series:
Legacy API | New API |
|
|
|
|
|
|
To add a single image that you want to view with the rest of the metrics you no longer need to use Series fields. As you control whether they are grouped in the same namespace you can upload it as a single File
field.
Legacy API | New API |
|
|
We've re-designed how our integrations work so that:
It's more tightly integrated with our base library and the API is more unified.
You can update the version of each integration separately in case of dependency conflicts.
There is a smaller number of dependencies if you are not using all integrations.
There is no longer neptune-contrib
library for the new Python API - each integration has now two parts:
Boilerplate code for ease of use in the main library
import neptune.new.integrations.framework_name
Actual integration that can be installed separately:
pip install neptune-framework-name
or as an extra together with neptune-client
pip install "neptune-client[framework-name]"
Existing integrations from neptune-contrib
are still fully functional. You can use them both with projects using the previous structure and the new structure. However, integrations from neptune-contrib
are using legacy Python API while the new integrations are re-written to fully use possibilities provided by the new Python API and achieve better metadata organization.
File-related neptune-contrib
functionalities are now part of the core library. Read more here.
You can read in detail about each integration in the Integrations section.
We are still re-writing some of the integrations using the new Python API and they should be available in the next few weeks. In the meantime, you can use the previous version of the integration built using the legacy Python API.
Let's look at how this looks like in the case of TensorFlow/Keras integration.
Legacy API | neptune.new API |
|
or
|
import neptune​neptune.init(project_qualified_name='my_workspace/my_project')​from neptunecontrib.monitoring.keras import NeptuneMonitor​model.fit(x_train, y_train,epochs=5,batch_size=64,callbacks=[NeptuneMonitor()])
import neptune.new as neptune​run = neptune.init(project='my_workspace/my_project')​from neptune.new.integrations.tensorflow_keras import NeptuneCallback​model.fit(x_train, y_train,epochs=5,batch_size=64,callbacks=[NeptuneCallback(run=run)])
Interaction with tags is quite similar - the tags are stored as a StringSet
field under sys/tags
path.
Code using legacy API:
# Legacy APIimport neptune​neptune.init(project_qualified_name='my_workspace/my_project')​neptune.create_experiment(params=PARAMS,tags=['maskRCNN'])​neptune.append_tag('prod_v2.0.1')neptune.append_tags('finetune', 'keras')
Code using neptune.new API:
# neptune.new APIimport neptune.new as neptune​run = neptune.init(project='my_workspace/my_project',tags=['maskRCNN'])​run["sys/tags"].add('prod_v2.0.1')run["sys/tags"].add(['finetune', 'keras'])