Downloading experiment data

Almost all data that is logged to the project (experiments and notebooks) can be downloaded to the local machine. You may want to do it for a variety of reasons:

  • Build custom analysis or visualizations using experiment data.

  • Use saved model checkpoint elsewhere.

  • Get source code of experiment and run it again.

  • Build report that uses data across projects.

  • Archive old project.

There are three ways to download data from Neptune:

  1. Programmatically, by using neptune-client: for example downloading experiment dashboard as pandas DataFrame. Check download programmatically below.

  2. Directly from the UI: for example downloading notebook checkpoint or experiments dashboard as csv. Check downloading from Neptune UI below.

  3. From the JupyterLab interface: Check downloading checkpoint documentation.

Downloading programmatically

You can download experiment data programmatically.

Snippet below shows how to download experiment dashboard as pandas DataFrame. It’s a decent representation of the overall idea behind downloading data from Neptune.

To download experiments dashboard data as pandas DataFrame use get_leaderboard().

import neptune

# Get project
project = neptune.init('my_workspace/my_project')

# Download experiments dashboard as pandas DataFrame
data = project.get_leaderboard()
data.head()

data is a pandas DataFrame, where each row is an experiment and columns represent all system properties, metrics and text logs, parameters and properties in these experiments. For metrics, the latest value is returned.

Example downloaded dashboard data

Downloading from the project

On the level of the project, you can do two major actions:

Note

Check Project for other related methods.

Fetch a list of Experiment objects

Let’s fetch a list of Experiment objects that match some criteria.

import neptune

# Get project
project = neptune.init('my_workspace/my_project')

# Get list of experiment objects created by 'sophia'
sophia_experiments = project.get_experiments(owner='sophia')

# Get another list of experiment objects that have 'cycleLR' assigned
cycleLR_experiments = project.get_experiments(tag='cycleLR')

First, you need to get correct project, then you simply run get_experiments() with appropriate parameters. sophia_experiments and cycleLR_experiments are lists of neptune.experiments.Experiment objects. You can use it either to download data from experiments or update them:

  • For updating check this guide.

  • For downloading continue reading this page.

Download experiment dashboard as DataFrame

Let’s download the filtered experiments dashboard view as a Pandas DataFrame, using get_leaderboard().

import neptune

# Get project
project = neptune.init('my_workspace/my_project')

# Get dashboard with experiments contributed by 'sophia'
sophia_df = project.get_leaderboard(owner='sophia')

# Get another dashboard with experiments tagged 'cycleLR'
cycleLR_df = project.get_leaderboard(tag='cycleLR')

First, you need to get correct project, then you simply run get_leaderboard() with appropriate parameters. sophia_df and cycleLR_df are pandas DataFrames where each row is an experiment and columns represent all system properties, metrics and text logs, parameters and properties in these experiments. For metrics, the latest value is returned.

Note that prefixes are added to metrics, parameters and properties:

  • channel_ for metrics and text logs, for example: channel_epoch/accuracy

  • parameter_ for example: parameter_optimizer

  • property_ for example: property_test_images_version

Example dataframe will look like this:

Example downloaded dashboard data

Note

To download only experiments that you want, you can filter them by id, state, owner, tag and min_running_time. Check get_leaderboard() documentation for details.

Downloading from the experiment

On this level you can use all methods that get/download data from the neptune.experiments.Experiment object. Three types of data are especially useful: metrics, artifacts and source code.

First step in all cases is to get experiment object.

import neptune

# Get project
project = neptune.init('my_workspace/my_project')

# Get experiment object for appropriate experiment, here 'SHOW-2066'
my_exp = project.get_experiments(id='SHOW-2066')[0]

Have a look at this section about updating experiments to learn more about it.

Here, my_exp is neptune.experiments.Experiment object that will be used in the following section about downloading metrics, artifacts and source code.

Metrics

You can download metrics data as pandas DataFrame.

# 'my_exp' is experiment object
data = my_exp.get_numeric_channels_values('epoch/accuracy', 'epoch/loss')

get_numeric_channels_values() accepts comma separated metric names. data is a pandas DataFrame with metrics data.

You can also use get_logs() to see all logs (types: metrics, text, images) names in the experiment.

# exp is Experiment object
print(my_exp.get_logs().keys())

Result looks like this:

Example logs names printed in notebook

Note

It’s good idea to get metrics with common temporal pattern (like iteration or batch/epoch number). Thanks to this each row of returned DataFrame has metrics from the same moment in experiment. For example, combine epoch metrics to one DataFrame and batch metrics to the other.

Files

Download files from the experiment. Any file that is logged to the artifacts section can be downloaded.

Notice that there are two methods for this:

# Download csv file
my_exp.download_artifact('aux_data/preds_test.csv', 'data/')

# Download all model checkpoints to the cwd
my_exp.download_artifacts('model_checkpoints/')

Source code

Download source code used un the experiment as a ZIP archive.

# Download all sources to the cwd
my_exp.download_sources()

Note

You can also download source directly from the UI: here is how.

More options

Besides metrics, artifacts and scripts covered above, you can use other methods as well. Here is a full list of methods that download data:

Combining downloading methods

You can combine a few downloading options to build custom visualizations or analysis. Example below shows how to use get_experiments() and get_numeric_channels_values() and seaborn library to overlay metric from multiple experiments on the same plot.

Get list of Experiment objects.

import neptune

# Set project
project = neptune.init('my_workspace/my_project')

# Get list of experiments
experiments = project.get_experiments(owner='...', tag='...')

Download metrics data from all experiments in the list, by using get_numeric_channels_values()

metrics_df = pd.DataFrame(columns=['id', 'epoch_accuracy', 'epoch_loss', 'learning_rate'])
for experiment in experiments:
    df = experiment.get_numeric_channels_values('epoch_accuracy', 'epoch_loss', 'learning_rate')
    df.insert(loc=0, column='id', value=experiment.id)
    metrics_df = metrics_df.append(df, sort=True)

metrics_df will look like this:

Metrics dataframe

Make seaborn plot

# Prepare dataframe
metrics_df.sort_values(by='eval_accuracy', ascending=False, inplace=True)
...

# Make seaborn plot
g = sns.relplot(x='x', y='epoch_accuracy', data=metrics_df)

The result will look like this:

Metrics plotted in single chart

Downloading from Neptune UI

You can download experiment data directly from the UI to your local machine. Check downloading from the UI documentation page for details.

Downloading from Jupyter Notebook

You can download notebook checkpoint directly from Neptune to the Jupyter or JupyterLab interface. Check downloading checkpoint documentation for details.