Downloading experiment data¶
Almost all data that is logged to the project (experiments and notebooks) can be downloaded to the local machine. You may want to do it for a variety of reasons:
Build custom analysis or visualizations using experiment data.
Use saved model checkpoint elsewhere.
Get source code of experiment and run it again.
Build report that uses data across projects.
Archive old project.
There are three ways to download data from Neptune:
Programmatically, by using neptune-client: for example downloading experiment dashboard as pandas DataFrame. Check download programmatically below.
Directly from the UI: for example downloading notebook checkpoint or experiments dashboard as csv. Check downloading from Neptune UI below.
From the JupyterLab interface: Check downloading checkpoint documentation.
You can download experiment data programmatically.
Snippet below shows how to download experiment dashboard as pandas DataFrame. It’s a decent representation of the overall idea behind downloading data from Neptune.
To download experiments dashboard data as pandas DataFrame use
import neptune # Get project project = neptune.init('my_workspace/my_project') # Download experiments dashboard as pandas DataFrame data = project.get_leaderboard() data.head()
data is a pandas DataFrame, where each row is an experiment and columns represent all system properties, metrics and text logs, parameters and properties in these experiments. For metrics, the latest value is returned.
Downloading from the project¶
On the level of the project, you can do two major actions:
Project for other related methods.
Fetch a list of
Let’s fetch a list of
Experiment objects that match some criteria.
import neptune # Get project project = neptune.init('my_workspace/my_project') # Get list of experiment objects created by 'sophia' sophia_experiments = project.get_experiments(owner='sophia') # Get another list of experiment objects that have 'cycleLR' assigned cycleLR_experiments = project.get_experiments(tag='cycleLR')
First, you need to get correct project, then you simply run
get_experiments() with appropriate parameters.
cycleLR_experiments are lists of
neptune.experiments.Experiment objects. You can use it either to download data from experiments or update them:
For updating check this guide.
For downloading continue reading this page.
Download experiment dashboard as DataFrame¶
Let’s download the filtered experiments dashboard view as a Pandas DataFrame, using
import neptune # Get project project = neptune.init('my_workspace/my_project') # Get dashboard with experiments contributed by 'sophia' sophia_df = project.get_leaderboard(owner='sophia') # Get another dashboard with experiments tagged 'cycleLR' cycleLR_df = project.get_leaderboard(tag='cycleLR')
First, you need to get correct project, then you simply run
get_leaderboard() with appropriate parameters.
cycleLR_df are pandas DataFrames where each row is an experiment and columns represent all system properties, metrics and text logs, parameters and properties in these experiments. For metrics, the latest value is returned.
Note that prefixes are added to metrics, parameters and properties:
channel_for metrics and text logs, for example:
Example dataframe will look like this:
To download only experiments that you want, you can filter them by
get_leaderboard() documentation for details.
Downloading from the experiment¶
On this level you can use all methods that get/download data from the
neptune.experiments.Experiment object. Three types of data are especially useful: metrics, artifacts and source code.
First step in all cases is to get experiment object.
import neptune # Get project project = neptune.init('my_workspace/my_project') # Get experiment object for appropriate experiment, here 'SHOW-2066' my_exp = project.get_experiments(id='SHOW-2066')
Have a look at this section about updating experiments to learn more about it.
neptune.experiments.Experiment object that will be used in the following section about downloading metrics, artifacts and source code.
You can download metrics data as pandas DataFrame.
# 'my_exp' is experiment object data = my_exp.get_numeric_channels_values('epoch/accuracy', 'epoch/loss')
get_numeric_channels_values() accepts comma separated metric names.
data is a pandas DataFrame with metrics data.
You can also use
get_logs() to see all logs (types: metrics, text, images) names in the experiment.
# exp is Experiment object print(my_exp.get_logs().keys())
Result looks like this:
It’s good idea to get metrics with common temporal pattern (like iteration or batch/epoch number). Thanks to this each row of returned DataFrame has metrics from the same moment in experiment. For example, combine epoch metrics to one DataFrame and batch metrics to the other.
Download files from the experiment. Any file that is logged to the artifacts section can be downloaded.
Notice that there are two methods for this:
download_artifact(): single file download.
download_artifacts(): multiple files download as a ZIP archive.
# Download csv file my_exp.download_artifact('aux_data/preds_test.csv', 'data/') # Download all model checkpoints to the cwd my_exp.download_artifacts('model_checkpoints/')
Download source code used un the experiment as a ZIP archive.
# Download all sources to the cwd my_exp.download_sources()
You can also download source directly from the UI: here is how.
Besides metrics, artifacts and scripts covered above, you can use other methods as well. Here is a full list of methods that download data:
get_hardware_utilization(): Gets GPU, CPU and memory utilization data.
get_logs(): Gets all log names with their most recent values for this experiment.
get_numeric_channels_values(): Gets values of specified metrics (numeric logs).
get_parameters(): Gets parameters for this experiment.
get_properties(): Gets user-defined properties for this experiment.
get_system_properties(): Gets experiment properties.
get_tags(): Gets the tags associated with this experiment.
download_artifact(): Download an artifact (file) from the experiment storage.
download_artifacts(): Download a directory or a single file from experiment’s artifacts as a ZIP archive.
download_sources(): Download a directory or a single file from experiment’s sources as a ZIP archive.
get_pickle(): Download pickled artifact (file) from Neptune and returns a Python object.
Combining downloading methods¶
You can combine a few downloading options to build custom visualizations or analysis. Example below shows how to use
get_numeric_channels_values() and seaborn library to overlay metric from multiple experiments on the same plot.
Get list of
import neptune # Set project project = neptune.init('my_workspace/my_project') # Get list of experiments experiments = project.get_experiments(owner='...', tag='...')
Download metrics data from all experiments in the list, by using
metrics_df = pd.DataFrame(columns=['id', 'epoch_accuracy', 'epoch_loss', 'learning_rate']) for experiment in experiments: df = experiment.get_numeric_channels_values('epoch_accuracy', 'epoch_loss', 'learning_rate') df.insert(loc=0, column='id', value=experiment.id) metrics_df = metrics_df.append(df, sort=True)
metrics_df will look like this:
Make seaborn plot
# Prepare dataframe metrics_df.sort_values(by='eval_accuracy', ascending=False, inplace=True) ... # Make seaborn plot g = sns.relplot(x='x', y='epoch_accuracy', data=metrics_df)
The result will look like this:
Downloading from Neptune UI¶
You can download experiment data directly from the UI to your local machine. Check downloading from the UI documentation page for details.