Almost all data that is logged to the project (runs and notebooks) can be downloaded to the local machine. You may want to do it for a variety of reasons:
Build custom analysis or visualizations using runs data.
Use saved model checkpoint elsewhere.
Get source code of a run and execute it again.
Build report that uses data across projects.
Archive old project.
There are three ways to download data from Neptune:
Programmatically, by using neptune-client: for example downloading runs dashboard as pandas DataFrame. Check download programmatically below.
Directly from the UI: for example downloading notebook checkpoint or runs dashboard as csv. Check downloading from Neptune UI below.
From the JupyterLab interface: Check downloading checkpoint documentation.
You can download run's data programmatically.
To download runs table as pandas DataFrame use
import neptune.new as neptunemy_project = neptune.get_project("my_workspace/my_project")# Get dashboard with runs contributed by 'sophia'sophia_df = my_project.fetch_runs_table(owner='sophia').to_pandas()# Get another dashboard with runs tagged 'cycleLR'cycleLR_df = my_project.fetch_runs_table(tag='cycleLR').to_pandas()cycleLR_df.head()
cycleLR_df are pandas DataFrames, where each row is a run and columns represent tracked metadata such as metrics, text logs and parameters. For metrics, the latest value is returned.
Example dataframe will look like this:
You can fetch the run's data in a similar way you track the data. Look at the following example:
run["parameters/batch_size"].get() # returns single value e.g. 5run["train/accuracy"].get_last() # return the last accuracy valuerun["trained_model"].download() # downloads model file to the current directory
Run Fields can be grouped into namespaces to organize them by type, use case, phase etc. You can access a namespace the same way you would access any other field by specifying its path:
# Instead of using full pathbatch_size = run["parameters/batch_size"].get()# You can access a namespace and then a specific field within that namespaceparameters = run["parameters"]batch_size = parameters["batch_size"].get()
You can access most of the atom fields by using
# Numerical fieldsbatch_size = run["parameters/batch_size"].get()# Stringsusername = run["sys/owner"].get()# Datetimelast_updated = run["sys/modification_time"].get()
Download files stored in
File fields by using
# Download example_image to the current directoryrun["data/example_image.png"].download()# Download model to the specified directoryrun["trained_model.h5"].download(destination_path)
You can access numerical and text Series by using
# Accessing FloatSeriesfinal_loss = run["train/loss"].get_last()# Accessing StringSerieslast_stderr_line = run["sys/stderr"].get_last()
Tags fields can be accessed by using
.get() method that returns set of
run_tags = run["sys/tags"].get()if "exploration" in run_tags:print_analysis()
You can download all files stored in a Files field by using
# Download files to the current directoryrun["misclasified_samples"].download()# You can also download to a specific pathrun["source_code/files"].download(path_to_dir)
You can download runs data directly from the UI to your local machine. Check downloading from the UI documentation page for details.
You can download notebook checkpoint directly from Neptune to the Jupyter or JupyterLab interface. Check downloading checkpoint documentation for details.