download_files()
Downloads files from the specified Neptune experiments or runs.
Parameters
Which files to download, specified using the file fetching methods:
- For files logged as a series, use
fetch_series()
to specify the content containing the files and pass the output to thefiles
argument. - For individually assigned files, use the output of
fetch_experiments_table()
.
You can also pass a reference to a single File object or an Iterable of them.
Path to where the files should be downloaded. Can be relative or absolute.
If None
, the files are downloaded to the current working directory (CWD).
The Neptune project isn't specified directly for the function call, because the project is encoded in the files
argument.
For how to set the project manually, see the example.
Returns
pandas.DataFrame
– A DataFrame mapping experiments/runs and attributes to the paths of downloaded files. The DataFrame has a MultiIndex with:
- Index:
["experiment", "step"]
for experiments or["run", "step"]
for runs.- For individually assigned files, the step is
NaN
.
- For individually assigned files, the step is
- Columns: Single-level index with attribute names.
Raises
NeptuneUserError
– If files don't have associated identifiers:
- experiment names, if fetching from experiments
- run IDs, if fetching from runs
This indicates that the incorrect API was used. Make sure to import and use the methods from the correct module:
- When targeting experiments, use the
neptune_query
module to fetch experiments and download files. - When targeting runs, use the
neptune_query.runs
module to fetch runs and download files.
Constructing the destination path
Files are downloaded to the following directory:
<destination>/<experiment_name>/<attribute_path>/<file_name>
Note that:
- The directory specified with the
destination
parameter requires write permissions. - If the experiment name or an attribute path includes slashes
/
, each element that follows the slash is treated as a subdirectory. - The directory and subdirectories are automatically created if they don't already exist.
Example
Specify files from a given step range of a series:
import neptune_query as nq
interesting_files = nq.fetch_series(
project="team-alpha/project-x",
experiments=["seabird-4", "seabird-5"],
attributes=r"^predictions/",
step_range=(1.0, 3.0),
)
nq.download_files(files=interesting_files)
attribute predictions
experiment step
seabird-4 1.0 /home/sigurd/project-x/downloads/seabird-4/predictions/step_1_000000.png
2.0 /home/sigurd/project-x/downloads/seabird-4/predictions/step_2_000000.png
3.0 /home/sigurd/project-x/downloads/seabird-4/predictions/step_3_000000.png
seabird-5 1.0 /home/sigurd/project-x/downloads/seabird-5/predictions/step_1_000000.png
2.0 /home/sigurd/project-x/downloads/seabird-5/predictions/step_2_000000.png
3.0 /home/sigurd/project-x/downloads/seabird-5/predictions/step_3_000000.png
Specify individually assigned files:
interesting_files = nq.fetch_experiments_table(
project="team-alpha/project-x",
experiments=["seabird-4", "seabird-5"],
attributes=r"sample | labels",
)
nq.download_files(files=interesting_files)
attribute data sample labels
experiment step
seabird-4 NaN .../downloads/seabird-4/sample.csv .../downloads/seabird-4/labels.json
seabird-5 NaN .../downloads/seabird-5/sample.csv .../downloads/seabird-5/labels.json
Download from runs
To target individual runs by ID instead of experiment name, import the runs API:
import neptune_query.runs as nq_runs
Then call the corresponding querying method and replace the experiments
parameter with runs
:
interesting_files = nq_runs.fetch_runs_table(
project="team-alpha/project-x",
runs=["kittiwake-i2ufv"], # run ID
attributes=r"sample | labels",
)
nq_runs.download_files(files=interesting_files)