Project

Project

A Project object is a representation of a Neptune project. It can be used to retrieve information about Runs, Models, and its Model versions within that project. You can also store and retrieve metadata on a project level such as information about datasets, links to documentations, key project metrics etc.

[] (field lookup)

You can access any project's field through a dict-like field lookup project[field_path].
This way you can both track metadata:
1
project["general/brief"] = URL_TO_PROJECT_BRIEF
2
project["general/data_analysis"].upload("data_analysis.ipynb")
3
4
project["dataset/v0.1"].track_files("s3://datasets/images")
5
project.wait()
6
project["dataset/latest"] = project["dataset/v0.1"].fetch()
Copied!
As well as fetch already tracked metadata - for example to have the single source of truth when starting a new run:
1
run = neptune.init_run()
2
run["dataset"] = project["dataset/latest"].fetch()
3
project["dataset/latest"].fetch()
Copied!

Returns

The returned type depends on the field's type and whether a field is stored under the given path
Field
Returns
The field exists.
The field does not exist
Handler object
The path is a namespace e.g. train when a field train/acc exists.

=

Convenience alias for .assign().

.assign()

Assign values to multiple fields from a dictionary. You can use this method to log multiple pieces of information with one command.
Parameters
value
(dict ) - A dictionary with values to assign, where keys become the paths of the fields.
The dictionary can be nested - in such case the path will be a combination of all keys.
wait
(Boolean, optional, default is False) - If True the client will first wait to send all tracked metadata to the server. This makes the call synchronous, see Connection modes guide.

Examples

1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
# Assign multiple fields from a dictionary
5
general_info = {"brief": URL_TO_PROJECT_BRIEF, "deadline": "2049-06-30"}
6
project["general"] = general_info
7
8
# You can always explicitly log parameters one by one
9
project["general/brief"] = URL_TO_PROJECT_BRIEF
10
project["general/deadline"] = "2049-06-30"
11
12
# Dictionaries can be nested
13
general_info = {"brief": {"url": URL_TO_PROJECT_BRIEF}}
14
project["general"] = general_info
15
# This will log the url under path "general/brief/url"
Copied!

.print_structure()

Pretty prints the structure of the project's metadata. Paths are ordered lexicographically and the whole structure is neatly colored.
See also: .get_structure().

.get_structure()

Returns a projects's metadata structure in form of a dictionary.
This method can be used to traverse the projects's metadata structure programmatically when using Neptune in automated workflows.
The returned object is a shallow copy of an internal structure. Any modifications to it may result in tracking malfunction.

Returns

dict with the project's metadata structure.

del

Removes the field or whole namespace stored under the path completely and all data associated with them. See also .pop().

Examples

1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
5
# Delete a field with path "datasets/v0.4"
6
del project["datasets/v0.4"]
7
8
# You can also delete whole namespace
9
del project["datasets"]
Copied!

.pop()

Removes the field or whole namespace stored under the path completely and all data associated with them. See also .del().
Parameters
Text
path
(str ) - Path of the field or namespace to be removed.
wait
(Boolean, optional, default is False) - If True the client will first wait to send all tracked metadata to the server. This makes the call synchronous, see Connection modes guide.

Examples

1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
# Delete a field along with it's data
5
project.pop("datasets/v0.4")
6
7
# .pop() can be invoked directly on fields and namespaces
8
9
# Following line
10
project.pop("datasets/v0.4")
11
# is equiavlent to this line
12
project["datasets/v0.4"].pop()
13
# or this line
14
project["datasets"].pop("v0.4")
15
16
# You can also delete in batch whole namespace
17
project["datasets"].pop()
Copied!

.exists()

Checks if there is any field or namespace under the specified path.
Note that this method checks the local representation of the project. The field may have been created by another process (use .sync() to synchronize local representation) or the metadata may have not reached the Neptune servers so it may be impossible to fetch (use .wait() to wait for all tracking calls to finish).
Parameters
path
(str) - The path to check for the existence of a field or a namespace

Examples

1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
# If an old dataset exists remove it
5
if project.exists("dataset/v0.4"):
6
del project["dataset/v0.4"]
Copied!
When working in the asynchronous (default) mode remember that metadata you track may not be available immediately to fetch from the server even if it appears in the local representation. In order to prevent that you can use .wait().
1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
project["general/brief"] = URL_TO_PROJECT_BRIEF
5
6
# The path exists in the local representation
7
if project.exists("general/brief"):
8
# However, the tracking call may have not reached Neptune servers yet
9
project["general/brief"].fetch() # Error - the field does not exist
Copied!

.stop()

Stops the connection to the project and kills the synchronization thread. .stop() will be automatically called when a script that initialized the connection finishes or when exiting Neptune context. When using Neptune with Jupyter notebooks it's a good practice to stop the connection manually as it will be stopped automatically only when the Jupyter kernel stops.
Parameters
seconds
(int or float, optional, default is None) - The method will wait for the specified time for all tracking calls to finish, before stopping the connection. If None will wait for all tracking calls to finish.

Example

If you are initializing the connection from a script you don't need to call .stop():
1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
[...] # Your code
5
6
# If you are executing Python script .stop()
7
# is automatically called at the end for every Neptune object
Copied!
If you are initializing multiple connection from one script it is a good practice to .stop() the unneeded connections. You can also use Context Managers - Neptune will automatically call .stop() when exiting the Project context:
1
import neptune.new as neptune
2
3
# If you are initializing multiple connections from the same script
4
# stop the connection manually once not needed
5
for project_name in projects:
6
project = neptune.init_project(name=project_name)
7
[...] # Your code
8
project.stop()
9
10
# You can also use with statement and context manager
11
for project_name in projects:
12
with neptune.init_project(name=project_name) as project:
13
[...] # Your code
14
# .stop() is automatically called
15
# when code execution exits the with statement
Copied!
If you are using Jupyter notebooks for connecting to a project you need to manually invoke .stop() once the connection is not needed.

.fetch()

Fetch values of all non-File Atom fields as a dictionary.
The result will preserve the hierarchical structure of the project's metadata but will contain only non-File Atom fields.

Returns

dict containing all non-File Atom fields values.

Examples

1
import neptune.new as neptune
2
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
3
4
# Fetch all the project metrics
5
project_metrics = project["metrics"].fetch()
Copied!

.get_url()

Returns a URL of the project in Neptune.

Returns

str with the URL of the project in Neptune

.fetch_runs_table()

Retrieve runs matching the specified criteria.
All parameters are optional, each of them specifies a single criterion. Only runs matching all of the criteria will be returned.
Due to technical limitation only first 10,000 runs matching the criteria are fetched.
Parameters
id
(str or list of str, optional, default is None) - A run's id like "SAN-1" or list of ids like ["SAN-1", "SAN-2"]. Matching any element of the list is sufficient to pass the criterion.
state
(str or list of str, optional, default is None) - A run's state like "active" or list of states like ["inactive", "active"]. Possible values: "inactive", "active".
Matching any element of the list is sufficient to pass the criterion.
owner
(str or list of str, optional, default is None) - Username of the run's owner (the user who created the tracked run is an owner) like "josh" or a list of owners like ["frederic", "josh"].
Matching any element of the list is sufficient to pass the criterion.
tag
(str or list of str, optional, default is None) - An experiment tag like "lightGBM" or list of tags like ["pytorch", "cycleLR"]. Only experiments that have all specified tags will match this criterion.

Returns

A RunsTable object containing experiments matching the specified criteria. Use.to_pandas()to convert it to Pandas DataFrame.

Examples

1
import neptune.new as neptune
2
3
# Fetch project 'jackie/sandbox'
4
project = neptune.get_project(name='jackie/sandbox')
5
6
# Fetch all Runs metadata as Pandas DataFrame
7
runs_table_df = project.fetch_runs_table().to_pandas()
8
9
# Sort runs by creation time
10
runs_table_df = runs_table_df.sort_values(by='sys/creation_time', ascending=False)
11
12
# Extract the last runs id
13
last_run_id = runs_table_df['sys/id'].values[0]
14
15
16
# You can also filter the runs table by state, owner or tag or a combination
17
18
# Fetch only inactive runs
19
runs_table_df = project.fetch_runs_table(state='idle').to_pandas()
20
21
# Fetch only runs created by CI service
22
runs_table_df = project.fetch_runs_table(owner='my_company_ci_service').to_pandas()
23
24
# Fetch only runs that have both 'Exploration' and 'Optuna' tag
25
runs_table_df = project.fetch_runs_table(tag=['Exploration', 'Optuna']).to_pandas()
26
27
# You can combine conditions. Runs satisfying all conditions will be fetched
28
runs_table_df = project.fetch_runs_table(state='idle', tag='Exploration').to_pandas()
Copied!

.wait()

Wait for all the tracking calls to finish.
Parameters
disk_only
(Boolean, optional, default is False) - If True the process will only wait for data to be saved locally from memory, but will not wait for them to reach Neptune servers.

.sync()

Synchronizes local representation of the project with Neptune servers.
Parameters
wait
(Boolean, optional, default is True) - If True the client will first wait to send all tracked metadata to the server. This makes the call synchronous, see Connection modes guide.

RunsTable

An interim object containing fetched runs metadata. To access the data you need to convert it to Pandas DataFrame by invoking .to_pandas().

.to_pandas()

Converts RunsTable data to a Pandas DataFrame object.

Returns

RunsTable data in the form of pandas.DataFrame.

Examples

1
import neptune.new as neptune
2
3
# Fetch project 'jackie/sandbox'
4
project = neptune.get_project(name='jackie/sandbox')
5
6
# Fetch all Runs metadata as Pandas DataFrame
7
runs_table_df = project.fetch_runs_table().to_pandas()
8
9
# Sort runs by creation time
10
runs_table_df = runs_table_df.sort_values(by='sys/creation_time', ascending=False)
11
12
# Extract the last runs id
13
last_run_id = runs_table_df['sys/id'].values[0]
14
15
16
# You can also filter the runs table by state, owner or tag or a combination
17
18
# Fetch only inactive runs
19
runs_table_df = project.fetch_runs_table(state='inactive').to_pandas()
20
21
# Fetch only runs created by CI service
22
runs_table_df = project.fetch_runs_table(owner='my_company_ci_service').to_pandas()
23
24
# Fetch only runs that have both 'Exploration' and 'Optuna' tag
25
runs_table_df = project.fetch_runs_table(tag=['Exploration', 'Optuna']).to_pandas()
26
27
# You can combine conditions. Runs satisfying all conditions will be fetched
28
runs_table_df = project.fetch_runs_table(state='inactive', tag='Exploration').to_pandas()
Copied!