Project

Project

A Project object is a representation of a Neptune project. It can be used to retrieve information about Runs, Models, and its Model versions within that project. You can also store and retrieve metadata on a project level such as information about datasets, links to documentations, key project metrics etc.
The project object follows the same logic as other Neptune objects: If you assign a new value to an existing field, the new value overwrites the previous one.
In a given project, you always initialize and work with the same project object, so take care not to accidentally override each other's entries if your team is collaborating on project metadata.
Tip: Recall that the log() method appends the logged value to a series. It works for text strings as well as numerical values.

[] (field lookup)

You can access any project's field through a dict-like field lookup project[field_path].
This way you can both track metadata:
project["general/brief"] = URL_TO_PROJECT_BRIEF
project["general/data_analysis"].upload("data_analysis.ipynb")
project["dataset/v0.1"].track_files("s3://datasets/images")
project.wait()
project["dataset/latest"] = project["dataset/v0.1"].fetch()
As well as fetch already tracked metadata - for example to have the single source of truth when starting a new run:
run = neptune.init_run()
run["dataset"] = project["dataset/latest"].fetch()
project["dataset/latest"].fetch()

Returns

The returned type depends on the field's type and whether a field is stored under the given path
Field
Returns
The field exists.
The field does not exist
Handler object
The path is a namespace e.g. train when a field train/acc exists.

=

Convenience alias for .assign().

.assign()

Assign values to multiple fields from a dictionary. You can use this method to log multiple pieces of information with one command.
Parameters
value
(dict ) - A dictionary with values to assign, where keys become the paths of the fields.
The dictionary can be nested - in such case the path will be a combination of all keys.
wait
(Boolean, optional, default is False) - If True the client will first wait to send all tracked metadata to the server. This makes the call synchronous, see Connection modes guide.

Examples

import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
# Assign multiple fields from a dictionary
general_info = {"brief": URL_TO_PROJECT_BRIEF, "deadline": "2049-06-30"}
project["general"] = general_info
# You can always explicitly log parameters one by one
project["general/brief"] = URL_TO_PROJECT_BRIEF
project["general/deadline"] = "2049-06-30"
# Dictionaries can be nested
general_info = {"brief": {"url": URL_TO_PROJECT_BRIEF}}
project["general"] = general_info
# This will log the url under path "general/brief/url"

.print_structure()

Pretty prints the structure of the project's metadata. Paths are ordered lexicographically and the whole structure is neatly colored.
See also: .get_structure().

.get_structure()

Returns a projects's metadata structure in form of a dictionary.
This method can be used to traverse the projects's metadata structure programmatically when using Neptune in automated workflows.
The returned object is a shallow copy of an internal structure. Any modifications to it may result in tracking malfunction.

Returns

dict with the project's metadata structure.

del

Removes the field or whole namespace stored under the path completely and all data associated with them. See also .pop().

Examples

import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
# Delete a field with path "datasets/v0.4"
del project["datasets/v0.4"]
# You can also delete whole namespace
del project["datasets"]

.pop()

Removes the field or whole namespace stored under the path completely and all data associated with them. See also .del().
Parameters
Text
path
(str ) - Path of the field or namespace to be removed.
wait
(Boolean, optional, default is False) - If True the client will first wait to send all tracked metadata to the server. This makes the call synchronous, see Connection modes guide.

Examples

import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
# Delete a field along with it's data
project.pop("datasets/v0.4")
# .pop() can be invoked directly on fields and namespaces
# Following line
project.pop("datasets/v0.4")
# is equiavlent to this line
project["datasets/v0.4"].pop()
# or this line
project["datasets"].pop("v0.4")
# You can also delete in batch whole namespace
project["datasets"].pop()

.exists()

Checks if there is any field or namespace under the specified path.
Note that this method checks the local representation of the project. The field may have been created by another process (use .sync() to synchronize local representation) or the metadata may have not reached the Neptune servers so it may be impossible to fetch (use .wait() to wait for all tracking calls to finish).
Parameters
path
(str) - The path to check for the existence of a field or a namespace

Examples

import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
# If an old dataset exists remove it
if project.exists("dataset/v0.4"):
del project["dataset/v0.4"]
When working in the asynchronous (default) mode remember that metadata you track may not be available immediately to fetch from the server even if it appears in the local representation. In order to prevent that you can use .wait().
import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
project["general/brief"] = URL_TO_PROJECT_BRIEF
# The path exists in the local representation
if project.exists("general/brief"):
# However, the tracking call may have not reached Neptune servers yet
project["general/brief"].fetch() # Error - the field does not exist

.stop()

Stops the connection to the project and kills the synchronization thread. .stop() will be automatically called when a script that initialized the connection finishes or when exiting Neptune context. When using Neptune with Jupyter notebooks it's a good practice to stop the connection manually as it will be stopped automatically only when the Jupyter kernel stops.
Parameters
seconds
(int or float, optional, default is None) - The method will wait for the specified time for all tracking calls to finish, before stopping the connection. If None will wait for all tracking calls to finish.

Example

If you are initializing the connection from a script you don't need to call .stop():
import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
[...] # Your code
# If you are executing Python script .stop()
# is automatically called at the end for every Neptune object
If you are initializing multiple connection from one script it is a good practice to .stop() the unneeded connections. You can also use Context Managers - Neptune will automatically call .stop() when exiting the Project context:
import neptune.new as neptune
# If you are initializing multiple connections from the same script
# stop the connection manually once not needed
for project_name in projects:
project = neptune.init_project(name=project_name)
[...] # Your code
project.stop()
# You can also use with statement and context manager
for project_name in projects:
with neptune.init_project(name=project_name) as project:
[...] # Your code
# .stop() is automatically called
# when code execution exits the with statement
If you are using Jupyter notebooks for connecting to a project you need to manually invoke .stop() once the connection is not needed.

.fetch()

Fetch values of all non-File Atom fields as a dictionary.
The result will preserve the hierarchical structure of the project's metadata but will contain only non-File Atom fields.

Returns

dict containing all non-File Atom fields values.

Examples

import neptune.new as neptune
project = neptune.init_project(name="MY_WORKSPACE/MY_PROJECT")
# Fetch all the project metrics
project_metrics = project["metrics"].fetch()

.get_url()

Returns a URL of the project in Neptune.

Returns

str with the URL of the project in Neptune

.fetch_models_table()

Retrieve the models of the project, up to a maximum of 10 000.

Returns

A Table object containing Model objects. Use to_pandas()to convert it to a pandas DataFrame.

Examples

# Fetch project "jackie/sandbox"
project = neptune.get_project(name="jackie/sandbox")
# Fetch the metadata of all models as a pandas DataFrame
models_table_df = project.fetch_models_table().to_pandas()
# Sort model objects by size
models_table_df = models_table_df.sort_values(by="sys/size")
# Sort models by creation time
models_table_df = models_table_df.sort_values(
by="sys/creation_time",
ascending=False,
)
# Extract the last model id
last_model_id = models_table_df["sys/id"].values[0]

.fetch_runs_table()

Retrieve runs matching the specified criteria.
All parameters are optional, each of them specifies a single criterion. Only runs matching all of the criteria will be returned, up to a maximum of 10 000 runs.
Parameters
id
(str or list of str, optional, default is None) - A run's id like "SAN-1" or list of ids like ["SAN-1", "SAN-2"]. Matching any element of the list is sufficient to pass the criterion.
state
(str or list of str, optional, default is None) - A run's state like "active" or list of states like ["inactive", "active"]. Possible values: "inactive", "active".
Matching any element of the list is sufficient to pass the criterion.
owner
(str or list of str, optional, default is None) - Username of the run's owner (the user who created the tracked run is an owner) like "josh" or a list of owners like ["frederic", "josh"].
Matching any element of the list is sufficient to pass the criterion.
tag
(str or list of str, optional, default is None) - An experiment tag like "lightGBM" or list of tags like ["pytorch", "cycleLR"]. Only experiments that have all specified tags will match this criterion.

Returns

A Table object containing runs matching the specified criteria. Use to_pandas()to convert it to a pandas DataFrame.

Examples

import neptune.new as neptune
# Fetch project 'jackie/sandbox'
project = neptune.get_project(name='jackie/sandbox')
# Fetch all Runs metadata as Pandas DataFrame
runs_table_df = project.fetch_runs_table().to_pandas()
# Sort runs by creation time
runs_table_df = runs_table_df.sort_values(by='sys/creation_time', ascending=False)
# Extract the last runs id
last_run_id = runs_table_df['sys/id'].values[0]
# You can also filter the runs table by state, owner or tag or a combination
# Fetch only inactive runs
runs_table_df = project.fetch_runs_table(state='idle').to_pandas()
# Fetch only runs created by CI service
runs_table_df = project.fetch_runs_table(owner='my_company_ci_service').to_pandas()
# Fetch only runs that have both 'Exploration' and 'Optuna' tag
runs_table_df = project.fetch_runs_table(tag=['Exploration', 'Optuna']).to_pandas()
# You can combine conditions. Runs satisfying all conditions will be fetched
runs_table_df = project.fetch_runs_table(state='idle', tag='Exploration').to_pandas()

.wait()

Wait for all the tracking calls to finish.
Parameters
disk_only
(Boolean, optional, default is False) - If True the process will only wait for data to be saved locally from memory, but will not wait for them to reach Neptune servers.

.sync()

Synchronizes local representation of the project with Neptune servers.
Parameters
wait
(Boolean, optional, default is True) - If True the client will first wait to send all tracked metadata to the server. This makes the call synchronous, see Connection modes guide.

Table

An interim object containing the metadata of fetched objects. To access the data, convert it to a pandas DataFrame by invoking to_pandas().

.to_pandas()

Converts Table data to a Pandas DataFrame object.

Returns

Table data in the form of pandas.DataFrame.

Examples

import neptune.new as neptune
# Fetch project 'jackie/sandbox'
project = neptune.get_project(name='jackie/sandbox')
# Fetch all Runs metadata as Pandas DataFrame
runs_table_df = project.fetch_runs_table().to_pandas()
# Sort runs by creation time
runs_table_df = runs_table_df.sort_values(by='sys/creation_time', ascending=False)
# Extract the last runs id
last_run_id = runs_table_df['sys/id'].values[0]
# You can also filter the runs table by state, owner or tag or a combination
# Fetch only inactive runs
runs_table_df = project.fetch_runs_table(state='inactive').to_pandas()
# Fetch only runs created by CI service
runs_table_df = project.fetch_runs_table(owner='my_company_ci_service').to_pandas()
# Fetch only runs that have both 'Exploration' and 'Optuna' tag
runs_table_df = project.fetch_runs_table(tag=['Exploration', 'Optuna']).to_pandas()
# You can combine conditions. Runs satisfying all conditions will be fetched
runs_table_df = project.fetch_runs_table(state='inactive', tag='Exploration').to_pandas()