Skip to content

Model#

Representation of all metadata about a model.

Initialization#

Initialize with the init_model() function or the class constructor.

import neptune

run = neptune.init_model(key="MODEL_KEY")
from neptune import Model

model = Model(key="MODEL_KEY")
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takes api_token and project as arguments:

run = neptune.init_run( # (1)!
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8",  # your token here
    project="ml-team/classification",  # your full project name here
)
  1. Also works for init_model(), init_model_version(), init_project(), and integrations that create Neptune runs underneath the hood, such as NeptuneLogger or NeptuneCallback.

  2. API token: In the bottom-left corner, expand the user menu and select Get my API token.

  3. Project name: You can copy the path from the project details ( Edit project details).

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

You can use the model object to:

  • Store and retrieve general metadata about a machine learning model. This can be the model signature, validation datasets, or anything else that is supposed to be common to all versions of the model.
  • List the created versions of that model.

Parameters

Name      Type Default     Description
with_id str, optional None The Neptune identifier of an existing model to resume, such as "CLS-PRE". The identifier is stored in the object's sys/id field. If omitted or None is passed, a new model is created.
name str, optional "Untitled" A custom name for the model. You can use it as a human-readable ID and add it to the models table as a column (sys/name).
key str, optional None Key for the model. Required when creating a new model.
  • Used together with the project key to form the model identifier.
  • Must be uppercase, alphanumerical, and unique within the project.
project str, optional None Name of a project in the form workspace-name/project-name. If None, the value of the NEPTUNE_PROJECT environment variable is used.
api_token str, optional None Your Neptune API token (or a service account's API token). If None, the value of the NEPTUNE_API_TOKEN environment variable is used.

To keep your token secure, avoid placing it in source code. Instead, save it as an environment variable.

mode str, optional async Connection mode in which the logging will work. Possible values are async, sync, read-only, and debug.

If you leave it out, the value of the NEPTUNE_MODE environment variable is used. If that's not set, the default async is used.

flush_period float, optional 5 (seconds) In asynchronous (default) connection mode, how often Neptune should trigger disk flushing.
proxies dict, optional None Argument passed to HTTP calls made via the Requests library. For details on proxies, see the Requests documentation.
async_lag_callback NeptuneObjectCallback, optional None Custom callback function which is called if the lag between a queued operation and its synchronization with the server exceeds the duration defined by async_lag_threshold. The callback should take a Model object as the argument and can contain any custom code, such as calling stop() on the object.

Note: Instead of using this argument, you can use Neptune's default callback by setting the NEPTUNE_ENABLE_DEFAULT_ASYNC_LAG_CALLBACK environment variable to TRUE.

async_lag_threshold float, optional 1800.0 (seconds) Duration between the queueing and synchronization of an operation. If a lag callback (default callback enabled via environment variable or custom callback passed to the async_lag_callback argument) is enabled, the callback is called when this duration is exceeded.
async_no_progress_callback NeptuneObjectCallback, optional None Custom callback function which is called if there has been no synchronization progress whatsoever for the duration defined by async_no_progress_threshold. The callback should take a Model object as the argument and can contain any custom code, such as calling stop() on the object.

Note: Instead of using this argument, you can use Neptune's default callback by setting the NEPTUNE_ENABLE_DEFAULT_ASYNC_NO_PROGRESS_CALLBACK environment variable to TRUE.

async_no_progress_threshold float, optional 300.0 (seconds) For how long there has been no synchronization progress. If a no-progress callback (default callback enabled via environment variable or custom callback passed to the async_no_progress_callback argument) is enabled, the callback is called when this duration is exceeded.

Returns

Model object that is used to manage the model and log metadata to it.

Example

Initialize existing model with identifier CLS-PRE
from neptune import Model

model = Model(with_id="CLS-PRE") # (1)!
  1. Initially created with the key PRE (Model(key="PRE")) in a project with the key CLS.

Field lookup: []#

You can access the field of a model through a dict-like field lookup: model[field_path].

This way, you can

  • store metadata:

    model["model/signature"].upload("model_signature.json")
    
    model["validation/data/v0.1"].track_files("s3://datasets/validation")
    model.wait()
    model["validation/data/latest"] = model["validation/data/v0.1"].fetch()
    
  • fetch already logged metadata – for example, to have the single source of truth when evaluating a new model version:

    Fetch model's validation dataset for a new model version
    model_id = model["sys/id"].fetch()
    model_version = neptune.init_model_version(model=model_id) # (1)!
    
    validation_latest = model["validation/data/latest"].fetch() # (2)!
    model_version["validation/data"] = validation_latest # (3)!
    
    1. Create a new model version based on the previously initialized model
    2. Get a previously logged dataset reference from the "parent model"
    3. Assign the dataset reference to the model version's metadata

Returns

The returned type depends on the field type and whether a field is stored under the given path.

Path Example Returns
Field exists - The returned type matches the type of the field
Field does not exist - Handler object
Path is namespace and has field

Path: "model"

Field "model/signature" exists

Namespace handler object

Examples

Create new model
import neptune

model = neptune.init_model(key="PRE")
model["size/limit"] = 50
Connect to the model later
import neptune

model = neptune.init_model(with_id="CLS-PRE")

# Update the value of the existing field
model["size/limit"] = 100
Log other metadata
# Create new Series fields
model["train/logs"].append("Model registry, day 1:")

# Continue logging to existing Series fields
model["train/logs"].append("A model is born")

# If you access a namespace handler, you can interact with it like an object
info_ns = model["model_info"]
info_ns["size_units"] = "MB"  # Stores "MB" under path "model_info/size_units"

Assignment: =#

Convenience alias for assign().


assign()#

Assign values to multiple fields from a dictionary. You can use this method to store multiple pieces of metadata with a single command.

Parameters

Name Type Default Description
value dict None A dictionary with values to assign, where keys (str) become the paths of the fields. The dictionary can be nested, in which case the path will be a combination of all keys.
wait Boolean, optional False By default, logged metadata is sent to the server in the background. With this option set to True, Neptune first sends all data before executing the call. See Connection modes.

Example

import neptune

model = neptune.init_model(key="PRE")

# Assign multiple fields from a dictionary
model_info = {"size_limit": 50.0, "size_units": "MB"}
model["model"] = model_info

# You can also store metadata piece by piece
model["model/size_limit"] = 50.0
model["model/size_units"] = "MB"

# Dictionaries can be nested
model_info = {"size": {"limit": 50.0}}
model["model"] = model_info
# This will store the number 50.0 under path "model/size/limit"

del#

Completely removes the field or namespace and all associated metadata stored under the path.

See also: pop().

Examples

import neptune

model = neptune.init_model(with_id="CLS-PRE")

# Delete the field with the path "datasets/v0.4"
del model["datasets/v0.4"]

# You can also delete the whole namespace
del model["datasets"]

exists()#

Checks if there is a field or namespace under the specified path.

Info

This method checks the local representation of the model. If the field was created by another process or the metadata has not reached the Neptune servers, it may not be possible to fetch. In this case you can:

  • Call sync() on the model object to synchronize the local representation with the server.
  • Call wait() on the model object to wait for all logging calls to finish.

Parameters

Name Type Default Description
path str - Path to check for the existence of a field or namespace

Examples

import neptune

model = neptune.init_model(with_id="CLS-PRE")

# If an old dataset exists, remove it
if model.exists("dataset/v0.4"):
    del model["dataset/v0.4"]

Info

When working in asynchronous (default) mode, the metadata you track may not be immediately available to fetch from the server, even if it appears in the local representation.

To work around this, you can call wait() on the model object.

import neptune

model = neptune.init_model(with_id="CLS-PRE")

model["model/signature"].upload("model_signature.json")

# The path exists in the local representation
if model.exists("model/signature"):
    # However, the tracking call may have not reached Neptune servers yet
    model["model/signature"].download()  # Error: the field does not exist

model.wait()

fetch()#

Fetches the values of all single-value fields (that are not of type File) as a dictionary.

The result preserves the hierarchical structure of the model metadata.

Returns

dict containing the values of all non-File single-value fields.

Example

import neptune

model = neptune.init_model(with_id="CLS-PRE", mode="read-only")

# Fetch all the model metrics
model_metrics = model["metrics"].fetch()

fetch_model_versions_table()#

List the versions of the model.

Parameters

Name Type Default Description
query str, optional None NQL query string. Example: "model_size:float >= 100MB".
columns list[str], optional None

Names of columns to include in the table, as a list of field names.

The Neptune ID ("sys/id") is included automatically.

If None, all the columns of the model versions table are included (up to a maximum of 10 000).

limit int, optional None How many entries to return at most. If None, all entries are returned.
sort_by str, optional "sys/creation_time" Name of the field to sort the results by. The field must represent a simple type (string, float, datetime, integer, or Boolean).
ascending bool, optional False Whether to sort the entries in ascending order of the sorting column values.
progress_bar bool or Type[ProgressBarCallback], optional None Set to False to disable the download progress bar, or pass a type of ProgressBarCallback to use your own progress bar. If set to None or True, the default tqdm-based progress bar will be used.

Returns

An interim Table object containing ModelVersion objects.

Use to_pandas() to convert it to a pandas DataFrame.

Examples

Initialize an existing model in read-only mode
>>> import neptune
>>> model = neptune.init_model(with_id="NLU-FOREST", mode="read-only")
[neptune] [info   ] Neptune initialized...
Fetch list of all version of the model as pandas DataFrame
>>> versions_df = model.fetch_model_versions_table().to_pandas()
Fetching table...: 100 [00:03, 31.35/s]
>>> print(versions_df)
          sys/creation_time        sys/id sys/model_id  ...  test/acc val/acc
0  2023-08-24T13:55:30.052Z  NLU-FOREST-3   NLU-FOREST  ...      0.45    0.04
1  2023-08-24T13:55:18.777Z  NLU-FOREST-2   NLU-FOREST  ...      0.84    0.67
2   2023-08-24T13:54:32.75Z  NLU-FOREST-1   NLU-FOREST  ...      0.41    0.34
Include only accuracy fields as columns and sort by "test/acc"
>>> filtered_versions_df = model.fetch_model_versions_table(
...     columns=["test/acc", "val/acc"],
...     sort_by="test/acc",
... ).to_pandas()
Fetching table...: 100 [00:00, 146.80/s]
>>> print(filtered_versions_df)
           sys/id   test/acc  val/acc
0   NLU-FOREST-12       0.94     0.88
1   NLU-FOREST-11       0.75     0.63
2   NLU-FOREST-10       0.73     0.59
3   NLU-FOREST-9        0.64     0.65

To filter the model versions by a custom field and condition, you can pass an NQL string to the query argument.

Fetch model versions with particular dataset hash, include stage and test accuracy, sort by model size
model_versions_df = model.fetch_model_versions_table(
    query="(`data_version`:artifact = 9a113b799082e5fd628be178bedd52837bac24e91f",
    columns=["sys/stage", "model_size", "test/acc"],
    sort_by="model_size",
).to_pandas()

For the syntax and examples, see the Neptune Query Language (NQL) reference.


get_structure()#

Returns the metadata structure of a Model object in the form of a dictionary.

This method can be used to traverse the metadata structure programmatically when using Neptune in automated workflows.

See also: print_structure().

The returned object is a shallow copy of the internal structure. Any modifications to it may result in tracking malfunction.

Returns

dict with the model metadata structure.

Example

>>> import neptune
>>> model = neptune.init_model(with_id="CLS-PRE", mode="read-only")
>>> model.get_structure()
{'model': {'signature': <neptune.attributes.atoms.file.File object at 0x000001C8EF87DD50>, 'size_limit': <neptune.attributes.atoms.float.Float object at 0x000001C8EF87DE40>, 'size_units': <neptune.attributes.atoms.string.String object at 0x000001C8EF87DEA0>}, ... }

get_url()#

Returns a direct link to the model in Neptune. The same link is printed in the console once the model object has been initialized.

Returns

str with the URL of the model in Neptune.

Example

>>> import neptune
>>> model = neptune.init_model(with_id="CLS-PRE", mode="read-only")
>>> model.get_url()
https://app.neptune.ai/ml-team/classification/m/CLS-PRE

pop()#

Completely removes the field or namespace and all associated metadata stored under the path.

See also del.

Parameters

Name Type Default Description
path str - Path of the field or namespace to be removed.
wait Boolean, optional False By default, logged metadata is sent to the server in the background. With this option set to True, Neptune first sends all data before executing the call. See Connection modes.

Examples

import neptune

model = neptune.init_model(with_id="CLS-PRE")

# Delete a field along with its data
model.pop("datasets/v0.4")

You can invoke pop() directly on fields and namespaces.

# The following line
model.pop("datasets/v0.4")
# is equiavlent to this line
model["datasets/v0.4"].pop()
# and this line
model["datasets"].pop("v0.4")

# You can also batch-delete the whole namespace
model["datasets"].pop()

Pretty-prints the structure of the model metadata. Paths are ordered lexicographically and the structure is colored.

See also: get_structure().

Example

>>> import neptune
>>> model = neptune.init_model(with_id="CLS-PRE", mode="read-only")
>>> model.print_structure()
'model':
    'signature': File
    'size_limit': Float
    'size_units': String
'sys':
    'creation_time': Datetime
    'id': String
    'modification_time': Datetime
    'monitoring_time': Integer
    'name': String
    'owner': String
    'ping_time': Datetime
    'running_time': Float
    'size': Float
    'state': RunState
    'tags': StringSet
    'trashed': Boolean
'validation':
    'dataset':
        'v0.1': Artifact

stop()#

Stops the connection to Neptune and synchronizes all data.

When using context managers, Neptune automatically calls stop() when exiting the Model context.

Warning

Always call stop() in interactive environments, such as a Python interpreter or Jupyter notebook. The connection to Neptune is not stopped when the cell has finished executing, but rather when the entire notebook stops.

If you're running a script, the connection is stopped automatically when the script finishes executing. However, it's a best practice to call stop() when the connection is no longer needed.

Parameters

Name Type Default Description
seconds int or float, optional None Wait for the specified time for all logging calls to finish before stopping the connection. If None, wait for all logging calls to finish.

Examples

If you initializing the connection from a Python script, Neptune stops it automatically when the script finishes executing.

import neptune

model = neptune.init_model(key="PRE")

[...] # Your code

# stop() is automatically called at the end for every Neptune object

Using with statement and context manager:

for model_identifier in models:
    with neptune.init_model(with_id=model_identifier) as model:
        [...] # Your code
        # stop() is automatically called
        # when code execution exits the with statement

sync()#

Synchronizes the local representation of the model with Neptune servers.

Parameters

Name Type Default Description
wait Boolean, optional False By default, logged metadata is sent to the server in the background. With this option set to True, Neptune first sends all data before executing the call. See Connection modes.

wait()#

Wait for all the logging calls to finish.

Parameters

Name Type Default Description
disk_only Boolean, optional False If True, the process will wait only for the data to be saved locally from memory, but will not wait for it to reach Neptune servers.

Table.to_pandas()#

The Table object is an interim object containing the metadata of fetched objects. To access the data, you need to convert it to a pandas DataFrame by invoking to_pandas().

Returns

Tabular data in the pandas.DataFrame format.

Example

import neptune

model = neptune.init_model(with_id="CLS-PRE", mode="read-only")

# Fetch list of all version of the model as pandas DataFrame
model_versions_df = model.fetch_model_versions_table().to_pandas()