Skip to content

Catalyst integration guide#

Open in Colab

Custom dashboard displaying metadata logged with Catalyst

Neptune compatibility note

This integration has not yet been updated for neptune 1.x and requires using neptune-client <1.0.0.

Catalyst is a PyTorch framework for Deep Learning R&D. Neptune support is implemented as a logger in Catalyst. With the Neptune logger, you can automatically track:

  • Metrics
  • Hyperparameters
  • Images
  • Artifacts (such as videos, audio, model checkpoints, and files)
  • Hardware consumption metrics
  • stdout and stderr streams
  • Training code and Git information

See in Neptune  Code examples 

Quickstart#

Tip

This section is for Catalyst users who are familiar with loggers, like CSV logger or TensorBoard logger.

NeptuneLogger is part of the Catalyst library. To start logging, create a Neptune logger and pass it to the runner:

  1. Create the logger:

    import neptune # (1)!
    from catalyst import dl
    
    neptune_logger = dl.NeptuneLogger(
        api_token=neptune.ANONYMOUS_API_TOKEN, # (2)!
        project="common/catalyst-integration", # (3)!
        tags=["pretraining", "retina"],  # optional
    )
    
    1. Only needed to access the anonymous API token. If you're logging to your own project, you can omit the import.

    2. The api_token argument is included to enable anonymous logging.

      Once you've registered, leave the token out of your script and instead save it as an environment variable.

    3. Projects in the common workspace are public and can be used for testing.

      To log to your own workspace, pass the full name of your Neptune project: workspace-name/project-name. For example, project="ml-team/classification".

      You can copy the name from the project details ( Details & privacy).

    There are further ways to customize the behavior of the logger. For details, see the [Catalyst API reference ][catalyst-docs].

  2. Pass the logger to the runner:

    Pass to SupervisedRunner
    my_runner = dl.SupervisedRunner()
    
    my_runner.train(
        loggers={"neptune": neptune_logger},
        ...
    )
    
    Pass to custom Runner
    class CustomRunner(dl.IRunner):
        ...
        def get_loggers(self):
            return {
                "console": dl.ConsoleLogger(),
                "neptune": neptune_logger
            }
    
    runner = CustomRunner().run()
    

The Neptune logger setup is complete and you can run your scripts without additional changes.

Your metadata will be logged in the Neptune project for further analysis, comparison, and collaboration.

Full walkthrough#

This guide walks you through connecting NeptuneLogger to your machine learning scripts and using it in your experimentation.

Before you start#

  • Sign up at neptune.ai/register.
  • Create a project for storing your metadata.
  • Have Catalyst installed.
  • Have version 0.16.18 of neptune-client installed:

    pip install -U "neptune-client<1.0.0"
    
    Passing your Neptune credentials

    Once you've registered and created a project, set your Neptune API token and full project name to the NEPTUNE_API_TOKEN and NEPTUNE_PROJECT environment variables, respectively.

    export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...6Lc"
    

    To find your API token: In the bottom-left corner of the Neptune app, expand the user menu and select Get my API token.

    export NEPTUNE_PROJECT="ml-team/classification"
    

    Your full project name has the form workspace-name/project-name. You can copy it from the project settings: Click the menu in the top-right → Details & privacy.

    On Windows, navigate to SettingsEdit the system environment variables, or enter the following in Command Prompt: setx SOME_NEPTUNE_VARIABLE 'some-value'


    While it's not recommended especially for the API token, you can also pass your credentials in the code when initializing Neptune.

    run = neptune.init_run(
        project="ml-team/classification",  # your full project name here
        api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh...3Kb8",  # your API token here
    )
    

    For more help, see Set Neptune credentials.

If you'd rather follow the guide without any setup, you can run the example in Colab .

Adding NeptuneLogger to the Catalyst script#

Catalyst has a unified way of logging metadata, by using loggers.

You can learn more about Catalyst Loggers in the Catalyst docs .

To start logging, create a Neptune logger and pass it to the runner:

  1. Create a NeptuneLogger instance.

    from catalyst import dl
    import neptune.new as neptune
    
    neptune_logger = dl.NeptuneLogger() # (1)!
    
    1. If you haven't set up your credentials, you can log anonymously:

      neptune_logger = dl.NeptuneLogger(
          api_token=neptune.ANONYMOUS_API_TOKEN,
          project="common/catalyst-integration",
      )
      

    There are further ways to customize the behavior of the logger. For details, see the Catalyst API reference .

  2. Pass neptune_logger to the runner.

    The example below uses SupervisedRunner as an example:

    # Create runner
    my_runner = dl.SupervisedRunner()
    my_runner.train(loggers={"neptune": neptune_logger}, ...)
    

    The Neptune logger is now ready.

  3. Once you're done logging, stop tracking the run.

    neptune_logger.run.stop()
    
  4. Run your script:

    python main.py
    
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takes api_token and project as arguments:

run = neptune.init_run(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
    project="ml-team/classification", # (2)!
)
  1. In the bottom-left corner, expand the user menu and select Get my API token.
  2. You can copy the path from the project details ( Details & privacy).

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

Analyzing the logged metadata in Neptune#

Your metadata will be logged in the given Neptune project for analysis, comparison, and collaboration.

To open the run, click the Neptune link that appears in the console output.

Example link: https://app.neptune.ai/o/common/org/catalyst-integration/e/CATALYST-38823

You can also open the project and look for your run in the Runs section.

Logging best model#

After training, use the log_artifact() method to log a model checkpoint.

my_runner.log_artifact(
    path_to_artifact="./checkpoints/model.best.pth",
    tag="best_model",
    scope="experiment",
)

Manually logging metadata#

If you have other types of metadata that are not covered in this guide, you can still log them using the Neptune client library.

When you initialize the run, you get a run object, to which you can assign different types of metadata in a structure of your own choosing.

import neptune

# Create a new Neptune run
run = neptune.init_run()

# Log metrics inside loops
for epoch in range(n_epochs):
    # Your training loop

    run["train/epoch/loss"].append(loss)  # Each append() call appends a value
    run["train/epoch/accuracy"].append(acc)

# Track artifact versions and metadata
run["train/images"].track_files("./datasets/images")

# Upload entire files
run["test/preds"].upload("path/to/test_preds.csv")

# Log text or other metadata, in a structure of your choosing
run["tokenizer"] = "regexp_tokenize"