Skip to content

Working with Catalyst#

Open in Colab

Custom dashboard displaying metadata logged with Catalyst

Catalyst is a PyTorch framework for Deep Learning R&D. It focuses on reproducibility, rapid experimentation, and codebase reuse.

Neptune support is implemented as a logger in Catalyst. With the Neptune logger, you can automatically track:

  • Metrics
  • Hyperparameters
  • Images
  • Artifacts (such as videos, audio, model checkpoints, and files)
  • Hardware consumption metrics
  • stdout and stderr streams
  • Training code and Git information

See in Neptune  Code examples 

Related

Quickstart#

Tip

This section is for Catalyst users who are familiar with loggers, like CSV logger or TensorBoard logger.

NeptuneLogger is part of the Catalyst library. To start logging, create a Neptune logger and pass it to the runner:

  1. Create the logger:

    from catalyst import dl
    
    neptune_logger = dl.NeptuneLogger(
        api_token=neptune.ANONYMOUS_API_TOKEN,  # (1)
        project="common/catalyst-integration",  # (2)
        tags=["pretraining", "retina"],  # (optional)
    )
    
    1. The api_token argument is included to enable anonymous logging. Once you register, you should leave the token out of your script and instead save it as an environment variable.
    2. Projects in the common workspace are public and can be used for testing. To log to your own workspace, pass the full name of your Neptune project: workspace-name/project-name. For example, "ml-team/classification". To copy it, navigate to the project settingsProperties.

There are further ways to customize the behavior of the logger. For details, see the Catalyst API reference.

  1. Pass the logger to the runner:

    # You can pass it to the SupervisedRunner
    my_runner = dl.SupervisedRunner()
    
    my_runner.train(
        loggers={"neptune": neptune_logger},
        ...
    )
    
    # You can also pass it to the custom Runner
    class CustomRunner(dl.IRunner):
        ...
        def get_loggers(self):
            return {
                "console": dl.ConsoleLogger(),
                "neptune": neptune_logger
            }
    
    runner = CustomRunner().run()
    

The Neptune logger setup is complete and you can run your scripts without additional changes.

Your metadata will be logged in the Neptune project for further analysis, comparison, and collaboration.

Catalyst logging example#

This guide walks you through connecting NeptuneLogger to your machine-learning scripts and using it in your experimentation.

Before you start#

Tip

If you'd rather follow the guide without any setup, you can run the example in Colab.

Adding NeptuneLogger to the Catalyst script#

Catalyst has a unified way of logging metadata, by using loggers.

You can learn more about Catalyst Loggers in the Catalyst docs.

To start logging, create a Neptune logger and pass it to the runner:

  1. Create a NeptuneLogger instance.

    from catalyst import dl
    
    # Create NeptuneLogger instance
    neptune_logger = dl.NeptuneLogger()  # (1)
    
    1. If you haven't set up your credentials, you can log anonymously: dl.NeptuneLogger(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/catalyst-integration")

    There are further ways to customize the behavior of the logger. For details, see the Catalyst API reference.

  2. Pass neptune_logger to the runner.

    The example below uses SupervisedRunner as an example:

    from catalyst import dl
    
    # Create runner
    my_runner = dl.SupervisedRunner()
    
    my_runner.train(
        loggers={"neptune": neptune_logger},
        ...
    )
    

    The Neptune logger is now ready.

  3. Run your script:

    python main.py
    
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables. However, you can also pass them as arguments when you're using a function that takes api_token and project as parameters:

  • api_token="Your Neptune API token here"
    • Find and copy your API token by clicking your avatar and selecting Get my API token.
  • project="workspace-name/project-name""
    • Find and copy your project name in the project settingsProperties.

For example:

Neptune client library
model_version = neptune.init_model_version(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",
    project="ml-team/named-entity-recognition",
    model= ...
)
Neptune integration
neptune_logger = dl.NeptuneLogger(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",
    project="ml-team/named-entity-recognition",
)

Analyzing the logged metadata in Neptune#

Your metadata will be logged in the given Neptune project for analysis, comparison, and collaboration.

To open the run, click the Neptune link that appears in the console output.

Example link: https://app.neptune.ai/common/catalyst-integration/e/CATALYST-1486

You can also open the project and look for your run in the Runs tab.

Manually logging metadata#

If you have other types of metadata that are not covered in this guide, you can still log them using the Neptune client library (neptune-client).

When you initialize the run, you get a run object, to which you can assign different types of metadata in a structure of your own choosing.

from neptune.new import neptune

# Create a new Neptune run
run = neptune.init_run()

# Log metrics or other values inside loops
for epoch in range(n_epochs):
    ...  # Your training loop

    run["train/epoch/loss"].log(loss)  # Each log() appends a value
    run["train/epoch/accuracy"].log(acc)

# Upload files
run["test/preds"].upload("path/to/test_preds.csv")

# Track and version artifacts
run["train/images"].track_files("./datasets/images")

# Record numbers or text
run["tokenizer"] = "regexp_tokenize"
Back to top