Skip to content

PyTorch Ignite integration guide#

Open in Colab

Custom dashboard displaying metadata logged with PyTorch Ignite

This guide walks you through keeping track of your model training metadata when using PyTorch Ignite, such as:

  • Training metrics
  • Model checkpoints
  • Training code and Git information

See example in Neptune  Code examples 

Before you start#

Tip

If you'd rather follow the guide without any setup, you can run the example in Colab .

Installing the integration#

The integration is implemented as part of the Ignite framework, so you don't need to install anything else.

To install Neptune:

pip install -U neptune
How do I save my credentials as environment variables?

Set your Neptune API token and full project name to the NEPTUNE_API_TOKEN and NEPTUNE_PROJECT environment variables, respectively.

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...ifQ=="
export NEPTUNE_PROJECT="ml-team/classification"
export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...ifQ=="
export NEPTUNE_PROJECT="ml-team/classification"
setx NEPTUNE_API_TOKEN "h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...ifQ=="
setx NEPTUNE_PROJECT "ml-team/classification"

You can also navigate to SettingsEdit the system environment variables and add the variables there.

%env NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM.4kl0jvYh3Kb8...ifQ=="
%env NEPTUNE_PROJECT="ml-team/classification"

To find your credentials:

  • API token: In the bottom-left corner of the Neptune app, expand your user menu and select Get your API token. If you need the token of a service account, go to the workspace or project settings and enter the Service accounts settings.
  • Project name: Your full project name has the form workspace-name/project-name. You can copy it from the project menu ( Details & privacy).

If you're working in Google Colab, you can set your credentials with the os and getpass libraries:

import os
from getpass import getpass
os.environ["NEPTUNE_API_TOKEN"] = getpass("Enter your Neptune API token: ")
os.environ["NEPTUNE_PROJECT"] = "workspace-name/project-name"

Logging example#

  1. Create a NeptuneLogger instance:

    from ignite.contrib.handlers.neptune_logger import *
    
    neptune_logger = NeptuneLogger() # (1)!
    
    1. If you haven't set up your credentials, you can log anonymously:

      from neptune import ANONYMOUS_API_TOKEN
      
      NeptuneLogger(
          api_token=ANONYMOUS_API_TOKEN,
          project="common/pytorch-ignite-integration"
      )
      

    To open the run in the app, click the Neptune link in the console output.

    If Neptune can't find your project name or API token

    As a best practice, you should save your Neptune API token and project name as environment variables:

    export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
    
    export NEPTUNE_PROJECT="ml-team/classification"
    

    Alternatively, you can pass the information when using a function that takes api_token and project as arguments:

    run = neptune.init_run(
        api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
        project="ml-team/classification", # (2)!
    )
    
    1. In the bottom-left corner, expand the user menu and select Get my API token.
    2. You can copy the path from the project details ( Details & privacy).

    If you haven't registered, you can log anonymously to a public project:

    api_token=neptune.ANONYMOUS_API_TOKEN
    project="common/quickstarts"
    

    Make sure not to publish sensitive data through your code!

  2. Attach the logger to the trainer to log training loss at each iteration:

    neptune_logger.attach_output_handler(
        trainer,
        event_name=Events.ITERATION_COMPLETED,
        tag="training",
        output_transform=lambda loss: {"loss": loss},
    )
    
  3. Run the trainer:

    trainer.run(train_loader, max_epochs=epochs)
    

    To open the run and watch your model training live, click the Neptune link that appears in the console output.

    Example link: https://app.neptune.ai/o/neptune-ai/org/pytorch-ignite-integration/e/PYTOR-30

  4. When you're done, stop the logger:

    neptune_logger.close()
    

See example in Neptune 

More options#

Logging additional metadata after training#

You can access the Neptune run through the .experiment attribute of the NeptuneLogger object:

torch.save(model.state_dict(), "model.pth")
neptune_logger.experiment.upload("model.pth")

When you're done, stop the logger:

neptune_logger.close()

Attaching handlers to the logger#

To attach the logger to the evaluator on the training dataset and log NLL and accuracy metrics after each epoch:

neptune_logger.attach_output_handler(
    train_evaluator,
    event_name=Events.EPOCH_COMPLETED,
    tag="training",
    metric_names=["nll", "accuracy"],
    global_step_transform=global_step_from_engine(trainer), # (1)!
)
  1. Takes the epoch of the trainer instead of train_evaluator.

To attach the logger to the evaluator on the validation dataset and log NLL and accuracy metrics after each epoch:

neptune_logger.attach_output_handler(
    evaluator,
    event_name=Events.EPOCH_COMPLETED,
    tag="training",
    metric_names=["nll", "accuracy"],
    global_step_transform=global_step_from_engine(trainer), # (1)!
)
  1. Takes the epoch of the trainer instead of evaluator.

Logging parameters of the optimizer, such as learning rate at each iteration:

neptune_logger.attach_opt_params_handler(
    trainer,
    event_name=Events.ITERATION_STARTED,
    optimizer=optimizer,
    param_name="lr"  # optional
)

Logging the weight norm of the model after each iteration:

neptune_logger.attach_opt_params_handler(
    trainer,
    event_name=Events.ITERATION_STARTED,
    optimizer=optimizer,
    param_name="lr"  # optional
)

Saving model checkpoints to Neptune#

You can use NeptuneSaver to log model checkpoints during training.

from ignite.handlers import Checkpoint


def score_function(engine):
    return engine.state.metrics["accuracy"]


to_save = {"model": model}
handler = Checkpoint(
    to_save,
    NeptuneSaver(neptune_logger),
    n_saved=2,
    filename_prefix="best",
    score_function=score_function,
    score_name="validation_accuracy",
    global_step_transform=global_step_from_engine(trainer),
)
validation_evaluator.add_event_handler(Events.COMPLETED, handler)

Windows note

Checkpoint saving is not supported on Windows.

Using logger as context manager#

You can also use the logger as a context manager.

from ignite.contrib.handlers.neptune_logger import *

with NeptuneLogger() as neptune_logger:
    trainer = Engine(update_fn)
    # Attach the logger to the trainer to log training loss at each iteration
    neptune_logger.attach_output_handler(
        trainer,
        event_name=Events.ITERATION_COMPLETED,
        tag="training",
        output_transform=lambda loss: {"loss": loss},
    )

Related