Skip to content

Working with PyTorch#

Open in Colab

Info

This page is about pure PyTorch integration (support for the torch package).

Neptune also integrates with several other libraries from the PyTorch ecosystem:

Custom dashboard displaying metadata logged with PyTorch

This guide walks you through keeping track of your model-training metadata when using PyTorch , such as:

  • Model configuration
  • Hyperparameters
  • Losses and metrics
  • Training code and Git information
  • Images and predictions
  • Artifacts (model weights, dataset version)
  • Tensors:
    • 2D and 3D tensors as images
    • 1D tensors as series

See example in Neptune  Code examples 

Before you start#

Tip

If you'd rather follow the guide without any setup, you can run the example in Colab .

PyTorch logging example#

  1. Create a Neptune run:

    import torch
    import torch.nn as nn
    import torch.optim as optim
    import neptune
    from neptune.utils import stringify_unsupported
    from torchvision import datasets, transforms
    
    run = neptune.init_run()  # (1)!
    
    1. If you haven't set up your credentials, you can log anonymously: neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/pytorch-integration")

    To open the run in the app, click the Neptune link in the console output.

  2. Log the configuration and hyperparameters.

    Assign your variables and parameters to namespaces and fields of your choice:

    run["config/dataset/path"] = data_dir
    run["config/dataset/transforms"] = stringify_unsupported(data_tfms)  # dict() object
    run["config/dataset/size"] = dataset_size  # dict() object
    run["config/model"] = type(model).__name__
    run["config/criterion"] = type(criterion).__name__
    run["config/optimizer"] = type(optimizer).__name__
    run["config/params"] = stringify_unsupported(hparams)  # dict() object
    

    In the example snippet, all the metadata will be organized under the config namespace of the run.

  3. Log losses and metrics.

    To log a series of values, use the append() method in your training loop:

    for epoch in range(epochs):
    
        for i, (x, y) in enumerate(trainloader, 0):
    
            # Log batch loss
            run["training/batch/loss"].append(loss)
    
            # Log batch accuracy
            run["training/batch/acc"].append(acc)
    
  4. Run your script as you normally would.

    To open the run and watch your model training live, click the Neptune link that appears in the console output.

    Example link: https://app.neptune.ai/o/common/org/pytorch-integration/e/PYTOR1-66/

Stop the run when done

Once you are done logging, you should stop the Neptune run. You need to do this manually when logging from a Jupyter notebook or other interactive environment:

run.stop()

If you're running a script, the connection is stopped automatically when the script finishes executing. In notebooks, however, the connection to Neptune is not stopped when the cell has finished executing, but rather when the entire notebook stops.

See example in Neptune 

More options#

Logging model architecture and weights#

To help with reproducibility and testing, it may be useful to store the model architecture and best weight file.

  • To have all the metadata in a single place, you can log model metadata to the same run you created earlier.
  • To manage your model metadata separately, you can use the Neptune model registry.
run["model_arch"].upload(f"./model_arch.txt")
run["model_weights"].upload(f"./model.pth")

Initialize a ModelVersion object.

You first need to create a Model object that functions as an umbrella for all the versions. You can create and manage each model version separately.

# Create a new model and give it a key
model = neptune.init_model(key="PRETRAINED")
# If you like, log some generic metadata that should be common to
# all the model versions
model["signature"].upload("model_signature.json")

# Create a specific version of that model by passing the model ID
model_id = model["sys/id"].fetch()
model_version = neptune.init_model_version(model=model_id)

# Log metadata to the model version, just like you would for runs
model_version["arch"].upload(f"./model_arch.txt")
model_version["weights"].upload(f"./model.pth")

The model metadata will now be displayed in the Models tab of the app.

Logging images and predictions#

To visually inspect the model predictions, it can be helpful to plot images and predictions, especially in computer vision tasks.

You can log torch tensors and have them displayed as images in the Neptune app:

Log image with predictions
from neptune.types import File

img = torch.rand(30, 30, 3)
description = {"Predicted": pred, "Ground Truth": gt}
run["torch_tensor"].upload(File.as_image(img), description=description)