Neptune-PyTorch Integration

Note

Neptune integrates with both pure PyTorch and many libraries from the PyTorch Ecosystem. You may want to check out the following integrations:

For pure PyTorch integration, read on.

What will you get with this integration?

PyTorch is an open source deep learning framework commonly used for building neural network models. Neptune helps with keeping track of model training metadata.

With Neptune + PyTorch integration you can:

  • log hyperparameters

  • see learning curves for losses and metrics during training

  • see hardware consumption and stdout/stderr output during training

  • log torch tensors as images

  • log training code and git commit information

  • log model weights

Tip

You can log many other experiment metadata like interactive charts, video, audio and more. See the full list.

Note

This integration is tested with torch==1.7.0, neptune-client==0.4.132.

Where to start?

To get started with this integration, follow the quickstart below. You can also skip the basics and take a look at how to log model weights and prediction images in the more options section.

If you want to try things out and focus only on the code you can either:

Quickstart

This quickstart will show you how to:

  • Install the necessary Neptune package

  • Connect Neptune to your PyTorch model training code and create the first experiment

  • Log metrics, training scripts and .git info to Neptune

  • Explore learning curves in the Neptune UI

Before you start

You have Python 3.x and following libraries installed:

pip install --quiet torch neptune-client

You also need minimal familiarity with torch. Have a look at this PyTorch guide to get started.

Step 1: Initialize Neptune

Add the following snippet at the top of your script.

import neptune

neptune.init(api_token='ANONYMOUS', project_qualified_name='shared/pytorch-integration')

Tip

You can also use your personal API token. Read more about how to securely set the Neptune API token.

Step 2: Create an experiment

Run the code below to create a Neptune experiment:

neptune.create_experiment('pytorch-quickstart')

This also creates a link to the experiment. Open the link in a new tab. The charts will currently be empty, but keep the window open. You will be able to see live metrics once logging starts.

When you create an experiment Neptune will look for the .git directory in your project and get the last commit information saved.

Note

If you are using .py scripts for training Neptune will also log your training script automatically.

Step 3: Add logging into your training loop

Log your loss after every batch by adding log_metric() inside of the training loop.

for batch_idx, (data, target) in enumerate(train_loader):
    optimizer.zero_grad()
    outputs = model(data)
    loss = F.nll_loss(outputs, target)

    # log loss
    neptune.log_metric('batch_loss', loss)

    loss.backward()
    optimizer.step()
    if batch_idx == 100:
        break

Note

You can log epoch metric and losses by calling log_metric() at the epoch level.

Step 4: Run your training script

Run your script as you normally would:

python train.py

Step 5: Monitor your PyTorch training in Neptune

Now you can switch to the Neptune tab which you had opened previously to watch the training live!

PyTorch learning curve charts

More Options

Log hardware consumption and stderr/stdout

Neptune can automatically log your CPU and GPU consumption during training as well as stderr and stdout from your console. To do that you just need to install psutil.

pip install psutil
PyTorch hardware consumption charts

Log hyperparameters

You can log training and model hyperparameters. To do that just pass the parameter dictionary to create_experiment() method:

PARAMS = {'lr':0.005,
          'momentum':0.9,
          'iterations':100}

optimizer = optim.SGD(model.parameters(), PARAMS['lr'], PARAMS['momentum'])

# log params
neptune.create_experiment('pytorch-advanced', params=PARAMS)
PyTorch hyperparameter logging

Log model weights

You can log model weights to Neptune both during and after training.

To do that just use a log_artifact() method on the saved model file.

torch.save(model.state_dict(), 'model_dict.ckpt')

# log model
neptune.log_artifact('model_dict.ckpt')
PyTorch checkpoints logging

Log image predictions

You can log tensors as images to Neptune with some additional descriptions.

for batch_idx, (data, target) in enumerate(train_loader):

    optimizer.zero_grad()
    outputs = model(data)
    loss = F.nll_loss(outputs, target)

    loss.backward()
    optimizer.step()

    # log loss
    neptune.log_metric('batch_loss', loss)

    # log predicted images
    if batch_idx % 50 == 1:
        for image, prediction in zip(data, outputs):
            description = '\n'.join(['class {}: {}'.format(i, pred)
                                     for i, pred in enumerate(F.softmax(prediction))])
            neptune.log_image('predictions',
                              image.squeeze(),
                              description=description)

    if batch_idx == 100:
        break
PyTorch logging images

Note

You can log many other experiment metadata like interactive charts, video, audio and more. See the full list.

Remember that you can try it out with zero setup:

How to ask for help?

Please visit the Getting help page. Everything regarding support is there.

Other integrations you may like

Here are other integrations with libraries from the PyTorch ecosystem:

You may also like these two integrations: