PyTorch Lightning

What will you get with this integration?

PyTorch Lightning is a lightweight PyTorch wrapper for high-performance AI research. With Neptune integration you can:

  • monitor model training live,

  • log training, validation, and testing metrics, and visualize them in the Neptune UI,

  • log hyperparameters,

  • monitor hardware usage,

  • log any additional metrics,

  • log performance charts and images,

  • save model checkpoints.

Installation

Before you start, make sure that:

Install neptune-client[pytorch-lightning]

Depending on your operating system open a terminal or CMD and run this command.

pip install 'neptune-client[pytorch-lightning]'

For more help see installing neptune-client.

Quickstart

Create NeptuneLogger

from neptune.new.integrations.pytorch_lightning import NeptuneLogger
neptune_logger = NeptuneLogger(
api_key='<YOUR_API_TOKEN>',
project='<YOUR_WORKSPACE/YOUR_PROJECT>',
name='lightning-run', # Optional
)

Pass your Neptune Project name and API token to NeptuneLogger.

Pass neptune_logger to Trainer

Pass neptune_logger instance to lightning Trainer to log model training metadata to Neptune:

from pytorch_lightning import Trainer
trainer = Trainer(max_epochs=10, logger=neptune_logger)

Run model training

Pass your lightning Module and training Loader to Trainer and run .fit():

trainer.fit(model, train_loader)

Explore Results

You just learned how to start logging PyTorch Lightning model training runs to Neptune, by using Neptune logger.

Use logger inside your lightning Module class

You can use log Images, model checkpoints, and other ML metadata from inside your training and evaluation steps.

To do that you need to:

from neptune.new.types import File
class LitModel(LightningModule):
def training_step(self, batch, batch_idx):
# log metrics
acc = ...
self.logger.experiment['train/acc'].log(acc)
# log images
img = ...
self.logger.experiment['train/misclassified_images'].log(File.as_image(img))
def any_lightning_module_function_or_hook(self):
# log model checkpoint
...
self.logger.experiment['checkpoints/epoch37'].upload('epoch=37.ckpt')
# generic recipe
metadata = ...
self.logger.experiment['your/metadata/structure'].log(metadata)

You can log other model-building metadata like metrics, images, video, audio, interactive visualizations, and more. See What can you log and display?

Log after training is finished

If you want to log objects after the training is finished, use closeafterfit=False. You will then need to explicity stop the logger after your logging is complete using neptune_logger.experiment.stop().

neptune_logger = NeptuneLogger(...,
close_after_fit=False)
trainer = Trainer(logger=neptune_logger)
trainer.fit(model)
# Log confusion matrix after training
from neptune.new.types import File
from scikitplot.metrics import plot_confusion_matrix
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(16, 12))
plot_confusion_matrix(y_true, y_pred, ax=ax)
neptune_logger.experiment['test/confusion_matrix'].upload(File.as_image(fig))
# Stop logging
neptune_logger.experiment.stop()

Pass additional parameters to NeptuneLogger

You can also pass kwargs to specify the Run in greater detail, like tags and description:

neptune_logger = NeptuneLogger(
project='common/new-pytorch-lightning-integration',
name='lightning-run',
description='mlp quick run with pytorch-lightning',
tags=['mlp', 'quick-run'],
)
trainer = Trainer(max_epochs=3, logger=neptune_logger)

For more information about the Neptune Run, see Core Concepts.

What's next?