Skip to content

Working with Lightning#

Open in Colab

Custom dashboard displaying metadata logged with PyTorch Lightning

Lightning is a lightweight PyTorch wrapper for high-performance AI research.

With the Neptune integration, you can automatically:

  • Monitor model training live,
  • Log training, validation, and testing metrics and visualize them in the Neptune app
  • Log hyperparameters
  • Monitor hardware consumption
  • Log performance charts and images
  • Save model checkpoints
  • Track training code and Git commit information

See in Neptune  Code examples 

Related

Quickstart#

Tip

This section is for PyTorch Lightning users who are familiar with loggers, like CSV logger or TensorBoard logger.

NeptuneLogger is part of the Lightning library. To start logging, create a Neptune logger and pass it to the trainer:

  1. Create the logger:

    from pytorch_lightning import Trainer
    from pytorch_lightning.loggers import NeptuneLogger
    
    # Create NeptuneLogger instance
    from neptune.new import ANONYMOUS_API_TOKEN
    neptune_logger = NeptuneLogger(
        api_key=ANONYMOUS_API_TOKEN,  # (1)
        project="common/pytorch-lightning-integration",  # (2)
        tags=["training", "resnet"],  # optional
    )
    
    1. The api_token argument is included to enable anonymous logging. Once you register, you should leave the token out of your script and instead save it as an environment variable.
    2. Projects in the common workspace are public and can be used for testing. To log to your own workspace, pass the full name of your Neptune project: workspace-name/project-name. For example, "ml-team/classification". To copy it, navigate to the project settingsProperties.

    There are further ways to customize the behavior of the logger. For details, see the Lightning API reference .

  2. Pass the logger to the trainer:

    trainer = Trainer(max_epochs=10, logger=neptune_logger)
    
  3. Run the trainer:

    trainer.fit(my_model, my_dataloader)
    

The Neptune logger setup is complete and you can run your scripts without additional changes.

Your metadata will be logged in the Neptune project for further analysis, comparison, and collaboration.

Lightning logging example#

This guide walks you through connecting NeptuneLogger to your machine-learning scripts and analyzing some logged metadata.

Before you start#

Tip

If you'd rather follow the guide without any setup, you can run the example in Colab .

Adding NeptuneLogger to the Lightning script#

Lightning has a unified way of logging metadata, by using loggers.

You can learn more about logger support in the Lightning docs .

To start logging, create a Neptune logger and pass it to the runner:

  1. Create a NeptuneLogger instance:

    from pytorch_lightning.loggers import NeptuneLogger
    
    # Create NeptuneLogger instance
    neptune_logger = NeptuneLogger()  # (1)
    
    1. If you haven't set up your credentials, you can log anonymously: dl.NeptuneLogger(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/pytorch-lightning-integration")
    Changing the metadata folder name

    By default, the metadata is logged under a namespace called training.

    To change the namespace, modify the prefix argument of the constructor:

    neptune_logger = NeptuneLogger(
        prefix="my_namespace",  # your custom namespace
    )
    

    Once the Neptune logger is created, a link appears in the console output.

    Click the link to open the run in Neptune. You'll see the metadata appear as it gets logged.

    Example link: https://app.neptune.ai/o/common/org/pytorch-lightning-integration/e/PTL-18

  2. Pass neptune_logger to the trainer:

    from pytorch_lightning import Trainer
    
    trainer = Trainer(
        logger=neptune_logger,
        max_epochs=250,
    )
    

    The Neptune logger is now ready to be used.

  3. Pass your LightningModule and DataLoader instances to the fit() method of the trainer:

    model = My_LightningModule()
    train_loader = My_DataLoader()
    
    trainer.fit(model, train_loader)
    
  4. Run your script:

    python main.py
    
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables. However, you can also pass them as arguments when you're using a function that takes api_token and project as parameters:

  • api_token="Your Neptune API token here"
    • Find and copy your API token by clicking your avatar and selecting Get my API token.
  • project="workspace-name/project-name""
    • Find and copy your project name in the project settingsProperties.

For example:

Neptune client library
model_version = neptune.init_model_version(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",
    project="ml-team/named-entity-recognition",
    model= ...
)
Neptune integration
neptune_logger = dl.NeptuneLogger(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",
    project="ml-team/named-entity-recognition",
)

Analyzing the logged metadata in Neptune#

Your metadata will be logged in the given Neptune project for analysis, comparison, and collaboration.

To browse the metadata, follow the Neptune link in the console output.

You can also open the project and look for your run in the Runs tab.

Viewing the metadata#

To view the metadata from your Lightning run:

  1. In the left pane of the run view, select All metadata.
  2. Click training (or the name of your custom namespace, if you specified a different prefix when creating the logger).

Metrics are logged as nested dictionary-like structures defined in the LightningModule. For instructions, see the Specifying the metrics structure section.

Charts#

In the left pane of the run view, select Charts to display all the metrics at once.

Tip

Create a custom dashboard to display various types of metadata in one view.

More options#

You can configure the Neptune logger in various ways to address custom logging needs.

In the following sections, we describe some common use cases.

Related

For the full NeptuneLogger API reference, see the Lightning docs .

Specifying the metrics structure#

Metrics are logged as nested dictionary-like structures defined in the LightningModule.

You can specify the structure with self.log("path/to/metric", value).

Example
from pytorch_lightning import LightningModule

class MNISTModel(LightningModule):
    def training_step(self, batch, batch_idx):
        loss = ...
        self.log("metrics/batch/loss", loss, prog_bar=False)

        acc = ...
        self.log("metrics/batch/acc", acc)

    def training_epoch_end(self, outputs):
        loss = ...
        acc = ...
        self.log("metrics/epoch/loss", loss)
        self.log("metrics/epoch/acc", acc)
Result
training
    |—— metrics
        |—— batch
            |—— loss
            |—— acc
        |—— epoch
            |—— loss
            |—— acc

Using the logger methods anywhere in LightningModule#

You can use the default logging methods with the Neptune logger:

  • self.log()
  • log_metrics()
  • log_hyperparams()

To log custom metadata – such as images, CSV files, or interactive charts – you can access the Neptune run directly with the self.logger.experiment attribute.

from pytorch_lightning import LightningModule

from neptune.new.types import File


class LitModel(LightningModule):
    def training_step(self, batch, batch_idx):
        # log metrics
        acc = ...
        self.log("train/loss", loss)  # standard log method

    def any_lightning_module_function_or_hook(self):
        # Log images, using the Neptune client library
        img = ...
        self.logger.experiment["train/misclassified_imgs"].log(File.as_image(img))

        # Generic recipe, using the Neptune client library
        metadata = ...
        self.logger.experiment["your/metadata/structure"].log(metadata)  # (1)
  1. You can define your own folder structure here, depending on how you want to organize your metadata.

As another example, the below code yields the following result in Neptune: Two series of values (acc and loss) logged under the namespace training/val.

Example
import pytorch_lightning as pl

class LitModel(pl.LightningModule):
    def validation_epoch_end(self, outputs):
        loss = ...
        y_true = ...
        y_pred = ...
        acc = accuracy_score(y_true, y_pred)
        self.log("val/loss", loss)
        self.log("val/acc", acc)

See result in Neptune 

For more, see What you can log and display.

Logging after fitting or testing is finished#

You can use the created Neptune logger outside of the Trainer context, which lets you log objects after the fitting or testing methods are finished.

This way, you're not restricted to the LightningModule class – you can log from any method or class in your project code.

Example
from pytorch_lightning import Trainer
from pytorch_lightning.loggers import NeptuneLogger

# Create logger
neptune_logger = NeptuneLogger()

trainer = pl.Trainer(logger=neptune_logger)
model = ...
datamodule = ...

# Run fit and test
trainer.fit(model, datamodule=datamodule)
trainer.test(model, datamodule=datamodule)

##############################################
# Log additional metadata after fit and test #
##############################################

# Log confusion matrix as image
from neptune.new.types import File

fig, ax = plt.subplots(figsize=(16, 12))
plot_confusion_matrix(y_true, y_pred, ax=ax)
neptune_logger.experiment["test/confusion_matrix"].upload(File.as_image(fig))

# Generic recipe
metadata = ...
neptune_logger.experiment["your/metadata/structure"].log(metadata)

Passing any Neptune init parameter to the logger#

The Neptune logger accepts keyword arguments, which you can use to supply more detailed information about your run:

Example
from pytorch_lightning.loggers import NeptuneLogger
neptune_logger = NeptuneLogger(
    project="ml-team/classification",
    name="lightning-run",
    description="mlp quick run with pytorch-lightning",
    tags=["mlp", "quick-run"],
)

For the full list of arguments, see API referenceneptune.init_run().

If you have ModelCheckpoint configured, the Neptune logger automatically logs model checkpoints. Model weights are logged in the model/checkpoints namespace of the Neptune run.

Info

You can disable this option by setting log_model_checkpoints to False when you create the NeptuneLogger instance:

from pytorch_lightning.loggers import NeptuneLogger
neptune_logger = NeptuneLogger(
    log_model_checkpoints=False
)

You can log the model summary, as generated by the ModelSummary utility from Lightning.

The summary is logged in the model/summary namespace of the Neptune run.

from pytorch_lightning.loggers import NeptuneLogger
neptune_logger = NeptuneLogger()
model = ...  # LightningModule

# Log model summary
neptune_logger.log_model_summary(model=model, max_depth=-1)

If you have ModelCheckpoint configured, the Neptune logger automatically logs the best_model_path and best_model_score values.

They are logged in the model namespace of the Neptune run.

Logging gradients#

If you specify the trainer to track gradient norms, these norms will be automatically logged to Neptune.

You can inspect them as interactive charts in Neptune.

from pytorch_lightning.loggers import NeptuneLogger
# Create NeptuneLogger
neptune_logger = NeptuneLogger()
trainer = pl.Trainer(
    logger=neptune_logger,
    log_every_n_steps=50,
    track_grad_norm=2,  # track gradient norm
)

Logging hyperparameters#

You can log hyperparameters by using the standard log_hyperparams() method from the Lightning logger.

from pytorch_lightning.loggers import NeptuneLogger
PARAMS = ...  # dict or argparse
neptune_logger.log_hyperparams(params=PARAMS)

You can display any logged parameters:

  • In the runs table, by adding them as columns.
  • In custom dashboards.

Manually logging metadata#

If you have other types of metadata that are not covered in this guide, you can still log them using the Neptune client library (neptune-client).

When you initialize the run, you get a run object, to which you can assign different types of metadata in a structure of your own choosing.

from neptune.new import neptune

# Create a new Neptune run
run = neptune.init_run()

# Log metrics or other values inside loops
for epoch in range(n_epochs):
    ...  # Your training loop

    run["train/epoch/loss"].log(loss)  # Each log() appends a value
    run["train/epoch/accuracy"].log(acc)

# Upload files
run["test/preds"].upload("path/to/test_preds.csv")

# Track and version artifacts
run["train/images"].track_files("./datasets/images")

# Record numbers or text
run["tokenizer"] = "regexp_tokenize"