App version: 3.20251006

Correlate metrics and logs

When a metric indicates a problem, the logs might help you investigate the root cause of the issue. With Neptune, you can build custom views where you can see and analyze the correlation between the metrics and logs from the same time range or at the same step.

This tutorial explains how to:

Capture logs with Neptune
View logs in the Neptune app
Correlate metrics to logs and hardware usage

Before you start

Configure your Neptune API token and project. For details, see Get started.

Log metrics

To log numerical series to Neptune, use the log_metrics() function. For details, see Metrics.

Capture logs

You can log your own custom messages to Neptune as StringSeries attributes. Each message is timestamped and associated with a step value which makes it easier to track progress during training. For example, you can log the following information:

Error and warning messages
Progress updates, configuration changes
Custom debugging information

To send custom messages or logs to Neptune, log a series of strings with the log_string_series() function.

Example script
from random import random
from neptune_scale import Run


def hello_neptune():
    run = Run(
        api_token="eyJhcGlfYWRkcmVz...In0=",      # not needed if using environment variable
        project="workspace-name/project-name",    # not needed if using environment variable
        experiment_name="tutorial-metrics-and-logs",
    )

    run.log_string_series(
        data={"status": "Starting training"},
        step=0,
    )

    num_steps = 20
    offset = random() / 5

    for step in range(1, num_steps):
        # Your training loop
        acc = 1 - 2**-step - random() / (step + 1) - offset
        loss = 2**-step + random() / (step + 1) + offset

        if step % (num_steps // 2) == 0:  # Add a simulated error
            run.log_string_series(
                data={"status": f"Step = {step}, Loss = NaN"},
                step=step,
            )
        elif step % 1 == 0:
            run.log_string_series(
                data={"status": f"Step = {step}, All metrics logged"},
                step=step,
            )

        # Log metrics as usual
        run.log_metrics(
            data={"accuracy": acc, "loss": loss},
            step=step,
        )

    run.log_string_series(
        data={"status": "Training complete!"},
        step=step,
    )

    run.close()


if __name__ == "__main__":
    hello_neptune()

For details, see Log a series of strings.

Other options

In addition to logging custom messages, Neptune offers the following options to track runtime information:

Neptune logs the standard streams stderr and stdout automatically.
You can capture logs with the Python Logger. For details, see Log Python Logger output.
You can monitor hardware usage and log the results to Neptune with the neptune_hardware_monitoring.py utility script.

Build visualizations

To build custom visaulizations in the Neptune app:

Create a dashboard or report.

For a detailed comparison between them, see Gather and share insights.
To visualize the logged metadata, use widgets:
- For metrics that you want to analyze, create chart widgets.
- For the custom messages, create logs widgets.
Align the training charts to the logs.
- To use relative time:
  - For chart widgets, set the X-axis series to relative time.
  - For logs widgets, set the time scale to relative time.
- To use steps:
  - For charts widgets, set the X-axis to step.
  - For logs widgets, enable the Display steps option.
From the runs table, select a run whose metadata you want to see and compare.

Note that to view the contents of the logs widget, you must select only one run at a time. For details, see Select runs to compare.

Analyze the correlation

Once you configure your dashboard or report, you can analyze the correlation between the metrics and logs.

For example, you can debug spikes in training charts by correlating them to the errors messages logged per step or in the same time range. You can also analyze if these effects are shared across other metrics and determine if you should abandon or fork a training run.

To zoom in on a chart, click and drag over its area. The zoom applies to all charts that share the same X-axis series in your current view.

Before you start​

Log metrics​

Capture logs​

Other options​

Build visualizations​

Analyze the correlation​