Resume run

You can update runs even after they finished running. This lets you add new data or visualizations to the previously closed run and makes multi-stage training convenient.

Why you may want to update an existing run?

Updating existing run can come in handy in several situations:

  • You want to add metrics or visualizations to the closed run.

  • You finished model training and closed the run earlier, but now you want to continue training from that moment. Actually, you can even make multiple iterations of the procedure: resume run -> log more data. Have a look at the simple example below for details.

How to resume run?

Simply use neptune.init() with the run parameter set to the existing run's ID. Now you can continue working with the run's metadata as if you just started a brand new run.

Look at the following snippet of code, that illustrates the idea:

import neptune
# SUN-123 is the run you want to resume
run = neptune.init(run="SUN-123")
# download snapshot of model weights
model = run["train/model_weights"].download()
# 450 is the epoch from where you want to resume training process
checkpoint = 450
# continue training as usual
for epoch in range(checkpoint, 1000):
run["train/accuracy"].log(0.75)
# ...

You can retrieve a run and log more data to it multiple times.

Run with mse metric (id: SHOW-35)
Same run as above, but with more data logged.

Each time you resume a run, Python file from which run was started is uploaded: see example.