Forking runs in Neptune
Forking means creating a run that inherits metadata from an existing experiment run. This way, you can restart an experiment from a particular step.
To create a fork run, specify the parent run and step to fork from:
from neptune_scale import Run
run = Run(
experiment_name="swim-further",
fork_run_id="likable-barracuda", # parent run ID
fork_step=100,
)
To specify the run ID, use the sys/custom_run_id attribute.
It's the run identifier that you set manually via the run_id argument of the Run constructor. If not provided, it's auto-generated.
Inheritance
A child run inherits the following from its parent:
-
By default, configs and other single values logged with
log_configs().To disable inheritance of configs and other single values, set the
inherit_configsargument of theRunconstructor toFalse. -
All series attributes, up to and including the fork step. This includes metrics as well as series of strings, files, and histograms.
When the child run continues logging values to metrics, points logged at or after the fork step become part of the experiment lineage:
run = Run(
experiment_name="swim-further",
fork_run_id="likable-barracuda", # parent run ID
fork_step: 100,
)
for step in epoch:
# your training loop
run.log_metrics(
data={"loss": 0.11, "acc": 0.81},
step=101, # also OK: 100 or 100.1
)
In the Neptune app, you can then view the experiment lineage of runs as one continuous graph. Compare the graphs showing individual run metrics and the unified experiment graph:
Inherited attributes
| Metadata | Attribute type or path | How you log it |
|---|---|---|
| Metrics | FloatSeries | log_metrics() |
| Text logs | StringSeries | log_string_series() |
| Files | File | assign_files() |
| File series | FileSeries | log_files() |
| Histograms | HistogramSeries | log_histograms() |
| Tags | StringSet | add_tags() or via tags management in the app |
| Description | sys/description | Run.log_configs(data={"sys/description": "..."}) or in the app |
| Configs* | Float, String, Boolean, Integer, Datetime | log_configs() |
* Inherited by default when using the log_configs() function of the neptune-scale API.
Parent selection
If a run and its child are created close to each other, it can result in the requested parent not yet being available on the server.
In the case that the requested parent can't be retrieved, Neptune instead sets the parent to the experiment's latest head run.
The requested and actual parent runs are logged in the sys/forking/requested_parent and sys/forking/parent attributes, respectively.
Forking with earlier step than the parent's fork step
When creating a fork run at a step that is earlier than the parent's fork step, Neptune instead forks the run that actually contains that step in its range of directly logged steps. This is done to preserve the monotonicity of fork steps in the lineage.
Example:
- Run
r1is created. It contains steps 0-20. - Run
r2is forked fromr1at step 10. It inherits steps 0-10 and directly logs steps 11-30. - Run
r3is forked fromr2at step 20. It inherits steps 0-20 and directly logs steps 21-40. - Run
r4is forked fromr3at step 15. Becauser3was itself forked at a later step, the parent ofr4is set tor2instead ofr3.
The same applies when Neptune selects a parent based on the experiment name: the head of the experiment is the initial candidate, but if the fork step comes before the parent's fork step, Neptune goes down the lineage to find a valid parent.
Log points before the fork step
For an inherited series, a child run can log points for pre-fork steps, as long as they come after the parent's last logged step:
ParentMaxStep < ChildStep ≤ ForkStep
→ OK
ChildStep ≤ ParentMaxStep ≤ ForkStep
→ Error
When viewing the child run's metrics and the mode in the table toolbar is set to Runs:
- Show inherited datapoints enabled: Points inherited from the parent are displayed for steps before the fork point, otherwise points logged directly by the child run are displayed.
- Show inherited datapoints disabled: Only points directly logged by the child run itself are displayed.
If a series attribute doesn't exist in the parent run, the child run can log points starting from any step:
run = Run(
...,
fork_step: 100,
)
for step in some_range:
run.log_metrics(
data={"new_metric": 0.123}, # "new_metric" doesn't exist in the parent run
step=1, # OK
)
Such points are passed on to further descendant runs.
Get experiment URL
To get a link to the experiment in the Neptune web app:
run = Run(experiment_name=...)
run.get_experiment_url()
For details, see Construct Neptune URLs.