Log from multiple scripts#
Whether you're logging from multiple scripts, processes, or parts of a pipeline, you can log all generated metadata to a single run. You have a couple of options for this:
-
Pass the run object between functions - you can use the
Run
object as a parameter in functions you import from other scripts. -
- You can create a custom identifier for the run and use that to access the same run from multiple locations.
- You can also export the custom run ID as an environment variable (
NEPTUNE_CUSTOM_RUN_ID
).
Why log to the same run from multiple scripts?#
There are two major use cases that this functionality supports:
- Tracking runs in a distributed setup: In this case, the same script is run from different processes, but you would ideally track all of the metadata in the same Neptune run. For a tutorial, see Tracking distributed training jobs with Neptune.
- Logging multiple steps of a pipeline to the same run: For example, you have a pipeline that has different scripts for preprocessing, training, and evaluation, and you want to log all of these to the same run. For a tutorial, see Logging in a sequential pipeline
Best practices#
How to avoid overwriting some metric incorrectly from multiple scripts?#
This would require the same safeguards as you have while logging from a single script.
It's important to note that fields containing single values (or single files) are always overwritten with the most recent entry that is logged, while fields containing a series of values are appended with the most recent entry.
How does Neptune distinguish metrics logged from different scripts?#
Neptune does not. As such, it's even more important to ensure that you do not accidentally overwrite or update metadata already logged to a run.
If a metric is logged under the same field name (run["field_name"] = some_metric
), Neptune assumes that the intention is to overwrite that field with the new value. Otherwise, to keep the metrics apart, you need to log them to different field names inside the Neptune run.