Pipelines

Neptune works well with pipelining libraries like e.g. KubeFlow and provides tracking and visualization into the process.

You just need to make sure that all the steps (scripts) are tracking data to the same run - you can do it by setting the environment variable NEPTUNE_CUSTOM_RUN_ID to some unique ID, which will tell Neptune Client Library that scripts started with the same NEPTUNE_CUSTOM_RUN_ID value should be treated as one run.

export NEPTUNE_CUSTOM_RUN_ID="SOME RANDOM ID"
# or even better
export NEPTUNE_CUSTOM_RUN_ID=`date | md5`

On top of that, you might want to use namespaces to organize tracked metadata into meaningful steps of your pipeline e.g.:

# in preparation script
run["preparation/input_dataset"].upload("test.csv")
# in training script
run["train/accuracy"].log(0.96)
# in validation script
run["validation/accuracy"] = 0.89