Neptune works well with pipelining libraries like e.g. KubeFlow and provides tracking and visualization into the process.
You just need to make sure that all the steps (scripts) are tracking data to the same run - you can do it by setting the environment variable
NEPTUNE_CUSTOM_RUN_ID to some unique ID, which will tell Neptune Client Library that scripts started with the same
NEPTUNE_CUSTOM_RUN_ID value should be treated as one run.
export NEPTUNE_CUSTOM_RUN_ID="SOME RANDOM ID"# or even betterexport NEPTUNE_CUSTOM_RUN_ID=`date | md5`
On top of that, you might want to use namespaces to organize tracked metadata into meaningful steps of your pipeline e.g.:
# in preparation scriptrun["preparation/input_dataset"].upload("test.csv")# in training scriptrun["train/accuracy"].log(0.96)# in validation scriptrun["validation/accuracy"] = 0.89