How to organize ML experimentation: step by step guide

Introduction

This guide will show you how to:

  • Keep track of code, data, environment and parameters

  • Log results like evaluation metrics and model files

  • Find runs in the dashboard with tags

  • Organize runs in a dashboard view and save it for later

Before you start

Make sure you meet the following prerequisites before starting:

You can run this how-to on Google Colab with zero setup.

Just click on the Run in Google Colab link on the top of the page.

Step 1: Create a basic training script

As an example I’ll use a script that trains a sklearn model on wine dataset.

You don’t have to use sklearn to track your training runs with Neptune. I am using it as an easy to follow example. There are links to integrations with other ML frameworks and useful articles in the text.

Create a file train.py and copy the script below.

train.py
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score
from joblib import dump
data = load_wine()
X_train, X_test, y_train, y_test = train_test_split(data.data,
data.target,
test_size=0.4,
random_state=1234)
params = {'n_estimators': 10,
'max_depth': 3,
'min_samples_leaf': 1,
'min_samples_split': 2,
'max_features': 3,
}
clf = RandomForestClassifier(**params)
clf.fit(X_train, y_train)
y_train_pred = clf.predict_proba(X_train)
y_test_pred = clf.predict_proba(X_test)
train_f1 = f1_score(y_train, y_train_pred.argmax(axis=1), average='macro')
test_f1 = f1_score(y_test, y_test_pred.argmax(axis=1), average='macro')
print(f'Train f1:{train_f1} | Test f1:{test_f1}')
dump(clf, 'model.pkl')

Run training to make sure that it works correctly.

python train.py

Step 2: Connect Neptune to your script

At the top of your script add

import neptune.new as neptune
run = neptune.init(project='common/quickstarts',
api_token='ANONYMOUS')

This creates a new “run” in Neptune to which you can log metadata.

You need to tell Neptune who you are and where you want to log things. To do that you specify:

  • project=my_workspace/my_project: your workspace name and project name,

  • api_token=YOUR_API_TOKEN : your Neptune API token.

If you configured your Neptune API token correctly, as described in this docs page, you can skip 'api_token' argument.

Step 3. Add parameter, code and environment tracking

Add parameters tracking

run['parameters'] = params

You can add code and environment tracking at run creation

run = neptune.init(source_files=['*.py', 'requirements.txt'])

You can log source code to Neptune with every run. It can save you if you forget to commit your code changes to git.

To do it pass a list of files or regular expression to source_files argument.

If you start run from a directory that is a part of the git repo, Neptune will automatically find the .git directory and log some information from it:

  • status if repo has uncommitted changed (dirty flag),

  • commit information (id, message, author, date),

  • branch,

  • remote address to your run,

  • git checkout command with commit.

Putting it all together your neptune.init should look like this:

import neptune.new as neptune
run = neptune.init(project='common/quickstarts',
api_token='ANONYMOUS',
source_files=['*.py', 'requirements.txt'])
run['parameters'] = params

Step 4. Add tags to organize things

Runs can be viewed as dictionary-like structures - namespaces - that you can define in your code. You can apply hierarchical structure to your metadata that will be reflected in the UI as well. Thanks to this you can easily organize your metadata in a way you feel is most convenient.

There is one special namespace: system namespace, denoted sys. You can use it to add name and tags to the run.

Pass a list of strings to 'sys/tags' namespace:

run["sys/tags"].add(['run-organization', 'me']) # organize things

It will help you find runs later, especially if you try a lot of ideas.

Step 5. Add logging of train and evaluation metrics

run['train/f1'] = train_f1
run['test/f1'] = test_f1

Log all the scores you care about in the same way as above. There could be as many as you like.

You can log multiple values to the same metric:

acc = ...
run['train/loss'].log(acc)

When you do that a chart will be created automatically.

Step 6. Add logging of model files

run["model"].upload('my_model.pkl')

Log your model with .upload method. Just pass the path to the file you want to log to Neptune.

Step 7. Make a few runs with different parameters

Let’s execute some runs with different model configuration.

Change parameters in the params dictionary

params = {'n_estimators': 10,
'max_depth': 3,
'min_samples_leaf': 1,
'min_samples_split': 2,
'max_features': 3,
}

Execute a run

python train.py

Step 8. Go to Neptune UI

Click on one of the links created when you run the script or go directly to the app.

If you are logging things to the public project common/quickstarts you can just follow this link.

Step 9. See that everything got logged

Go to one of the runs you made and see that you logged things correctly:

  • click on the run link or one of the rows in the runs table in the UI

  • Go to Parameters section to see your parameters

  • Go to Monitoring to see hardware utilization charts

  • Go to All metadata to review all logged metadata

Step 10. Filter runs by tag

Go to the runs space and filter by the run-organization tag

Neptune should filter all those runs for you.

Step 11. Choose parameter and metric columns you want to see

Use the Add column button to choose the columns for the runs table:

  • Click on Add column,

  • Type metadata name of interest, for example test_f1,

  • Click on test_f1 to add it.

You can also use the suggested columns which shows you the columns with values that differ between selected runs.

Just click on the "+" to add it to your runs table.

Step 12. Save the view of runs table

You can save the current view of runs table for later:

  • Click on the Save as new

Both the columns and the filtering on rows will be saved as view.

Create and save multiple views of the runs table for different use cases or run groups.

What’s next

Now that you know how to keep track of runs and organize them you can:

Other useful articles from our blog: