Tracking hyperparameter optimization jobs with Neptune#
When optimizing or tuning hyperparameters, you can use Neptune to track the metadata from the study as well as each trial.
In this guide, we'll show you how to configure Neptune for your HPO job in two ways:
- By logging the metadata from all trials to the same Neptune run
- By creating a separate Neptune run for each trial
Integration tip
Neptune integrates directly with Optuna, a hyperparameter optimization framework.
For a detailed guide, see Optuna integration guide.
See example in Neptune  Code examples 
Before you start#
- Sign up at neptune.ai/register.
- Create a project for storing your metadata.
-
Install Neptune:
Installing through Anaconda Navigator
To find neptune, you may need to update your channels and index.
- In the Navigator, select Environments.
- In the package view, click Channels.
- Click Add..., enter
conda-forge
, and click Update channels. - In the package view, click Update index... and wait until the update is complete. This can take several minutes.
- You should now be able to search for neptune.
Note: The displayed version may be outdated. The latest version of the package will be installed.
Note: On Bioconda, there is a "neptune" package available which is not the neptune.ai client library. Make sure to specify the "conda-forge" channel when installing neptune.ai.
Passing your Neptune credentials
Once you've registered and created a project, set your Neptune API token and full project name to the
NEPTUNE_API_TOKEN
andNEPTUNE_PROJECT
environment variables, respectively.To find your API token: In the bottom-left corner of the Neptune app, expand the user menu and select Get my API token.
To find your project: Your full project name has the form
workspace-name/project-name
. To copy the name, click the menu in the top-right corner and select Edit project details.
While it's not recommended especially for the API token, you can also pass your credentials in the code when initializing Neptune.
run = neptune.init_run( project="ml-team/classification", # your full project name here api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh...3Kb8", # your API token here )
For more help, see Set Neptune credentials.
To follow the guide, you'll additionally need to have torch, torchvision, and tqdm installed:
Setting up the training script#
In this example, we'll set up a model training script with PyTorch.
-
Import the needed libraries:
-
Define hyperparameters and the search space:
parameters = { "batch_size": 128, "epochs": 1, "input_size": (3, 32, 32), "n_classes": 10, "dataset_size": 1000, "model_filename": "basemodel", "device": torch.device( "cuda:0" if torch.cuda.is_available() else "cpu" ), } input_size = reduce(lambda x, y: x * y, parameters["input_size"]) learning_rates = [1e-4, 1e-3, 1e-2] # learning rate choices
-
Set up the model:
class BaseModel(nn.Module): def __init__(self, input_size, hidden_dim, n_classes): super(BaseModel, self).__init__() self.main = nn.Sequential( nn.Linear(input_size, hidden_dim * 2), nn.ReLU(), nn.Linear(hidden_dim * 2, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, hidden_dim // 2), nn.ReLU(), nn.Linear(hidden_dim // 2, n_classes), ) self.input_size = input_size def forward(self, input): x = input.view(-1, self.input_size) return self.main(x)
-
Set up datasets:
Next, set up the training loop depending on your approach:
Logging all trials to the same run#
In this approach, we'll create a global Neptune run for logging metadata (such as metrics) across the trials.
-
Initialize a Neptune run:
- To identify a run that contains metadata from multiple trials.
If you haven't saved your credentials as environment variables, you can pass them as arguments when initializing Neptune:
neptune.init_run( project="workspace-name/project-name", api_token="YourNeptuneApiTokenHere", tags=["sweep-level"], )
How do I save my credentials as environment variables?
Set your Neptune API token and full project name to the
NEPTUNE_API_TOKEN
andNEPTUNE_PROJECT
environment variables, respectively.For example:
- On Windows, the command is
set
instead ofexport
.
- On Windows, the command is
set
instead ofexport
.
Finding your credentials:
- API token: In the bottom-left corner of the Neptune app, expand your user menu and select Get your API token.
- Project: Your full project name has the form
workspace-name/project-name
. To copy the name, click the menu in the top-right corner and select Edit project details.
If you're working in Colab, you can set your credentials with the os and getpass libraries:
-
Set up the training loop:
for (i, lr) in enumerate(learning_rates): # Log hyperparameters run[f"trials/{i}/parms"] = stringify_unsupported(parameters) run[f"trials/{i}/parms/lr"] = lr optimizer = optim.SGD(model.parameters(), lr=lr) for _ in trange(parameters["epochs"]): for (x, y) in trainloader: x, y = x.to(parameters["device"]), y.to(parameters["device"]) optimizer.zero_grad() outputs = model.forward(x) loss = criterion(outputs, y) _, preds = torch.max(outputs, 1) acc = (torch.sum(preds == y.data)) / len(x) # Log metrics run[f"trials/{i}/training/batch/loss"].append(loss) run[f"trials/{i}/training/batch/acc"].append(acc) loss.backward() optimizer.step()
-
To stop the connection to Neptune and sync all data, call the
stop()
method:
Analyzing results in Neptune#
When browsing the metadata of the sweep level run, you can see a namespace called trials. It contains the metadata (parms and training metrics) logged for each trial.
Logging each trial to a separate run#
In this approach, we'll create local Neptune runs that log metadata from each trial separately.
After setting up the training script, add the following training loop:
for (i, lr) in enumerate(learning_rates):
# Create a new run
run = neptune.init_run(
name=f"trial-{i}",
tags=["trial-level"], # (1)!
)
# Log hyperparameters
run["parms"] = stringify_unsupported(parameters)
run["parms/lr"] = lr
for _ in trange(parameters["epochs"]):
for (x, y) in trainloader:
x, y = x.to(parameters["device"]), y.to(parameters["device"])
optimizer.zero_grad()
outputs = model.forward(x)
loss = criterion(outputs, y)
_, preds = torch.max(outputs, 1)
acc = (torch.sum(preds == y.data)) / len(x)
# Log metrics
run["training/batch/loss"].append(loss)
run["training/batch/acc"].append(acc)
loss.backward()
optimizer.step()
# Important - stop each run inside the loop
run.stop()
- To indicate that the run only contains results from a single trial.
How do I save my credentials as environment variables?
Set your Neptune API token and full project name to the NEPTUNE_API_TOKEN
and NEPTUNE_PROJECT
environment variables, respectively.
For example:
- On Windows, the command is
set
instead ofexport
.
- On Windows, the command is
set
instead ofexport
.
Finding your credentials:
- API token: In the bottom-left corner of the Neptune app, expand your user menu and select Get your API token.
- Project: Your full project name has the form
workspace-name/project-name
. To copy the name, click the menu in the top-right corner and select Edit project details.
If you're working in Colab, you can set your credentials with the os and getpass libraries:
Analyzing results in Neptune#
Click on the run to browse the metadata.
You can see that the run only contains metadata (parms and training metrics) from a single trial.
Related
- API reference ≫ append()
- What you can log and display
- Integrations ≫ Optuna integration guide