Tracking hyperparameter optimization jobs with Neptune#
When running an HPO job, you can use Neptune to track the metadata from the study as well as each trial.
In this guide, we'll show you how to configure Neptune for your HPO job in two ways:
- By logging the metadata from all trials to the same Neptune run
- By creating a separate Neptune run for each trial
Integration tip
Neptune integrates directly with Optuna, a hyperparameter optimization framework.
For a detailed guide, see Working with Optuna.
See example in Neptune  Code examples 
Before you start#
- Set up Neptune. Instructions:
To follow the guide, you'll additionally need to have torch, torchvision, and tqdm installed:
Setting up the training script#
In this example, we'll set up a model training script with PyTorch.
-
Import the needed libraries:
-
Define hyperparameters and the search space:
-
Set up the model:
class BaseModel(nn.Module): def __init__(self, input_size, hidden_dim, n_classes): super(BaseModel, self).__init__() self.main = nn.Sequential( nn.Linear(input_size, hidden_dim * 2), nn.ReLU(), nn.Linear(hidden_dim * 2, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, hidden_dim // 2), nn.ReLU(), nn.Linear(hidden_dim // 2, n_classes), ) self.input_size = input_size def forward(self, input): x = input.view(-1, self.input_size) return self.main(x)
-
Set up datasets:
Next, set up the training loop depending on your approach:
Logging all trials to the same run#
In this approach, we'll create a global Neptune run for logging metadata (such as metrics) across the trials.
-
Initialize a Neptune run:
- To identify a run that contains metadata from multiple trials.
If you haven't saved your credentials as environment variables, you can pass them as arguments when initializing Neptune:
neptune.init_run( project="workspace-name/project-name", api_token="YourNeptuneApiTokenHere", tags=["sweep-level"], )
How do I save my credentials as environment variables?
Set your Neptune API token and full project name to the
NEPTUNE_API_TOKEN
andNEPTUNE_PROJECT
environment variables, respectively.For example:
- On Windows, the command is
set
instead ofexport
.
- On Windows, the command is
set
instead ofexport
.
Finding your credentials:
- API token: In the top-right corner of the Neptune app, click your avatar and select Get your API token.
- Project: Your full project name has the form
workspace-name/project-name
. To copy the name, navigate to your project → Settings → Properties.
If you're working in Colab, you can set your credentials with the os and getpass libraries:
-
Set up the training loop:
for (i, lr) in enumerate(learning_rates): # Log hyperparameters run[f"trials/{i}/parms"] = parameters run[f"trials/{i}/parms/lr"] = lr optimizer = optim.SGD(model.parameters(), lr=lr) for _ in trange(parameters["epochs"]): for (x, y) in trainloader: x, y = x.to(parameters["device"]), y.to(parameters["device"]) optimizer.zero_grad() outputs = model.forward(x) loss = criterion(outputs, y) _, preds = torch.max(outputs, 1) acc = (torch.sum(preds == y.data)) / len(x) # Log metrics run[f"trials/{i}/training/batch/loss"].append(loss) run[f"trials/{i}/training/batch/acc"].append(acc) loss.backward() optimizer.step()
-
To stop the connection to Neptune and sync all data, call the
stop()
method:Note for interactive sessions
Always call
stop()
in interactive environments, such as a Python interpreter or Jupyter notebook. The connection to Neptune is not stopped when the cell has finished executing, but rather when the entire notebook stops.If you're running a script, the connection is stopped automatically when the script finishes executing. However, it's a best practice to call
stop()
when the connection is no longer needed.
Analyzing results in Neptune#
When browsing the metadata of the sweep-level run, you can see a namespace called trials. It contains the metadata (params and training metrics) logged for each trial.
Logging each trial to a separate run#
In this approach, we'll create local Neptune runs that log metadata from each trial separately.
After Setting up the training script, add the following training loop:
for (i, lr) in enumerate(learning_rates):
# Create a new run
run = neptune.init_run(
name=f"trial-{i}",
tags=["trial-level"], # (1)!
)
# Log hyperparameters
run["parms"] = parameters
run["parms/lr"] = lr
for _ in trange(parameters["epochs"]):
for (x, y) in trainloader:
x, y = x.to(parameters["device"]), y.to(parameters["device"])
optimizer.zero_grad()
outputs = model.forward(x)
loss = criterion(outputs, y)
_, preds = torch.max(outputs, 1)
acc = (torch.sum(preds == y.data)) / len(x)
# Log metrics
run["training/batch/loss"].append(loss)
run["training/batch/acc"].append(acc)
loss.backward()
optimizer.step()
# Important - stop each run inside the loop
run.stop()
- To indicate that the run only contains results from a single trial.
How do I save my credentials as environment variables?
Set your Neptune API token and full project name to the NEPTUNE_API_TOKEN
and NEPTUNE_PROJECT
environment variables, respectively.
For example:
- On Windows, the command is
set
instead ofexport
.
- On Windows, the command is
set
instead ofexport
.
Finding your credentials:
- API token: In the top-right corner of the Neptune app, click your avatar and select Get your API token.
- Project: Your full project name has the form
workspace-name/project-name
. To copy the name, navigate to your project → Settings → Properties.
If you're working in Colab, you can set your credentials with the os and getpass libraries:
Analyzing results in Neptune#
Click on the run to browse the metadata.
You can see that the run only contains metadata (params and training metrics) from a single trial.
Related
- API reference ≫ append()
- What you can log and display
- Integrations ≫ Working with Optuna