Optuna integration guide#
Optuna is an open source hyperparameter optimization framework to automate hyperparameter search. With the Neptune-Optuna integration, you can:
- Log and monitor the Optuna hyperparameter sweep live:
- Values and params for each trial
- Best values and params for the study
- Hardware consumption and console logs
- Interactive plots from the
optuna.visualization
module - Parameter distributions for each trial
- The
Study
object itself, for "InMemoryStorage" or the database location for the studies with database storage
- Load the study directly from an existing Neptune run
See example in Neptune  Code examples 
Before you start#
- Sign up at neptune.ai/register.
- Create a project for storing your metadata.
- Have Optuna installed.
Installing the integration#
To use your preinstalled version of Neptune together with the integration:
To install both Neptune and the integration:
Passing your Neptune credentials
Once you've registered and created a project, set your Neptune API token and full project name to the NEPTUNE_API_TOKEN
and NEPTUNE_PROJECT
environment variables, respectively.
To find your API token: In the bottom-left corner of the Neptune app, expand the user menu and select Get my API token.
Your full project name has the form workspace-name/project-name
. You can copy it from the project settings: Click the
menu in the top-right →
Details & privacy.
On Windows, navigate to Settings → Edit the system environment variables, or enter the following in Command Prompt: setx SOME_NEPTUNE_VARIABLE 'some-value'
While it's not recommended especially for the API token, you can also pass your credentials in the code when initializing Neptune.
run = neptune.init_run(
project="ml-team/classification", # your full project name here
api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh...3Kb8", # your API token here
)
For more help, see Set Neptune credentials.
If you'd rather follow the guide without any setup, you can run the example in Colab .
This integration is not supported on conda.
Optuna logging example#
This example shows how to use NeptuneCallback
to log Optuna visualizations, Study
objects, and other metadata.
For how to customize the NeptuneCallback
, see the More options section.
-
Define your objective, for example:
import lightgbm as lgb import optuna from sklearn.datasets import load_breast_cancer from sklearn.metrics import roc_auc_score from sklearn.model_selection import train_test_split def objective(trial): data, target = load_breast_cancer(return_X_y=True) train_x, test_x, train_y, test_y = train_test_split( data, target, test_size=0.25 ) dtrain = lgb.Dataset(train_x, label=train_y) param = { "verbose": -1, "objective": "binary", "metric": "binary_logloss", "num_leaves": trial.suggest_int("num_leaves", 2, 256), "feature_fraction": trial.suggest_float( "feature_fraction", 0.2, 1.0, step=0.1 ), "bagging_fraction": trial.suggest_float( "bagging_fraction", 0.2, 1.0, step=0.1 ), "min_child_samples": trial.suggest_int("min_child_samples", 3, 100), } gbm = lgb.train(param, dtrain) preds = gbm.predict(test_x) return roc_auc_score(test_y, preds)
-
Import Neptune and create a run:
-
If you haven't set up your credentials, you can log anonymously:
-
-
Initialize the Neptune callback:
By default, the callback logs all the plots from the
optuna.visualization
module, details of all trials, and theStudy
object itself. For how to customize theNeptuneCallback
further, see More options. -
Run the Optuna parameter sweep with the callback.
Pass the callback to
study.optimize()
:study = optuna.create_study(direction="maximize") study.optimize(objective, n_trials=10, callbacks=[neptune_callback])
Now, when you run your hyperparameter sweep, all the metadata will be logged to Neptune.
-
To stop the connection to Neptune and sync all data, call the
stop()
method: -
To watch the optimization live, view the run in the Neptune app.
To open the run, click the Neptune link that appears in the console output.
Sample output
[neptune] [info ] Neptune initialized. Open in the app:
https://app.neptune.ai/workspace/project/e/RUN-1
Analyzing results in Neptune#
In the Runs section, you can see all your logged runs as a table.
Click on a run to inspect the logged metadata.
Viewing the visualizations#
The visualizations are logged as HTML objects. You can view them in All metadata → visualizations.
To display the visualizations according to your liking, create a custom dashboard and add widgets for fields in the visualizations
namespace.
See example dashboard in Neptune 
Filtering the runs by study or trial#
In the experiments table, you can filter the runs by level (trial or study) with the help of the tags applied to the runs. This way you can compare trials individually or get the high-level picture of the sweep.
To filter the runs by tag:
- In the search box above the table, start typing "tags" to find the
sys/tags
field. -
Choose the condition for the tags.
For example, to see study-level runs, set the query to Tags + one of + study.
-
Select Done.
You can also use these filters to find all trials that belong to the same study and select them for comparison.
Grouping runs by study#
To find your current study or compare studies between each other, use the group-by function:
- Near the Experiments tab, switch to group mode ( ).
- CLick on the Group button to change the grouping.
-
Enter the name of the field to group the runs by.
For example, to group by study, type and select the "study_name" field.
-
To see all trials of a group in a separate table view, click Show all.
Related
More options#
Customizing which plots to log and how often#
By default, NeptuneCallback
creates and logs all of the plots from the optuna.visualizations
module. This can add overhead to your Optuna sweep, as creating those visualizations takes time.
You can customize which plots you create and log and how often that happens with the following arguments:
plot_update_freq
:- Pass an integer
k
to update plots every k trials. - Pass
never
to not log any plots.
- Pass an integer
log_plot_contour
,log_plot_slice
, and otherlog_{OPTUNA_PLOT_FUNCTION}
: If you passFalse
, the plots will not be created or logged.
objective = ...
run = ...
# Create a Neptune callback for Optuna
neptune_callback = npt_utils.NeptuneCallback(
run,
plots_update_freq=10, # create/log plots every 10 trials
log_plot_slice=False, # do not create/log plot_slice
log_plot_contour=False, # do not create/log plot_contour
)
# Pass the callback to the optimize() method
study = optuna.create_study(direction="maximize")
study.optimize(
objective,
n_trials=50,
callbacks=[neptune_callback],
)
# Stop logging to the run
run.stop()
Disabling logging of all trials#
If you want to disable logging of all trials, you can pass
to either the NeptuneCallback()
constructor or the log_study_metadata()
function.
Logging charts and study object after sweep#
To log study metadata after the Study is finished, use the log_study_metadata()
function.
This method is generally faster than using the NeptuneCallback
, as it doesn't log the data live. When called, it logs the same metadata as the callback and accepts the same flags for customization.
objective = ...
run = ...
# Run Optuna with Neptune callback
study = optuna.create_study()
study.optimize(objective)
# Log Optuna charts and study object after the sweep is complete
npt_utils.log_study_metadata(study, run)
# Stop logging
run.stop()
Loading the study from existing Neptune run#
If you've logged a Optuna study to Neptune, you can load the study directly from the Neptune run with the load_study_from_run()
function and continue working with it.
# Initialize an existing Neptune run
run = neptune.init_run(
with_id="NEP1-26233", # The run ID goes here
)
# Load Optuna study from the Neptune Run
study = npt_utils.load_study_from_run(run)
# Continue logging to the same run
study.optimize(objective, n_trials=5)
How do I find the ID?
The Neptune ID is a unique identifier for the run. The Experiments tab displays it in the leftmost column.
In the run structure, the ID is stored in the system namespace (sys
).
-
If the run is active, you can obtain its ID with
run["sys/id"].fetch()
. For example: -
If you set a custom run ID, it's stored at
sys/custom_run_id
:
You can log and load an Optuna study both for InMemoryStorage
and database storage.
Logging each trial as separate Neptune run#
You can log trial-level metadata, such as learning curves or diagnostic charts, to a separate run for each trial.
To find and explore all the runs for the hyperparameter sweep later, you can create a trial-level run inside of the objective function and a study-level run outside, then connect these using the study name.
-
Create an objective function that logs each trial to Neptune as a run.
Inside of the objective function, you need to:
- Create a trial-level Neptune run
- Log the study name and a "trial" tag to distinguish between the study-level and trial-level runs
- Log parameters and scores to the trial-level run
- Stop the trial-level run
def objective_with_logging(trial): param = { "num_leaves": trial.suggest_int("num_leaves", 2, 256), "feature_fraction": trial.suggest_float("feature_fraction", 0.2, 1.0, step=0.1), "bagging_fraction": trial.suggest_float("bagging_fraction", 0.2, 1.0, step=0.1), "min_child_samples": trial.suggest_int("min_child_samples", 3, 100), } # Create a trial-level Run run_trial_level = neptune.init_run(tags=["trial"]) # (1)! # log study name and trial number to trial-level Run run_trial_level["study/study_name"] = study.study_name run_trial_level["trial/number"] = trial.number # log parameters of a trial-level Run run_trial_level["trial/parameters"] = param # Run training and calculate the score for this parameter configuration score = ... # Log score of a trial-level Run run_trial_level["trial/score"] = score # Stop trial-level Run run_trial_level.stop() return score
-
If you haven't set up your credentials, you can log anonymously:
The sweep will take longer, as each trial-level run is stopped inside of the objective function and needs to finish logging metadata to Neptune before the next trial starts.
-
Create an Optuna study:
-
Create a study-level Neptune run:
-
If you haven't set up your credentials, you can log anonymously:
-
-
Create a study-level Neptune callback:
-
Pass the callback to the
study.optimize()
method and run the parameter sweep: -
To stop the connection and synchronize the data, call the
stop()
method:
Navigate to the Neptune app to see your parameter sweep.
- All the trial-level runs are tagged as
trial
- The study-level run is tagged as
study
- All trials within a study have the same
study/study_name
field
To compare trials within each study, group the runs by study/study_name
:
- Near the Experiments tab, switch to group mode ( ).
- CLick on the Group button to change the grouping.
- Type "study/study_name" and select the field.
- To see all trials under a study in a new view, click Show all after expanding the group.
Logging distributed hyperparameter sweeps to single run#
You can log metadata from a distributed Optuna study to a single Neptune run by making use of the custom_run_id
parameter.
Related
-
Create Optuna storage.
On the command line or in a terminal app, such as Command Prompt:
optuna create-study \ --study-name "distributed-example" \ --storage "mysql://root@localhost/example"
For more information about distributed hyperparameter sweeps, see the Optuna documentation .
-
Create a Neptune run with a custom sweep ID.
Create an ID of a sweep and pass it to
custom_run_id
:Note
If your setup allows passing environment variables to worker nodes, you should:
-
Pass the
NEPTUNE_CUSTOM_RUN_ID
environment variable to the computational node: -
Then create a Neptune run without specifying the
custom_run_id
(as it will be picked up from the environment):
-
-
Create a Neptune callback and pass it to a loaded Optuna study:
-
Run the distributed study from multiple nodes or processes:
Process 1
Process 2
-
View the distributed Optuna study in Neptune.
Navigate to the Neptune app to see all trials from the distributed Optuna study, logged to a single Neptune run. The custom run ID (stored in the system namespace, sys/custom_run_id
) is the sweep ID you chose.
Logging multiple study objectives#
To log one or more study objectives, you can pass a list of objective names to the target_names
argument of NeptuneCallback
:
neptune_callback = npt_utils.NeptuneCallback(
run, # existing Neptune run
target_names=["FLOPS", "accuracy"],
)
Then log the multi-objective study metadata to Neptune by passing the callback to the Optuna study:
study = optuna.create_study(directions=["minimize", "maximize"])
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])
Tip
You can also pass the target_names
to the log_study_metadata()
function. See Log charts and study object after sweep.
Manually logging metadata#
If you have other types of metadata that are not covered in this guide, you can still log them using the Neptune client library.
When you initialize the run, you get a run
object, to which you can assign different types of metadata in a structure of your own choosing.
import neptune
# Create a new Neptune run
run = neptune.init_run()
# Log metrics inside loops
for epoch in range(n_epochs):
# Your training loop
run["train/epoch/loss"].append(loss) # Each append() call appends a value
run["train/epoch/accuracy"].append(acc)
# Track artifact versions and metadata
run["train/images"].track_files("./datasets/images")
# Upload entire files
run["test/preds"].upload("path/to/test_preds.csv")
# Log text or other metadata, in a structure of your choosing
run["tokenizer"] = "regexp_tokenize"
Related