Optuna

What will you get with this integration?

Optuna is an open-source hyperparameter optimization framework to automate hyperparameter search.

With Neptune-Optuna integration, you can:

  • log and monitor the Optuna hyperparameter sweep live:

    • values and params for each Trial

    • best values and params for the Study

    • hardware consumption and console logs

    • interactive plots from the optuna.visualization module

    • parameter distributions for each Trial

    • Study object itself for 'InMemoryStorage' or the database location for the Studies with database storage

  • load the Study directly from the existing Neptune Run

  • and more.

Installation

To install Neptune - Optuna integration, go to your console and run:

pip install neptune-client[optuna]

Examples on this page were tested with optuna==2.8.0 neptune-client[optuna]==0.9.12

Quickstart

This quickstart will show you how to:

  • Connect Neptune to your Optuna hyperparameter tuning code and create the first Run

  • Use NeptuneCallback to log values, parameters, Optuna visualizations, and Study to the Neptune Run

  • Explore logged metadata in the Neptune UI.

Before you start

To follow this quickstart you need to have:

You also need minimal familiarity with Optuna. Have a look at the Optuna tutorial guide to get started.

Step 1: Create a Neptune run

import neptune.new as neptune
run = neptune.init(api_token='<your_api_token>',
project='<your_project_name>') # your credentials

Note You can use the api_token='ANONYMOUS' and project='common/optuna-integration' to explore without having to create a Neptune account

Executing this snippet will give you a link like: https://app.neptune.ai/o/common/org/optuna-integration/e/NEP1-370 with common/optuna-integration replaced by your_workspace/your_project_name, and NEP1-370 replaced by your Run ID.

Click on the link to open the Run in Neptune UI. For now, it is empty but keep the tab with the run open to see what happens next.

Step 2: Initialize the NeptuneCallback

import neptune.new.integrations.optuna as optuna_utils
neptune_callback = optuna_utils.NeptuneCallback(run)

By default NeptuneCallback logs all the plots from optuna.visualizationmodule and theStudy object itself after every trial. To see how to customize the NeptuneCallback jump to Customize which plots you want to log and how often.

Step 3: Run Optuna parameter sweep with the NeptuneCallback

Pass the neptune_callback to study.optimize()

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100, callbacks=[neptune_callback])

Now when you run your hyperparameter sweep all the metadata will be logged to Neptune.

See the Optuna Study in Neptune

Switch to the tab with Neptune Run opened to watch the optimization live!

See this example in Neptune.

More Options

Customize which plots you want to log and how often

By default, NeptuneCallback creates and logs all of the plots from the optuna.visualizationsmodule which adds overhead to your Optuna sweep as creating those visualizations takes time.

You can customize which plots you create and log and how often that happens with the following arguments:

  • plot_update_freq: pass integer k to update plots every k trials or 'never' to not log any plots

  • log_plot_contour, log_plot_slice, and other log_{OPTUNA_PLOT_FUNCTION}: pass 'False', and the plots will not be created and logged

objective = ...
run = ...
# Create a NeptuneCallback for Optuna
neptune_callback = optuna_utils.NeptuneCallback(
run,
plots_update_freq=10, # create/log plots every 10 trials
log_plot_slice=False, # do not create/log plot_slice
log_plot_contour=False, # do not create/log plot_contour
)
# Pass NeptuneCallback to Optuna Study .optimize()
study = optuna.create_study(direction='maximize')
study.optimize(objective,
n_trials=50,
callbacks=[neptune_callback])
# Stop logging to a Neptune Run
run.stop()

Log charts and the Study object after the sweep

You can log all metadata from your Optuna Study only after your sweep has finished with .log_study_metadata().

.log_study_metadata() function logs the same metadata that NeptuneCallback logs and you can customize it with similar flags.

objective = ...
run = ...
# Run Optuna with Neptune Callback
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)
# Log Optuna charts and study object after the sweep is complete
optuna_utils.log_study_metadata(study,
run,
log_plot_contour=False)
# Stop logging
run.stop()

Load the Optuna Study from an existing Neptune Run

If you logged the Optuna Study to Neptune, you can load the Study directly from the Neptune Run with load_study_from_run() function and continue working with it.

# Fetch an existing Neptune Run
run = neptune.init(
api_token='<YOUR_API_TOKEN>',
project='<YOUR_WORKSPACE/YOUR_PROJECT>', # you can pass your credentials here
run='NEP1-370') # You can pass Run ID for some other Run
# Load Optuna Study from the Neptune Run
study = optuna_utils.load_study_from_run(run)
# Continue logging to the same run
study.optimize(objective, n_trials=10)

You can log and load Optuna Study both for InMemoryStorage and database storage.

Log each trial as a separate Neptune Run

In addition to logging study-level metadata like params, values, or slice plots to the Neptune Run you can log trial-level metadata like learning curves or diagnostic charts for each trial to a separate trial-level Run.

To do that you need to:

  • create study-level Run

  • create trial-level Runs inside of the objective function

  • connect trial-level Runs and the study-level Run with an ID to find and explore all the Runs for the hyperparameter sweep later

Step 1: Create a unique sweep ID

import uuid
sweep_id = uuid.uuid1()
print('sweep-id: ', sweep_id)

Step 2: Create a study-level Neptune Run

run_study_level = neptune.init(
api_token='ANONYMOUS',
project='common/optuna-integration') # you can pass your credentials here

Step 3: Log the sweep ID to the study-level Run

run_study_level['sweep-id'] = sweep_id

Add a 'study-level' tag to distinguish between the study-level and trial-level Runs for the sweep.

run_study_level['sys/tags'].add('study-level')

Step 4: Create an objective function that logs each trial to Neptune as a Run

Inside of the objective function, you need to:

  • create a trial-level Neptune Run

  • log the sweep ID and a 'trial-level' tag to distinguish between study-level and trial-level Runs

  • log parameters and scores to the trial-level Run

  • stop the trial-level Run

def objective_with_logging(trial):
param = {
'num_leaves': trial.suggest_int('num_leaves', 2, 256),
'feature_fraction': trial.suggest_uniform('feature_fraction', 0.2, 1.0),
'bagging_fraction': trial.suggest_uniform('bagging_fraction', 0.2, 1.0),
'min_child_samples': trial.suggest_int('min_child_samples', 3, 100),
}
# create a trial-level Run
run_trial_level = neptune.init(api_token='ANONYMOUS',
project='common/optuna-integration')
# log sweep id to trial-level Run
run_trial_level['sys/tags'].add('trial-level')
run_trial_level['sweep-id'] = sweep_id
# log parameters of a trial-level Run
run_trial_level['parameters'] = param
# run training and calculate the score for this parameter configuration
score = ...
# log score of a trial-level Run
run_trial_level['score'] = score
# stop trial-level Run
run_trial_level.stop()
return score

The sweep will take longer as each trial-level Run is stopped inside of the objective function and needs to finish logging metadata to Neptune before the next trial starts.

Step 5: Create a study-level NeptuneCallback

neptune_callback = optuna_utils.NeptuneCallback(run_study_level)

Step 6: Pass the NeptuneCallback to the study.optimize() method and run the parameter sweep

study = optuna.create_study(direction='maximize')
study.optimize(objective_with_logging, n_trials=20, callbacks=[neptune_callback])

Step 7: Stop logging to the Neptune Run

run_study_level.stop()

Go to the Neptune UI to see your parameter sweep

Now when you go to the Neptune UI, you have:

  • all Runs for the sweep with the same value of the 'sweep-id' Field

  • all the trial-level Runs logged with 'sys/tags'='trial-level'

  • study-level Run logged with 'sys/tags'='study-level'

To compare sweeps between each other or find your current sweep, use Group by:

  • Go to the Runs Table

  • Click + Group by in the top right

  • Type 'sweep-id' and click on it

  • Click Show all to see your trials in a separate Table View

Logging distributed hyperparameter sweeps to Neptune

You can log metadata from a distributed Optuna study to a single Neptune Run.

To do that you need to:

Step 1: Create Optuna storage

optuna create-study \
--study-name "distributed-example" \
--storage "mysql://[email protected]/example"

For more information read about distributed hyperparameter sweeps in Optuna documentation.

Step 2: Create a Neptune Run with a custom sweep ID

Create an ID of a sweep and pass it to custom_run_id :

run = neptune.init(
api_token='<YOUR_API_TOKEN>',
project='<YOUR_WORKSPACE/YOUR_PROJECT>', # credentials
custom_run_id='<YOUR-SWEEP-ID>') # Pass an ID of your sweep

If your setup allows passing environment variable to worker nodes you should:

  • passNEPTUNE_CUSTOM_RUN_IDenvironment variable to the computational node

export NEPTUNE_CUSTOM_RUN_ID = '<YOUR-SWEEP-ID>'
  • create a Neptune Run without specifying the custom_run_id

run = neptune.init(
api_token='<YOUR_API_TOKEN>',
project='<YOUR_WORKSPACE/YOUR_PROJECT>') # credentials

Step 3: Create a Neptune Callback and pass it to a loaded Optuna Study

objective = ...
run = ...
neptune_callback = optuna_utils.NeptuneCallback(run)
if __name__ == "__main__":
study = optuna.load_study(
study_name="distributed-example",
storage="mysql://[email protected]/example"
)
study.optimize(objective, n_trials=100,
callbacks=[neptune_callback])

Step 4: Run the distributed study from multiple nodes or processes

Run the same Optuna script from multiple processes:

Process 1

python run_sweep_with_neptune.py

Process 2

python run_sweep_with_neptune.py

Step 5: See distributed Optuna study in Neptune

Now you can go to the Neptune UI and see all the trials from distributed Optuna study logged to a single Neptune Run with a Run ID being a custom sweep ID you chose.

Having problems?

Please visit the Getting help page. Everything regarding support is there:

What’s next