Skip to content

Working with Optuna#

Open in Colab

Custom dashboard displaying metadata logged with Optuna

Optuna is an open-source hyperparameter optimization framework to automate hyperparameter search.

With the Neptune–Optuna integration, you can:

  • Log and monitor the Optuna hyperparameter sweep live:
    • Values and params for each trial
    • Best values and params for the study
    • Hardware consumption and console logs
    • Interactive plots from the optuna.visualization module
    • Parameter distributions for each trial
    • The Study object itself, for "InMemoryStorage" or the database location for the studies with database storage
  • Load the study directly from an existing Neptune run

See example in Neptune  Code examples 

Related

Before you start#

Tip

If you'd rather follow the guide without any setup, you can run the example in Colab.

Installing the Neptune–Optuna integration#

On the command line or in a terminal app, such as Command Prompt, enter the following:

pip install neptune-client[optuna]
conda install -c conda-forge neptune-optuna

Optuna logging example#

This example shows how to use NeptuneCallback to log Optuna visualizations, Study objects, and other metadata.

For how to customize the NeptuneCallback, see the More options section.

  1. Import Neptune and create a run:

    import neptune.new as neptune
    
    run = neptune.init_run()  # (1)
    
    1. If you haven't set up your credentials, you can log anonymously: neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration")

    To open the run, click the Neptune link that appears in the console output.

    You'll see the metadata appear as it gets logged.

    Example link: https://app.neptune.ai/o/common/org/optuna-integration/e/NEP1-370

  2. Initialize the Neptune callback:

    import neptune.new.integrations.optuna as optuna_utils
    
    neptune_callback = optuna_utils.NeptuneCallback(run)
    

    By default, the callback logs all the plots from the optuna.visualization module and the Study object itself after every trial. See the More options section for how to customize the NeptuneCallback.

  3. Run the Optuna parameter sweep with the callback.

    Pass the callback to study.optimize():

    study = optuna.create_study(direction="maximize")
    study.optimize(objective, n_trials=100, callbacks=[neptune_callback])
    

    Now, when you run your hyperparameter sweep, all the metadata will be logged to Neptune.

  4. To watch the optimization live, view the run in the Neptune app.

More options#

Customizing which plots to log and how often#

By default, NeptuneCallback creates and logs all of the plots from the optuna.visualizations module. This can add overhead to your Optuna sweep, as creating those visualizations takes time.

You can customize which plots you create and log and how often that happens with the following arguments:

  • plot_update_freq:
    • Pass an integer k to update plots every \(k\) trials.
    • Pass never to not log any plots.
  • log_plot_contour, log_plot_slice, and other log_{OPTUNA_PLOT_FUNCTION}: If you pass False, the plots will not be created or logged.
objective = ...
run = ...

# Create a Neptune callback for Optuna
neptune_callback = optuna_utils.NeptuneCallback(
    run,
    plots_update_freq=10,  # create/log plots every 10 trials
    log_plot_slice=False,  # do not create/log plot_slice
    log_plot_contour=False,  # do not create/log plot_contour
)

# Pass the callback to the optimize() method
study = optuna.create_study(direction="maximize")
study.optimize(
    objective,
    n_trials=50,
    callbacks=[neptune_callback],
)

# Stop logging to the run
run.stop()

Logging charts and study object after sweep#

After your sweep has finished, you can log all metadata from your Optuna study with log_study_metadata().

The log_study_metadata() function logs the same metadata that NeptuneCallback logs, and you can customize it with similar flags.

objective = ...
run = ...

# Run Optuna with Neptune callback
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=10)

# Log Optuna charts and study object after the sweep is complete
optuna_utils.log_study_metadata(
    study,
    run,  
    log_plot_contour=False,
    target_names=["FLOPS", "accuracy"],  # (optional) one or more study objectives
)

# Stop logging
run.stop()

Loading the study from existing Neptune run#

If you've logged a Optuna study to Neptune, you can load the study directly from the Neptune run with the load_study_from_run() function and continue working with it.

# Initialize an existing Neptune run
run = neptune.init_run(  # (1)
    with_id="NEP1-370",  # The run ID goes here
)

# Load Optuna study from the Neptune Run
study = optuna_utils.load_study_from_run(run)

# Continue logging to the same run
study.optimize(objective, n_trials=10)
  1. If you haven't set up your credentials, you can log anonymously: neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration")
How do I find the ID?

The Neptune ID is a unique identifier for the object. It's displayed in the Details view and in the leftmost column of the table views.

The ID is stored in the system namespace. You can obtain it with object["sys/id"].fetch(). For example:

>>> run = neptune.init_run(project="ml-team/classification")
>>> run["sys/id"].fetch()
'CLS-26'

For more help, see Getting the ID.

You can log and load an Optuna study both for InMemoryStorage and database storage.

Logging each trial as separate Neptune run#

You can log trial-level metadata, such as learning curves or diagnostic charts, to a separate run for each trial.

To find and explore all the runs for the hyperparameter sweep later, you can create a study-level run as well as trial-level runs inside of the objective function, then connect these with a custom ID.

  1. Create a unique sweep ID:

    import uuid
    sweep_id = uuid.uuid1()
    print("sweep-id: ", sweep_id)
    
  2. Create a study-level Neptune run:

    run_study_level = neptune.init_run()  # (1)
    
    1. If you haven't set up your credentials, you can log anonymously: neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration")
  3. Log the sweep ID to the study-level run:

    run_study_level["sweep-id"] = sweep_id
    

    Add a "study-level" tag to distinguish between the study-level and trial-level runs for the sweep.

    run_study_level["sys/tags"].add("study-level")
    
  4. Create an objective function that logs each trial to Neptune as a run.

    Inside of the objective function, you need to:

    • Create a trial-level Neptune run
    • Log the sweep ID and a "trial-level" tag to distinguish between the study-level and trial-level runs
    • Log parameters and scores to the trial-level run
    • Stop the trial-level run
    def objective_with_logging(trial):
    
        param = {
            "num_leaves": trial.suggest_int("num_leaves", 2, 256),
            "feature_fraction": trial.suggest_uniform("feature_fraction", 0.2, 1.0),
            "bagging_fraction": trial.suggest_uniform("bagging_fraction", 0.2, 1.0),
            "min_child_samples": trial.suggest_int("min_child_samples", 3, 100),
        }
    
        # Create a trial-level run
        run_trial_level = neptune.init_run()  # (1)
    
        # Log sweep ID to trial-level run
        run_trial_level["sys/tags"].add("trial-level")
        run_trial_level["sweep-id"] = sweep_id
    
        # Log parameters of a trial-level run
        run_trial_level["parameters"] = param
    
        # Run training and calculate the score for this parameter configuration
        score = ...
    
        # Log score of a trial-level Run
        run_trial_level["score"] = score
    
        # Stop trial-level Run
        run_trial_level.stop()
    
        return score
    
    1. If you haven't set up your credentials, you can log anonymously: neptune.init_run(api_token=neptune.ANONYMOUS_API_TOKEN, project="common/optuna-integration")

    ⚠ The sweep will take longer, as each trial-level run is stopped inside of the objective function and needs to finish logging metadata to Neptune before the next trial starts.

  5. Create a study-level Neptune callback:

    neptune_callback = optuna_utils.NeptuneCallback(run_study_level)
    
  6. Pass the callback to the study.optimize() method and run the parameter sweep:

    study = optuna.create_study(direction="maximize")
    study.optimize(
        objective_with_logging,
        n_trials=20,
        callbacks=[neptune_callback]
    )
    
  7. To stop the connection and synchronize the data, call the stop() method:

    run_study_level.stop()
    

Navigate to the Neptune app to see your parameter sweep.

  • All sweep runs have the same value in the "sweep-id" field
  • All the trial-level runs are tagged with trial-level
  • The study-level run is tagged with study-level

To compare sweeps between each other, or find your current sweep, group the runs by sweep ID:

  1. Above the runs table, click Group by.
  2. Type "sweep-id" and select the field.
  3. To open a certain group in a new view, click Show all.

Logging distributed hyperparameter sweeps to single run#

You can log metadata from a distributed Optuna study to a single Neptune run by making use of the custom_run_id parameter.

  1. Create Optuna storage.

    On the command line or in a terminal app, such as Command Prompt:

    optuna create-study \
        --study-name "distributed-example" \
        --storage "mysql://root@localhost/example"
    

    For more information about distributed hyperparameter sweeps, see the Optuna documentation.

  2. Create a Neptune run with a custom sweep ID.

    Create an ID of a sweep and pass it to custom_run_id:

    run = neptune.init_run(
        custom_run_id="your sweep ID")  # Pass an ID of your sweep
    

    Note

    If your setup allows passing environment variables to worker nodes, you should:

    1. Pass the NEPTUNE_CUSTOM_RUN_ID environment variable to the computational node:

      export NEPTUNE_CUSTOM_RUN_ID = 'your sweep ID'
      
    2. Then create a Neptune run without specifying the custom_run_id (as it will be picked up from the environment):

    run = neptune.init_run()
    
  3. Create a Neptune callback and pass it to a loaded Optuna study:

    objective = ...
    run = ...
    
    neptune_callback = optuna_utils.NeptuneCallback(run)
    
    if __name__ == "__main__":
        study = optuna.load_study(
            study_name="distributed-example",
            storage="mysql://root@localhost/example",
        )
        study.optimize(objective, n_trials=100, callbacks=[neptune_callback])
    
  4. Run the distributed study from multiple nodes or processes:

    Process 1

    python run_sweep_with_neptune.py
    

    Process 2

    python run_sweep_with_neptune.py
    
  5. View the distributed Optuna study in Neptune.

Navigate to the Neptune app to see all trials from the distributed Optuna study, logged to a single Neptune run. The custom run ID (stored in the system namespace, sys/custom_run_id) is the sweep ID you chose.

Logging multiple study objectives#

To log one or more study objectives, you can pass a list of objective names to the target_names argument of NeptuneCallback:

Multi-objective
neptune_callback = optuna_utils.NeptuneCallback(
    run,  # existing Neptune run
    target_names=["FLOPS", "accuracy"],
)

Then log the multi-objective study metadata to Neptune by passing the callback to the Optuna study:

study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=5, callbacks=[neptune_callback])

Tip

You can also pass the target_names to the log_study_metadata() function. See Logging charts and study object after sweep.

Manually logging metadata#

If you have other types of metadata that are not covered in this guide, you can still log them using the Neptune client library (neptune-client).

When you initialize the run, you get a run object, to which you can assign different types of metadata in a structure of your own choosing.

from neptune.new import neptune

# Create a new Neptune run
run = neptune.init_run()

# Log metrics or other values inside loops
for epoch in range(n_epochs):
    ...  # Your training loop

    run["train/epoch/loss"].log(loss)  # Each log() appends a value
    run["train/epoch/accuracy"].log(acc)

# Upload files
run["test/preds"].upload("path/to/test_preds.csv")

# Track and version artifacts
run["train/images"].track_files("./datasets/images")

# Record numbers or text
run["tokenizer"] = "regexp_tokenize"
Back to top