Skip to content

API reference: XGBoost integration#

You can use the Neptune integration with XGBoost to capture model training metadata with NeptuneCallback.



Neptune callback for logging metadata during XGBoost model training.


This callback requires xgboost>=1.3.0.

The callback logs the following:

  • Metrics
  • All parameters
  • Learning rate
  • The pickled model
  • Visualizations (feature importances and trees)
  • If early stopping is activated, best_score and best_iteration are also logged.

The callback works with the xgboost.train() and functions, and with from the scikit-learn API.

Metrics are logged for every dataset in the evals list and for every metric specified.

Example: With evals = [(dtrain, "train"), (dval, "valid")] and "eval_metric": ["mae", "rmse"], four metrics are created:

  1. "train/mae"
  2. "train/rmse"
  3. "valid/mae"
  4. "valid/rmse"


Name       Type Default     Description
run Run or Handler - An existing run reference, as returned by neptune.init_run(), or a namespace handler.
base_namespace str, optional "training" Namespace under which all metadata logged by the Neptune callback will be stored.
log_model bool True Whether to log the model as a pickled file at the end of training.
log_importance bool True Whether to log feature importance charts at the end of training.
max_num_features int 10 Max number of top features to log on the importance charts. Works when log_importances is set to True. If None or <1, all features will be displayed.

For details, see xgboost.plot_importance() .

log_tree list of int None Indexes of target trees to log as charts. Requires the Graphviz library to be installed.

For details, see xgboost.to_graphviz() .

tree_figsize int 30 Controls the size of the visualized tree image. Increase this in case you work with large trees. Works when log_trees is not None.


Create a Neptune run:

import neptune

run = neptune.init_run()
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

You can, however, also pass them as arguments when initializing Neptune:

run = neptune.init_run(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",  # your token here
    project="ml-team/classification",  # your full project name here
  • API token: In the bottom-left corner, expand the user menu and select Get my API token.
  • Project name: in the top-right menu: Edit project details.

If you haven't registered, you can also log anonymously to a public project (make sure not to publish sensitive data through your code!):

run = neptune.init_run(

Create a Neptune callback and pass it to xgb.train():

from neptune.integrations.xgboost import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)

xgb.train( ..., callbacks=[neptune_callback])

When creating the callback, you can specify what you want to log and where:

neptune_callback = NeptuneCallback(
    log_tree=[0, 1, 2, 3],