Skip to content

API reference: LightGBM integration#

You can use NeptuneCallback to capture model training metadata and log model summary after training.


For an in-depth tutorial, see IntegrationsLightGBM integration guide


Neptune callback for logging metadata during LightGBM model training.

The callback logs parameters, evaluation results, and info about the train_set:

  • feature names
  • number of data points (num_rows)
  • number of features (num_features)

Evaluation results are logged separately for every valid_sets. For example, with "metric": "logloss" and valid_names=["train","valid"], two logs are created: train/logloss and valid/logloss.

The callback works with the lgbm.train() and functions, and with the scikit-learn API


Name       Type Default     Description
run Run or Handler, optional None Existing run reference, as returned by neptune.init_run(), or a namespace handler.
base_namespace str, optional experiment Namespace under which all metadata logged by the Neptune callback will be stored.


Create a Neptune run:

import neptune

run = neptune.init_run()

Instantiate the callback and pass it to training function:

from neptune.integrations.lightgbm import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)
gbm = lgb.train(params, ..., callbacks=[neptune_callback])
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

You can, however, also pass them as arguments when initializing Neptune:

run = neptune.init_run(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",  # your token here
    project="ml-team/classification",  # your full project name here
  • API token: In the bottom-left corner, expand the user menu and select Get my API token.
  • Project name: in the top-right menu: Edit project details.

If you haven't registered, you can also log anonymously to a public project (make sure not to publish sensitive data through your code!):

run = neptune.init_run(


Create a model summary after training that can be assigned to the run namespace.


To have all the information in a single run, you can log the summary to the same run that you used for logging model training.


Name          Type Default Description
booster lightgbm.Booster or lightgbm.LGBMModel - The trained LightGBM model.
log_importances bool True Whether to log feature importance charts.
max_num_features int 10 Max number of top features to log on the importance charts. Works when log_importances is set to True. If None or <1, all features will be displayed.

See lightgbm.plot_importance for details.

list_trees list of int None Indices of the target tree to visualize. Works when log_trees is set to True.
log_trees_as_dataframe bool False Whether to parse the model and log trees in CSV format. Works only for Booster objects. See lightgbm.Booster.trees_to_dataframe for details.
log_pickled_booster bool True Whether to log the model as a pickled file.
log_trees bool False Whether to log visualized trees. This requires the Graphviz library to be installed.
tree_figsize int 30 Controls the size of the visualized tree image. Increase this in case you work with large trees. Works when log_trees is set to True.
log_confusion_matrix bool False Whether to log confusion matrix. If set to True, you need to pass y_true and y_pred.
y_true numpy.array None True labels on the test set. Needed only if log_confusion_matrix is set to True.
y_pred numpy.array None Predictions on the test set. Needed only if log_confusion_matrix is set to True.


dict with all metadata, which you can assign to the Neptune run:

run["booster_summary"] = create_booster_summary(...)


Initialize a Neptune run:

import neptune

run = neptune.init_run(project="workspace-name/project-name")  # (1)!
  1. The full project name. For example, "ml-team/classification".

    To copy it, navigate to the project settings in the top-right () and select Edit project details.

Train LightGBM model and log booster summary to Neptune:

from neptune.integrations.lightgbm import create_booster_summary

gbm = lgb.train(params, ...)
run["lgbm_summary"] = create_booster_summary(booster=gbm)

You can customize what to log:

run["lgbm_summary"] = create_booster_summary(
    list_trees=[0, 1, 2, 3, 4],

In order to log a confusion matrix, the predicted labels and ground truth are required:

y_pred = np.argmax(gbm.predict(X_test), axis=1)
run["lgbm_summary"] = create_booster_summary(