Skip to content

API reference: LightGBM integration#

You can use NeptuneCallback to capture model training metadata and log model summary after training.


Neptune callback for logging metadata during LightGBM model training.

The callback logs parameters, evaluation results, and info about the train_set:

  • feature names
  • number of data points (num_rows)
  • number of features (num_features)

Evaluation results are logged separately for every valid_sets. For example, with "metric": "logloss" and valid_names=["train","valid"], two logs are created: train/logloss and valid/logloss.

The callback works with the lgbm.train() and functions, and with the scikit-learn API


Name       Type Default     Description
run Run or Handler, optional None Existing run reference, as returned by neptune.init_run(), or a namespace handler.
base_namespace str, optional experiment Namespace under which all metadata logged by the Neptune callback will be stored.


Create a Neptune run:

import neptune

run = neptune.init_run()

Instantiate the callback and pass it to training function:

from neptune.integrations.lightgbm import NeptuneCallback

neptune_callback = NeptuneCallback(run=run)
gbm = lgb.train(params, ..., callbacks=[neptune_callback])
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takes api_token and project as arguments:

run = neptune.init_run( # (1)!
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8",  # your token here
    project="ml-team/classification",  # your full project name here
  1. Also works for init_model(), init_model_version(), init_project(), and integrations that create Neptune runs underneath the hood, such as NeptuneLogger or NeptuneCallback.

  2. API token: In the bottom-left corner, expand the user menu and select Get my API token.

  3. Project name: You can copy the path from the project details ( Edit project details).

If you haven't registered, you can log anonymously to a public project:


Make sure not to publish sensitive data through your code!


Create a model summary after training that can be assigned to the run namespace.


To have all the information in a single run, you can log the summary to the same run that you used for logging model training.


Name          Type Default Description
booster lightgbm.Booster or lightgbm.LGBMModel - The trained LightGBM model.
log_importances bool True Whether to log feature importance charts.
max_num_features int 10 Max number of top features to log on the importance charts. Works when log_importances is set to True. If None or <1, all features will be displayed.

See lightgbm.plot_importance for details.

list_trees list of int None Indices of the target tree to visualize. Works when log_trees is set to True.
log_trees_as_dataframe bool False Whether to parse the model and log trees in CSV format. Works only for Booster objects. See lightgbm.Booster.trees_to_dataframe for details.
log_pickled_booster bool True Whether to log the model as a pickled file.
log_trees bool False Whether to log visualized trees. This requires the Graphviz library to be installed.
tree_figsize int 30 Controls the size of the visualized tree image. Increase this in case you work with large trees. Works when log_trees is set to True.
log_confusion_matrix bool False Whether to log confusion matrix. If set to True, you need to pass y_true and y_pred.
y_true numpy.array None True labels on the test set. Needed only if log_confusion_matrix is set to True.
y_pred numpy.array None Predictions on the test set. Needed only if log_confusion_matrix is set to True.


dict with all metadata, which you can assign to the Neptune run:

run["booster_summary"] = create_booster_summary(...)


Initialize a Neptune run:

import neptune

run = neptune.init_run(project="workspace-name/project-name") # (1)!
  1. The full project name. For example, "ml-team/classification".

    To find the required string in the Neptune app, click How to create a new run. You can copy the project argument from the modal that opens.

Train LightGBM model and log booster summary to Neptune:

from neptune.integrations.lightgbm import create_booster_summary

gbm = lgb.train(params, ...)
run["lgbm_summary"] = create_booster_summary(booster=gbm)

You can customize what to log:

run["lgbm_summary"] = create_booster_summary(
    list_trees=[0, 1, 2, 3, 4],

In order to log a confusion matrix, the predicted labels and ground truth are required:

y_pred = np.argmax(gbm.predict(X_test), axis=1)
run["lgbm_summary"] = create_booster_summary(

See also

neptune-lightgbm repo on GitHub