XGBoost

You can use Neptune integration with XGBoost to capture model training metadata through NeptuneCallback.

You can find detailed information on how to install and use the integration in the user guide.

NeptuneCallback

Neptune callback for logging metadata during XGBoost model training.

This callback logs metrics, all parameters, learning rate, pickled model, visualizations. If early stopping is activated "best_score" and "best_iteration" are also logged.

Metrics are logged for every dataset in the evals list and for every metric specified. For example with evals = [(dtrain, "train"), (dval, "valid")] and "eval_metric": ["mae", "rmse"], 4 metrics are created - "train/mae", "train/rmse", "valid/mae", and "valid/rmse".

Callback works with xgboost.train() and xgboost.cv() functions, and with the sklearn API model.fit().

This callback requires xgboost>=1.3.0.

Parameters

run

(Run) - An existing run reference (as returned by neptune.init()).

base_namespace

(str, optional, default is "training") - Namespace under which all metadata logged by the NeptuneCallback will be stored.

log_model

(bool, default is True) - Whether to log model as a pickled file at the end of training.

log_importance

(bool, default is True) - Whether to log feature importance charts at the end of training.

max_num_features

(int, default is None) - Max number of top features on the importance charts. Works only if log_importance is set to True. If None, all features will be displayed. See xgboost.plot_importance for details.

log_tree

(list of int, default is None) - Indices of the target trees to log as charts. See xgboost.to_graphviz for details.

This requires graphviz library to work, read how to install it in the user guide.

tree_figsize

(int, default is 30) - Control size of the visualized tree image. Increase this in case you work with large trees. Works only if log_tree is not None.

Examples

# Create run
import neptune.new as neptune
run = neptune.init(project="WORKSPACE/PROJECT")
# Create Neptune callback and pass it to xgb.train() function
from neptune.new.integrations.xgboost import NeptuneCallback
neptune_callback = NeptuneCallback(run=run)
xgb.train(
...
callbacks=[neptune_callback],
)
# When creating the callback you can customize what you want to log and where
neptune_callback = NeptuneCallback(
run=run,
base_namespace="experiment",
log_model=False,
log_tree=[0, 1, 2, 3],
)