API reference: XGBoost integration#
You can use the Neptune integration with XGBoost to capture model training metadata with
Neptune callback for logging metadata during XGBoost model training.
This callback requires
The callback logs the following:
- All parameters
- Learning rate
- The pickled model
- Visualizations (feature importances and trees)
- If early stopping is activated,
best_iterationare also logged.
The callback works with the
xgboost.cv() functions, and with
model.fit() from the scikit-learn API.
Metrics are logged for every dataset in the
evals list and for every metric specified.
evals = [(dtrain, "train"), (dval, "valid")] and
"eval_metric": ["mae", "rmse"], four metrics are created:
||-||An existing run reference, as returned by
||Namespace under which all metadata logged by the Neptune callback will be stored.|
||Whether to log the model as a pickled file at the end of training.|
||Whether to log feature importance charts at the end of training.|
||Max number of top features to log on the importance charts. Works when
For details, see
||Indexes of target trees to log as charts. Requires the Graphviz library to be installed.
For details, see
||Controls the size of the visualized tree image. Increase this in case you work with large trees. Works when
Create a Neptune run:
If Neptune can't find your project name or API token
As a best practice, you should save your Neptune API token and project name as environment variables:
You can, however, also pass them as arguments when initializing Neptune:
Also works for
API token: In the bottom-left corner, expand the user menu and select Get my API token.
- Project name: in the top-right menu: → Edit project details.
If you haven't registered, you can also log anonymously to a public project (make sure not to publish sensitive data through your code!):
Create a Neptune callback and pass it to
When creating the callback, you can specify what you want to log and where:
neptune-xgboost repo on GitHub