Neptune-XGBoost Integration¶
What will you get with this integration?¶
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. The integration with Neptune lets you log multiple training artifacts with no further customization.

The integration is implemented as XGBoost callback and provides the following capabilities:
Log metrics (train and eval) after each boosting iteration.
Log model (Booster) to Neptune after the last boosting iteration.
Log feature importance to Neptune as an image after the last boosting iteration.
Log visualized trees to Neptune as images after the last boosting iteration.
Note
This integration is tested with xgboost==1.2.0
, and neptune-client==0.4.132
.
Where to start?¶
To get started with this integration, follow the Quickstart below.
If you want to try things out and focus only on the code you can either:
Open Colab notebook with quickstart code and run it as an anonymous user “neptuner” - zero setup, it just works,
View quickstart code as a plain Python script on GitHub.
Quickstart¶
This quickstart will show you how to log XGBoost experiments to Neptune using XGBoost-Neptune integration.
Integration is implemented as XGBoost callback and made available in the neptune-contrib
library.
As a result you will have an experiment logged to Neptune with metrics, model, feature importances and (optionally, requires graphviz) visualized trees. Have a look at this example experiment.
Before you start¶
You have Python 3.x
and following libraries installed:
neptune-client
: See neptune-client installation guide.neptune-contrib[monitoring]
: See neptune-contrib installation guide.xgboost==1.2.0
. See XGBoost installation guide.pandas==1.0.5
andscikit-learn==0.23.1
. See pandas installation guide and scikit-learn installation guide.
Example¶
Make sure you have created an experiment before you start XGBoost training. Use the create_experiment()
method to do this.
Here is how to use the Neptune-XGBoost integration:
import neptune
...
# here you import `neptune_callback` that does the magic (the open source magic :)
from neptunecontrib.monitoring.xgboost import neptune_callback
...
# Use neptune callback
neptune.create_experiment(name='xgb', tags=['train'], params=params)
xgb.train(params, dtrain, num_round, watchlist,
callbacks=[neptune_callback()]) # neptune_callback is here
Logged metrics¶
These are logged for train and eval (or whatever you defined in the watchlist) after each boosting iteration.

Logged model¶
The model (Booster) is logged to Neptune after the last boosting iteration. If you run cross-validation, you get a model for each fold.

Logged feature importance¶
This is a very useful chart, as it shows feature importance. It is logged to Neptune as an image after the last boosting iteration. If you run cross-validation, you get a feature importance chart for each fold’s model.

Logged visualized trees (requires graphviz)¶
Note
You need to install graphviz and graphviz Python interface for log_tree
feature to work.
Check Graphviz and Graphviz Python interface for installation info.
Log first 6 trees at the end of training (tree with indices 0, 1, 2, 3, 4, 5)
xgb.train(params, dtrain, num_round, watchlist,
callbacks=[neptune_callback(log_tree=[0,1,2,3,4,5])])
Selected trees are logged to Neptune as an image after the last boosting iteration. If you run cross-validation, you get a tree visualization for each fold’s model, independently.

Explore Results¶
You just learned how to start logging XGBoost experiments to Neptune. Check this experiment or view quickstart code as a plain Python script on GitHub.

Common problems¶
If you are using Windows machine with Python 3.8 and xgboost-1.2.1
, you may encounter tkinter error when logging feature importance. This problem does not occur on the Windows machine with Python 3.8 and xgboost-1.2.0
. Also, it does not occur on the Windows machine with Python 3.6 or Python 3.7.
How to ask for help?¶
Please visit the Getting help page. Everything regarding support is there.