Evidently integration guide#
Evidently is an open source tool to evaluate, test, and monitor machine learning models. With Neptune, you can:
- Upload Evidently's interactive reports.
- Log report values as key-value pairs.
- Log and visualize production data drift.
See in Neptune  Example scripts 
Before you start#
- Sign up at neptune.ai/register.
- Create a project for storing your metadata.
-
Have Evidently and Neptune installed.
To follow the example, also install pandas and scikit-learn.
Passing your Neptune credentials
Once you've registered and created a project, set your Neptune API token and full project name to the NEPTUNE_API_TOKEN
and NEPTUNE_PROJECT
environment variables, respectively.
To find your API token: In the bottom-left corner of the Neptune app, expand the user menu and select Get my API token.
Your full project name has the form workspace-name/project-name
. You can copy it from the project settings: Click the
menu in the top-right →
Details & privacy.
On Windows, navigate to Settings → Edit the system environment variables, or enter the following in Command Prompt: setx SOME_NEPTUNE_VARIABLE 'some-value'
While it's not recommended especially for the API token, you can also pass your credentials in the code when initializing Neptune.
run = neptune.init_run(
project="ml-team/classification", # your full project name here
api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh...3Kb8", # your API token here
)
For more help, see Set Neptune credentials.
Logging Evidently reports#
You can upload reports to Neptune either as HTML or as a dictionary, depending on how you want to view and access them.
You can find the entire list of pretests in the Evidently documentation .
The example uses the following libraries:
from sklearn import datasets
from evidently.test_suite import TestSuite
from evidently.test_preset import DataStabilityTestPreset
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
-
Run Evidently test suites and reports:
-
Import Neptune and start a run:
-
If you haven't set up your credentials, you can log anonymously:
-
-
Save the reports.
Using Neptune's HTML previewer, you can view and interact with Evidently's rich HTML reports on Neptune.
As HTMLdata_stability.save_html("data_stability.html") data_drift_report.save_html("data_drift_report.html") run["data_stability/report"].upload("data_stability.html") run["data_drift/report"].upload("data_drift_report.html")
By saving Evidently's results as a dictionary to Neptune, you can have programmatic access to them to use in your CI/CD pipelines.
-
To stop the connection to Neptune and sync all data, call the
stop()
method: -
Run your script as you normally would.
To open the run, click the Neptune link that appears in the console output.
Example link: https://app.neptune.ai/common/evidently-support/e/EV-7
Result
You can view the reports in the All metadata section.
See example in Neptune  View full code example 
Logging production data drift#
You can also use Neptune to log the results when using Evidently to evaluate production data drift.
Load a dataset:
curl https://archive.ics.uci.edu/ml/machine-learning-databases/00275/Bike-Sharing-Dataset.zip --create-dirs -o data/Bike-Sharing-Dataset.zip
unzip -o data/Bike-Sharing-Dataset.zip -d data
import pandas as pd
bike_df = pd.read_csv("data/hour.csv")
bike_df["datetime"] = pd.to_datetime(bike_df["dteday"])
bike_df["datetime"] += pd.to_timedelta(bike_df.hr, unit="h")
bike_df.set_index("datetime", inplace=True)
bike_df = bike_df[
[
"season",
"holiday",
"workingday",
"weathersit",
"temp",
"atemp",
"hum",
"windspeed",
"casual",
"registered",
"cnt",
]
]
bike_df
Note
For demonstration purposes, we treat this data as the input data for a live model.
To use with production models, the prediction logs should be available.
Define column mapping for Evidently:
from evidently import ColumnMapping
data_columns = ColumnMapping()
data_columns.numerical_features = ["weathersit", "temp", "atemp", "hum", "windspeed"]
data_columns.categorical_features = ["holiday", "workingday"]
Specify which metrics you want to calculate.
In this case, you can generate the Data Drift report and log the drift score for each feature.
def eval_drift(reference, production, column_mapping):
data_drift_report = Report(metrics=[DataDriftPreset()])
data_drift_report.run(
reference_data=reference,
current_data=production,
column_mapping=column_mapping,
)
report = data_drift_report.as_dict()
drifts = []
for feature in column_mapping.numerical_features + column_mapping.categorical_features:
drifts.append(
(feature, report["metrics"][1]["result"]["drift_by_columns"][feature]["drift_score"])
)
return drifts
Specify the period that is considered reference – Evidently will use it as the base for the comparison. Then, choose the periods to treat as experiments. This emulates the production model runs.
# Set reference dates
reference_dates = ("2011-01-01 00:00:00", "2011-06-30 23:00:00")
# Set experiment batches dates
experiment_batches = [
("2011-07-01 00:00:00", "2011-07-31 00:00:00"),
("2011-08-01 00:00:00", "2011-08-31 00:00:00"),
("2011-09-01 00:00:00", "2011-09-30 00:00:00"),
("2011-10-01 00:00:00", "2011-10-31 00:00:00"),
("2011-11-01 00:00:00", "2011-11-30 00:00:00"),
("2011-12-01 00:00:00", "2011-12-31 00:00:00"),
("2012-01-01 00:00:00", "2012-01-31 00:00:00"),
("2012-02-01 00:00:00", "2012-02-29 00:00:00"),
("2012-03-01 00:00:00", "2012-03-31 00:00:00"),
("2012-04-01 00:00:00", "2012-04-30 00:00:00"),
("2012-05-01 00:00:00", "2012-05-31 00:00:00"),
("2012-06-01 00:00:00", "2012-06-30 00:00:00"),
("2012-07-01 00:00:00", "2012-07-31 00:00:00"),
("2012-08-01 00:00:00", "2012-08-31 00:00:00"),
("2012-09-01 00:00:00", "2012-09-30 00:00:00"),
("2012-10-01 00:00:00", "2012-10-31 00:00:00"),
("2012-11-01 00:00:00", "2012-11-30 00:00:00"),
("2012-12-01 00:00:00", "2012-12-31 00:00:00"),
]
Log the drifts with Neptune:
import uuid
from datetime import datetime
custom_run_id = str(uuid.uuid4())
for date in experiment_batches:
with neptune.init_run(
custom_run_id=custom_run_id, # (1)!
tags=["prod monitoring"], # (Optional) Replace with your own
) as run:
metrics = eval_drift(
bike_df.loc[reference_dates[0] : reference_dates[1]],
bike_df.loc[date[0] : date[1]],
column_mapping=data_columns,
)
for feature in metrics:
run["drift"][feature[0]].append(
round(feature[1], 3),
timestamp=datetime.strptime(
date[0], "%Y-%m-%d %H:%M:%S").timestamp() # (2)!
)
- Passing a custom run ID ensures that the metrics are logged to the same run.
- Passing a timestamp in the
append()
method lets you visualize the date in the x-axis of the charts.
If Neptune can't find your project name or API token
As a best practice, you should save your Neptune API token and project name as environment variables:
Alternatively, you can pass the information when using a function that takes api_token
and project
as arguments:
run = neptune.init_run(
api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
project="ml-team/classification", # (2)!
)
- In the bottom-left corner, expand the user menu and select Get my API token.
- You can copy the path from the project details ( → Details & privacy).
If you haven't registered, you can log anonymously to a public project:
Make sure not to publish sensitive data through your code!
Follow the run link and explore the drifts in the Charts dashboard.
You might have to change the x-axis from Step to Time (absolute).
Related
- Add tags
- Set custom run ID
- What you can log and display
- API reference ≫
append()
- Evidently on GitHub
- Evidently documentation
- Arize integration guide