Working with Arize#
Arize and Neptune are MLOps tools that aim to improve connected but different parts of your ML pipeline and workflow.
- Arize helps you:
- visualize your production model performance
- understand drift and data quality issues
- Neptune logs, stores, displays, and compares your model-building metadata for better experiment tracking and model registry.
Together, Arize and Neptune help you:
- Train the best model
- Validate your model pre-launch
- Compare production performances of those models
Related
- Arize website
- Arize documentation
- Arize on GitHub
Before you start#
- Set up Neptune. Instructions:
-
Install Arize and the Neptune–Keras package:
Arize logging example#
You can use callbacks to log and visualize loss curves for each training iteration.
In this example, we'll work with Keras to build a classifier model.
-
Create a run:
If Neptune can't find your project name or API token
As a best practice, you should save your Neptune API token and project name as environment variables:
export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8" export NEPTUNE_PROJECT="ml-team/classification"
You can, however, also pass them as arguments when initializing Neptune:
run = neptune.init_run( api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8", # your token here project="ml-team/classification", # your full project name here )
- Find and copy your API token by clicking your avatar and selecting Get my API token.
- Find and copy your project name in the project Settings → Properties.
If you haven't registered, you can also log anonymously to a public project (make sure not to publish sensitive data through your code!):
-
From your Arize admin page, copy the
API_KEY
andORG_KEY
and replace them in theClient()
arguments below: -
Define some model metadata:
-
Import and load the data:
import concurrent.futures as cf import datetime import os import uuid import numpy as np import pandas as pd from sklearn import datasets, preprocessing from sklearn.model_selection import train_test_split def process_data(X, y): scaler = preprocessing.MinMaxScaler() X = np.array(X).reshape((len(X), 30)) y = np.array(y) return X, y # Load data and split data data = datasets.load_breast_cancer() X, y = datasets.load_breast_cancer(return_X_y=True) X, y = X.astype(np.float32), y X, y = pd.DataFrame(X, columns=data["feature_names"]), pd.Series(y) X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42) X_train, X_val, y_train, y_val = train_test_split( X_train, y_train, random_state=42 )
-
Log training callbacks:
import tensorflow as tf import tensorflow.keras as keras from keras.layers import Activation, Dense, Dropout, Flatten from keras.models import Sequential # Define and compile model model = Sequential() model.add(Dense(10, activation="sigmoid", input_shape=((30,)))) model.add(Dropout(0.25)) model.add(Dense(20, activation="sigmoid")) model.add(Dropout(0.25)) model.add(Dense(10, activation="sigmoid")) model.add(Dropout(0.25)) model.add(Dense(1, activation="sigmoid")) model.compile( optimizer=keras.optimizers.Adam(), loss=keras.losses.mean_squared_logarithmic_error, ) # Fit model and log callbacks params = { "batch_size": 30, "epochs": 50, "verbose": 0, } callbacked = model.fit( X_train, y_train, batch_size=params["batch_size"], epochs=params["epochs"], verbose=params["verbose"], validation_data=(X_test, y_test), # log to Neptune using a Neptune callback callbacks=[NeptuneCallback(run=run)], )
-
Run your script as you normally would.
To open the run, click the Neptune link that appears in the console output.
A live training curve should show up in the Charts section.
Stop the run when done
Once you are done logging, you should stop the Neptune run. You need to do this manually when logging from a Jupyter notebook or other interactive environment:
If you're running a script, the connection is stopped automatically when the script finishes executing. In notebooks, however, the connection to Neptune is not stopped when the cell has finished executing, but rather when the entire notebook stops.
More options#
Logging training and validation records to Arize#
Arize logs training and validation records to an Evaluation Store for model pre-launch validation, such as visualizing performance across different feature slices (for example, model accuracy for lower-income versus higher-income individuals).
The records you send can also serve as your model baseline, which can be compared against the features that your models use for prediction in production. This helps inform you when the distributions of the features have shifted.
To learn more about the Arize Python SDK and arize.log_training_records
, see the Arize documentation .
# OPTIONAL: A quick helper function to validate Arize responses
def arize_responses_helper(responses):
for response in cf.as_completed(responses):
r = response.result()
if r.status_code != 200:
raise ValueError(
f"future failed with response code {r.status_code}, {r.text}"
)
Logging training records to Arize#
# Use the model to generate predictions
y_train_pred = model.predict(X_train).T[0]
y_val_pred = model.predict(X_val).T[0]
y_test_pred = model.predict(X_test).T[0]
# Logging training
train_prediction_labels = pd.Series(y_train_pred)
train_actual_labels = pd.Series(y_train)
train_feature_df = pd.DataFrame(X_train, columns=data["feature_names"])
train_responses = arize.log_training_records(
model_id=model_id,
model_version=model_version,
model_type=model_type, # this will change depending on your model type
prediction_labels=train_prediction_labels,
actual_labels=train_actual_labels,
features=train_feature_df,
)
arize_responses_helper(train_responses)
Logging validation to Arize#
val_prediction_labels = pd.Series(y_val_pred)
val_actual_labels = pd.Series(y_val)
val_features_df = pd.DataFrame(X_val, columns=data["feature_names"])
val_responses = arize.log_validation_records(
model_id=model_id,
model_version=model_version,
model_type=model_type,
batch_id="batch0",
prediction_labels=val_prediction_labels,
actual_labels=val_actual_labels,
features=val_features_df,
)
arize_responses_helper(val_responses)
Storing and versioning model weights with Neptune#
Neptune allows you to organize your model metadata in a folder-like structure inside the run. For each run, you can log model weights or checkpoints.
You can organize different trained iterations using the tag model_version
you used to log training records to Arize for better integration.
Note
The code for model storing is different for different frameworks. This example is only applicable to Keras.
- To have all the metadata in a single place, you can log model metadata to the same run you created earlier.
- To manage your model metadata separately, you can use the Neptune model registry.
import glob
# Storing model version 1
directory_name = f"keras_model_{model_version}"
model.save(directory_name)
run[f"{directory_name}/saved_model.pb"].upload(f"{directory_name}/saved_model.pb")
for name in glob.glob(f"{directory_name}/variables/*"):
run[name].upload(name)
# Log "model_id", for better reference
run["model_id"] = model_id
Initialize a ModelVersion
object and log the metadata there, just like you would with a run.
You first need to create a Model
object that functions as an umbrella for all the versions. You can create and manage each model version separately.
# Create new version of a registered model with ID "CLS-PRETRAINED"
model_v = neptune.init_model_version(model="CLS-PRETRAINED")
# Log metadata to the ModelVersion object, just like you would for runs
model_v[f"{directory_name}/saved_model.pb"].upload(f"{directory_name}/saved_model.pb")
for name in glob.glob(f"{directory_name}/variables/*"):
model_v[name].upload(name)
The model metadata will now be displayed in the Models tab of the app.
Logging and versioning the model in production with Arize#
During production, you can use arize.bulk_log
or arize.log
in the Python SDK to log any data in your model serving endpoint.
In this example, we send in our test data, simulating a production setting. In actual production, you would deploy the models saved by Neptune prior to logging to Arize.
To learn more about arize.bulk_log
, see the Arize documentation .
import datetime
# Generating predictions
y_test_pred = pd.Series(y_test_pred)
num_preds = len(y_test_pred) # num_preds == 143
# Generating prediction IDs
ids_df = pd.DataFrame([str(uuid.uuid4()) for _ in range(num_preds)])
# Logging the predictions, features, and actuals
log_predictions_responses = arize.bulk_log(
# Required arguments
model_id=model_id,
prediction_ids=ids_df,
# Optional arguments
model_version=model_version,
prediction_labels=y_test_pred,
actual_labels=y_test,
features=X_test, # we recommend logging features with predictions
model_type=model_type,
# we recommend using model_type on the first time logging to Arize
feature_names_overwrite=None,
)
arize_responses_helper(log_predictions_responses)