Skip to content

Working with Arize#

Arize and Neptune are MLOps tools that aim to improve connected but different parts of your ML pipeline and workflow.

  • Arize helps you:
    • visualize your production model performance
    • understand drift and data quality issues
  • Neptune logs, stores, displays, and compares your model-building metadata for better experiment tracking and model registry.

Together, Arize and Neptune help you:

  • Train the best model
  • Validate your model pre-launch
  • Compare production performances of those models

Before you start#

Arize logging example#

You can use callbacks to log and visualize loss curves for each training iteration.

In this example, we'll work with Keras to build a classifier model.

  1. Create a run:

    import neptune
    run = neptune.init_run()
    If Neptune can't find your project name or API token

    As a best practice, you should save your Neptune API token and project name as environment variables:

    export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8"
    export NEPTUNE_PROJECT="ml-team/classification"

    You can, however, also pass them as arguments when initializing Neptune:

    run = neptune.init_run(
        api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jvYh3Kb8",  # your token here
        project="ml-team/classification",  # your full project name here
    • Find and copy your API token by clicking your avatar and selecting Get my API token.
    • Find and copy your project name in the project SettingsProperties.

    If you haven't registered, you can also log anonymously to a public project (make sure not to publish sensitive data through your code!):

    run = neptune.init_run(
  2. From your Arize admin page, copy the API_KEY and ORG_KEY and replace them in the Client() arguments below:

    from arize.api import Client
    from arize.types import ModelTypes
    arize = Client(
        organization_key="ORG_KEY",  # replace with your own
        api_key="API_KEY",  # replace with your own
  3. Define some model metadata:

    model_id = "neptune_cancer_prediction_model"
    model_version = "v1"
    model_type = ModelTypes.BINARY
  4. Import and load the data:

    import concurrent.futures as cf
    import datetime
    import os
    import uuid
    import numpy as np
    import pandas as pd
    from sklearn import datasets, preprocessing
    from sklearn.model_selection import train_test_split
    def process_data(X, y):
        scaler = preprocessing.MinMaxScaler()
        X = np.array(X).reshape((len(X), 30))
        y = np.array(y)
        return X, y
    # Load data and split data
    data = datasets.load_breast_cancer()
    X, y = datasets.load_breast_cancer(return_X_y=True)
    X, y = X.astype(np.float32), y
    X, y = pd.DataFrame(X, columns=data["feature_names"]), pd.Series(y)
    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
    X_train, X_val, y_train, y_val = train_test_split(
        X_train, y_train, random_state=42
  5. Log training callbacks:

    import tensorflow as tf
    import tensorflow.keras as keras
    from keras.layers import Activation, Dense, Dropout, Flatten
    from keras.models import Sequential
    # Define and compile model
    model = Sequential()
    model.add(Dense(10, activation="sigmoid", input_shape=((30,))))
    model.add(Dense(20, activation="sigmoid"))
    model.add(Dense(10, activation="sigmoid"))
    model.add(Dense(1, activation="sigmoid"))
    # Fit model and log callbacks
    params = {
        "batch_size": 30,
        "epochs": 50,
        "verbose": 0,
    callbacked =
        validation_data=(X_test, y_test),
        # log to Neptune using a Neptune callback
  6. Run your script as you normally would.

    To open the run, click the Neptune link that appears in the console output.

A live training curve should show up in the Charts section.

Stop the run when done

Once you are done logging, you should stop the Neptune run. You need to do this manually when logging from a Jupyter notebook or other interactive environment:


If you're running a script, the connection is stopped automatically when the script finishes executing. In notebooks, however, the connection to Neptune is not stopped when the cell has finished executing, but rather when the entire notebook stops.

More options#

Logging training and validation records to Arize#

Arize logs training and validation records to an Evaluation Store for model pre-launch validation, such as visualizing performance across different feature slices (for example, model accuracy for lower-income versus higher-income individuals).

The records you send can also serve as your model baseline, which can be compared against the features that your models use for prediction in production. This helps inform you when the distributions of the features have shifted.

To learn more about the Arize Python SDK and arize.log_training_records, see the Arize documentation .

Adding an optional helper
# OPTIONAL: A quick helper function to validate Arize responses
def arize_responses_helper(responses):
    for response in cf.as_completed(responses):
        r = response.result()
        if r.status_code != 200:
            raise ValueError(
                f"future failed with response code {r.status_code}, {r.text}"

Logging training records to Arize#

# Use the model to generate predictions
y_train_pred = model.predict(X_train).T[0]
y_val_pred = model.predict(X_val).T[0]
y_test_pred = model.predict(X_test).T[0]

# Logging training
train_prediction_labels = pd.Series(y_train_pred)
train_actual_labels = pd.Series(y_train)
train_feature_df = pd.DataFrame(X_train, columns=data["feature_names"])

train_responses = arize.log_training_records(
    model_type=model_type,  # this will change depending on your model type


Logging validation to Arize#

val_prediction_labels = pd.Series(y_val_pred)
val_actual_labels = pd.Series(y_val)
val_features_df = pd.DataFrame(X_val, columns=data["feature_names"])

val_responses = arize.log_validation_records(


Storing and versioning model weights with Neptune#

Neptune allows you to organize your model metadata in a folder-like structure inside the run. For each run, you can log model weights or checkpoints.

You can organize different trained iterations using the tag model_version you used to log training records to Arize for better integration.


The code for model storing is different for different frameworks. This example is only applicable to Keras.

  • To have all the metadata in a single place, you can log model metadata to the same run you created earlier.
  • To manage your model metadata separately, you can use the Neptune model registry.
import glob

# Storing model version 1
directory_name = f"keras_model_{model_version}"

for name in glob.glob(f"{directory_name}/variables/*"):

# Log "model_id", for better reference
run["model_id"] = model_id

Initialize a ModelVersion object and log the metadata there, just like you would with a run.

You first need to create a Model object that functions as an umbrella for all the versions. You can create and manage each model version separately.

# Create new version of a registered model with ID "CLS-PRETRAINED"
model_v = neptune.init_model_version(model="CLS-PRETRAINED")

# Log metadata to the ModelVersion object, just like you would for runs
for name in glob.glob(f"{directory_name}/variables/*"):

The model metadata will now be displayed in the Models tab of the app.

Logging and versioning the model in production with Arize#

During production, you can use arize.bulk_log or arize.log in the Python SDK to log any data in your model serving endpoint.

In this example, we send in our test data, simulating a production setting. In actual production, you would deploy the models saved by Neptune prior to logging to Arize.

To learn more about arize.bulk_log, see the Arize documentation .

import datetime

# Generating predictions
y_test_pred = pd.Series(y_test_pred)
num_preds = len(y_test_pred)  # num_preds == 143

# Generating prediction IDs
ids_df = pd.DataFrame([str(uuid.uuid4()) for _ in range(num_preds)])

# Logging the predictions, features, and actuals
log_predictions_responses = arize.bulk_log(
    # Required arguments
    # Optional arguments
    features=X_test,  # we recommend logging features with predictions
    # we recommend using model_type on the first time logging to Arize