TensorFlow integration guide#

Tip

Before you start#

Sign up at neptune.ai/register.
Create a project for storing your metadata.

To follow this example, have the following installed:

pip install -U neptune tensorflow numpy requests

Logging example#

In this example, we'll work with the MNIST dataset. We'll prepare the data, set up a training loop, and log the metadata with Neptune.

Create a script#

Import the needed libraries:

import io

import requests
import tensorflow as tf
import numpy as np

import neptune

Start a Neptune run:
```
run = neptune.init_run()
```
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:
```
export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
```
```
export NEPTUNE_PROJECT="ml-team/classification"
```
Alternatively, you can pass the information when using a function that takes api_token and project as arguments:
```
run = neptune.init_run(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
    project="ml-team/classification", # (2)!
)
```
1. In the bottom-left corner, expand the user menu and select Get my API token.
2. You can copy the path from the project details ( → Details & privacy).
If you haven't registered, you can log anonymously to a public project:
```
api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"
```
Make sure not to publish sensitive data through your code!
Download the MNIST dataset and track its metadata:
```
response = requests.get(
    "https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz"
)
with open("mnist.npz", "wb") as f:
    f.write(response.content)

run["datasets/version"].track_files("mnist.npz")
```
You can use the track_files() method for a file or folder when you want to track the metadata rather than upload the files in full.

Learn more

See how to work with files that are tracked as artifacts: Track artifacts

Set up train and test sets:

with np.load("mnist.npz") as data:
    train_examples = data["x_train"]
    train_labels = data["y_train"]
    test_examples = data["x_test"]
    test_labels = data["y_test"]

Define and log model parameters:
```
params = {
    "batch_size": 1024,
    "shuffle_buffer_size": 100,
    "lr": 0.001,
    "num_epochs": 10,
    "num_visualization_examples": 10,
}

run["training/model/params"] = params
```
You can use simple assignment (=) to log single values or dictionaries to a field in the run. You can define the structure freely. In this case we're creating the nested namespaces "training/model" and, inside those, the "params" field where the dictionary is logged.

Learn more

Learn about the structure of Neptune objects: Namespaces and fields

Normalize and prepare the data for training:

def normalize_img(image):
    """Normalizes images: `uint8` -> `float32`."""
    return tf.cast(image, tf.float32)


train_examples = normalize_img(train_examples)
test_examples = normalize_img(test_examples)

train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, train_labels))
test_dataset = tf.data.Dataset.from_tensor_slices((test_examples, test_labels))

train_dataset = train_dataset.shuffle(params["shuffle_buffer_size"]).batch(
    params["batch_size"]
)
test_dataset = test_dataset.batch(params["batch_size"])

Prepare the model:

model = tf.keras.models.Sequential(
    [
        eras.layers.Input(shape=(28, 28)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dense(10),
    ]
)

loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam(params["lr"])

with io.StringIO() as s:
    model.summary(print_fn=lambda x, **kwargs: s.write(x + "\n"))
    model_summary = s.getvalue()

Log the model summary:

run["training/model/summary"] = model_summary

Set up a training loop with Neptune logging (highlighted):

def loss_and_preds(model, x, y, training):
    # training=training is needed only if there are layers with different
    # behavior during training versus inference (e.g. Dropout)
    y_ = model(x, training=training)

    return loss_object(y_true=y, y_pred=y_), y_


def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        loss_value, _ = loss_and_preds(model, inputs, targets, training=True)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)


for epoch in range(params["num_epochs"]):
    epoch_loss_avg = tf.keras.metrics.Mean()
    epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()

    for x, y in train_dataset:
        loss_value, grads = grad(model, x, y)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

        epoch_loss_avg.update_state(loss_value)
        epoch_accuracy.update_state(y, model(x, training=True))

    # Log metrics for the epoch
    # Train metrics
    run["training/train/loss"].append(epoch_loss_avg.result())
    run["training/train/accuracy"].append(epoch_accuracy.result())

    # Log test metrics
    test_loss, test_preds = loss_and_preds(model, test_examples, test_labels, False)
    run["training/test/loss"].append(test_loss)
    acc = epoch_accuracy(test_labels, test_preds)
    run["training/test/accuracy"].append(acc)

    # Log test prediction
    for idx in range(params["num_visualization_examples"]):
        np_image = test_examples[idx].numpy().reshape(28, 28)
        image = neptune.types.File.as_image(np_image)
        pred_label = test_preds[idx].numpy().argmax()
        true_label = test_labels[idx]
        run[f"training/visualization/epoch_{epoch}"].append(
            image, description=f"pred={pred_label} | actual={true_label}"
        )

    if epoch % 5 == 0 or epoch == (params["num_epochs"] - 1):
        print(
            "Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(
                epoch, epoch_loss_avg.result(), epoch_accuracy.result()
            )
        )

To stop the connection to Neptune and sync all data, call the stop() method:
```
run.stop()
```

Run the training#

Once you execute the code, you should see a Neptune link printed to the console output.

Sample output

[neptune] [info ] Neptune initialized. Open in the app: https://app.neptune.ai/workspace/project/e/RUN-1

Follow the link to open the run in Neptune.

If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"

export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takes api_token and project as arguments:

run = neptune.init_run(
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (1)!
    project="ml-team/classification", # (2)!
)

In the bottom-left corner, expand the user menu and select Get my API token.
You can copy the path from the project details ( → Details & privacy).

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

Analyze the results in Neptune#

In the All metadata section, you can see our two custom namespaces that contain logged metadata:

datasets – our dataset metadata is tracked here.
training – contains other model and training metadata.

The other namespaces are generated by default. They contain automatically logged system and basic metadata.

See example in Neptune