TensorFlow integration guide#

Tip

Before you start#

Sign up at neptune.ai/register.
Create a project for storing your metadata.

To follow this example, have the following installed:

pipconda

pip install -U neptune tensorflow numpy requests

conda install -c conda-forge neptune tensorflow numpy requests

Logging example#

In this example, we'll work with the MNIST dataset. We'll prepare the data, set up a training loop, and log the metadata with Neptune.

Create a script#

Import the needed libraries:

import io

import requests
import tensorflow as tf
import numpy as np

import neptune

Start a Neptune run:
```
run = neptune.init_run()
```
If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:
```
export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"
```
```
export NEPTUNE_PROJECT="ml-team/classification"
```
Alternatively, you can pass the information when using a function that takes api_token and project as arguments:
```
run = neptune.init_run( # (1)!
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (2)!
    project="ml-team/classification", # (3)!
)
```
1. Also works for init_model(), init_model_version(), init_project(), and integrations that create Neptune runs underneath the hood, such as NeptuneLogger or NeptuneCallback.
2. In the bottom-left corner, expand the user menu and select Get my API token.
3. You can copy the path from the project details ( → Details & privacy).
If you haven't registered, you can log anonymously to a public project:
```
api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"
```
Make sure not to publish sensitive data through your code!
Download the MNIST dataset and track its metadata:
```
response = requests.get(
    "https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz"
)
with open("mnist.npz", "wb") as f:
    f.write(response.content)

run["datasets/version"].track_files("mnist.npz")
```
You can use the track_files() method for a file or folder when you want to track the metadata rather than upload the files in full.

Learn more

See how to work with files that are tracked as artifacts: Track artifacts

Set up train and test sets:

with np.load("mnist.npz") as data:
    train_examples = data["x_train"]
    train_labels = data["y_train"]
    test_examples = data["x_test"]
    test_labels = data["y_test"]

Define and log model parameters:
```
params = {
    "batch_size": 1024,
    "shuffle_buffer_size": 100,
    "lr": 0.001,
    "num_epochs": 10,
    "num_visualization_examples": 10,
}

run["training/model/params"] = params
```
You can use simple assignment (=) to log single values or dictionaries to a field in the run. You can define the structure freely. In this case we're creating the nested namespaces "training/model" and, inside those, the "params" field where the dictionary is logged.

Learn more

Learn about the structure of Neptune objects: Namespaces and fields

Normalize and prepare the data for training:

def normalize_img(image):
    """Normalizes images: `uint8` -> `float32`."""
    return tf.cast(image, tf.float32)


train_examples = normalize_img(train_examples)
test_examples = normalize_img(test_examples)

train_dataset = tf.data.Dataset.from_tensor_slices((train_examples, train_labels))
test_dataset = tf.data.Dataset.from_tensor_slices((test_examples, test_labels))

train_dataset = train_dataset.shuffle(params["shuffle_buffer_size"]).batch(
    params["batch_size"]
)
test_dataset = test_dataset.batch(params["batch_size"])

Prepare the model:

model = tf.keras.models.Sequential(
    [
        eras.layers.Input(shape=(28, 28)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dense(10),
    ]
)

loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
optimizer = tf.keras.optimizers.Adam(params["lr"])

with io.StringIO() as s:
    model.summary(print_fn=lambda x, **kwargs: s.write(x + "\n"))
    model_summary = s.getvalue()

Log the model summary:

run["training/model/summary"] = model_summary

Set up a training loop with Neptune logging (highlighted):

def loss_and_preds(model, x, y, training):
    # training=training is needed only if there are layers with different
    # behavior during training versus inference (e.g. Dropout)
    y_ = model(x, training=training)

    return loss_object(y_true=y, y_pred=y_), y_


def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        loss_value, _ = loss_and_preds(model, inputs, targets, training=True)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)


for epoch in range(params["num_epochs"]):
    epoch_loss_avg = tf.keras.metrics.Mean()
    epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()

    for x, y in train_dataset:
        loss_value, grads = grad(model, x, y)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

        epoch_loss_avg.update_state(loss_value)
        epoch_accuracy.update_state(y, model(x, training=True))

    # Log metrics for the epoch
    # Train metrics
    run["training/train/loss"].append(epoch_loss_avg.result())
    run["training/train/accuracy"].append(epoch_accuracy.result())

    # Log test metrics
    test_loss, test_preds = loss_and_preds(model, test_examples, test_labels, False)
    run["training/test/loss"].append(test_loss)
    acc = epoch_accuracy(test_labels, test_preds)
    run["training/test/accuracy"].append(acc)

    # Log test prediction
    for idx in range(params["num_visualization_examples"]):
        np_image = test_examples[idx].numpy().reshape(28, 28)
        image = neptune.types.File.as_image(np_image)
        pred_label = test_preds[idx].numpy().argmax()
        true_label = test_labels[idx]
        run[f"training/visualization/epoch_{epoch}"].append(
            image, description=f"pred={pred_label} | actual={true_label}"
        )

    if epoch % 5 == 0 or epoch == (params["num_epochs"] - 1):
        print(
            "Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(
                epoch, epoch_loss_avg.result(), epoch_accuracy.result()
            )
        )

To stop the connection to Neptune and sync all data, call the stop() method:
```
run.stop()
```

Run the training#

Once you execute the code, you should see a Neptune link printed to the console output.

Sample output

[neptune] [info ] Neptune initialized. Open in the app: https://app.neptune.ai/workspace/project/e/RUN-1

Follow the link to open the run in Neptune.

If Neptune can't find your project name or API token

As a best practice, you should save your Neptune API token and project name as environment variables:

export NEPTUNE_API_TOKEN="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8"

export NEPTUNE_PROJECT="ml-team/classification"

Alternatively, you can pass the information when using a function that takes api_token and project as arguments:

run = neptune.init_run( # (1)!
    api_token="h0dHBzOi8aHR0cHM6Lkc78ghs74kl0jv...Yh3Kb8", # (2)!
    project="ml-team/classification", # (3)!
)

Also works for init_model(), init_model_version(), init_project(), and integrations that create Neptune runs underneath the hood, such as NeptuneLogger or NeptuneCallback.
In the bottom-left corner, expand the user menu and select Get my API token.
You can copy the path from the project details ( → Details & privacy).

If you haven't registered, you can log anonymously to a public project:

api_token=neptune.ANONYMOUS_API_TOKEN
project="common/quickstarts"

Make sure not to publish sensitive data through your code!

Analyze the results in Neptune#

In the All metadata section, you can see our two custom namespaces that contain logged metadata:

datasets – our dataset metadata is tracked here.
training – contains other model and training metadata.

The other namespaces are generated by default. They contain automatically logged system and basic metadata.

See example in Neptune

More options#

Saving to the model registry#

To organize your model training metadata separately from the runs, you can log the metadata to model objects. This will make the data appear in the Models section of the project.

Register a model#

You first need to register a unique model. You can then create as many versions of it as you like.

The below example would register a model with the key KERAS:

model_object = neptune.init_model(
    key="KERAS",
    name="Keras model",  # optional
    description="Model trained on MNIST with Keras",  # optional
)

model_object.stop()

Create a model version#

We can now initialize a version of the model we created above. If our project key is CLAS, we identify the model with that and the model key together:

model_version = neptune.init_model_version(
    model="CLAS-KERAS",
)

Now, we can log the metadata to the model version object. We can use the same metadata tracking methods as for runs.

model_version["run_id"] = run["sys/id"].fetch()
model_version["metrics/test_loss"] = test_loss
model_version["metrics/test_accuracy"] = acc
model_version["datasets/version"].track_files("mnist.npz")

# Save the model
model.save("weights.keras") # (1)!

# Upload saved model
model_version["weights"].upload("weights.keras")

This model object refers to the Keras Sequential model created earlier.

Finally, remember to stop Neptune objects once they're no longer needed.

model_version.stop()

See example model version in Neptune