Skip to content

How to use Neptune with Docker#

You can use Neptune in any Python environment, including containerized Python scripts or applications.

In this guide, you'll learn how to use Neptune to log experimentation metadata inside a Docker container.

See results in Neptune  Code examples 

Before you start#

Set up authentication#

You have a couple of options for using your Neptune API token with Docker:

Add Neptune to your script#

Create a Python script named train.py, or use an existing model training script and add the Neptune-specific snippets as needed (highlighted in the sample script below).

train.py
import torch
import torch.nn as nn
import torch.optim as optim
from neptune.utils import stringify_unsupported
from torchvision import datasets, transforms

import neptune

# Initialize Neptune and create new run
run = neptune.init_run()

DATA_DIR = "data"  # modify as needed
data_tfms = {
    "train": transforms.Compose(
        [
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
        ]
    )
}

params = {
    "lr": 1e-2,
    "batch_size": 128,
    "input_size": 32 * 32 * 3,
    "n_classes": 10,
    "model_filename": "basemodel",
}

trainset = datasets.CIFAR10(DATA_DIR, transform=data_tfms["train"], download=True)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=params["batch_size"], shuffle=True
)
dataset_size = {"train": len(trainset)}

model = BaseModel(params["input_size"], params["input_size"], params["n_classes"])
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=params["lr"])

# Log config and pararameters
run["config/dataset/path"] = DATA_DIR
run["config/dataset/transforms"] = stringify_unsupported(data_tfms)
run["config/dataset/size"] = dataset_size
run["config/params"] = params

# Log losses and metrics
for i, (x, y) in enumerate(trainloader, 0):

    optimizer.zero_grad()
    outputs = model.forward(x)
    _, preds = torch.max(outputs, 1)
    loss = criterion(outputs, y)
    acc = (torch.sum(preds == y.data)) / len(x)

    # Log batch loss
    run["metrics/training/batch/loss"].append(loss)

    # Log batch accuracy
    run["metrics/training/batch/acc"].append(acc)

    loss.backward()
    optimizer.step()

# Stop logging
run.stop()

Create a Dockerfile#

  1. Create a requirements.txt file with all the dependencies needed for this guide – in particular, the Neptune client library (neptune).

    requirements.txt
    neptune
    torch==1.9.0
    torchvision==0.10.0
    
  2. Create a Dockerfile that will:

    1. Specify the base container image that we'll use to build our own.
    2. Install dependencies specified in requirements.txt.
    3. Copy our training script to the container image and define the command for executing it.
    Dockerfile
    # syntax=docker/dockerfile:1
    FROM python:3.8-slim-buster
    
    RUN apt-get update
    RUN apt-get -y install gcc
    
    COPY requirements.txt requirements.txt
    RUN pip3 install -r requirements.txt
    
    # Copy all files in the current directory to
    # the main directory of the container
    COPY . .
    CMD [ "python3", "-W ignore" ,"training.py" ]
    

Build and run a Docker container#

Next, we'll build a container image using the Dockerfile we created in the previous step.

  1. To build the container image, execute the following command:

    docker build --tag <image-name> . # image-name: neptune-docker
    
  2. Run the created image by passing your Neptune API token as a Docker environment variable:

    # image-name: neptune-docker
    docker run -e NEPTUNE_API_TOKEN="<YOUR_API_TOKEN>" <image-name>
    
    How do I find my API token?

    In the bottom-left corner of the Neptune app, open the user menu and select Get your API token.

    How to find your Neptune API token

    You can copy your token from the dialog that opens. It's very long – make sure to copy and paste it in full!

  3. Run your Docker image container in a manner of your choosing – such as GitHub Actions, your local or remote machine, and so no. Neptune works with either method.

To open the run and watch the model training live, click the Neptune link in the console output.

See example run in Neptune