Google Colab

What will you get with this integration?

Google Colab is a temporary runtime environment. This means you lose all your data (unless saved externally) once you restart your kernel.

This is where you can leverage Neptune. By running model training on Google Colab and keeping track of it with Neptune you can log and download things like:

  • parameters,

  • metrics and losses,

  • images, interactive charts, and other media,

  • hardware consumption,

  • model checkpoints and other artifacts.

By doing that you can keep your run metadata safe even when the Google Colab kernel has died.

Introduction

This guide will show you how to:

  • Install neptune-client,

  • Connect Neptune to your Colab notebook and create the first run,

  • Log simple metrics to Neptune and explore them in the Neptune UI.

Step 0: Before you start

Make sure that:

  1. you have an account with both Google and Neptune

  2. you have created a project from the Neptune UI that you will use for tracking metadata.

Registering with Neptune and creating a project is optional in case you are just trying out the application as an 'ANONYMOUS' user.

Step 1: Install Neptune client and import Neptune

Install neptune-client before your code cells in Google Colab:

pip install --upgrade neptune-client

Then import neptune:

import neptune.new as neptune

Step 2: Initialize Neptune

Basically, you tell Neptune:

  • who you are: your Neptune api_token

  • where you want to send your data: your Neptune project.

At this point, you will have a new run in Neptune. From now on you will use the run object to log metadata to it.

Get your personal api_token to initialize Neptune

There are a few special, public projects to show how Neptune works. For those projects, you can use the 'ANONYMOUS' api token and log as a public user neptuner.

For example:

run = neptune.init(api_token='ANONYMOUS' ,
project='common/neptune-and-google-colab')

Get your Neptune API token and pass it to Neptune:

How to find your API token

The preferred way of doing this is by using the getpass() method so that your token remains private even if you share the notebook.

from getpass import getpass
api_token = getpass('Enter your private Neptune API token: ')
# api_token = 'ANONYMOUS' for ANONYMOUS users

Enter the token in the input box. This will save your token to api_token.

Initialize your project

Remember to create a new project from the UI that you will use for metadata tracking.

workspace = 'your_neptune_username' # username = 'common' for ANONYMOUS users
project = 'your_project_name'
project_name = workspace + '/' + project_name

The project_name of a project can be found under Settings → Properties

How to find the full name of the 'project'.
run = neptune.init(project=project_name, api_token=api_token)

Executing this cell will give you a link. Click on the link to open the run in Neptune. For now, it is empty but keep the tab with the run open to see what happens next.

Runs can be viewed as dictionary-like structures - namespaces - that you can define in your code. You can apply a hierarchical structure to your metadata that will be reflected in the UI as well. Thanks to this you can easily organize your metadata in a way you feel is most convenient.

Step 3: Log metadata during training

Log metrics or losses under a name of your choice. You can log one or multiple values.

Now run the cell below.

from time import sleep
params = {'learning_rate': 0.1}
# log params
run['parameters'] = params
# log name and append tags
run["sys/name"] = 'colab-example'
run["sys/tags"].add(['colab', 'simple'])
# log loss during training
for epoch in range(132):
sleep(0.1) # to see logging live
run["train/loss"].log(0.97 ** epoch)
run["train/loss-pow-2"].log((0.97 ** epoch)**2)
# log train and validation scores
run['train/accuracy'] = 0.95
run['valid/accuracy'] = 0.93
# log files/artifacts
! echo "Welcome to Neptune" > file.txt
run['artifacts/sample'].upload('file.txt') # file will be uploaded as sample.txt

The snippet above logs:

  • parameters with just one field: learning rate,

  • name of run and two tags,

  • train/loss and train/loss-pow-2 as series of numbers, visualized as charts in UI,

  • train/accuracy and valid/accuracy as single values

  • file.txt which will be visible under All Metadata/artifacts as sample.txt

Step 4: Stop logging

Once you are done logging, you should stop tracking the run using the stop() method. This is needed only while logging from a notebook environment. While logging through a script, Neptune automatically stops tracking once the script has completed execution.

run.stop()
File uploaded as artifact

Step 5: Explore the Run in Neptune UI

Switch over to the Nepune UI. Go to the All metadata and Charts sections of the Neptune UI to see the logs. You can also check an example run.

Charts in the Neptune UI

Neptune automatically logs the hardware consumption during the run. You can see them in the Monitoring section of the Neptune UI.

Monitoring hardware usage with Neptune

To view the structure of the run object, use the print_structure() method.

run.print_structure()

Conclusion

You’ve learned how to:

  • Install neptune-client,

  • Connect Neptune to your Google Colab notebook and create a run,

  • Log metadata to Neptune,

  • See your metrics parameters and scores,

  • See hardware consumption during the run.

What's next