Keeping track of Jupyter Notebooks¶
Jupyter Notebooks are a useful and popular tool for data scientists, regardless of their area of specialization. They allow data scientists to work interactively, keeping code and results - like visualizations - in a single document.
While Neptune is essentially a platform for tracking experiments, it provides Jupyter and JupyterLab extensions that also let you track Jupyter Notebooks.
In Neptune, each Notebook consists of a collection of checkpoints that you upload directly from the Jupyter user interface.
In any project, an unlimited number of Notebooks and checkpoints is allowed.
You can browse checkpoints history across all Notebooks in the project.
You can share a Notebook as a link.
You can compare two Notebooks side-by-side, like source code.
To try it now, without registering to Neptune, look at the sample Notebooks in the public project onboarding. Use the public user’s API token that appears below, and the username neptuner to upload some snapshots to this project. You still need to install and configure Jupyter extension.
Public user’s API token:
The Notebooks tab in the Neptune UI provides a table of all the Notebooks in the current project.
This view lets you see what your team members are working on, review details and checkpoints associated with a Notebook, as well as share or download a Notebook and compare two or more Notebooks.
The Notebook data is arranged in the following columns:
In addition, for each Notebook, there are buttons for downloading the Notebook, comparing it with another Notebook, or for sharing a link to it.
A Compare button at the top right displays a Notebooks Comparison pane. See Compare Notebooks.
Once you select a Notebook, you can see all its contents, that is: code and markdown cells, outputs and execution count.
There are two tabs on the right:
Details: Here are shown the ID, size, creation date, latest checkpoint, owner, description and associated experiments of the selected Notebook.
Checkpoints: Here are listed all the checkpoints of the Notebook. Click a checkpoint to see the details in the main pane. From this tab, you can also access the experiments that are associated with the checkpoint.
You can also view snapshots of the work with the Notebook, as well as download, share or compare this checkpoint with another checkpoint.
The Notebooks Comparison pane lets you compare Notebook checkpoints.
You display the pane by clicking the Compare button anywhere it is visible in the Notebooks pane.
In the Notebooks Comparison pane, select two Notebook checkpoints, then click Compare to see a side-by-side comparison, just like source code.
Differences in code, markdown, output and execution count are highlighted.
Summary information about the differences is displayed at the top of the pane.
Notebooks are stored as files on your computer.
Each Notebook file (.ipynb) is a JSON containing everything that the user can see in a Notebook and some metadata.
Neptune uses metadata to associate particular files with Notebook entities on Neptune servers. That means that after a Notebook is uploaded to Neptune, the file on disk is changed to include the ID of the entity on the Neptune server.
If you copy a Notebook file (let’s call it “Notebook A”) and edit it with the intention of creating something completely separate from Notebook A, the association with Notebook A on the Neptune server remains. If the name of the Notebook changes from “Notebook A”, you will be warned.
When you download a Notebook checkpoint, the ID in the metadata is preserved, so that when, after some work, you click Upload, Neptune knows that this may be another checkpoint in a particular Notebook. You can do some work, upload some intermediate snapshot, go to another computer (or another SageMaker instance, and so on), download the Notebook and keep on working on it.
The capability is comparable to Google Docs in that there’s a place where you store your work and you can access
it easily from wherever you choose.
Depending on their roles, members of a project can view and download all Notebooks (and their checkpoints) in the project.
Viewers can download Notebooks.
Contributors and Owners can also upload them.
When uploading a new Notebook, a user becomes the owner of this Notebook. Only the owner of a Notebook can upload new checkpoints of this Notebook.
Uploading a Notebook¶
You can upload Notebook checkpoints from Jupyter to Neptune.
To upload the current Notebook as a checkpoint:
In the dialog that is displayed, select a project from the list.
(Optional) Type in a checkpoint name and description.
Click Upload checkpoint.
A confirmation message is displayed. You can click the link in the message to open the Notebook in Neptune.
Downloading a Notebook¶
You can download a specific Notebook checkpoint from Neptune to Jupyter.
To download a Notebook checkpoint:
In the dialog that is displayed, select the following from the respective lists:
You can create Notebooks and update Notebook checkpoints in Neptune from the command line, using Neptune’s notebook sync command.
Using CLI commands is an alternative if you prefer not to use the neptune-notebooks extensions in Jupyter or JupyterLab.
Syncing Notebook checkpoints using the neptune-notebooks extension is highly recommended!
There is a single - yet powerful - CLI command:
neptune notebook sync --project ENTITY_NAME/PROJECT_NAME your_notebook.ipynb
ENTITY_NAME is either your workspace name in the case of a team account or a username in the case of an individual account.
Project to which to log Notebook or checkpoint. If the NEPTUNE_PROJECT environment variable is set, then this command overwrites the environment variable.
Takes the user’s Notebook user_notebook.ipynb and creates a new Notebook object in Neptune.
If the Notebook is not known to Neptune, it will be created. In such a case, you do not need to use this flag.
neptune notebook sync --project ENTITY_NAME/PROJECT_NAME your_notebook.ipynb --new
Updates the user’s Notebook user_notebook.ipynb in Neptune by adding a new checkpoint to it.
If the Notebook is known to Neptune it will be updated. In such a case, you do not need to use this flag.
neptune notebook sync --project ENTITY_NAME/PROJECT_NAME your_notebook.ipynb --update
To use the CLI command, you must export your NEPTUNE_API_TOKEN as an environment variable. You can do this in either of two ways:
Use this command:
Append the line above to your
Always keep your API token secret - it is like a password to the application. Appending the “export NEPTUNE_API_TOKEN=’YOUR_LONG_API_TOKEN’” line to your
~/.bash_profile file is the recommended method to ensure it remains secret.