neptune.experiments
¶
An Experiment is everything that you log to Neptune, beginning at neptune.create_experiment()
(ref: create_experiment()
) and ending when script finishes or when you explicitly stop the experiment with neptune.stop
(ref: stop()
).
# Set project
neptune.init('my_workspace/my_project')
# Create new experiment
exp = neptune.create_experiment()
# log metadata
exp.log_metric()
exp.log_image()
exp.log_artifact()
# Log whatever else you want
...
# Close the experiment namespace
exp.stop()
You can log many ML metadata types to the experiment, including:
metrics,
losses,
model weights,
images,
interactive charts,
predictions,
and much more.
Have a look at the complete list of metadata types you can log to the experiment.
Besides logging data, you can also
download experiment data to you local machine, or
update an existing experiment even when it’s closed.
Module Contents¶
Classes¶
|
A class for managing Neptune experiment. |
-
neptune.experiments.
_logger
¶
-
class
neptune.experiments.
Experiment
(backend, project, _id, internal_id)[source]¶ Bases:
object
A class for managing Neptune experiment.
Each time User creates new experiment instance of this class is created. It lets you manage experiment,
log_metric()
,log_text()
,log_image()
,set_property()
, and much more.- Parameters
backend (
neptune.Backend
) – A Backend objectproject (
neptune.Project
) – The project this experiment belongs to_id (
str
) – Experiment idinternal_id (
str
) – internal UUID
Example
Assuming that project is an instance of
Project
.experiment = project.create_experiment()
Warning
User should never create instances of this class manually. Always use:
create_experiment()
.-
IMAGE_SIZE_LIMIT_MB
= 15¶
-
id
¶ Experiment short id
Combination of project key and unique experiment number.Format is<project_key>-<experiment_number>
, for example:MPI-142
.- Returns
str
- experiment short id
Examples
Assuming that experiment is an instance of
Experiment
.exp_id = experiment.id
-
name
¶ Experiment name
- Returns
str
experiment name
Examples
Assuming that project is an instance of
Project
.experiment = project.create_experiment('exp_name') exp_name = experiment.name
-
state
¶ Current experiment state
Possible values: ‘running’, ‘succeeded’, ‘failed’, ‘aborted’.
- Returns
str
- current experiment state
Examples
Assuming that experiment is an instance of
Experiment
.state_str = experiment.state
-
internal_id
¶
-
limits
¶
-
get_system_properties
(self)[source]¶ Retrieve experiment properties.
Experiment properties are for example: owner, created, name, hostname.List of experiment properties may change over time.- Returns
dict
- dictionary mapping a property name to value.
Examples
Assuming that experiment is an instance of
Experiment
.sys_properties = experiment.get_system_properties
Get tags associated with experiment.
- Returns
list
ofstr
with all tags for this experiment.
Example
Assuming that experiment is an instance of
Experiment
.experiment.get_tags()
-
append_tag
(self, tag, *tags)[source]¶ Append tag(s) to the current experiment.
Alias:
append_tags()
. Only[a-zA-Z0-9]
and-
(dash) characters are allowed in tags.- Parameters
tag (single
str
or multiplestr
orlist
ofstr
) –Tag(s) to add to the current experiment.
If
str
is passed, singe tag is added.If multiple - comma separated -
str
are passed, all of them are added as tags.If
list
ofstr
is passed, all elements of thelist
are added as tags.
Examples
neptune.append_tag('new-tag') # single tag neptune.append_tag('first-tag', 'second-tag', 'third-tag') # few str neptune.append_tag(['first-tag', 'second-tag', 'third-tag']) # list of str
Append tag(s) to the current experiment.
Alias for:
append_tag()
-
remove_tag
(self, tag)[source]¶ Removes single tag from the experiment.
- Parameters
tag (
str
) – Tag to be removed
Example
Assuming that experiment is an instance of
Experiment
.# assuming experiment has tags: `['tag-1', 'tag-2']`. experiment.remove_tag('tag-1')
Note
Removing a tag that is not assigned to this experiment is silently ignored.
-
get_channels
(self)[source]¶ Alias for
get_logs()
-
get_logs
(self)[source]¶ Retrieve all log names along with their last values for this experiment.
- Returns
dict
- A dictionary mapping a log names to the log’s last value.
Example
Assuming that experiment is an instance of
Experiment
.exp_logs = experiment.get_logs()
-
send_metric
(self, channel_name, x, y=None, timestamp=None)[source]¶ Log metrics (numeric values) in Neptune.
Alias for
log_metric()
-
log_metric
(self, log_name, x, y=None, timestamp=None)[source]¶ Log metrics (numeric values) in Neptune
If a log with providedlog_name
does not exist, it is created automatically.If log exists (determined bylog_name
), then new value is appended to it.- Parameters
log_name (
str
) – The name of log, i.e. mse, loss, accuracy.x (
double
) –Depending, whether
y
parameter is passed:y
not passed: The value of the log (data-point).y
passed: Index of log entry being appended. Must be strictly increasing.
y (
double
, optional, default isNone
) – The value of the log (data-point).timestamp (
time
, optional, default isNone
) – Timestamp to be associated with log entry. Must be Unix time. IfNone
is passed, time.time() (Python 3.6 example) is invoked to obtain timestamp.
Example
Assuming that experiment is an instance of
Experiment
and ‘accuracy’ log does not exists:# Both calls below have the same effect # Common invocation, providing log name and value experiment.log_metric('accuracy', 0.5) experiment.log_metric('accuracy', 0.65) experiment.log_metric('accuracy', 0.8) # Providing both x and y params experiment.log_metric('accuracy', 0, 0.5) experiment.log_metric('accuracy', 1, 0.65) experiment.log_metric('accuracy', 2, 0.8) # Common invocation, logging loss tensor in PyTorch loss = torch.Tensor([0.89]) experiment.log_metric('log-loss', loss) # Common invocation, logging metric tensor in Tensorflow acc = tf.constant([0.93]) experiment.log_metric('accuracy', acc) f1_score = tf.constant(0.78) experiment.log_metric('f1_score', f1_score)
Note
For efficiency, logs are uploaded in batches via a queue. Hence, if you log a lot of data, you may experience slight delays in Neptune web application.
Note
Passing either
x
ory
coordinate as NaN or +/-inf causes this log entry to be ignored. Warning is printed tostdout
.
-
send_text
(self, channel_name, x, y=None, timestamp=None)[source]¶ Log text data in Neptune.
Alias for
log_text()
-
log_text
(self, log_name, x, y=None, timestamp=None)[source]¶ Log text data in Neptune
If a log with providedlog_name
does not exist, it is created automatically.If log exists (determined bylog_name
), then new value is appended to it.- Parameters
log_name (
str
) – The name of log, i.e. mse, my_text_data, timing_info.x (
double
orstr
) –Depending, whether
y
parameter is passed:y
not passed: The value of the log (data-point). Must bestr
.y
passed: Index of log entry being appended. Must be strictly increasing.
y (
str
, optional, default isNone
) – The value of the log (data-point).timestamp (
time
, optional, default isNone
) –Timestamp to be associated with log entry. Must be Unix time. If
None
is passed, time.time() (Python 3.6 example) is invoked to obtain timestamp.
Example
Assuming that experiment is an instance of
Experiment
:# common case, where log name and data are passed neptune.log_text('my_text_data', str(data_item)) # log_name, x and timestamp are passed neptune.log_text(log_name='logging_losses_as_text', x=str(val_loss), timestamp=1560430912)
Note
For efficiency, logs are uploaded in batches via a queue. Hence, if you log a lot of data, you may experience slight delays in Neptune web application.
Note
Passing
x
coordinate as NaN or +/-inf causes this log entry to be ignored. Warning is printed tostdout
.
-
send_image
(self, channel_name, x, y=None, name=None, description=None, timestamp=None)[source]¶ Log image data in Neptune.
Alias for
log_image()
-
log_image
(self, log_name, x, y=None, image_name=None, description=None, timestamp=None)[source]¶ Log image data in Neptune
If a log with providedlog_name
does not exist, it is created automatically.If log exists (determined bylog_name
), then new value is appended to it.- Parameters
log_name (
str
) – The name of log, i.e. bboxes, visualisations, sample_images.x (
double
) –Depending, whether
y
parameter is passed:y
not passed: The value of the log (data-point). Seey
parameter.y
passed: Index of log entry being appended. Must be strictly increasing.
y (multiple types supported, optional, default is
None
) –The value of the log (data-point). Can be one of the following types:
PIL image
Pillow docsmatplotlib.figure.Figure
Matplotlib 3.1.1 docsstr
- path to image file2-dimensional
numpy.array
with values in the [0, 1] range - interpreted as grayscale image3-dimensional
numpy.array
with values in the [0, 1] range - behavior depends on last dimensionif last dimension is 1 - interpreted as grayscale image
if last dimension is 3 - interpreted as RGB image
if last dimension is 4 - interpreted as RGBA image
torch.tensor
with values in the [0, 1] range.torch.tensor
is converted tonumpy.array
via .numpy() method and logged.
tensorflow.tensor
with values in [0, 1] range.tensorflow.tensor
is converted tonumpy.array
via .numpy() method and logged.
image_name (
str
, optional, default isNone
) – Image namedescription (
str
, optional, default isNone
) – Image descriptiontimestamp (
time
, optional, default isNone
) –Timestamp to be associated with log entry. Must be Unix time. If
None
is passed, time.time() (Python 3.6 example) is invoked to obtain timestamp.
Example
Assuming that experiment is an instance of
Experiment
:# path to image file experiment.log_image('bbox_images', 'pictures/image.png') experiment.log_image('bbox_images', x=5, 'pictures/image.png') experiment.log_image('bbox_images', 'pictures/image.png', image_name='difficult_case') # PIL image img = PIL.Image.new('RGB', (60, 30), color = 'red') experiment.log_image('fig', img) # 2d numpy array array = numpy.random.rand(300, 200)*255 experiment.log_image('fig', array) # 3d grayscale array array = numpy.random.rand(300, 200, 1)*255 experiment.log_image('fig', array) # 3d RGB array array = numpy.random.rand(300, 200, 3)*255 experiment.log_image('fig', array) # 3d RGBA array array = numpy.random.rand(300, 200, 4)*255 experiment.log_image('fig', array) # torch tensor tensor = torch.rand(10, 20) experiment.log_image('fig', tensor) # tensorflow tensor tensor = tensorflow.random.uniform(shape=[10, 20]) experiment.log_image('fig', tensor) # matplotlib figure example 1 from matplotlib import pyplot pyplot.plot([1, 2, 3, 4]) pyplot.ylabel('some numbers') experiment.log_image('plots', plt.gcf()) # matplotlib figure example 2 from matplotlib import pyplot import numpy numpy.random.seed(19680801) data = numpy.random.randn(2, 100) figure, axs = pyplot.subplots(2, 2, figsize=(5, 5)) axs[0, 0].hist(data[0]) axs[1, 0].scatter(data[0], data[1]) axs[0, 1].plot(data[0], data[1]) axs[1, 1].hist2d(data[0], data[1]) experiment.log_image('diagrams', figure)
Note
For efficiency, logs are uploaded in batches via a queue. Hence, if you log a lot of data, you may experience slight delays in Neptune web application.
Note
Passing
x
coordinate as NaN or +/-inf causes this log entry to be ignored. Warning is printed tostdout
.Warning
Only images up to 15MB are supported. Larger files will not be logged to Neptune.
-
send_artifact
(self, artifact, destination=None)[source]¶ Save an artifact (file) in experiment storage.
Alias for
log_artifact()
-
log_artifact
(self, artifact, destination=None)[source]¶ Save an artifact (file) in experiment storage.
- Parameters
artifact (
str
orIO object
) – A path to the file in local filesystem or IO object. It can be open file descriptor or in-memory buffer like io.StringIO or io.BytesIO.destination (
str
, optional, default isNone
) – A destination path. IfNone
is passed, an artifact file name will be used.
Note
If you use in-memory buffers like io.StringIO or io.BytesIO, remember that in typical case when you write to such a buffer, it’s current position is set to the end of the stream, so in order to read it’s content, you need to move back it’s position to the beginning. We recommend to call seek(0) on the in-memory buffers before passing it to Neptune. Additionally, if you provide io.StringIO, it will be encoded in ‘utf-8’ before sent to Neptune.
- Raises
FileNotFound – When
artifact
file was not found.StorageLimitReached – When storage limit in the project has been reached.
Example
Assuming that experiment is an instance of
Experiment
:# simple use experiment.log_artifact('images/wrong_prediction_1.png') # save file in other directory experiment.log_artifact('images/wrong_prediction_1.png', 'validation/images/wrong_prediction_1.png') # save file under different name experiment.log_artifact('images/wrong_prediction_1.png', 'images/my_image_1.png')
-
delete_artifacts
(self, path)[source]¶ Removes an artifact(s) (file/directory) from the experiment storage.
- Parameters
path (
list
orstr
) – Path or list of paths to remove from the experiment’s output- Raises
FileNotFound – If a path in experiment artifacts does not exist.
Examples
Assuming that experiment is an instance of
Experiment
.experiment.delete_artifacts('forest_results.pkl') experiment.delete_artifacts(['forest_results.pkl', 'directory']) experiment.delete_artifacts('')
-
download_artifact
(self, path, destination_dir=None)[source]¶ Download an artifact (file) from the experiment storage.
Download a file indicated by
path
from the experiment artifacts and save it indestination_dir
.- Parameters
path (
str
) – Path to the file to be downloaded.destination_dir (
str
) – The directory where the file will be downloaded. IfNone
is passed, the file will be downloaded to the current working directory.
- Raises
NotADirectory – When
destination_dir
is not a directory.FileNotFound – If a path in experiment artifacts does not exist.
Examples
Assuming that experiment is an instance of
Experiment
.experiment.download_artifact('forest_results.pkl', '/home/user/files/')
-
download_sources
(self, path=None, destination_dir=None)[source]¶ Download a directory or a single file from experiment’s sources as a ZIP archive.
Download a subdirectory (or file)
path
from the experiment sources and save it indestination_dir
as a ZIP archive. The name of an archive will be a name of downloaded directory (or file) with ‘.zip’ extension.- Parameters
path (
str
) – Path of a directory or file in experiment sources to be downloaded. IfNone
is passed, all source files will be downloaded.destination_dir (
str
) – The directory where the archive will be downloaded. IfNone
is passed, the archive will be downloaded to the current working directory.
- Raises
NotADirectory – When
destination_dir
is not a directory.FileNotFound – If a path in experiment sources does not exist.
Examples
Assuming that experiment is an instance of
Experiment
.# Download all experiment sources to current working directory experiment.download_sources() # Download a single directory experiment.download_sources('src/my-module') # Download all experiment sources to user-defined directory experiment.download_sources(destination_dir='/tmp/sources/') # Download a single directory to user-defined directory experiment.download_sources('src/my-module', 'sources/')
-
download_artifacts
(self, path=None, destination_dir=None)[source]¶ Download a directory or a single file from experiment’s artifacts as a ZIP archive.
Download a subdirectory (or file)
path
from the experiment artifacts and save it indestination_dir
as a ZIP archive. The name of an archive will be a name of downloaded directory (or file) with ‘.zip’ extension.- Parameters
path (
str
) – Path of a directory or file in experiment artifacts to be downloaded. IfNone
is passed, all artifacts will be downloaded.destination_dir (
str
) – The directory where the archive will be downloaded. IfNone
is passed, the archive will be downloaded to the current working directory.
- Raises
NotADirectory – When
destination_dir
is not a directory.FileNotFound – If a path in experiment artifacts does not exist.
Examples
Assuming that experiment is an instance of
Experiment
.# Download all experiment artifacts to current working directory experiment.download_artifacts() # Download a single directory experiment.download_artifacts('data/images') # Download all experiment artifacts to user-defined directory experiment.download_artifacts(destination_dir='/tmp/artifacts/') # Download a single directory to user-defined directory experiment.download_artifacts('data/images', 'artifacts/')
-
reset_log
(self, log_name)[source]¶ Resets the log.
Removes all data from the log and enables it to be reused from scratch.
- Parameters
log_name (
str
) – The name of log to reset.- Raises
ChannelDoesNotExist – When the log with name
log_name
does not exist on the server.
Example
Assuming that experiment is an instance of
Experiment
.experiment.reset_log('my_metric')
Note
Check Neptune web application to see that reset charts have no data.
-
get_parameters
(self)[source]¶ Retrieve parameters for this experiment.
- Returns
dict
- dictionary mapping a parameter name to value.
Examples
Assuming that experiment is an instance of
Experiment
.exp_params = experiment.get_parameters()
-
get_properties
(self)[source]¶ Retrieve User-defined properties for this experiment.
- Returns
dict
- dictionary mapping a property key to value.
Examples
Assuming that experiment is an instance of
Experiment
.exp_properties = experiment.get_properties()
-
set_property
(self, key, value)[source]¶ Set key-value pair as an experiment property.
If property with given
key
does not exist, it adds a new one.- Parameters
key (
str
) – Property key.value (
obj
) – New value of a property.
Examples
Assuming that experiment is an instance of
Experiment
:experiment.set_property('model', 'LightGBM') experiment.set_property('magic-number', 7)
-
remove_property
(self, key)[source]¶ Removes a property with given key.
- Parameters
key (single
str
) – Key of property to remove.
Examples
Assuming that experiment is an instance of
Experiment
:experiment.remove_property('host')
-
get_hardware_utilization
(self)[source]¶ Retrieve GPU, CPU and memory utilization data.
Get hardware utilization metrics for entire experiment as a single pandas.DataFrame object. Returned DataFrame has following columns (assuming single GPU with 0 index):
x_ram - time (in milliseconds) from the experiment start,
y_ram - memory usage in GB,
x_cpu - time (in milliseconds) from the experiment start,
y_cpu - CPU utilization percentage (0-100),
x_gpu_util_0 - time (in milliseconds) from the experiment start,
y_gpu_util_0 - GPU utilization percentage (0-100),
x_gpu_mem_0 - time (in milliseconds) from the experiment start,
y_gpu_mem_0 - GPU memory usage in GB.
If more GPUs are available they have their separate columns with appropriate indices (0, 1, 2, …), for example: x_gpu_util_1, y_gpu_util_1.The returned DataFrame may containNaN
s if one of the metrics has more values than others.- Returns
pandas.DataFrame
- DataFrame containing the hardware utilization metrics.
Examples
The following values denote that after 3 seconds, the experiment used 16.7 GB of RAM
x_ram = 3000
y_ram = 16.7
Assuming that experiment is an instance of
Experiment
:hardware_df = experiment.get_hardware_utilization()
-
get_numeric_channels_values
(self, *channel_names)[source]¶ Retrieve values of specified metrics (numeric logs).
The returned pandas.DataFrame contains 1 additional column x along with the requested metrics.
- Parameters
*channel_names (one or more
str
) – comma-separated metric names.- Returns
pandas.DataFrame
- DataFrame containing values for the requested metrics.The returned DataFrame may containNaN
s if one of the metrics has more values than others.
Example
Invoking
get_numeric_channels_values('loss', 'auc')
returns DataFrame with columns x, loss, auc.Assuming that experiment is an instance of
Experiment
:batch_channels = experiment.get_numeric_channels_values('batch-1-loss', 'batch-2-metric') epoch_channels = experiment.get_numeric_channels_values('epoch-1-loss', 'epoch-2-metric')
Note
It’s good idea to get metrics with common temporal pattern (like iteration or batch/epoch number). Thanks to this each row of returned DataFrame has metrics from the same moment in experiment. For example, combine epoch metrics to one DataFrame and batch metrics to the other.
-
_start
(self, upload_source_entries=None, abort_callback=None, logger=None, upload_stdout=True, upload_stderr=True, send_hardware_metrics=True, run_monitoring_thread=True, handle_uncaught_exceptions=True)[source]¶
-
stop
(self, exc_tb=None)[source]¶ Marks experiment as finished (succeeded or failed).
- Parameters
exc_tb (
str
, optional, default isNone
) – Additional traceback information to be stored in experiment details in case of failure (stacktrace, etc). If this argument isNone
the experiment will be marked as succeeded. Otherwise, experiment will be marked as failed.
Examples
Assuming that experiment is an instance of
Experiment
:# Marks experiment as succeeded experiment.stop() # Assuming 'ex' is some exception, # it marks experiment as failed with exception info in experiment details. experiment.stop(str(ex))