X-coordinates (step) must be strictly increasing#
Error
Error occurred during asynchronous operation processing: X-coordinates (step) must be strictly increasing for series attribute. <field_name>. Invalid point: <invalid step value>
Issue#
This error occurs when you're creating a series field with a step value that is less than or equal to the value of the previous logged step. For the chart to be constructed correctly, step values must be strictly increasing.
You might run into this in the following scenarios:
- You're resuming an existing run and appending fresh values to a series field with steps already logged. Remember that
append()
is iterative, so each call adds a value to the existing set of values. - You have distributed systems logging to the same monitoring namespace. For example, two GPUs logging utilization to the same field.
Solution#
Starting a new series#
If you want to resume a run and overwrite an existing series field, you need to either:
- delete the field, or
- change the name of the field so that effectively a new field is created.
Example#
In the below example, train/distribution
is a FileSeries field. We want to start it from the beginning when resuming the run, so we'll delete the field with the del
command first:
Now the step will be appended correctly.
for epoch in range(100):
plt_fig = get_histogram()
run["train/distribution"].append(plt_fig, step=epoch)
Tips for distributed systems#
If logging of system metrics is the cause for the error, check the below resources for tips:
- Best practices
- Tracking distributed training jobs with Neptune
- Learn how the monitoring namespace works: Logging system metrics
Related
- Learn more about the
step
argument: Log custom x values for graph - API references for series fields:
- Trash and delete data
Filtering out the error#
Instead of turning off logging completely, you can filter out this error by adding the below snippet to your scripts:
import logging
class _FilterCallback(logging.Filterer):
def filter(self, record: logging.LogRecord):
return not (
record.name == "neptune"
and record.getMessage().startswith(
"Error occurred during asynchronous operation processing:"
)
)
neptune.internal.operation_processors.async_operation_processor.logger.addFilter(
_FilterCallback()
)
The above filters out both of the following messages:
Error occurred during asynchronous operation processing: Timestamp must be non-decreasing for series attribute
Error occurred during asynchronous operation processing: X-coordinates (step) must be strictly increasing for series attribute
To filter out either one of the above, modify the example startswith()
definition accordingly.