Storage troubleshooting and tips#
In this guide, you'll learn:
- How to identify what data is taking up storage space.
- What you can do to free up space.
- How you can reduce or manage the amount of data that is logged to Neptune.
Finding what takes up space#
Which projects take up the most space?#
To find out which of your projects take up the most space, click your workspace name in the top-left corner → Subscription → Usage.
Here you'll find a table of all your projects which can be sorted by storage used.
You can also get this information by querying the API and sorting the results by project or by user. The following Jupyter notebook demonstrates how to do this.
Trashed runs or models still take up space
Trashed items are not deleted until you manually empty the trash.
To check or empty trash, navigate to a project and select the Trash tab.
- To permanently delete all listed objects, select Empty trash.
Which runs or models take up the most space?#
In the table view for runs, models, or model versions, you can sort the objects by size.
- Click Add column.
- Select the
sys/sizefield, or start typing it until you see it in the search results.
- In the Size column that appears, click the icon on the column and select Sort descending.
Scenario A: Certain objects take up a lot of space#
If only a few runs or model objects take up a lot of storage space, look into what kind of metadata they tend to have logged.
You can check the size of
FileSetfields by browsing All metadata of a particular run.
- Simple types, such as
Stringvalues, take up very little space and are unlikely to be the problem.
If there are any runs you no longer need, consider deleting them. To keep your storage manageable, make this clean-up a monthly or quarterly activity.
In the runs table, you can display old runs by filtering based on the
If you want to keep the old runs, check if there are some individual metadata fields that you could delete, such as model checkpoints or large visualizations. You can do this via API by resuming each run and deleting fields or entire namespaces you don't need.
If you have a large datasets logged, consider storing them in dedicated cloud storage and only tracking them in Neptune as artifacts.
Scenario B: Objects are largely uniform in size#
If all runs are similar in size and there are no clear outliers, the next step would be to identify what metadata takes the most space.
- If all the metadata is important for the runs, consider deleting some runs that are no longer needed.
- You can modify the code to avoid logging some of the heavy metadata, or store the data externally (for example, in dedicated cloud storage) and only store a link to them in Neptune.