Skip to content

Log datasets#

You can track the metadata and version of your datasets by logging them as artifacts.

Use the track_files() method and pass a file or folder path:

# Single file
run["train/dataset"].track_files("./datasets/train.csv")

# Folder
run["train/images"].track_files("s3://datasets/images")

This saves a record of a file (or collection of files) with the following information:

  • URL and file path
  • MD5 hash
  • Size
  • Last modified

To learn more, see Track artifacts.

Uploading datasets#

If you want to upload a dataset (or a sample of it) in full, you can upload it as you would any other file:

run["aux/data"].upload("auxiliary-data.zip")

For details, see Upload files.