Log datasets#
You can track the metadata and version of your datasets by logging them as artifacts.
Use the track_files()
method and pass a file or folder path:
# Single file
run["train/dataset"].track_files("./datasets/train.csv")
# Folder
run["train/images"].track_files("s3://datasets/images")
This saves a record of a file (or collection of files) with the following information:
- URL and file path
- MD5 hash
- Size
- Last modified
To learn more, see Track artifacts.
Uploading datasets#
If you want to upload a dataset (or a sample of it) in full, you can upload it as you would any other file:
For details, see Upload files.
See also
- Tutorials ≫ Data versioning
- Downloading or fetching files