Logging datasets#
You can track the metadata of your datasets by logging them as artifacts.
Use the track_files()
method and pass a file or folder path:
# Single file
run["train/dataset"].track_files("./datasets/train.csv")
# Folder
run["train/images"].track_files("s3://datasets/images")
This saves a record of a file (or collection of files) with the following information:
- URL and file path
- MD5 hash
- Size
- Last modified
To learn more, see Tracking artifacts.
Uploading datasets#
If you want to upload a dataset (or a sample of it) in full, you can upload it as you would any other file:
For details, see Uploading files.
See also
Use cases ≫ Data versioning