Our great sponsors
-
oxen-release
Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
We have been working on an open source data version control tool, built in Rust, and aimed at versioning large sets of images, videos, audio, text, data frames, etc. Ie the types of data you need to work with for modern machine learning systems. The tooling can index hundreds of thousands of images in seconds and uses fast hashing and modern network protocols to sync it to the remote extremely fast. You can checkout some performance numbers on the CelebA facial recognition dataset here.