Our great sponsors
-
qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
-
oxen-release
Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
If you want to perform vector search over your data, then Qdrant (https://qdrant.tech) is worth checking out.
We've been working on an open source tool called Oxen to help store large ML datasets. It's optimized for large sets of unstructured data ie images, video, audio, text, as well as parquet or arrow style DataFrames.
Would love to get some feedback on it!
Vector databases in general are good for storing large amounts of unstructured data by first converting them into embeddings via ML models. There's also feature stores, which store and organize features for later use in model training or predictive analytics. Feature stores generally come in _before_ models get trained, while vector databases generally come _after_ (i.e. they use trained models).
Milvus (https://milvus.io) and Feast (https://feast.dev/) are two of the most well known vector databases and feature stores, respectively.