Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today. Learn more →
Oxen-release Alternatives
Similar projects and alternatives to oxen-release
-
gpt-2-output-dataset
Dataset of GPT-2 outputs for research in detection, biases, and more
-
-
SonarLint
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
-
-
Milvus
A cloud-native vector database with high-performance and high scalability.
-
qdrant
Qdrant - Vector Search Engine and Database for the next generation of AI applications. Also available in the cloud https://cloud.qdrant.io/
-
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
-
-
-
dud
A lightweight CLI tool for versioning data alongside source code and building data pipelines.
oxen-release reviews and mentions
-
A tale of Phobos – how we almost cracked a ransomware using CUDA
We've been working on some open source tooling called "oxen" that was built for large datasets of images, video, audio, text etc. We wanted to solve the exact problem you're flagging here with git.
Feel free to check it out here https://github.com/Oxen-AI/oxen-release#-oxen would love any feedback!
-
A Critical Field Guide for Working with Machine Learning Datasets
We've been working on an open source tool called Oxen to help store large ML datasets. It's optimized for large sets of unstructured data ie images, video, audio, text, as well as parquet or arrow style DataFrames.
Would love to get some feedback on it!
-
Oxen.ai: Fast Unstructured Data Version Control
Creators of the project here, the first difference is the raw speed if you have many images, video, audio, files etc.
We did some benchmarking here: https://github.com/Oxen-AI/oxen-release/blob/main/Performanc...
~TLDR~
The comparison with DVC is biased https://github.com/Oxen-AI/oxen-release/blob/main/Performanc...
I'd nowhere near the same performance with oxen. The analysis is very biased to help Oxen. I wish people had more integrity before trying so hard to push a half-baked product into the market.
-
🐂 🌾 Oxen.ai - Blazing Fast Unstructured Data Version Control
We have been working on an open source data version control tool, built in Rust, and aimed at versioning large sets of images, videos, audio, text, data frames, etc. Ie the types of data you need to work with for modern machine learning systems. The tooling can index hundreds of thousands of images in seconds and uses fast hashing and modern network protocols to sync it to the remote extremely fast. You can checkout some performance numbers on the CelebA facial recognition dataset here.
-
🐂 🌾 Oxen.ai - Blazing Fast Unstructured Data Version Control, built in Rust
It is, but git-lfs is extremely slow when it comes to these types of use cases. Due to a few factors including the hashing algorithm, and network protocols. I have some bench marking here: https://github.com/Oxen-AI/oxen-release/blob/main/Performance.md
-
Detect ChatGPT Generated Content
Where are you hosting the dataset? Would love to help out, I'm building an open source data version control tool to help iterate on ML datasets.
https://github.com/Oxen-AI/oxen-release
Would be cool if we could get a community around the test dataset to insure that 93% accuracy rate. Then people can add their failure cases to the repo and then you can iterate on them.
-
A note from our sponsor - SonarLint
www.sonarlint.org | 25 Mar 2023
Stats
Oxen-AI/oxen-release is an open source project licensed under Apache License 2.0 which is an OSI approved license.