Our great sponsors
-
label-errors
🛠️ Corrected Test Sets for ImageNet, MNIST, CIFAR, Caltech-256, QuickDraw, IMDB, Amazon Reviews, 20News, and AudioSet
we be benchmarked the minimum (lower bound) of error detection across the ten most commonly used real world ML datasets and found the lower bound is at least 50% accurate. You can see these errors yourself here: labelerrors.com (all found with cleanlab studio, a more advanced version of the algorithms in confident learning) and this was nominated for best paper award at NeurIPS 2021.
-
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
If you have trained a speech-to-text model and are able to get its probabilistic predictions over the word/token at each position, then you can use the token_classification module in our open-source cleanlab library for this purpose.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Related posts
- Automated Data Quality at Scale
- [Research] Detecting Errors in Numerical Data via any Regression Model
- Detecting Errors in Numerical Data via Any Regression Model
- cleanlab v2.5 now supports all major ML tasks (adds regression, object detection, and image segmentation)
- Enhancing Product Analytics and E-commerce Business