Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 9 weak-supervision Open-Source Projects
-
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
argilla
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: [Research] Detecting Annotation Errors in Semantic Segmentation Data | /r/MachineLearning | 2023-11-05We have feely open-sourced our new method for improving segmentation data, published a paper on the research behind it, and released a 5-min code tutorial. You can also read more in the blog if you'd like.
Project mention: Open-Source Data Collection Platform for LLM Fine-Tuning and RLHF | news.ycombinator.com | 2023-06-05I'm Dani, CEO and co-founder of Argilla.
Happy to answer any questions you might have and excited to hear your thoughts!
More about Argilla
GitHub: https://github.com/argilla-io/argilla
Among all these feel-good stories, how about one with a bit different ending?
During my masters, I created a ML library that dealt with noise in dataset. I implemented bunch of papers, but unlike your usual research code, I spent a long time obsessing about it's API, performance, created documentation, CI- the whole shebang [1]. But then, like avg research code, I moved on and promptly forgot about it.
One day about a year ago the cofounder of a very new, small startup working on something similar texted me about the project on linkedin. We chatted for a bit, but as a guy who thinks he's too cool for linkedin, I next logged in and saw his last message about wanting to collaborate about 3/4 months after the fact.
Well they raised $25 million dollars a few months ago :(
[1] https://github.com/Shihab-Shahriar/scikit-clean
weak-supervision related posts
-
[Research] Detecting Annotation Errors in Semantic Segmentation Data
-
[R] Automated Quality Assurance for Object Detection Datasets
-
[Research] Detecting Errors in Numerical Data via any Regression Model
-
Detecting Errors in Numerical Data via Any Regression Model
-
cleanlab v2.5 now supports all major ML tasks (adds regression, object detection, and image segmentation)
-
Automated Data Quality at Scale
-
Enhancing Product Analytics and E-commerce Business
-
A note from our sponsor - InfluxDB
www.influxdata.com | 5 May 2024
Index
What are some of the best open-source weak-supervision projects? This list will help you:
Project | Stars | |
---|---|---|
1 | cleanlab | 8,673 |
2 | snorkel | 5,712 |
3 | argilla | 3,122 |
4 | skweak | 910 |
5 | wrench | 211 |
6 | weasel | 153 |
7 | auto_annotate | 148 |
8 | zeroshot_topics | 60 |
9 | scikit-clean | 13 |
Sponsored