weak-supervision

Open-source projects categorized as weak-supervision

Top 9 weak-supervision Open-Source Projects

  • cleanlab

    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

  • Project mention: [Research] Detecting Annotation Errors in Semantic Segmentation Data | /r/MachineLearning | 2023-11-05

    We have feely open-sourced our new method for improving segmentation data, published a paper on the research behind it, and released a 5-min code tutorial. You can also read more in the blog if you'd like.

  • snorkel

    A system for quickly generating training data with weak supervision

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • argilla

    Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.

  • Project mention: Open-Source Data Collection Platform for LLM Fine-Tuning and RLHF | news.ycombinator.com | 2023-06-05

    I'm Dani, CEO and co-founder of Argilla.

    Happy to answer any questions you might have and excited to hear your thoughts!

    More about Argilla

    GitHub: https://github.com/argilla-io/argilla

  • skweak

    skweak: A software toolkit for weak supervision applied to NLP tasks

  • wrench

    [NeurIPS 2021] WRENCH: Weak supeRvision bENCHmark

  • weasel

    Weakly Supervised End-to-End Learning (NeurIPS 2021) (by autonlab)

  • auto_annotate

    Labeling is boring. Use this tool to speed up your next object detection project!

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • zeroshot_topics

    Topic Inference with Zeroshot models

  • scikit-clean

    A collection of algorithms for detecting and handling label noise

  • Project mention: Ask HN: What side projects landed you a job? | news.ycombinator.com | 2023-12-03

    Among all these feel-good stories, how about one with a bit different ending?

    During my masters, I created a ML library that dealt with noise in dataset. I implemented bunch of papers, but unlike your usual research code, I spent a long time obsessing about it's API, performance, created documentation, CI- the whole shebang [1]. But then, like avg research code, I moved on and promptly forgot about it.

    One day about a year ago the cofounder of a very new, small startup working on something similar texted me about the project on linkedin. We chatted for a bit, but as a guy who thinks he's too cool for linkedin, I next logged in and saw his last message about wanting to collaborate about 3/4 months after the fact.

    Well they raised $25 million dollars a few months ago :(

    [1] https://github.com/Shihab-Shahriar/scikit-clean

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

weak-supervision related posts

  • [Research] Detecting Annotation Errors in Semantic Segmentation Data

    1 project | /r/MachineLearning | 5 Nov 2023
  • [R] Automated Quality Assurance for Object Detection Datasets

    1 project | /r/computervision | 28 Sep 2023
  • [Research] Detecting Errors in Numerical Data via any Regression Model

    1 project | /r/statistics | 20 Sep 2023
  • Detecting Errors in Numerical Data via Any Regression Model

    1 project | news.ycombinator.com | 18 Sep 2023
  • cleanlab v2.5 now supports all major ML tasks (adds regression, object detection, and image segmentation)

    1 project | /r/coolgithubprojects | 17 Sep 2023
  • Automated Data Quality at Scale

    2 projects | news.ycombinator.com | 27 Jul 2023
  • Enhancing Product Analytics and E-commerce Business

    1 project | /r/ecommerce | 6 Jul 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 5 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source weak-supervision projects? This list will help you:

Project Stars
1 cleanlab 8,673
2 snorkel 5,712
3 argilla 3,122
4 skweak 910
5 wrench 211
6 weasel 153
7 auto_annotate 148
8 zeroshot_topics 60
9 scikit-clean 13

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com