[Discussion] What is your go to technique for labelling data?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • cleanlab

    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

    You can save a lot of money using cleanlab: https://github.com/cleanlab/cleanlab

  • pigeonXT

    🐦 Quickly annotate data from the comfort of your Jupyter notebook

    If you want something easy that you can run from a jupyter notebook I would take a look at https://github.com/dennisbakhuis/pigeonXT

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • labelflow

    The open platform for image labelling

    Check labelflow.ai. It's free, the code is published, web UI is super simple and the images do not need to be uploaded on remote servers so you get started in no time. For classification you would press the 1 key if image has hotdog else right key to go to the next image. Not gonna lie, you're going to need a bit of time for 10k images but definitely doable alone on a simple use case like that. To be fully transparent, I work there! Classification features are still in beta they will be released in 2 weeks. Happy labeling!

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts