Python data-labeling

Open-source Python projects categorized as data-labeling

Top 8 Python data-labeling Projects

  • label-studio

    Label Studio is a multi-type data labeling and annotation tool with standardized output format

    Project mention: Preprocessing data for CNN tips? | reddit.com/r/deeplearning | 2023-02-10

    I’m fairly new to deep learning and learning as I got so sorry if this is very basic, but I’m working on a model for detecting invasive coconut rhinoceros beetles destroying palm trees using drone photography. The 1080p photos I’m given were taken 250ft AGL and were cropped into equal size smaller images with some having one or more palm trees and some having none. Im using I’m using labelStudio to generate the XML files that point to their jpg counterparts path.

  • doccano

    Open source annotation tool for machine learning practitioners.

    Project mention: How do I connect application running in a notebook server to my local machine. | reddit.com/r/Kubeflow | 2022-12-11

    I followed the guide to have doccano https://github.com/doccano/doccano setup in a notebook server in Kubeflow. It is running and the django connection is established, but it fails to connect with my localmachine so when I try to open the link, it does not respond. Is there a way to connect apps running on remove notebook servers to local in Kubeflow?

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • refinery

    The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.

    Project mention: [P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions. | reddit.com/r/MachineLearning | 2023-03-03

    You definitely forgot https://www.kern.ai/ :)

  • compose

    A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning. (by alteryx)

    Project mention: 20+ Free Tools & Resources for Machine Learning | dev.to | 2022-03-31

    Compose Compose targets labeling raw data, allowing you to set labeling functions for your data in Python in order to make the labeling process easier.

  • bbox-visualizer

    Make drawing and labeling bounding boxes easy as cake

  • hover

    :speedboat: Label data at scale. Fun and precision included. (by phurwicz)

  • mutate

    A library to synthesize text datasets using Large Language Models (LLM)

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • modzy-labelstudio-sample

    Create training data labels from a production model with Modzy, Dropbox, and Label Studio

    Project mention: Data Labeling for ML Model Retraining with Label Studio | reddit.com/r/artificial | 2022-09-26

    Link to Github repo.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-03-03.

Python data-labeling related posts

Index

What are some of the best open-source data-labeling projects in Python? This list will help you:

Project Stars
1 label-studio 12,336
2 doccano 7,505
3 refinery 1,142
4 compose 410
5 bbox-visualizer 327
6 hover 296
7 mutate 142
8 modzy-labelstudio-sample 11
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com