Python data-centric

Open-source Python projects categorized as data-centric

Top 4 Python data-centric Projects

data-centric
  1. ludwig

    Low-code framework for building custom LLMs, neural networks, and other AI models

    Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

    This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.

    questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?

    Would love to see more progress toward this area!

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. Encord Active

    Open source active learning toolkit to find failure modes in your computer vision models, prioritize data to label next, and drive data curation to improve model performance.

  4. DataCLUE

    DataCLUE: 数据为中心的NLP基准和工具包

  5. pypely

    From local functions to cloud deployed pipelines

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python data-centric discussion

Log in or Post with

Python data-centric related posts

  • [R] DataCLUE: A Benchmark Suite for Data-centric NLP

    2 projects | /r/MachineLearning | 17 Nov 2021

Index

What are some of the best open-source data-centric projects in Python? This list will help you:

# Project Stars
1 ludwig 11,320
2 Encord Active 446
3 DataCLUE 144
4 pypely 16

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that Python is
the 2nd most popular programming language
based on number of references?