Python Datascience

Open-source Python projects categorized as Datascience

Top 21 Python Datascience Projects

  • ludwig

    Low-code framework for building custom LLMs, neural networks, and other AI models

  • Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

    This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.

    questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?

    Would love to see more progress toward this area!

  • modin

    Modin: Scale your Pandas workflows by changing a single line of code

  • Project mention: The Distributed Tensor Algebra Compiler (2022) | news.ycombinator.com | 2023-06-15
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Taipy

    Turns Data and AI algorithms into production-ready web applications in no time.

  • Project mention: +10 Resources to Empower Women in Technology | dev.to | 2024-03-06

    I’ve been working in tech for more than five years. I started as a Data Scientist, and now I’m exploring and loving the DevRel 🥑 role for Taipy. Needless to say, evolving in the tech scene has been a ride full of ups, downs, and everything in between.

  • metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

  • Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • Mimesis

    Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.

  • panel

    Panel: The powerful data exploration & web app framework for Python (by holoviz)

  • Project mention: This Week In Python | dev.to | 2024-04-12

    panel – data exploration & web app framework for Python

  • PyFunctional

    Python library for creating data pipelines with chain functional programming

  • Project mention: Python: Uncovering the Overlooked Core Functionalities | news.ycombinator.com | 2023-07-24

    If you actually think this code is better there's a real library that does this: https://github.com/EntilZha/PyFunctional.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Fast-F1

    FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry

  • Project mention: Consume Live Timing/Telemetry From API During Race | /r/F1Technical | 2023-05-28

    F1 broadcasts their live timing via the SignalR protocol. The endpoint itself is unauthenticated. You can look at FastF1’s implementation of the SignalR client and the respective endpoints which it connects to within the code documentation here FastF1 SignalR client

  • CleverCSV

    CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

  • openllmetry

    Open-source observability for your LLM application, based on OpenTelemetry

  • Project mention: Show HN: You don't need to adopt new tools for LLM observability | news.ycombinator.com | 2024-02-14

    So why should it be different when the app you're building happened to be using LLMs?

    So today we're open-sourcing OpenLLMetry-JS. It's an open protocol and SDK, based on OpenTelemetry, that provides traces and metrics for LLM JS/TS applications and can be connected to any of the 15+ tools that already support OpenTelemetry. Here's the repo: https://github.com/traceloop/openllmetry-js

    A few months ago we launched the python flavor here (https://news.ycombinator.com/item?id=37843907) and we've now built a compatible one for Node.js.

    Would love to hear your thoughts and opinions!

    Check it out -

    Docs: https://www.traceloop.com/docs/openllmetry/getting-started-t...

    Github:

  • streamlit-geospatial

    A multi-page streamlit app for geospatial

  • Project mention: how i can create a timelapse of a specfic region | /r/remotesensing | 2023-07-05
  • DGFraud

    A Deep Graph-based Toolbox for Fraud Detection

  • socios-brasil

    Captura os dados de sócios das empresas brasileiras na Receita Federal e exporta para um formato legível por humanos

  • Mobile-Phone-Dataset-GSMArena

    Python script for creating Mobile Phones Dataset on GSMArena website.

  • gretel-python-client

    The Gretel Python Client allows you to interact with the Gretel REST API.

  • PathDict

    Easily query and modify Python dicts!

  • linkedin-connections-analyzer

    LinkedIn connections analyzer

  • TagMaps

    Spatio-Temporal Tag and Photo Location Clustering for generating Tag Maps

  • scrape-google-play-store-app

    Single script to scrape Google Play Store App info without browser automation

  • Machine-Learning-Cyrillic-Classifier

    This is a web app where you can draw a letter in the russian alphabet and the ML algorithm will predict the letter that you drew.

  • OLX-Analytics

    🔍 This project allows easy and efficient browsing of classifieds on the OLX portal. The user has the option to register for a subscription and receive the latest information from the category that interests him every 4 hours.

  • Project mention: [Python] Project ideas for every level of advancement | dev.to | 2023-10-27

    Stack: Python, Flask, HTML, CSS, Bootstrap, Docker, SQLite, APScheduler Source code

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Datascience related posts

Index

What are some of the best open-source Datascience projects in Python? This list will help you:

Project Stars
1 ludwig 10,801
2 modin 9,465
3 Taipy 8,371
4 metaflow 7,586
5 Mimesis 4,304
6 panel 4,192
7 PyFunctional 2,332
8 Fast-F1 2,178
9 CleverCSV 1,213
10 openllmetry 1,224
11 streamlit-geospatial 800
12 DGFraud 655
13 socios-brasil 547
14 Mobile-Phone-Dataset-GSMArena 57
15 gretel-python-client 43
16 PathDict 23
17 linkedin-connections-analyzer 12
18 TagMaps 6
19 scrape-google-play-store-app 2
20 Machine-Learning-Cyrillic-Classifier 1
21 OLX-Analytics 1

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com