Python Datascience

Open-source Python projects categorized as Datascience

Top 20 Python Datascience Projects

Datascience
  1. Taipy

    Turns Data and AI algorithms into production-ready web applications in no time.

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. modin

    Modin: Scale your Pandas workflows by changing a single line of code

  4. metaflow

    Build, Manage and Deploy AI/ML Systems

    Project mention: Metaflow: Build, Manage and Deploy AI/ML Systems | news.ycombinator.com | 2025-07-16

    Stay tuned! We have some cool new features coming soon to support agentic workloads (teaser: https://github.com/Netflix/metaflow/pull/2473)

    If you are curious, join the Metaflow Slack at http://slack.outerbounds.co and start a thread on #ask-metaflow

  5. openllmetry

    Open-source observability for your GenAI or LLM application, based on OpenTelemetry

    Project mention: What is OpenTelemetry, and why does it matter for AI agents? | dev.to | 2026-05-11

    OpenLLMetry: Pre-built instrumentations for LangChain, Anthropic, OpenAI, LlamaIndex, Ollama, Qdrant, and others. Reduces boilerplate if your agent uses popular frameworks.

  6. panel

    Panel: The powerful data exploration & web app framework for Python (by holoviz)

  7. Fast-F1

    FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry

  8. Mimesis

    Mimesis is a fast Python library for generating fake data in multiple languages.

  9. PyFunctional

    Python library for creating data pipelines with chain functional programming

  10. CleverCSV

    CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

  11. streamlit-geospatial

    A multi-page streamlit app for geospatial

  12. DGFraud

    A Deep Graph-based Toolbox for Fraud Detection

  13. dingo

    Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool (by MigoXLab)

    Project mention: Show HN: Dingo 1.9.0 released: With enhanced hallucination detection | news.ycombinator.com | 2025-07-31
  14. socios-brasil

    Captura os dados de sócios das empresas brasileiras na Receita Federal e exporta para um formato legível por humanos

  15. Mobile-Phone-Dataset-GSMArena

    Python script for creating Mobile Phones Dataset on GSMArena website.

  16. Skyulf

    Build and ship production ML pipelines faster: a pipeline library with an optional self-hosted visual layer for modular, reproducible workflows, local testing, and experiment tracking.

    Project mention: Show HN: I built a visual, MLOps tool (Skyulf) | news.ycombinator.com | 2026-01-06
  17. PathDict

    Easily query and modify Python dicts!

  18. linkedin-connections-analyzer

    LinkedIn connections analyzer

  19. TagMaps

    Spatio-Temporal Tag and Photo Location Clustering for generating Tag Maps

  20. scrape-google-play-store-app

    Single script to scrape Google Play Store App info without browser automation

  21. Machine-Learning-Cyrillic-Classifier

    This is a web app where you can draw a letter in the russian alphabet and the ML algorithm will predict the letter that you drew.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Datascience discussion

Log in or Post with

Python Datascience related posts

  • Start contributing to a Popular Open Source Project

    2 projects | dev.to | 28 Jan 2025
  • Build a Stock Dashboard in less than 40 lines of Python code!🤓

    1 project | dev.to | 5 Dec 2024
  • 🤓 Top 12 Open Source Repositories to Watch in 2025 to become the ultimate developer

    1 project | dev.to | 2 Dec 2024
  • 9 Open-Source Python Tools to Build Better Data Apps in 2025

    1 project | dev.to | 18 Nov 2024
  • Python Day 9: Building Interactive Web Apps without HTML/CSS and JavaScript

    1 project | dev.to | 26 Apr 2024
  • +10 Resources to Empower Women in Technology

    1 project | dev.to | 6 Mar 2024
  • Show HN: Building data and AI apps, an alternative to Streamlit

    1 project | news.ycombinator.com | 12 Feb 2024
  • A note from our sponsor - SaaSHub
    www.saashub.com | 15 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Datascience projects in Python? This list will help you:

# Project Stars
1 Taipy 19,237
2 modin 10,391
3 metaflow 10,129
4 openllmetry 7,193
5 panel 5,700
6 Fast-F1 5,139
7 Mimesis 4,813
8 PyFunctional 2,487
9 CleverCSV 1,326
10 streamlit-geospatial 1,020
11 DGFraud 752
12 dingo 710
13 socios-brasil 607
14 Mobile-Phone-Dataset-GSMArena 66
15 Skyulf 44
16 PathDict 26
17 linkedin-connections-analyzer 13
18 TagMaps 9
19 scrape-google-play-store-app 2
20 Machine-Learning-Cyrillic-Classifier 2

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?