Python Statistics

Open-source Python projects categorized as Statistics

Top 23 Python Statistic Projects

  1. scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Must-Know 2025 Developer’s Roadmap and Key Programming Trends | dev.to | 2025-02-05

    Python’s Growth in Data Work and AI: Python continues to lead because of its easy-to-read style and the huge number of libraries available for tasks from data work to artificial intelligence. Tools like TensorFlow and PyTorch make it a must-have. Whether you’re experienced or just starting, Python’s clear style makes it a good choice for diving into machine learning. Actionable Tip: If you’re new to Python, try projects that combine data with everyday problems. For example, build a simple recommendation system using Pandas and scikit-learn.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

  4. statsmodels

    Statsmodels: statistical modeling and econometrics in Python

    Project mention: The Truth About Linear Regression | news.ycombinator.com | 2024-07-30

    statsmodels is the closest thing in python to R. statsmodels has mixed model support, but mgcv apparently requires more. It is well above my paygrade, but this seems relevant: https://github.com/statsmodels/statsmodels/issues/8029 (i.e. no out of the box support, you might be able to build an approximation on your own).

  5. imbalanced-learn

    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

  6. boltons

    🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

  7. Tautulli

    A Python based monitoring and tracking tool for Plex Media Server.

  8. statsforecast

    Lightning ⚡️ fast forecasting with statistical and econometric models.

  9. Nutrient

    Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.

    Nutrient logo
  10. github-stats

    Better GitHub statistics images for your profile, with stats from private repos too

  11. sweetviz

    Visualize and compare datasets, target values and associations, with one line of code.

  12. eiten

    Statistical and Algorithmic Investing Strategies for Everyone

  13. uncertainty-baselines

    High-quality implementations of standard and SOTA methods on a variety of tasks.

  14. pycm

    Multi-class confusion matrix library in Python

  15. geomstats

    Computations and statistics on manifolds with geometric structures.

  16. causal-learn

    Causal Discovery in Python. It also includes (conditional) independence tests and score functions.

    Project mention: Survey: Integrating Large Language Models in Causal Discovery: A Statistical Causal Approach | dev.to | 2024-05-24
  17. maloja

    Self-hosted music scrobble database to create personal listening statistics and charts

  18. hierarchicalforecast

    Probabilistic Hierarchical forecasting 👑 with statistical and econometric methods.

  19. sportsipy

    A free sports API written for python

  20. popmon

    Monitor the stability of a Pandas or Spark dataframe ⚙︎

  21. meteostat-python

    Access and analyze historical weather and climate data with Python.

  22. pypinfo

    Easily view PyPI download statistics via Google's BigQuery.

  23. pytensor

    PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

  24. fitter

    Fit data to many distributions

  25. Contributions-Importer-For-Github

    This tool helps users to import contributions to GitHub from private git repositories, or from public repositories that are not hosted in GitHub.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Statistics discussion

Log in or Post with

Python Statistics related posts

  • Tea Tasting: Python package for statistical analysis of A/B tests

    1 project | news.ycombinator.com | 24 Aug 2024
  • tea-tasting VS confidence - a user suggested alternative

    2 projects | 16 Aug 2024
  • The Truth About Linear Regression

    3 projects | news.ycombinator.com | 30 Jul 2024
  • Show HN: Aurora – Problem solving focused statistical and ML software toolkit

    1 project | news.ycombinator.com | 21 Jul 2024
  • How to Build a Logistic Regression Model: A Spam-filter Tutorial

    1 project | dev.to | 5 May 2024
  • Frouros: An open-source Python library for drift detection in machine learning

    1 project | news.ycombinator.com | 6 Apr 2024
  • Ask HN: How to Do a GitHub Wrapped?

    1 project | news.ycombinator.com | 19 Dec 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 15 Feb 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Statistic projects in Python? This list will help you:

# Project Stars
1 scikit-learn 61,000
2 ydata-profiling 12,720
3 statsmodels 10,422
4 imbalanced-learn 6,915
5 boltons 6,561
6 Tautulli 5,785
7 statsforecast 4,137
8 github-stats 3,048
9 sweetviz 2,980
10 eiten 2,889
11 uncertainty-baselines 1,480
12 pycm 1,461
13 geomstats 1,295
14 causal-learn 1,273
15 maloja 1,263
16 hierarchicalforecast 617
17 sportsipy 505
18 popmon 498
19 meteostat-python 465
20 pypinfo 427
21 pytensor 417
22 fitter 381
23 Contributions-Importer-For-Github 363

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that Python is
the 2nd most popular programming language
based on number of references?