Python Statistics

Open-source Python projects categorized as Statistics | Edit details

Top 23 Python Statistic Projects

  • scikit-learn

    scikit-learn: machine learning in Python

    Project mention: scikit-learn test case results? | reddit.com/r/scikit_learn | 2022-01-05
  • statsmodels

    Statsmodels: statistical modeling and econometrics in Python

    Project mention: Advice required to choose appropriate software for an assignment | reddit.com/r/econometrics | 2021-04-26

    Can't you get a student discount for Stata? R would definitely be able to handle everything. For Python, have a look through the statsmodel package https://github.com/statsmodels/statsmodels

  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • boltons

    🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

  • Tautulli

    A Python based monitoring and tracking tool for Plex Media Server.

    Project mention: A way to notify my users what new content is on the server | reddit.com/r/PleX | 2022-01-25
  • eiten

    Statistical and Algorithmic Investing Strategies for Everyone

    Project mention: Has anyone worked on problems involving "portfolio optimization theory"? | reddit.com/r/datascience | 2021-07-29
  • pgmpy

    Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

    Project mention: [D] Python toolboxes for probabilistic graphical model inference | reddit.com/r/MachineLearning | 2021-11-03

    I do know of a few promising toolboxes such as pgmpy, pymc3, and pyro, but have not used either of them (for this purpose) and am at a bit of a loss picking one to start with.

  • sweetviz

    Visualize and compare datasets, target values and associations, with one line of code.

    Project mention: Automated Data Profiling and Attribute Clustering using unsupervised ML techniques | reddit.com/r/datascience | 2021-07-03

    Take a look at this package which computes associations between variables and other viz and can infer some types https://github.com/fbdesignpro/sweetviz

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • github-stats

    Better GitHub statistics images for your profile, with stats from private repos too

    Project mention: My Awesome Github Readme | dev.to | 2021-07-01

    Github Stats:https://github.com/jstrieb/github-stats

  • lifetimes

    Lifetime value in Python

    Project mention: Customer lifetime value | reddit.com/r/BusinessIntelligence | 2021-10-20

    I've you haven't come across it, I recommend this straightforward library as a starting point https://github.com/CamDavidsonPilon/lifetimes

  • pycm

    Multi-class confusion matrix library in Python

    Project mention: PyCM 3.4 released: Multi-class confusion matrix library in Python | reddit.com/r/coolgithubprojects | 2022-01-27
  • uncertainty-baselines

    High-quality implementations of standard and SOTA methods on a variety of tasks.

    Project mention: Google AI Introduces ‘Uncertainty Baselines Library’ For Uncertainty and Robustness in Deep Learning | reddit.com/r/artificial | 2021-10-17

    Code for https://arxiv.org/abs/2106.04015 found: https://github.com/google/uncertainty-baselines

  • sportsipy

    A free sports API written for python

    Project mention: Is there any baseball website that won't get pissed at me scraping it often? | reddit.com/r/Sabermetrics | 2021-06-29

    I have used this for my own projects: https://github.com/roclark/sportsipy

  • popmon

    Monitor the stability of a pandas or spark dataframe ⚙︎

    Project mention: Monitor the stability of a pandas or spark dataframe | news.ycombinator.com | 2021-09-15
  • choochoo

    Training Diary

  • fitter

    Fit data to many distributions

    Project mention: Analysing Time-to-Events | reddit.com/r/AskStatistics | 2021-05-21

    Never heard of this library, looks neat. Linking for posterity: https://github.com/cokelaer/fitter

  • emerge

    emerge is a source code analysis tool and dependency visualizer that can be used to gather insights about source code structure, metrics, dependencies and complexity of software projects. After scanning the source code of a project it provides you an interactive web interface to explore and analyze your project by using graph structures.

    Project mention: Is it possible to generate a flow diagram from Javascript code? | reddit.com/r/vscode | 2021-09-06

    There's no VS Code extension for it AFAIK, but it's the best (and almost only) tool that I know which can do it for JavaScript code. There's also madge and emerge, in case the first one doesn't fit your needs.

  • tokei-pie

    Render tokei's output to interactive sunburst chart.

    Project mention: Tokei-pie: render tokei's output to interactive sunburst chart | news.ycombinator.com | 2021-11-18
  • graphsignal

    Graphsignal Logger

    Project mention: [P] Model Performance Monitoring in Production | reddit.com/r/MachineLearning | 2021-11-01

    And the logger repo is https://github.com/graphsignal/graphsignal.

  • dcor

    Distance correlation and related E-statistics in Python

    Project mention: Highly used R packages with no Python equivalent | reddit.com/r/bioinformatics | 2021-08-07

    BreakoutDetection. I actually started doing something like this, and contributed a few things to the dcor library that could be used for the underlying e-statistics, but I never got around to implementing the whole thing.

  • django-ai

    Artificial Intelligence for Django

    Project mention: How to effectively incorporate ML in django? | reddit.com/r/django | 2021-11-18

    You should check https://github.com/math-a3k/django-ai and https://github.com/math-a3k/covid-ht as an integrated way of doing this

  • roc_comparison

    The fast version of DeLong's method for computing the covariance of unadjusted AUC.

    Project mention: [D]getting a good p-value and confidence interval for cross-validated AUC | reddit.com/r/MachineLearning | 2021-05-12

    https://statisticaloddsandends.wordpress.com/2020/06/07/what-is-the-delong-test-for-comparing-aucs/ https://www.rdocumentation.org/packages/pROC/versions/1.17.0.1 http://pamixsun.github.io/papers/sun2014fast.pdf https://github.com/yandexdataschool/roc_comparison

  • speedtest-to-influxdb

    Script to periodically run the Speedtest CLI application by Ookla and post results to InfluxDB. (by aidengilmartin)

    Project mention: Kontinuierliche Überwachung der Internetgeschwindigkeit über die QNAP????? | reddit.com/r/de_EDV | 2021-11-04
  • nhlstats

    Thin wrapper library/CLI for accessing the NHL Live API.

    Project mention: NHL goalies save percentage by goal differential | reddit.com/r/hockey | 2021-05-18

    I used the nhlstats tool to download play-by-play and shift data for the current NHL season, and then loaded it into a PostgreSQL database. 99% of the analysis was done using postgres queries, though I did use a little bit of pandas / Seaborn to create charts because of how intertwined those ecosystems are. The queries are relatively complex so I won't break them down here unless there is great demand, but I am planning to release the code to download & process the data once I have it cleaned up a bit.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-27.

Python Statistics related posts

Index

What are some of the best open-source Statistic projects in Python? This list will help you:

Project Stars
1 scikit-learn 48,629
2 statsmodels 7,034
3 boltons 5,687
4 Tautulli 4,216
5 eiten 2,195
6 pgmpy 1,973
7 sweetviz 1,894
8 github-stats 1,216
9 lifetimes 1,214
10 pycm 1,204
11 uncertainty-baselines 782
12 sportsipy 312
13 popmon 226
14 choochoo 201
15 fitter 190
16 emerge 116
17 tokei-pie 111
18 graphsignal 102
19 dcor 86
20 django-ai 69
21 roc_comparison 68
22 speedtest-to-influxdb 41
23 nhlstats 37
Find remote jobs at our new job board 99remotejobs.com. There are 30 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
OPS - Build and Run Open Source Unikernels
Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.
github.com/nanovms