Python Statistics

Open-source Python projects categorized as Statistics | Edit details

Top 23 Python Statistic Projects

  • GitHub repo scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Will I be able to switch into a hardware job if my first job is in data science? | reddit.com/r/ElectricalEngineering | 2021-12-07

    I can't tell you whether you'd like data science or machine learning, but I can tell you I took a class in it last year. It was an applied ML class targeting power systems engineers. ML is extremely statistics and probability heavy. I personally found the theory to be very dry, but the application to be rather enjoyable. We used sci-kit learn, which is an interesting Python package targeting academic data science and machine learning. https://scikit-learn.org/

  • GitHub repo statsmodels

    Statsmodels: statistical modeling and econometrics in Python

    Project mention: Advice required to choose appropriate software for an assignment | reddit.com/r/econometrics | 2021-04-26

    Can't you get a student discount for Stata? R would definitely be able to handle everything. For Python, have a look through the statsmodel package https://github.com/statsmodels/statsmodels

  • Nanos

    Run Linux Software Faster and Safer than Linux with Unikernels.

  • GitHub repo boltons

    🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

  • GitHub repo Tautulli

    A Python based monitoring and tracking tool for Plex Media Server.

    Project mention: Tracking Movie Collection via Spreadsheet | reddit.com/r/DataHoarder | 2021-12-05
  • GitHub repo eiten

    Statistical and Algorithmic Investing Strategies for Everyone

    Project mention: Has anyone worked on problems involving "portfolio optimization theory"? | reddit.com/r/datascience | 2021-07-29
  • GitHub repo pgmpy

    Python Library for learning (Structure and Parameter) and inference (Probabilistic and Causal) in Bayesian Networks.

    Project mention: [D] Python toolboxes for probabilistic graphical model inference | reddit.com/r/MachineLearning | 2021-11-03

    I do know of a few promising toolboxes such as pgmpy, pymc3, and pyro, but have not used either of them (for this purpose) and am at a bit of a loss picking one to start with.

  • GitHub repo sweetviz

    Visualize and compare datasets, target values and associations, with one line of code.

    Project mention: Automated Data Profiling and Attribute Clustering using unsupervised ML techniques | reddit.com/r/datascience | 2021-07-03

    Take a look at this package which computes associations between variables and other viz and can infer some types https://github.com/fbdesignpro/sweetviz

  • Scout APM

    Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.

  • GitHub repo lifetimes

    Lifetime value in Python

    Project mention: Customer lifetime value | reddit.com/r/BusinessIntelligence | 2021-10-20

    I've you haven't come across it, I recommend this straightforward library as a starting point https://github.com/CamDavidsonPilon/lifetimes

  • GitHub repo pycm

    Multi-class confusion matrix library in Python

    Project mention: [P] PyCM 3.3 released: Comparison of Classifiers Based on Confusion Matrix | reddit.com/r/MachineLearning | 2021-10-27
  • GitHub repo github-stats

    Better GitHub statistics images for your profile, with stats from private repos too

    Project mention: My Awesome Github Readme | dev.to | 2021-07-01

    Github Stats:https://github.com/jstrieb/github-stats

  • GitHub repo uncertainty-baselines

    High-quality implementations of standard and SOTA methods on a variety of tasks.

    Project mention: Google AI Introduces ‘Uncertainty Baselines Library’ For Uncertainty and Robustness in Deep Learning | reddit.com/r/artificial | 2021-10-17

    Code for https://arxiv.org/abs/2106.04015 found: https://github.com/google/uncertainty-baselines

  • GitHub repo sportsipy

    A free sports API written for python

    Project mention: Is there any baseball website that won't get pissed at me scraping it often? | reddit.com/r/Sabermetrics | 2021-06-29

    I have used this for my own projects: https://github.com/roclark/sportsipy

  • GitHub repo choochoo

    Training Diary

    Project mention: Any tools for monitoring fitness without a power meter, besides Strava? | reddit.com/r/Strava | 2021-01-21

    if not, and you're competent with docker, you may be able to get choochoo running - https://github.com/andrewcooke/choochoo

  • GitHub repo popmon

    Monitor the stability of a pandas or spark dataframe ⚙︎

    Project mention: Monitor the stability of a pandas or spark dataframe | news.ycombinator.com | 2021-09-15
  • GitHub repo fitter

    Fit data to many distributions

    Project mention: Analysing Time-to-Events | reddit.com/r/AskStatistics | 2021-05-21

    Never heard of this library, looks neat. Linking for posterity: https://github.com/cokelaer/fitter

  • GitHub repo graphsignal

    Graphsignal Logger

    Project mention: [P] Model Performance Monitoring in Production | reddit.com/r/MachineLearning | 2021-11-01

    And the logger repo is https://github.com/graphsignal/graphsignal.

  • GitHub repo tokei-pie

    Render tokei's output to interactive sunburst chart.

    Project mention: Tokei-pie: render tokei's output to interactive sunburst chart | news.ycombinator.com | 2021-11-18
  • GitHub repo emerge

    emerge is a source code analysis tool and dependency visualizer that can be used to gather insights about source code structure, metrics, dependencies and complexity of software projects. After scanning the source code of a project it provides you an interactive web interface to explore and analyze your project by using graph structures.

    Project mention: Is it possible to generate a flow diagram from Javascript code? | reddit.com/r/vscode | 2021-09-06

    There's no VS Code extension for it AFAIK, but it's the best (and almost only) tool that I know which can do it for JavaScript code. There's also madge and emerge, in case the first one doesn't fit your needs.

  • GitHub repo dcor

    Distance correlation and related E-statistics in Python

    Project mention: Highly used R packages with no Python equivalent | reddit.com/r/bioinformatics | 2021-08-07

    BreakoutDetection. I actually started doing something like this, and contributed a few things to the dcor library that could be used for the underlying e-statistics, but I never got around to implementing the whole thing.

  • GitHub repo django-ai

    Artificial Intelligence for Django

    Project mention: How to effectively incorporate ML in django? | reddit.com/r/django | 2021-11-18

    You should check https://github.com/math-a3k/django-ai and https://github.com/math-a3k/covid-ht as an integrated way of doing this

  • GitHub repo roc_comparison

    The fast version of DeLong's method for computing the covariance of unadjusted AUC.

    Project mention: [D]getting a good p-value and confidence interval for cross-validated AUC | reddit.com/r/MachineLearning | 2021-05-12

    https://statisticaloddsandends.wordpress.com/2020/06/07/what-is-the-delong-test-for-comparing-aucs/ https://www.rdocumentation.org/packages/pROC/versions/1.17.0.1 http://pamixsun.github.io/papers/sun2014fast.pdf https://github.com/yandexdataschool/roc_comparison

  • GitHub repo nhlstats

    Thin wrapper library/CLI for accessing the NHL Live API.

    Project mention: NHL goalies save percentage by goal differential | reddit.com/r/hockey | 2021-05-18

    I used the nhlstats tool to download play-by-play and shift data for the current NHL season, and then loaded it into a PostgreSQL database. 99% of the analysis was done using postgres queries, though I did use a little bit of pandas / Seaborn to create charts because of how intertwined those ecosystems are. The queries are relatively complex so I won't break them down here unless there is great demand, but I am planning to release the code to download & process the data once I have it cleaned up a bit.

  • GitHub repo speedtest-to-influxdb

    Script to periodically run the Speedtest CLI application by Ookla and post results to InfluxDB. (by aidengilmartin)

    Project mention: Kontinuierliche Überwachung der Internetgeschwindigkeit über die QNAP????? | reddit.com/r/de_EDV | 2021-11-04
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-12-07.

Python Statistics related posts

Index

What are some of the best open-source Statistic projects in Python? This list will help you:

Project Stars
1 scikit-learn 48,142
2 statsmodels 6,897
3 boltons 5,662
4 Tautulli 4,136
5 eiten 2,081
6 pgmpy 1,937
7 sweetviz 1,830
8 lifetimes 1,196
9 pycm 1,186
10 github-stats 1,085
11 uncertainty-baselines 718
12 sportsipy 298
13 choochoo 201
14 popmon 199
15 fitter 179
16 graphsignal 101
17 tokei-pie 97
18 emerge 95
19 dcor 83
20 django-ai 67
21 roc_comparison 63
22 nhlstats 37
23 speedtest-to-influxdb 37
Find remote jobs at our new job board 99remotejobs.com. There are 32 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com