Python Statistics

Open-source Python projects categorized as Statistics

Top 23 Python Statistic Projects

  • scikit-learn

    scikit-learn: machine learning in Python

  • Project mention: AutoCodeRover resolves 22% of real-world GitHub in SWE-bench lite | news.ycombinator.com | 2024-04-09

    Thank you for your interest. There are some interesting examples in the SWE-bench-lite benchmark which are resolved by AutoCodeRover:

    - From sympy: https://github.com/sympy/sympy/issues/13643. AutoCodeRover's patch for it: https://github.com/nus-apr/auto-code-rover/blob/main/results...

    - Another one from scikit-learn: https://github.com/scikit-learn/scikit-learn/issues/13070. AutoCodeRover's patch (https://github.com/nus-apr/auto-code-rover/blob/main/results...) modified a few lines below (compared to the developer patch) and wrote a different comment.

    There are more examples in the results directory (https://github.com/nus-apr/auto-code-rover/tree/main/results).

  • ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

  • Project mention: FLaNK 25 December 2023 | dev.to | 2023-12-26
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • statsmodels

    Statsmodels: statistical modeling and econometrics in Python

  • Project mention: statsmodels Release Candidate 0.14.0rc0 tagged | /r/Python | 2023-04-26
  • imbalanced-learn

    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

  • Project mention: What’s your approach to highly imbalanced data sets? | /r/datascience | 2023-05-26

    There's a pletora of undersampling and oversampling models you can try out. To avoid removing information form the dataset, you can focus on oversampling techniques. You can try imbalanced-learn or smote-variants. Given enough data, using fully synthetic data is also an option, you can check ydata-synthetic for it. Let us know how it turned out!

  • boltons

    πŸ”© Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

  • Project mention: Boltons is a set of over 250 BSD-licensed, pure-Python utilities | news.ycombinator.com | 2023-12-11
  • Tautulli

    A Python based monitoring and tracking tool for Plex Media Server.

  • Project mention: I'm fine with the basics of Plex - now what can I do to really use plex to it's full potential? | /r/PleX | 2023-12-09

    With Tautulli you have a better monitoring system than what Plex offers. Streaming history split by user, you can add notifications to a lot of services like Slack, email and so on. You can even create newsletters being sent out to users based on what was added to your server.

  • statsforecast

    Lightning ⚑️ fast forecasting with statistical and econometric models.

  • Project mention: TimeGPT-1 | news.ycombinator.com | 2023-10-13

    I can't find the TimeGPT-1 model.

    LICENSE Apache-2

    https://github.com/Nixtla/statsforecast/blob/main/LICENSE

    Mentions ARIMA, ETS, CES, and Theta modeling

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • sweetviz

    Visualize and compare datasets, target values and associations, with one line of code.

  • github-stats

    Better GitHub statistics images for your profile, with stats from private repos too

  • Project mention: Ask HN: How to Do a GitHub Wrapped? | news.ycombinator.com | 2023-12-19

    I have done similar work using the GitHub APIs before. I recommend using their GraphQL explorer to develop your queries interactively. You may need to fall back on the REST API instead of the GraphQL one for certain stats.

    https://docs.github.com/en/graphql/overview/explorer

    You can also refer to my code here, which may already collect some of the statistics you're interested in.

    https://github.com/jstrieb/github-stats/blob/master/github_s...

    I predict the most annoying part of this project will be dealing with authentication. There are a handful of ways to do it, and the permissions can be finicky depending on what data you are fetching.

    Best of luck!

  • eiten

    Statistical and Algorithmic Investing Strategies for Everyone

  • pgmpy

    Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

  • lifetimes

    Lifetime value in Python

  • pycm

    Multi-class confusion matrix library in Python

  • Project mention: PyCM 4.0 Released: Multilabel Confusion Matrix Support | /r/coolgithubprojects | 2023-06-07
  • uncertainty-baselines

    High-quality implementations of standard and SOTA methods on a variety of tasks.

  • maloja

    Self-hosted music scrobble database to create personal listening statistics and charts

  • Project mention: Can I get charged for having FLAC content on my server hosted at Germany? | /r/selfhosted | 2023-07-07

    With that recommendation, I want to add Maloja to the mix. (scrobble which works great with navidrome).

  • hierarchicalforecast

    Probabilistic Hierarchical forecasting πŸ‘‘ with statistical and econometric methods.

  • Project mention: [D] When less is more in the hierarchical forecasting case. | /r/MachineLearning | 2023-07-03
  • popmon

    Monitor the stability of a Pandas or Spark dataframe βš™οΈŽ

  • sportsipy

    A free sports API written for python

  • pypinfo

    Easily view PyPI download statistics via Google's BigQuery.

  • fitter

    Fit data to many distributions

  • meteostat-python

    Access and analyze historical weather and climate data with Python.

  • Project mention: Povijesni vremenski podaci | /r/croatia | 2023-06-15

    Probaj s: https://github.com/meteostat/meteostat-python

  • Contributions-Importer-For-Github

    This tool helps users to import contributions to GitHub from private git repositories, or from public repositories that are not hosted in GitHub.

  • github-repo-stats

    GitHub Action for advanced repository traffic analysis and reporting

  • Project mention: How I Fixed GitHub's Repo Traffic Insights πŸ› οΈ πŸ“Š | dev.to | 2023-12-03

    Within the discussion, I came across a GitHub action tool that fetches traffic data and stores it in a CSV file, also generating a PDF report:

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Statistics related posts

Index

What are some of the best open-source Statistic projects in Python? This list will help you:

Project Stars
1 scikit-learn 58,046
2 ydata-profiling 12,022
3 statsmodels 9,534
4 imbalanced-learn 6,697
5 boltons 6,415
6 Tautulli 5,361
7 statsforecast 3,540
8 sweetviz 2,833
9 github-stats 2,713
10 eiten 2,655
11 pgmpy 2,612
12 lifetimes 1,433
13 pycm 1,429
14 uncertainty-baselines 1,362
15 maloja 936
16 hierarchicalforecast 512
17 popmon 485
18 sportsipy 472
19 pypinfo 394
20 fitter 353
21 meteostat-python 352
22 Contributions-Importer-For-Github 337
23 github-repo-stats 279

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com