SaaSHub helps you find the best software and product alternatives Learn more β
Top 23 Python Statistic Projects
-
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
boltons
π© Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
pgmpy
Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
-
uncertainty-baselines
High-quality implementations of standard and SOTA methods on a variety of tasks.
-
hierarchicalforecast
Probabilistic Hierarchical forecasting π with statistical and econometric methods.
-
Contributions-Importer-For-Github
This tool helps users to import contributions to GitHub from private git repositories, or from public repositories that are not hosted in GitHub.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: AutoCodeRover resolves 22% of real-world GitHub in SWE-bench lite | news.ycombinator.com | 2024-04-09Thank you for your interest. There are some interesting examples in the SWE-bench-lite benchmark which are resolved by AutoCodeRover:
- From sympy: https://github.com/sympy/sympy/issues/13643. AutoCodeRover's patch for it: https://github.com/nus-apr/auto-code-rover/blob/main/results...
- Another one from scikit-learn: https://github.com/scikit-learn/scikit-learn/issues/13070. AutoCodeRover's patch (https://github.com/nus-apr/auto-code-rover/blob/main/results...) modified a few lines below (compared to the developer patch) and wrote a different comment.
There are more examples in the results directory (https://github.com/nus-apr/auto-code-rover/tree/main/results).
Project mention: Whatβs your approach to highly imbalanced data sets? | /r/datascience | 2023-05-26There's a pletora of undersampling and oversampling models you can try out. To avoid removing information form the dataset, you can focus on oversampling techniques. You can try imbalanced-learn or smote-variants. Given enough data, using fully synthetic data is also an option, you can check ydata-synthetic for it. Let us know how it turned out!
Project mention: Boltons is a set of over 250 BSD-licensed, pure-Python utilities | news.ycombinator.com | 2023-12-11
Project mention: I'm fine with the basics of Plex - now what can I do to really use plex to it's full potential? | /r/PleX | 2023-12-09With Tautulli you have a better monitoring system than what Plex offers. Streaming history split by user, you can add notifications to a lot of services like Slack, email and so on. You can even create newsletters being sent out to users based on what was added to your server.
I can't find the TimeGPT-1 model.
LICENSE Apache-2
https://github.com/Nixtla/statsforecast/blob/main/LICENSE
Mentions ARIMA, ETS, CES, and Theta modeling
I have done similar work using the GitHub APIs before. I recommend using their GraphQL explorer to develop your queries interactively. You may need to fall back on the REST API instead of the GraphQL one for certain stats.
https://docs.github.com/en/graphql/overview/explorer
You can also refer to my code here, which may already collect some of the statistics you're interested in.
https://github.com/jstrieb/github-stats/blob/master/github_s...
I predict the most annoying part of this project will be dealing with authentication. There are a handful of ways to do it, and the permissions can be finicky depending on what data you are fetching.
Best of luck!
Project mention: PyCM 4.0 Released: Multilabel Confusion Matrix Support | /r/coolgithubprojects | 2023-06-07
Project mention: Can I get charged for having FLAC content on my server hosted at Germany? | /r/selfhosted | 2023-07-07With that recommendation, I want to add Maloja to the mix. (scrobble which works great with navidrome).
Project mention: [D] When less is more in the hierarchical forecasting case. | /r/MachineLearning | 2023-07-03
Probaj s: https://github.com/meteostat/meteostat-python
Within the discussion, I came across a GitHub action tool that fetches traffic data and stores it in a CSV file, also generating a PDF report:
Python Statistics related posts
- Frouros: An open-source Python library for drift detection in machine learning
- Ask HN: How to Do a GitHub Wrapped?
- [D] Major bug in Scikit-Learn's implementation of F-1 score
- 80% faster, 50% less memory, 0% loss of accuracy Llama finetuning
- Contraction Clustering (RASTER): A fast clustering algorithm
- Cubic Spline Interpolation
- TimeGPT-1
-
A note from our sponsor - SaaSHub
www.saashub.com | 25 Apr 2024
Index
What are some of the best open-source Statistic projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | scikit-learn | 58,046 |
2 | ydata-profiling | 12,022 |
3 | statsmodels | 9,534 |
4 | imbalanced-learn | 6,697 |
5 | boltons | 6,415 |
6 | Tautulli | 5,361 |
7 | statsforecast | 3,540 |
8 | sweetviz | 2,833 |
9 | github-stats | 2,713 |
10 | eiten | 2,655 |
11 | pgmpy | 2,612 |
12 | lifetimes | 1,433 |
13 | pycm | 1,429 |
14 | uncertainty-baselines | 1,362 |
15 | maloja | 936 |
16 | hierarchicalforecast | 512 |
17 | popmon | 485 |
18 | sportsipy | 472 |
19 | pypinfo | 394 |
20 | fitter | 353 |
21 | meteostat-python | 352 |
22 | Contributions-Importer-For-Github | 337 |
23 | github-repo-stats | 279 |
Sponsored