SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Statistic Open-Source Projects
-
7. Scikit-learn - Machine Learning
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
-
Project mention: Your Guide To Using Open Source Software as an Indie Developer | dev.to | 2025-05-25
There was a time when open source software meant “functional, but clunky.” That’s changed. Tools like Plausible (analytics), N8N (automation), Umami (web stats), and Vaultwarden (password manager) are beautifully built, stable, and powerful. Many match or even beat their commercial alternatives.
-
Plausible Analytics
Simple, open source, lightweight and privacy-friendly web analytics alternative to Google Analytics.
Project mention: Your Guide To Using Open Source Software as an Indie Developer | dev.to | 2025-05-25There was a time when open source software meant “functional, but clunky.” That’s changed. Tools like Plausible (analytics), N8N (automation), Umami (web stats), and Vaultwarden (password manager) are beautifully built, stable, and powerful. Many match or even beat their commercial alternatives.
-
excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
-
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
WhatTheDuck does SQL with duckdb-wasm IIRC
Pygwalker does open-source descriptive statistics and charts from pandas dataframes: https://github.com/Kanaries/pygwalker
ydata-profiling does Exploratory Data Analysis (EDA) with Pandas and Spark DataFrames and integrates with various apps: https://github.com/ydataai/ydata-profiling
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
statsmodels is the closest thing in python to R. statsmodels has mixed model support, but mgcv apparently requires more. It is well above my paygrade, but this seems relevant: https://github.com/statsmodels/statsmodels/issues/8029 (i.e. no out of the box support, you might be able to build an approximation on your own).
-
miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Project mention: XAN: A Modern CSV-Centric Data Manipulation Toolkit for the Terminal | news.ycombinator.com | 2025-03-27I recently came across https://github.com/johnkerl/miller. I don't know how these tools compare.
-
gonum
Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more
-
scc
Sloc, Cloc and Code: scc is a very fast accurate code counter with complexity calculations and COCOMO estimates written in pure Go
Related: https://github.com/boyter/scc , which can also separately count code that is generated (based on keywords in the files).
This is useful in cases where API layers (think protobuf -> target lang) are generated by a complier, and you want to know how much code is manually created.
-
-
boltons
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
-
I first tried to use growthbook. They had only react support. I thought - I could use the js sdk and work around it. Ok fine. It seemed a bit complicated to use in terms of their UI. Okay fine, I try to find an easier one maybe I can self-host. That way I could even put it behind cloudflare CDN and use caching on it and clever cache-busting when I change values could help propagate changes. Okay fine I have a plan. I ended up going with Flagsmith instead. It was even easier. Perfect.
-
git-quick-stats
▁▅▆▃▅ Git quick statistics is a simple and efficient way to access various statistics in git repository.
-
Kotlin can use any Java library, giving you access to powerful machine learning frameworks like DeepLearning4J, Smile, and Weka.
-
-
evidence
Business intelligence as code: build fast, interactive data visualizations in SQL and markdown
Project mention: Data viz library built with Apache ECharts, Leaflet, and shadcn | news.ycombinator.com | 2025-04-12It would be better to link to the main page, https://evidence.dev/, which is titled "Evidence - Business intelligence as code".
-
We hope that you'll join us in our mission to advance cutting-edge scientific computation in JavaScript. Start by showing your support and starring the project on GitHub today: https://github.com/stdlib-js/stdlib.
-
Regarding the 'sudo' issue: Doing a benchmark by just running an example executable is not really recommended because there's a ton of reasons why you might get differing performance.
It's probably better to set up an actual benchmark using a crate like Criterion instead [0].
[0] https://github.com/bheisler/criterion.rs
-
statsforecast – Forecasting with statistical and econometric models
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Statistics discussion
Statistics related posts
-
Show HN: Simplest Git Statistics in CLI
-
Your Guide To Using Open Source Software as an Indie Developer
-
Ruby 3.5 Feature: Namespace on read
-
Ask HN: Freelancer? Seeking freelancer? (May 2025)
-
Lovely Tensors: Tensors, for human consumption
-
Plausible 3.0.0
-
Statistics for Strava, a self-hosted service for better stats
-
A note from our sponsor - SaaSHub
www.saashub.com | 19 Jun 2025
Index
What are some of the best open-source Statistic projects? This list will help you:
# | Project | Stars |
---|---|---|
1 | scikit-learn | 62,340 |
2 | Probabilistic-Programming-and-Bayesian-Methods-for-Hackers | 27,488 |
3 | Umami | 26,836 |
4 | Plausible Analytics | 22,693 |
5 | excelize | 19,279 |
6 | ydata-profiling | 12,975 |
7 | tokei | 12,623 |
8 | statsmodels | 10,732 |
9 | miller | 9,321 |
10 | gonum | 8,042 |
11 | scc | 7,414 |
12 | imbalanced-learn | 6,999 |
13 | boltons | 6,633 |
14 | growthbook | 6,638 |
15 | git-quick-stats | 6,556 |
16 | Smile | 6,194 |
17 | Tautulli | 5,992 |
18 | evidence | 5,293 |
19 | stdlib | 5,215 |
20 | criterion.rs | 5,076 |
21 | statsforecast | 4,408 |
22 | datascience | 4,403 |
23 | probability | 4,341 |