SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Science and Data analysis Projects
-
Pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Feature transformations should be deterministic: The same input should produce the same output when the same feature definition and configuration are applied. This is what allows training, backtesting, and live inference to remain aligned. Tools such as Pandas, Spark, or feature platforms such as Feast can be used to implement that logic.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
NumPy
-
Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
NetworkX - networkx.org/
-
-
Project mention: Uv is fantastic, but its package management UX is a mess | news.ycombinator.com | 2026-05-21
Scipy maintainer here, the main issue with the wheels was the Fortran77 that was SciPy throwing wrenches into the mix. With C/C++ self compilation should be quite straightforward. We (all Scientific Python packages) really worked hard on that.
From version 1.19 of SciPy there will be no need for fortran compilers (because we translated everything to C https://github.com/scipy/scipy/issues/18566) and then all becomes much easier in all platforms due to the large availability of C compilers in all platforms. Together with the Stable API developments in CPython the wheel clash issues "hopefully" will decrease gradually.
-
It looks like Narwhals; "Narwhals and scikit-Lego came together to achieve dataframe-agnosticism" https://news.ycombinator.com/item?id=40950813 :
> Narwhals: https://narwhals-dev.github.io/narwhals/ :
>> Extremely lightweight compatibility layer between [pandas, Polars, cuDF, Modin]
> Lancedb/lance works with [Pandas, DuckDB, Polars, Pyarrow,]; https://github.com/lancedb/lance
SymPy has Solvers for ODEs and PDEs and convex optimization. SymPy also has lambdify to compile from a relatively slow symbolic expression tree to faster 'vectorized' functions
From https://news.ycombinator.com/item?id=40683777 re: warp :
> sympy.utilities.lambdify.lambdify() https://github.com/sympy/sympy/blob/main/sympy/utilities/lam... :
>>> """Convert a SymPy expression into a function that allows for fast numeric evaluation""" [with e.g. the CPython math module, mpmath, NumPy, SciPy, CuPy, JAX, TensorFlow, PyTorch (*), SymPy, numexpr, but not yet cmath]
-
-
-
Project mention: Python JIT project was asked to pause development | news.ycombinator.com | 2026-06-06
Also you can use projects like numba https://numba.pydata.org/
-
Project mention: Hierarchical Bayesian Regression with PyMC: When Groups Share Strength | dev.to | 2026-04-26
By the end of this post, you'll build a hierarchical Bayesian regression model in PyMC, compare it against pooled and unpooled alternatives, and see shrinkage in action on synthetic insurance data.
-
Project mention: Orange: No-code data mining, visualization and machine learning toolbox | news.ycombinator.com | 2025-10-22
-
-
-
-
-
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
-
-
bcbio-nextgen
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
-
-
-
-
-
Python Science and Data analysis discussion
Python Science and Data analysis related posts
-
Python JIT project was asked to pause development
-
MLOps Lifecycle: Stages, Workflow, and Best Practices
-
What Training Exists for Security Professionals Learning AI and Data Science?
-
Uv is fantastic, but its package management UX is a mess
-
16 Python Libraries You Should Know
-
Best AI Cybersecurity Training for Security Teams: How to Pick
-
Introduction to Python for Data Analysis: A Beginner’s Guide
-
A note from our sponsor - SaaSHub
www.saashub.com | 12 Jun 2026
Index
What are some of the best open-source Science and Data analysis projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | Pandas | 48,955 |
| 2 | NumPy | 32,155 |
| 3 | NetworkX | 16,984 |
| 4 | pygwalker | 15,833 |
| 5 | SciPy | 14,744 |
| 6 | SymPy | 14,665 |
| 7 | Dask | 13,848 |
| 8 | statsmodels | 11,455 |
| 9 | Numba | 11,042 |
| 10 | PyMC | 9,630 |
| 11 | orange | 5,629 |
| 12 | astropy | 5,178 |
| 13 | Biopython | 5,064 |
| 14 | statsforecast | 4,806 |
| 15 | blaze | 3,195 |
| 16 | fugue | 2,165 |
| 17 | Cubes | 1,480 |
| 18 | bcbio-nextgen | 1,030 |
| 19 | NIPY | 826 |
| 20 | Neupy | 736 |
| 21 | bccb | 644 |
| 22 | Bubbles | 460 |
| 23 | PyDy | 411 |