Top 23 Python Science and Data analysis Projects
-
pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Latest mention: Hacktoberfest: 69 Beginner-Friendly Projects You Can Contribute To | dev.to | 2020-09-29https://github.com/pandas-dev/pandas Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
-
numpy
The fundamental package for scientific computing with Python.
Numpy recently removed Apple's Accelerate framework as a supported BLAS implementation, due to bugs in Accelerate (the claim alleges).
-
networkx
Network Analysis in Python
-
scipy
Scipy library main repository
-
dask
Parallel computing with task scheduling
-
sympy
A computer algebra system written in pure Python
The trouble with SymPy is it's, well, buggy. I tried it years ago and as soon as I got serious I quite quickly ran into problems that I reported, some of which I see they apparently still haven't gotten around to addressing. [1] [2]
Symbolic math is hard; they have my sympathies. I don't think I could do better. But as long as bugs like these exist, it's going to be hard to convince people to switch away from better tools like Mathematica.
-
numba
NumPy aware dynamic Python compiler using LLVM
-
statsmodels
Statsmodels: statistical modeling and econometrics in Python
Latest mention: [C] I have an MS in Statistics - how can I get better at coding? | reddit.com/r/statistics | 2021-01-04 -
pymc3
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano
-
blaze
NumPy and Pandas interface to Big Data
-
orange3
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
-
biopython
Official git repository for Biopython (originally converted from CVS)
-
ipyparallel
Interactive Parallel Computing in Python
-
cubes
Light-weight Python OLAP framework for multi-dimensional data analysis
-
mining
Business Intelligence (BI) in Python, OLAP
-
bcbio-nextgen
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
-
neupy
NeuPy is a Tensorflow based python library for prototyping and building neural networks
-
nipype
Workflows and interfaces for neuroimaging packages
-
bcbb
Incubator for useful bioinformatics code, primarily in Python and R
-
bubbles
[NOT MAINTAINED] Bubbles – Python ETL framework
-
pydy
Multibody dynamics tool kit.
-
harold
An open-source systems and controls toolbox for Python3
-
signac
Manage large and heterogeneous data spaces on the file system.
Index
What are some of the best open-source Science and Data analysis projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | pandas | 28,089 |
2 | numpy | 15,953 |
3 | networkx | 8,474 |
4 | scipy | 7,839 |
5 | dask | 7,759 |
6 | sympy | 7,740 |
7 | numba | 5,979 |
8 | statsmodels | 5,915 |
9 | pymc3 | 5,480 |
10 | blaze | 2,920 |
11 | orange3 | 2,568 |
12 | biopython | 2,562 |
13 | ipyparallel | 1,850 |
14 | cubes | 1,379 |
15 | mining | 1,114 |
16 | bcbio-nextgen | 795 |
17 | neupy | 662 |
18 | nipype | 542 |
19 | bcbb | 487 |
20 | bubbles | 426 |
21 | pydy | 228 |
22 | harold | 125 |
23 | signac | 64 |