Python Science and Data analysis

Open-source Python projects categorized as Science and Data analysis

Top 23 Python Science and Data analysis Projects

  • pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Latest mention: Hacktoberfest: 69 Beginner-Friendly Projects You Can Contribute To | dev.to | 2020-09-29

    https://github.com/pandas-dev/pandas Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

  • numpy

    The fundamental package for scientific computing with Python.

    Latest mention: The Secret Apple M1 Coprocessor | news.ycombinator.com | 2021-01-16

    Numpy recently removed Apple's Accelerate framework as a supported BLAS implementation, due to bugs in Accelerate (the claim alleges).

    https://github.com/numpy/numpy/pull/15759

  • networkx

    Network Analysis in Python

  • scipy

    Scipy library main repository

    Latest mention: Top Optimal Control Libraries for Python | reddit.com/r/ControlTheory | 2021-01-17
  • dask

    Parallel computing with task scheduling

  • sympy

    A computer algebra system written in pure Python

    Latest mention: Doing Symbolic Math with SymPy | news.ycombinator.com | 2021-01-08

    The trouble with SymPy is it's, well, buggy. I tried it years ago and as soon as I got serious I quite quickly ran into problems that I reported, some of which I see they apparently still haven't gotten around to addressing. [1] [2]

    Symbolic math is hard; they have my sympathies. I don't think I could do better. But as long as bugs like these exist, it's going to be hard to convince people to switch away from better tools like Mathematica.

    [1] https://github.com/sympy/sympy/issues/12561

    [2] https://github.com/sympy/sympy/issues/12562

  • numba

    NumPy aware dynamic Python compiler using LLVM

  • statsmodels

    Statsmodels: statistical modeling and econometrics in Python

    Latest mention: [C] I have an MS in Statistics - how can I get better at coding? | reddit.com/r/statistics | 2021-01-04
  • pymc3

    Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

  • blaze

    NumPy and Pandas interface to Big Data

  • orange3

    🍊 :bar_chart: :bulb: Orange: Interactive data analysis

  • biopython

    Official git repository for Biopython (originally converted from CVS)

  • ipyparallel

    Interactive Parallel Computing in Python

  • cubes

    Light-weight Python OLAP framework for multi-dimensional data analysis

  • mining

    Business Intelligence (BI) in Python, OLAP

  • bcbio-nextgen

    Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

  • neupy

    NeuPy is a Tensorflow based python library for prototyping and building neural networks

  • nipype

    Workflows and interfaces for neuroimaging packages

  • bcbb

    Incubator for useful bioinformatics code, primarily in Python and R

  • bubbles

    [NOT MAINTAINED] Bubbles – Python ETL framework

  • pydy

    Multibody dynamics tool kit.

  • harold

    An open-source systems and controls toolbox for Python3

  • signac

    Manage large and heterogeneous data spaces on the file system.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Index

What are some of the best open-source Science and Data analysis projects in Python? This list will help you:

Project Stars
1 pandas 28,089
2 numpy 15,953
3 networkx 8,474
4 scipy 7,839
5 dask 7,759
6 sympy 7,740
7 numba 5,979
8 statsmodels 5,915
9 pymc3 5,480
10 blaze 2,920
11 orange3 2,568
12 biopython 2,562
13 ipyparallel 1,850
14 cubes 1,379
15 mining 1,114
16 bcbio-nextgen 795
17 neupy 662
18 nipype 542
19 bcbb 487
20 bubbles 426
21 pydy 228
22 harold 125
23 signac 64