Top 23 Science and Data analysis Open-Source Projects
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much moreProject mention: VBA vs. Power BI | reddit.com/r/FPandA | 2021-03-01
VBA is used for writing up scripts that will automate some process in Excel. VBA performance is incredibly slow and honestly, terrible. You're better off learning some programming (Python) and libraries that will allow you to manipulate/clean/data wrangle. Look into pandas.
The fundamental package for scientific computing with Python.Project mention: Making A Synthesizer Using Python | reddit.com/r/Python | 2021-03-02
What do you mean by uploads? If you mean additional libraries besides Python then, for control input you need the midi module from pygame and for audio output pyaudio. Other than that numpy, you can install these using pip.
Get performance insights in less than 4 minutes. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
PredictionIO, a machine learning server for developers and ML engineers.
Network Analysis in PythonProject mention: [P] I made Communities: a library of clustering algorithms for network graphs (link in comments) | reddit.com/r/MachineLearning | 2021-02-22
It would be nice that communities natively supports both networkx and igraph data structures.
Parallel computing with task schedulingProject mention: Too much data to preprocess to work with pandas — is pyspark.sql a feasible alternative? | reddit.com/r/PySpark | 2021-02-25
I haven't used it myself I have to admit, but I think dask could fit your workflow. Spark might add a little bit too much overhead if you're not used to it and you're not using a distributed system but of course it would also work.
Scipy library main repositoryProject mention: I’m never updating Scipy | reddit.com/r/physicsmemes | 2021-01-26
A computer algebra system written in pure PythonProject mention: Python Math Library made in 3 Days as a 14 year-old - libmaths | reddit.com/r/Python | 2021-02-23
Now compare that to SymPy: https://github.com/sympy/sympy/blob/9e8f62e059d83178c1d8a1e19acac5473bdbf1c1/sympy/ntheory/primetest.py#L472-L634
NumPy aware dynamic Python compiler using LLVMProject mention: I need help to speed up my program! | reddit.com/r/learnpython | 2021-03-02
The first thing I would do is write the code in a non-vectorized fashion to see where I could get rid of any unnecessary copying/allocating. Then you could rewrite the code using a more efficient sequence of vectorized operations, or you could JIT it using a library like numba
Statsmodels: statistical modeling and econometrics in PythonProject mention: [C] I have an MS in Statistics - how can I get better at coding? | reddit.com/r/statistics | 2021-01-04
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and moreProject mention: Loop elimination | reddit.com/r/golang | 2021-01-04
I'm not sure what exactly you are trying to accomplish, but there are already numeric packages https://github.com/gonum/gonum that has asm loops for the common stuff. And there's https://github.com/mmcloughlin/avo that makes working with assembly less painful.
BigDL: Distributed Deep Learning Framework for Apache Spark
Breeze is a numerical processing library for Scala.
Interactive and Reactive Data Science using Scala and Spark.
NumPy and Pandas interface to Big Data
Repository for the Astropy core package
🍊 :bar_chart: :bulb: Orange: Interactive data analysisProject mention: Informatica per la SCIENZA, per un ignorante in materia. | reddit.com/r/ItalyInformatica | 2021-02-28
Official git repository for Biopython (originally converted from CVS)Project mention: How is computer science used in biotechnology? | reddit.com/r/biotech | 2021-02-21
You probably mean genetic engineering, which also uses a lot of software tools. The latest iteration, called synthetic biology, also relies heavily on computer-assisted DNA design, cloning and modelling of gene expression networks. You may check out Biopython, the Synthetic Biology Open Language (SBOL), the GBA software, or CUBA for examples of software used in synbio.
Abstract Algebra for Scala
A well tested and comprehensive Golang statistics library package with no dependencies.
Interactive Parallel Computing in Python
A repository for plotting and visualizing data
What are some of the best open-source Science and Data analysis projects? This list will help you:
|22||Interactive Parallel Computing with IPython||1,899|