SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Numpy Projects
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
Project mention: What are the best Python libraries to learn for beginners? | reddit.com/r/learnpython | 2023-01-30
NumPy: Scientific computing library and I know this one is the most popular especially in Data Science.
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Project mention: Need help with a data science project | reddit.com/r/learnmachinelearning | 2023-01-30 -
Data scientists work on phenomenally large datasets, and Dask is a handy tool for exploration within the confines of a single cloud VM or their local PCs. Location data visualization is an essential part of deciding further algorithm development and roadmap for projects. This lays the foundation for data engineering and science to work at scale, with petabytes of data.
-
haven't tried yet but you can check something like this https://github.com/rougier/numpy-100. this is for numpy, maybe there could be something for pandas or matploblib
-
ROCm's great for data centers, but good luck finding anything about desktop GPUs on their site apart from this lone blog post: https://community.amd.com/t5/instinct-accelerators/exploring...
There's a good explanation of AMD's ROCm targets here: https://news.ycombinator.com/item?id=28200477
It's currently a PITA to get common Python libs like Numba to even talk to AMD cards (admittedly Numba won't talk to older Nvidia cards either and they deprecate ruthlessly; I had to downgrade 8 versions to get it working with a 5yo mobile workstation). YC-backed Ivy claims to be working on unifying ML frameworks in a hardware-agnostic way but I don't have enough experience to assess how well they're succeeding yet: https://lets-unify.ai
I was happy to see DiffusionBee does talk the GPU in my late-model intel Mac, though for some reason it only uses 50% of its power right now. I'm sure the situation will improve as Metal 3.0 and Vulkan get more established.
-
scientific-visualization-book
An open access book on scientific visualization using python and matplotlib
I had the same problem until I found this tutorial:
https://github.com/rougier/matplotlib-tutorial
If you wan something deeper the same person has written a book:
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
Project mention: mlcourse.ai: NEW Courses - star count:8584.0 | reddit.com/r/algoprojects | 2023-02-04
-
The heavy part in backtest is calculations. And they are done in pandas which is partially written in c. Also, we can use numba: https://numba.pydata.org/
-
Project mention: The founder of Gmail claims that ChatGPT can “kill” Google in two years. | reddit.com/r/Futurology | 2023-01-31
But a couple years later they came out with open source implementations yeah: https://github.com/google/trax/tree/master/trax/models/reformer
-
u/Spataner's answer is great. If you WANT GPU-enabled numpy functions, I would check out CuPy: https://cupy.dev/
-
-
Sorry maybe someone could chime in and help but I use chainer to upscale. https://github.com/chainer/chainer
-
In version 1.45.0, we introduced msgspec as our serialization backend, replacing orjson. This had some immediate performance benefits, but that's not the main reason we made the switch.
-
Only thing I can think of is Orange, which has some statistics capability, but isn't its focus.
-
Project mention: TensorFlow Datasets (TFDS): a collection of ready-to-use datasets | news.ycombinator.com | 2022-12-21
I tried Librispeech, a very common dataset for speech recognition, in both HF and TFDS.
TFDS performed extremely bad.
First it failed because the official hosting server only allows 5 simultaneous connections, and TFDS totally ignored that and makes up to 50 simultaneous downloads and that breaks. I wonder if anyone actually tested this?
Then you need to have some computer with 30GB to do the preparation, which might fail on your computer. This is where I stopped. https://github.com/tensorflow/datasets/issues/3887. It might be fixed now but it took them 8 months to respond to my issue.
On HF, it just worked. There was a smaller issue in how the dataset was split up but that is fixed now, and their response was very fast and great.
-
Image, 3D, or data visualization applications using OpenCV and the SciPy ecosystem. The Graphics View Framework can display an image and let the user interact with it, and the Python ecosystem is very rich for image processing, data analysis, and visualization. For example, LabelMe for image labeling, PyQtGraph for scientific graphics, or custom QWidget integration in Maya.
-
PyTorch and JAX are used heavily in climate science on the ML side. For more general analytics, not so much. Many of our users like to use Xarray as a high-level API. There has been some work to integrate Xarray with PyTorch (https://github.com/pydata/xarray/issues/3232) but we're not there yet.
The Python Array API standard should help align these different back-ends: https://data-apis.org/array-api/latest/
-
mars
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
-
Project mention: How usable is Julia for Natural Language Processing Machine learning? | reddit.com/r/Julia | 2022-10-28
-
numpyro
Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
-
Project mention: Introspect type hints Pythonically in O(1) time. | reddit.com/r/Python | 2022-09-25
Much thanks to @tlambert03 – who also authors Napari, a fast multidimensional image viewer in Python you might also enjoy.
-
Project mention: Using Rust to speed up 3D rendering in the browser | reddit.com/r/rust | 2022-03-16
Even though it's not Rust nor browser, I'm leaving this Python library here, because I was made aware of it recently: https://github.com/marcomusy/vedo
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Numpy related posts
- A backtester idea
- The founder of Gmail claims that ChatGPT can “kill” Google in two years.
- What are the best Python libraries to learn for beginners?
- #01 Benchmark of four JIT Backends
- A new way to accelerate your data science workflow
- Python numpy, pandas, matplotlib
- Joining the Open Source Development Course
-
A note from our sponsor - #<SponsorshipServiceOld:0x00007fea592a2e20>
www.saashub.com | 4 Feb 2023
Index
What are some of the best open-source Numpy projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | data-science-ipython-notebooks | 24,571 |
2 | NumPy | 22,581 |
3 | datasets | 15,143 |
4 | Dask | 10,716 |
5 | numpy-100 | 9,754 |
6 | ivy | 8,785 |
7 | scientific-visualization-book | 8,740 |
8 | mlcourse.ai | 8,586 |
9 | Numba | 8,231 |
10 | trax | 7,319 |
11 | cupy | 6,635 |
12 | einops | 6,316 |
13 | chainer | 5,765 |
14 | orjson | 4,215 |
15 | orange | 3,919 |
16 | datasets | 3,723 |
17 | PyQtGraph | 3,104 |
18 | xarray | 2,837 |
19 | mars | 2,541 |
20 | gluon-nlp | 2,466 |
21 | numpyro | 1,642 |
22 | napari | 1,596 |
23 | vedo | 1,531 |