Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 5 Cudf Open-Source Projects
-
pygraphistry
PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.
Extra fun: We find most enterprise/gov graph analytics work only requires 1-2 attributes to go along with the graph index, and those attributes often are already numeric (time, $, ...) or can be dictionary-encoded as discussed here (categorical, ID, ...)... so even 'tough' billion scale graphs are fine on 1 gpu.
Early, but that's been the basic thinking into our new GFQL system: slice into the columns you want, and then do all the in-GPU traversals you want. In our V1, we keep things dataframe-native include the in-GPU data representation, and are already working on the first extensions to support switching to more graph-native indexing for steps as needed.
Ex: https://github.com/graphistry/pygraphistry/blob/master/demos...
Cudf related posts
-
Why we dropped Docker for Python environments
-
[D] Can we use Ray for distributed training on vertex ai ? Can someone provide me examples for the same ? Also which dataframe libraries you guys used for training machine learning models on huge datasets (100 gb+) (because pandas can't handle huge data).
-
Story of my life
-
Artificial Intelligence in Python
-
Benchmarking Pandas, CuDF, Modin, Apache Arrow and Spark on a Billion Taxi Rides dataset
-
Buka | Sains Data GPU RAPIDS
-
A note from our sponsor - InfluxDB
www.influxdata.com | 9 May 2024
Index
What are some of the best open-source Cudf projects? This list will help you:
Project | Stars | |
---|---|---|
1 | cudf | 7,311 |
2 | pygraphistry | 2,060 |
3 | Optimus | 1,446 |
4 | awesome-pandas-alternatives | 29 |
5 | udsb | 8 |
Sponsored