Top 5 Cudf Open-Source Projects

cudf

23 7,311 9.9 C++

cuDF - GPU DataFrame Library

Project mention: A Polars exploration into Kedro | dev.to | 2023-05-17

The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.

pygraphistry

9 2,060 9.2 Python

PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

Project mention: Graph Data Fits in Memory | news.ycombinator.com | 2024-04-15

Extra fun: We find most enterprise/gov graph analytics work only requires 1-2 attributes to go along with the graph index, and those attributes often are already numeric (time, $, ...) or can be dictionary-encoded as discussed here (categorical, ID, ...)... so even 'tough' billion scale graphs are fine on 1 gpu.
Early, but that's been the basic thinking into our new GFQL system: slice into the columns you want, and then do all the in-GPU traversals you want. In our V1, we keep things dataframe-native include the in-GPU data representation, and are already working on the first extensions to support switching to more graph-native indexing for steps as needed.
Ex: https://github.com/graphistry/pygraphistry/blob/master/demos...

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Optimus

0 1,446 0.6 Python

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark (by ironmussa)
awesome-pandas-alternatives

1 29 10.0

Awesome list of alternative dataframe libraries in Python.
udsb

1 8 4.4 Jupyter Notebook

Unlimited Data-Science Benchmarks for Numeric, Tabular and Graph Workloads
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Cudf related posts

Why we dropped Docker for Python environments

1 project | /r/dataengineering | 12 Apr 2023
[D] Can we use Ray for distributed training on vertex ai ? Can someone provide me examples for the same ? Also which dataframe libraries you guys used for training machine learning models on huge datasets (100 gb+) (because pandas can't handle huge data).

1 project | /r/MachineLearning | 9 Feb 2023
Story of my life

1 project | /r/ProgrammerHumor | 28 Nov 2022
Artificial Intelligence in Python

1 project | /r/learnpython | 30 Oct 2022
Benchmarking Pandas, CuDF, Modin, Apache Arrow and Spark on a Billion Taxi Rides dataset

2 projects | /r/Python | 21 Sep 2022
Buka | Sains Data GPU RAPIDS

1 project | /r/opencv | 21 Feb 2022
A note from our sponsor - InfluxDB
www.influxdata.com | 9 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Cudf projects? This list will help you:

	Project	Stars
1	cudf	7,311
2	pygraphistry	2,060
3	Optimus	1,446
4	awesome-pandas-alternatives	29
5	udsb	8

Cudf

Top 5 Cudf Open-Source Projects

cudf

pygraphistry

InfluxDB

Optimus

awesome-pandas-alternatives

udsb

SaaSHub

Cudf related posts

Why we dropped Docker for Python environments

[D] Can we use Ray for distributed training on vertex ai ? Can someone provide me examples for the same ? Also which dataframe libraries you guys used for training machine learning models on huge datasets (100 gb+) (because pandas can't handle huge data).

Story of my life

Artificial Intelligence in Python

Benchmarking Pandas, CuDF, Modin, Apache Arrow and Spark on a Billion Taxi Rides dataset

Buka | Sains Data GPU RAPIDS

Index