The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Cudf Alternatives
Similar projects and alternatives to cudf
-
chia-blockchain
Chia blockchain python implementation (full node, farmer, harvester, timelord, and wallet)
-
Pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
-
Apache Arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
-
DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
-
annoy
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
-
-
-
-
-
-
-
PSRayTracing
A (modern) C++ implementation of the Peter Shirley Ray Tracing mini-books (https://raytracing.github.io). Features a clean project structure, perf. improvements (compared to the original code), multi-core rendering, and more.
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
cudf reviews and mentions
-
A Polars exploration into Kedro
The interesting thing about Polars is that it does not try to be a drop-in replacement to pandas, like Dask, cuDF, or Modin, and instead has its own expressive API. Despite being a young project, it quickly got popular thanks to its easy installation process and its “lightning fast” performance.
-
Introducing TeaScript C++ Library
Yes sure, that is how OpenMP does; but on the other side: you seem to already do some basic type inference, and building an AST, no? Then you know as well the size and type of your vectors, and can execute actions in parallel if there is enough data to be worth parallelizing. Is there anyone who don't want their code to execute faster if it is possible? Those that do work in big data domain do use threads and vectorized instructions without user having to type in any directive; just import different library. Example, numpy or numpy with cuda backend, or similar GPU accelerated libraries like cudf.
-
[D] [R] Large-scale clustering
try https://rapids.ai/
-
[P] Looking for state of the art clustering algorithms
As a companion to the other comments, I'd like to mention that the RAPIDS library cuML provides GPU-accelerated versions of quite a few of the algorithms mentioned in this thread (HDBSCAN, UMAP, SVM, PCA, {Exact, Approximate} Nearest Neighbors, DBSCAN, KMeans, etc.).
- Integrating multiple point clouds?
-
Dask – a flexible library for parallel computing in Python
You can probably use https://github.com/rapidsai/cudf/tree/main/python/dask_cudf a dask wrapper around cuDF.
- An Engineer's View of Venture Capitalists (2011)
-
Notes from the Meeting on Python GIL Removal Between Python Core and Sam Gross
https://news.ycombinator.com/item?id=18040664
Today, conda-forge compiles CPython to relocatable platform+architecture-specific binaries with LLVM. https://github.com/conda-forge/python-feedstock/blob/master/...
Pyodide (JupyterLite) compiles CPython to WASM (or LLVM IR?) with LLVM/emscripten IIRC. Hopefully there's a clear way to implement the new GIL-less multithreading support with Web Workers in WASM, too?
The https://rapids.ai/ org has a bunch a fast Python for HPC; with Dask and pick a scheduler. Less process overhead and less need for interprocess locking of memory handles that transgress contexts due to a new GIL removal approach would be even faster than debuggable one process per core Python.
-
New pipelined multi-threaded plotter implementation (work in progress)
Can you describe what will be needed in terms of GPU hardware? I acquired some stuff while messing with rapids.ai, but it's such a pain to support I gave up. Would be great if an OpenCl enhancement for Chia appears.
-
Unifying the CUDA Python Ecosystem
that project might be abandoned but this strategy is used in nvidia and nvidia adjacent projects (through llvm):
https://github.com/rapidsai/cudf/blob/branch-0.20/python/cud...
https://github.com/gmarkall/numba/blob/master/numba/cuda/com...
>but we also need high level expressibility that doesn't require writing kernels in C
the above are possible because C is actually just a frontend to PTX
https://docs.nvidia.com/cuda/parallel-thread-execution/index...
fundamentally you are not going to ever be able to have a way to write cuda kernels without thinking about cuda architecture anymore so than you'll ever be able to write async code without thinking about concurrency.
-
A note from our sponsor - WorkOS
workos.com | 17 Apr 2024
Stats
rapidsai/cudf is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of cudf is C++.