

-
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
scikit-learn-intelex
Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
There is a bit similar project which supports Intel GPU offloading: https://github.com/intel/scikit-learn-intelex
-
- (For those familiar with NumPy) CuPy is closer to NumPy than jax.numpy
But CuPy does not support automatic gradient computation, so if you do deep learning, use JAX instead. Or PyTorch, if you do not trust Google to maintain a project for a prolonged period of time https://killedbygoogle.com/
-
I'm surprised to see pytorch and Jax mentioned as alternatives but not numba : https://github.com/numba/numba
I've recently had to implement a few kernels to lower the memory footprint and runtime of some pytorch function : it's been really nice because numba kernels have type hints support (as opposed to raw cupy kernels).
-
cuda-api-wrappers
Thin C++-flavored header-only wrappers for core CUDA APIs: Runtime, Driver, NVRTC, NVTX.
> probably the easiest way to interface with custom CUDA kernels
In Python? Perhaps. Generally? No, it isn't. Full power of the CUDA APIs including all runtime compilation options etc. : https://github.com/eyalroz/cuda-api-wrappers/
Example:
// the source could be a string literal, loaded from a .cu file, etc.
-
For my tasks, I had some success with algebraic multigrid solvers as preconditioner, for example from AMGCL or PyAMG. They are also reasonably easy to get started with.
https://github.com/pyamg/pyamg
https://github.com/ddemidov/amgcl
But I only have to deal with positive definite systems, so YMMV.
I am not sure whether those libraries can deal with multiple right-hand sides, but most complexity is in the preconditioners anyway.
-
For my tasks, I had some success with algebraic multigrid solvers as preconditioner, for example from AMGCL or PyAMG. They are also reasonably easy to get started with.
https://github.com/pyamg/pyamg
https://github.com/ddemidov/amgcl
But I only have to deal with positive definite systems, so YMMV.
I am not sure whether those libraries can deal with multiple right-hand sides, but most complexity is in the preconditioners anyway.
-
Nutrient
Nutrient – The #1 PDF SDK Library, trusted by 10K+ developers. Other PDF SDKs promise a lot - then break. Laggy scrolling, poor mobile UX, tons of bugs, and lack of support cost you endless frustrations. Nutrient’s SDK handles billion-page workloads - so you don’t have to debug PDFs. Used by ~1 billion end users in more than 150 different countries.
-
If you like cupy, definitely checkout the Multinode Multi-gpu version, cuNumeric: https://github.com/nv-legate/cunumeric
Would love to get any feedback from the community.
Related posts
-
TransformerEngine: A library for accelerating Transformer models on Nvidia GPUs
-
Benchmarking Large Language Models on NVIDIA H100 GPUs with CoreWeave (Part 1)
-
Jittor: High-performance deep learning framework based on JIT and meta-operators
-
GPUs for Deep Learning in 2023 – An In-depth Analysis
-
Doubts on pyopencl