How to make CUDA libraries more performant?

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

triton

30 11,054 9.9 C++

Development repository for the Triton language and compiler

If writing your own CUDA code is hard (as I think each implementation has to be architecture-specific and learning about so many architectures is just not feasible) are there any alternatives to writing CUDA that are commonly used by the community? I read about openai/triton or is there any compiler that automatically does this? Or do I have to go the long route and learn CUDA for each architecture?

maxas

3 784 0.0 Sass

Discontinued Assembler for NVIDIA Maxwell architecture

cuDNN is already very optimized but if you want to read on optimizing, here you go (Maxwell specifix) https://github.com/NervanaSystems/maxas/wiki/SGEMM, there is an accompanying paper or read Nvidia Cutlass.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Import Vocabulary from ChatGPT to Anki

1 project | news.ycombinator.com | 5 May 2024
GitHub Embed

1 project | dev.to | 4 May 2024
NFT Preview Card Component Challenge from Frontend Mentor

1 project | dev.to | 1 May 2024
Programming VTuber Logos

1 project | news.ycombinator.com | 24 Apr 2024
Mastering Component Styling: Elevate Your CSS with Layering and Dynamic Class Management, No ng:deep needed!

1 project | dev.to | 1 May 2024

How to make CUDA libraries more performant?

This page summarizes the projects mentioned and recommended in the original post on /r/CUDA Post date: 10 Mar 2022

triton

maxas

InfluxDB

Related posts

Show HN: Import Vocabulary from ChatGPT to Anki

GitHub Embed

NFT Preview Card Component Challenge from Frontend Mentor

Programming VTuber Logos

Mastering Component Styling: Elevate Your CSS with Layering and Dynamic Class Management, No ng:deep needed!