gpu-benches
amh-code
gpu-benches | amh-code | |
---|---|---|
1 | 8 | |
158 | 624 | |
- | - | |
7.5 | 10.0 | |
2 months ago | over 1 year ago | |
Jupyter Notebook | Jupyter Notebook | |
GNU General Public License v3.0 only | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gpu-benches
-
Maxing out the device
The snippet is taken from: https://github.com/te42kyfo/gpu-benches/blob/master/um-stream/main.cu in lines 30-38
amh-code
-
Ask HN: Recommendations for high quality, free CS books online
I recently stumbled on https://en.algorithmica.org/hpc/ which I absolutely loved. It's really well written, comprehensible and concise. It felt like a pleasure to read which I find really rare with CS textbooks and I feel like I've come out of it understanding how computers work a bit better
Does anyone have any similar CS books they'd recommend? Ideally they'd be:
- Algorithms for Modern Hardware
-
Ask HN: How can I learn about performance optimization?
I admire Daniel Lemire’s work on SIMD implementations. [Lemire]
[Lemire] https://lemire.me/en/#publications
I learn a lot by reading my compiler’s and profiler’s documentation.
For Rust, the Rust Performance Book by Nicholas Nethercote et al. [Nethercote] seems like a nice place to start after reading the Cargo and rustc books.
[Nethercote] https://nnethercote.github.io/perf-book/
Algorithms for Modern Hardware by Sergey Slotin [Slotin] is a dense and approachable overview.
[Slotin] https://en.algorithmica.org/hpc/
Quantitative understanding of the underlying implementations and computer architecture has been invaluable for me. Computer architecture: a quantitative approach by John L. Hennessy and David A. Patterson [H&P] and Computer organization and design: the hardware/software interface by Patterson and Hennessy [P&H ARM, P&H RISC] are two introductory books I like the best. There are three editions of the second book: the ARM, MIPS and RISC-V editions.
[H&P] https://www.google.com/books/edition/_/cM8mDwAAQBAJ
- Algorithms for Modern Hardware – Algorithmica
-
Ask HN: Programming Courses for Experienced Coders?
Hello, recently I've enjoyed Casey Muratori's Performance-Aware Programming course[0]. You could read Algorithms for Modern Hardware[1] to learn similar set of stuff though. Casey's course is aimed at bringing beginners all the way to a nearly-industry-leading understanding of performance issues while the book assumes a bit more knowledge, but I think a lot of people have trouble getting into this stuff using a book if they don't have related experience.
I've also found Hacker's Delight Second Edition[2] to be a useful reference, and I really wish that I would get around to reading What Every Programmer Should Know About Memory[3] in full, because I end up reading a bunch of other things[4] to learn stuff that's surely in there.
[0]: https://www.computerenhance.com/p/welcome-to-the-performance...
[1]: https://en.algorithmica.org/hpc/
[2]: https://github.com/lancetw/ebook-1/blob/80eccb7f59bf102586ba...
[3]: https://people.freebsd.org/~lstewart/articles/cpumemory.pdf
[4]: https://danluu.com/3c-conflict/
-
SIMD Everywhere Optimization from ARM Neon to RISC-V Vector Extensions
https://en.algorithmica.org/hpc/ and http://0x80.pl/ have some stuff about this, but the latter can be dense. I've had fun getting my hands dirty with some problems at https://highload.fun/ but there's not much direction unless you go to the telegram chat and ask people questions.
-
Fastest Branchless Binary Search
Other fast binary searches https://github.com/sslotin/amh-code/tree/main/binsearch
What are some alternatives?
rankseg - [JMLR 2023] RankSEG: A consistent ranking-based framework for segmentation
sb_lower_bound - Fastest Branchless Binary Search
mlscorecheck - Testing the consistency of binary classification performance scores reported in papers
branchless-binary-search - Binary search implementation that avoids branch instructions
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
tigerbeetle - The distributed financial transactions database designed for mission critical safety and performance.
ThinkingInSimd - An essay comparing performance implications of ignoring AVX acceleration
std-simd - std::experimental::simd for GCC [ISO/IEC TS 19570:2018]
zig - General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
OpenCV - Open Source Computer Vision Library
1brc - 1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
Exercism - Scala Exercises - Crowd-sourced code mentorship. Practice having thoughtful conversations about code.