gpu-benches
Microbenchmarks
gpu-benches | Microbenchmarks | |
---|---|---|
1 | 1 | |
158 | 87 | |
- | - | |
7.5 | 2.3 | |
2 months ago | 5 months ago | |
Jupyter Notebook | Jupyter Notebook | |
GNU General Public License v3.0 only | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gpu-benches
-
Maxing out the device
The snippet is taken from: https://github.com/te42kyfo/gpu-benches/blob/master/um-stream/main.cu in lines 30-38
Microbenchmarks
-
Romeo and Julia, Where Romeo Is Basic Statistics
> Every language I've ever seen with garbage collection has gone through decades of "now the garbage collection is better" or "just wait until the next version, garbage collection will be better".
Ok but the Go example I linked is already in production, right now, you can use it. This isn't a "it will get better in two releases" situation, Go's GC as of today has pause times that are sub-millisecond. The Java Shenandoah example I linked is still mostly in beta, but it's also something you can use right now, though admittedly it'll probably be awhile before it's in a mainline release.
> This is besides the point of performance and no longer talking about reality, it's just FUD from a "what if" future.
It's not "just FUD", there are dozens of reported security issues that have happened because of bad manual memory management problems. Off the top of my head, Heartbleed was a famous case.
This isn't me badmouthing anyone; manual memory management is hard to get right, even for very smart people.
> Right, but you get it by avoiding allocation and avoiding the garbage collector the same way avoiding allocation in C++ is important, but in julia it won't be woven in to the performance, it will cause big pauses.
Fair enough, I did look at the code for the official benchmarks (https://github.com/JuliaLang/Microbenchmarks/blob/master/per...) and outside of the integer parsing code it does indeed seem to avoid dynamic allocations so I will concede that the benchmarks might be a bit more skewed compared to real-world code.
I still get a hunch that if you compared it allocation-heavy Julia to malloc+free-heavy C++ the differences wouldn't really be that far off, but that's just a hunch and I don't have data to back that up; might be a fun test to write though, so maybe I'll try that this weekend.
-----
Sort of tangential, but I also do think that there's value in having decent concurrency constructs built into the language. With C++, if you stick to built-ins you are basically stuck with mutexes and despite what people like to pretend, getting correct code with mutexes is really really hard to get right, and very easy to screw up in a non-obvious way. If you allow yourself to use libraries, then you have stuff like ZeroMQ and OpenMP and stuff, so it's really not that dire realistically. However, I think there's value in having nice, easy to use concurrency constructs in the language other than mutexes, and I do wonder if as a result of that it encourages people to utilize multiple threads more frequently, because they don't have to worry about weird deadlock situations as much.
Again, I believe Rust actually does address this because of the single-owner-enforced-at-compile-time stuff, but I haven't used it enough to really draw a conclusion on it.
What are some alternatives?
rankseg - [JMLR 2023] RankSEG: A consistent ranking-based framework for segmentation
mlscorecheck - Testing the consistency of binary classification performance scores reported in papers