hotspot
rustc-perf
hotspot | rustc-perf | |
---|---|---|
16 | 26 | |
3,874 | 592 | |
1.4% | 0.0% | |
9.3 | 9.6 | |
3 days ago | 6 days ago | |
C++ | Rust | |
GNU General Public License v3.0 or later | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hotspot
- Hotspot: A GUI for the Linux perf profiler
-
What is your favourite profiling tool for C++?
perf with Hotspot 👌
-
Profiling C code on an M1 mac
If you’re able to use perf on Linux, I would recommend hotspot for visualizing the results.
-
What is the problem with transfer speeds withing Dolphin?
I can recommend you using the https://github.com/KDAB/hotspot/ tool whenever you want to study performance.
-
Data-driven performance optimization with Rust and Miri
Every Linux C/C++/Rust developer should know about https://github.com/KDAB/hotspot. It's convenient and fast. I use it for Rust all the time, and it provides all of these features on the back of regular old `perf`.
-
How to interpret a flamegraph?
Flamegraphs alone aren't a full picture of what your application is doing, but it can give you hints as to where to look. Another tool I often use is Hotspot which can open the perf.data file and provide more options for filtering and digging into the gathered data beyond the single flamegraph.
-
Twenty Years of Valgrind
Ignore the command, it's just a placeholder to get meaningful values. The -d flag adds basic cache events, by adding another -d you also get load and load miss events for the dTLB, iTLB and L1i cache.
But as mentioned, you can instrument any event supported by your system. Including very obscure events such as uops_executed.cycles_ge_2_uops_exec (Cycles where at least 2 uops were executed per-thread) or frontend_retired.latency_ge_2_bubbles_ge_2 (Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall).
You can also record data using perf-record(1) and inspect them using perf-report(1) or - my personal favorite - the Hotspot tool (https://github.com/KDAB/hotspot).
Sorry for hijacking the discussion a little, but I think perf is an awesome little tool and not as widely known as it should be. IMO, when using it as a profiler (perf-record), it is vastly superior to any language-specific built-in profiler. Unfortunately some languages (such as Python or Haskell) are not a good fit for profiling using perf instrumentation as their stack frame model does not quite map to the C model.
-
Linux Perf Examples
> [...] how Perf compares to vendor tools like vTune [...] ?
Regarding the hardware events that Perf can capture on x86, it has pretty much all of them. So it should be equivalent to vTune for all practical purposes.
The big difference is in the UI -- or absence thereof. Perf is a low-level tool and its output is mostly text files. There is a curses-based TUI for perf-report (and even gtk version, but it is essentially the same as the TUI, just using GTK2 widgets), but that's about it.
By contrast, vTune comes with a heavy (electron-based?) GUI and is quite helpful in guiding beginners, with many graphs and explanations.
Of course, one can (and is expected to) complement Perf with an assortment of tools that process its output for visualization. For example, the flamegraph [1] and heat map [2] tools described in the article. But also KDAB hotspot [3] or HPerf for a vTune-style perf-report.
[1] https://github.com/brendangregg/FlameGraph
[2] https://github.com/brendangregg/HeatMap
[3] https://github.com/KDAB/hotspot
[4] https://www.poirrier.ca/hperf/
-
Parsers that don't yet exist?
https://github.com/KDAB/hotspot might contain parsing code you could use as an example (other than perf script). It always accepts raw perf.data, and there doesn't seem to be a way to feed it the output of perf script, so it might be parsing it directly instead of calling perf script.
rustc-perf
-
Adding runtime benchmarks to the Rust compiler benchmark suite
> what do people use to run benchmarks on CI?
Typically, you purchase/rent a server that does nothing but sequentially run queued benchmarks (and the size/performance of this server doesn't really matter, as long as the performance is consistent), then sends the report somewhere for hosting and processing. Of course, this could be triggered by something running in CI, and the CI job could wait for the results, if benchmarking is an important part of your workflow.
But CI and benchmarks really shouldn't be run on the same host.
> What does the rust project use?
It's not clear exactly where the Rust benchmark "perf-runner" is hosted, but here are the specifications of the machine at least: https://github.com/rust-lang/rustc-perf/blob/414230abc695bd7...
> What do other projects use?
Essentially what I described above, a dedicated machine that runs benchmarks. The Rust project seems to do it via GitHub comments (as I understand https://github.com/rust-lang/rustc-perf/tree/master/collecto...), others have API servers that respond to HTTP requests done from CI/chat, others have remote GUIs that triggers the runs. I don't think there is a single solution that everyone/most are using.
- [rustc-perf] Runtime benchmarks got finally merged
-
Ask HN: Was programming more interesting when memory usage was a concern?
A lot of effort is spent to reduce the size of structs in the Rust compiler
https://nnethercote.github.io/2023/03/24/how-to-speed-up-the...
3% and 6% of improvement doesn't seem like much, but at the level of rustc those big wins
Performance of Rustc must be continously tracked (here https://perf.rust-lang.org/) because if you don't proactively fight against bloat, the tendency is that the code will become slower over time (due to new features etc)
-
Can Rust's compile time match its runtime performance?
hmm really really hard to answer :'), it's tradeoffs I think, no matter what you think Rust (cmiiw, I'm not qualified to say this) has (and probably in the future will adds more with guards on compiler metrics https://perf.rust-lang.org/) several phases that given the diffs to other language, might not available to any language compiler out there, if it's available I think rustc already did their best in here (some already being parallized etc etc, might be wrong since I can't refs any reference MRs, but it does exists though labels regarding this)
-
How to catch performance regressions in Rust
About a year ago I was looking for a tool like Rust perf for my application code. I did some research and found a lot of prior art. However, nothing checked all the boxes I was looking for, so I built Bencher!
- Rust – Are We Game Yet?
-
Next Rust Compiler
https://www.pingcap.com/blog/rust-compilation-model-calamity... is a good overview. In general it varies depending on the crate but we track the performance at https://perf.rust-lang.org/ - if you look at cargo, for example, over 60% of the time is spent in codegen through LLVM: https://perf.rust-lang.org/detailed-query.html?commit=222d1f...
- Data-driven performance optimization with Rust and Miri
-
Generic associated types to be stable in Rust 1.65
Something like https://perf.rust-lang.org/?
-
This Week in Rust #463
The performance full-report link is dead: https://github.com/rust-lang/rustc-perf/blob/master/triage/2022-10-04.md
What are some alternatives?
FlameGraph - Stack trace visualizer
zig - General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
polkit-dumb-agent - a polkit agent in 145 lines of code, because polkit is dumb and none of the other agents worked
glTF-Sample-Models - glTF Sample Models
firestorm - A fast intrusive flamegraph
unreal-rust - Rust integration for Unreal Engine 5
gta5view - Open Source Snapmatic and Savegame viewer/editor for GTA V
rusty-dos - A Rust skeleton for an MS-DOS program for IBM compatibles and the PC-98, including some PC-98-specific functionality
cargo-flamegraph - Easy flamegraphs for Rust projects and everything else, without Perl or pipes <3
RustPython - A Python Interpreter written in Rust
optick-rs - Optick for Rust
nanoserde - Serialisation library with zero dependencies