FlameGraph
pmu-tools
FlameGraph | pmu-tools | |
---|---|---|
53 | 3 | |
16,438 | 1,912 | |
- | - | |
4.5 | 9.2 | |
15 days ago | 6 days ago | |
Perl | Python | |
- | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
FlameGraph
-
JVM Profiling in Action
We'll use async-profiler and flame graphs for profiling. To simplify the process, we'll run the code using JBang.
-
Memray – A Memory Profiler for Python
And flame graphs excel and this kind of thing
https://www.brendangregg.com/flamegraphs.html
-
All my favorite tracing tools: eBPF, QEMU, Perfetto, new ones I built and more
which can output in a format understood by Brendan Gregg's flame frames (https://www.brendangregg.com/flamegraphs.html)
But that's not quite the kind of tracing you're talking about. We also built a printf-style interface to our recording files, which seems closer:
-
Recap of Werner Vogels' Keynote at re:Invent 2023
Strategies included discontinuing or resizing underutilized services, transitioning to more cost-effective solutions, reducing the current resources to the amount of resources that we need for our application, and conducting detailed analyses of computing resource utilization through tools like flamegraphs. This detailed scrutiny helped identify and rectify significant cost-driving areas, such as garbage collection and application configurations.
-
Pinpoint performance regressions with CI-Integrated differential profiling
Flame Graphs by Brendan Gregg
-
Flameshow: A Terminal Flamegraph Viewer
Historically brendangregg's since AIUI he basically invented flamegraphs
https://www.brendangregg.com/flamegraphs.html
So if you can make your tool eat whatever https://github.com/brendangregg/FlameGraph is fed with you're going to support a lot of existing tooling across OSes and languages.
-
Introducing Flame graphs: It’s getting hot in here
“Flame graphs are a visualization of hierarchical data, created to visualize stack traces of profiled software so that the most frequent code-paths to be identified quickly and accurately.”
-
Using SVG to create simple sparkline charts
SVGs are amazing for interactive visualisation too. Like Flamegraphs: https://www.brendangregg.com/flamegraphs.html
-
Good example of using flame graphs to speed up java code (50x improvement)
This may be a good example of the application of a flame graph but it is not a good demonstration of flame graphs; the graph is nearly incidental. The source has an actual explanation.
-
Intro to PostGraphile V5 (Part 1): Replacing the Foundations
A profiling flame graph from Graphile Crystal (a precursor to Grafast) using GraphQL.js' executor (each tick is 1ms, total: 29ms). As we removed more and more responsibilities from GraphQL.js, we ended up only using it for output. Replacing this final responsibility with a custom implementation in Graphile Crystal itself, we reduced execution time for this query down to 15.5ms (effectively removing the majority of the yellow portion of the flame graph).
pmu-tools
-
Gallery of Processor Cache Effects
I am not seeing it mentioned anywhere, but for people looking for a good starting point on "low-level" CPU performance debugging, intel's CPU top-down u-architecture method (https://www.intel.com/content/www/us/en/docs/vtune-profiler/...) is a good systematic way to understand where you CPU is speeding most of it's cycle.
They also have two tools which basically implement this analysis and spit a bunch of very useful metric that are actionable and very easy to understand
- Intel Vtune is a fantastic tool to start with. It's currently free to use, support most OSes and very friendly to use for beginner.
- Intel pmu-tools (https://github.com/andikleen/pmu-tools) is basically command line version of Vtune.
-
if you had to restart at 0 knowledge what would you do?
Install some tool that would help you see the performance of your system, like a graph of the CPU usage, the top processes being used, disk activity/read/write, etc. Every time you run your program, glance at those numbers, eventually you'll develop an intuition. Basically write code and profile. A good exercise would be practicing with data structures, this site has an exhaustive list of them, find some stuff that's interesting then google the implementation, then build it yourself, test it, debug, profile, optimize, and understand the performance constraints. Eventually you'll develop better understanding and can compare between other people's works, optimizing them. If you want to go beyond, read some papers on lock-free algorithms https://github.com/JCTools/JCTools/tree/master/resources then read Brendan Gregg's blog and books. Read about how profiling tools work https://github.com/andikleen/pmu-tools/wiki/toplev-manual
-
Linux Perf Examples
Toplev is a godsend (thank you Andi Kleen!). If you work with perf you'll love this.
https://github.com/andikleen/pmu-tools
What are some alternatives?
hotspot - The Linux perf GUI for performance analysis.
bips - Bitcoin Improvement Proposals
benchmark - A microbenchmark support library
tracing-bunyan-formatter - A Layer implementation for tokio-rs/tracing providing Bunyan formatting for events and spans.
HeatMap - Heat map generation tools
Aeron - Efficient reliable UDP unicast, UDP multicast, and IPC message transport
node-clinic - Clinic.js diagnoses your Node.js performance issues
JCTools
etcd - Distributed reliable key-value store for the most critical data of a distributed system
Event Store - EventStoreDB, the event-native database. Designed for Event Sourcing, Event-Driven, and Microservices architectures