parca-agent
samply
parca-agent | samply | |
---|---|---|
10 | 8 | |
484 | 1,784 | |
5.0% | - | |
9.9 | 9.4 | |
1 day ago | 1 day ago | |
Go | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
parca-agent
-
Flameshow: A Terminal Flamegraph Viewer
If that's true, you should probably update the docs. Everything I could find implied dotnet, jvm, python were still unsupported. For example, the roadmap section of the readme mentions most of these but nothing mentions dotnet. However I did find your tickets and a demo being merged in which makes it seem maybe supported?
Ticket: https://github.com/parca-dev/parca-agent/issues/161
Demo: https://github.com/parca-dev/parca-demo/pull/18
-
How to troubleshoot memory leaks in Go with Grafana Pyroscope
Couldn't see any advantages to this over https://github.com/parca-dev/parca-agent. Which uses eBPF so it can be used with non-instrumented apps and code paths.
-
Frame pointers vs. DWARF – my verdict
The pervasive lack of frame pointers is the reason why we've developed a custom format derived from DWARF unwind information thanks to some insights: DWARF unwind information is incredible flexible, it supports many arches and allows restoring any arbitrary register. But we only need 3: the frame pointer, the stack pointer, and in non-x86 the return address.
In addition, this encoding doesn't use that many bytes, but unfortunately reading and parsing that information is quite expensive.
For that reason I've developed a new unwinder that uses custom unwind information derived from DWARF (https://www.polarsignals.com/blog/posts/2022/11/29/profiling..., previously discussed in https://news.ycombinator.com/item?id=33788794) that runs in BPF. This new compact representation can be binary searched easily and each unwind row has a size of 16 bytes. I are currently working on reducing it down to ~10 bytes.
All the code is fully OSS (Apache 2.0 for userspace and GPL for BPF), and part of the Parca project (https://github.com/parca-dev/parca-agent).
We've also given some talks in FOSDEM going deeper into how we made it scale for many big processes.
-
Dwarf-Based Stack Walking Using eBPF
I find this surprising! Was this for off the shelf applications or some custom binaries?
As mentioned above, we see DWARF expressions such as `DW_CFA_def_cfa_expression` on the regular. See the "Test Plan" section and commit messages of the PR that introduced support for this particular opcode [0]
[0]: https://github.com/parca-dev/parca-agent/pull/1058
- Parca Agent rewrites eBPF in-kernel C code in Rust (using Aya-rs)
-
Fantastic Symbols and Where to Find Them - Part 2
Let's see an example perf map file for NodeJS. The runtimes out there output this file with more or less the same format, more or less!
-
Fantastic Symbols and Where to Find Them - Part 1
The good news is we got you covered. If you are using Parca Agent, we already do the heavy lifting for you to symbolize captured stack traces. And we keep extending our support for the different languages and runtimes.
samply
- Samply: Command-line sampling profiler for macOS and Linux
- samply: Command line CPU profiler which uses the Firefox profiler as its UI
-
Help with Rust Program performance
Regarding profilers, I really like samply. It doesn't require to modify source code, runs on Linux and macOS and automatically loads profiling data into Firefox Profiler UI.
-
AI learns to play flappy bird (code in comments)
I grabbed a quick profile using samply and noticed two things: Even in fast mode, the simulation only updates when the screen is redrawn, so its update frequency is limited by the refresh rate. And the simulation seems to mostly be bottle-necked by Vec reallocation, so re-using Vecs might help.
-
Firefox Profiler
I ran across this when I found samply [0], a CLI sampling profiler. On samply's GitHub there's a link to a sample profile that opens in the Firefox Profiler and I was in awe at just how fast it is! Try dragging your mouse over the timeline for a second: https://share.firefox.dev/3j3PJoK
0: https://github.com/mstange/samply
-
Frame pointers vs. DWARF – my verdict
IMHO, perf's decision to write whole stacks directly to the disk and unwinding them as a post-process is a really bad design. It wastes disk space, and as the author pointed out, it also has a lot of IO overhead.
As an alternative approach, https://github.com/mstange/samply processes data streamed from perf and unwinds it in realtime. The unwinding overhead is surprisingly low: it only takes around 1% of (single) CPU per CPU profiled. Solving the disk waste alone has been a tremendous improvement of profiling experience. As a bonus, the unwinding and symbolization works reliably while I frequently had postprocessing not terminating when using the perf CLI directly.
-
Data-driven performance optimization with Rust and Miri
samply supports showing inline frames in call stacks. I find this makes a huge difference when profiling Rust.
- Samply: A work in progress of a command-line profiler for macOS and Linux
What are some alternatives?
kubectl-flame - Kubectl plugin for effortless profiling on kubernetes
pprof-rs - A Rust CPU profiler implemented with the help of backtrace-rs
ebpf - ebpf-go is a pure-Go library to read, modify and load eBPF programs and attach them to various hooks in the Linux kernel.
rust-flappy-bird-ai - AI learns to play flappy bird using neuro-evolution, implemented in Rust using macroquad
perf-map-agent - A java agent to generate method mappings to use with the linux `perf` tool
flamegraph - Easy flamegraphs for Rust projects and everything else, without Perl or pipes <3
pwru - Packet, where are you? -- eBPF-based Linux kernel networking debugger
profiler - Firefox Profiler — Web app for Firefox performance analysis
rbspy - Sampling CPU profiler for Ruby
rayon - Rayon: A data parallelism library for Rust
go-profiler-notes - felixge's notes on the various go profiling methods that are available.
perfmon