Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linux Perf Examples (brendangregg.com)
128 points by arkj on March 18, 2022 | hide | past | favorite | 14 comments


Toplev is a godsend (thank you Andi Kleen!). If you work with perf you'll love this.

https://github.com/andikleen/pmu-tools/wiki/toplev-manual


Perf is a great tool. I'm still amazed at the way it can trace the execution of a program right through the kernel layer to show where the process is really spending (wasting) its time.


It's not a trace in the ptrace sense. Last I checked it uses clock based samples or hardware-native performance counters, depending on what's useful. The results are incredibly useful but the process is somewhat 'simple' IMO. Imagine an ISR from a clock source that records the PC. Ok, it's not quite that simple but it's what is at the core.

One thing that can be confusing is that programs that do some I/O and/or IPC -- it won't be counted, and sometimes perf will point to the 'wrong' bottleneck. That's why it's helpful to understand the tool's design. Intel's perf counter was called something like "CLK_UNHALTED" back in the day and that was helpful to remind you exactly what was being measured. If your program is slow because of the computation/memory access etc, it's perfect.


It does two things (or three, but all of which use the same hardware). PC sampling, and counters that tick on specific events. The events can be processor instruction related, like cache misses, or kernel events.


If you're on an Intel CPU try vTune, it's like perf on crack. Everything perf does vTune does better and in a pretty GUI that draws diagrams and pipelines for you.


Or just KDAB's Hotspot.


Not even close. Hotspot is a nice GUI for perf but it's like using COBOL compared to vTune's [modern language]


Anyone have a good resource for how Perf compares to vendor tools like vTune and uProf? Is there another perf tool for Arm or is that even necessary?


> [...] how Perf compares to vendor tools like vTune [...] ?

Regarding the hardware events that Perf can capture on x86, it has pretty much all of them. So it should be equivalent to vTune for all practical purposes.

The big difference is in the UI -- or absence thereof. Perf is a low-level tool and its output is mostly text files. There is a curses-based TUI for perf-report (and even gtk version, but it is essentially the same as the TUI, just using GTK2 widgets), but that's about it.

By contrast, vTune comes with a heavy (electron-based?) GUI and is quite helpful in guiding beginners, with many graphs and explanations.

Of course, one can (and is expected to) complement Perf with an assortment of tools that process its output for visualization. For example, the flamegraph [1] and heat map [2] tools described in the article. But also KDAB hotspot [3] for flame graphs or HPerf [4] for a vTune-style perf-report.

[1] https://github.com/brendangregg/FlameGraph

[2] https://github.com/brendangregg/HeatMap

[3] https://github.com/KDAB/hotspot

[4] https://www.poirrier.ca/hperf/


Perf is a great way to generate data for offline analysis. But it lacks a gui, and its text mode interface is very sparse.

The good news is, it is an very solid foundation to build on. They got the fundamentals right.

If you’re profiling short pieces of code PAPI might be better. It lets you bracket code snippets with its function calls and get performance counter data. So you can count instructions, memory accesses etc.


Does everyone else find the default "perf report" output completely baffling?

I find the only way to understand the output is to covert it to a flame graph (thanks Brendan!). I wish the TUI output looked a lot more like GNU's gprof tool.


Are there any real world examples?

This looks extremely useful but reading some of this still leaves me questioning how will I know when to reach for perf out of the toolbox


Great stuff of course.

> Last Updated: 29-Jul-2020


And submitted 5 years ago first




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: