Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here!

This project looks interesting and as you mention it seems conceptually similar! I would agree with that quote, generally speaking, but I think it really depends on a binary to binary basis.

Some binaries outsource most of the work to dynamic libraries. Unfortunately, DWARF expressions are typically emitted for these program counter ranges, so it's desirable to at least implement a subset of expressions [0].

Even if that's not the case, we want to produce profiles that are as accurate as possible :)

You are totally right, they were discussing kernel stacks where stakes are higher as it needs to work perfectly, otherwise kernel live patching would not reliably work, among others. The kernel has now some unwind table format, that can be used in x86_64, called ORC [1].

That being said the parser for DWARF would still have to live in the kernel, and I am not sure if kernel devs would like to accept such a patch.

Ideally we would transition to an unwind-specific format for user-space (something like ours, for example) and perhaps have a suitable unwinder in the kernel, rather than having to implement it in a BPF program. This is something we are considering for the future, but it's not free from problems (increased executable size, redundant unwind information, etc). But this is exactly why we wanted to have a conversation with the communities interested in this work!

[0]: https://github.com/parca-dev/parca-agent/pull/1058/commits

[1]: https://lwn.net/Articles/728339/



> Ideally we would transition to an unwind-specific format for user-space (something like ours, for example) and perhaps have a suitable unwinder in the kernel, rather than having to implement it in a BPF program. This is something we are considering for the future, but it's not free from problems (increased executable size, redundant unwind information, etc). But this is exactly why we wanted to have a conversation with the communities interested in this work!

I'm just an interested user rather than someone who has an ownership stake in any major distribution, language, library, or application, but fwiw:

I'd love for every binary/library to embed enough information to cheaply unwind during profiling [edit: and symbolize shortly afterward], in one simple format, so that this sort of profiling tool Just Works, as well as backtraces generated by the program itself (e.g. Rust's panics). I'd prefer it support inlining well (so ideally not just turning on frame pointers everywhere but that'd still be a huge improvement over the status quo). As you know, this is all a complete mess now, and I hate that. I think a reasonable amount of space overhead for this is fine. I want to be able to profile pre-packaged applications and my own Rust applications that use some C libraries without hassle. And have profiles written in a format that I can ship to another machine without trouble reading the symbols there or concern about excessive sensitive information. All stuff you've touched on. I would consider it a genuine miracle if the parties involved could all get behind this.

Full DWARF information, e.g. Rust's debug = 2 [1] (to be clear: my understanding is this is more or less to allow debuggers to print variable contents reasonably well?) is another matter. Seems like that adds a huge amount of bloat and is more rarely used. I like split debuginfo. As long as there's an easy way to get it on demand via a debuginfod service and/or installing an additional package, it's fine.

[1] https://doc.rust-lang.org/cargo/reference/profiles.html


> I'd love for every binary/library to embed enough information to cheaply unwind during profiling

Definitely! We hope to be able to change this. There's the .ctf_frame format [0] that could be used too. While its design goals are very much aligned to ours, implementing a reader for this format in BPF will be complicated.

> and symbolize shortly afterward

Definitely, I think that C Type Format (CTF) [1] could be a candidate here. Another possibility would be to get BTF (BPF Type Format) [2] working in userspace. While it wouldn't cover everything that DWARF deals with, it might be a reasonable tradeoff for symbolization.

FWIW both in Polar Signals and Parca (our OSS offering) we resolve to async symbolization done in the server, as it's very expensive to do and we want to reduce load where the Agent runs.

[0]: https://gcc.gnu.org/wiki/cauldron2022talks

[1]: https://lwn.net/Articles/795384/#:~:text=The%20Compact%20C%2....

[2]: https://docs.kernel.org/bpf/btf.html


I wish you success!

AFAICT from some searching this morning, CTF is implemented in the GNU toolchain (compiler -> debugger) but not at all in LLVM? so seems like it'd be a while before this really happens even if everyone gets behind that format.

> FWIW both in Polar Signals and Parca (our OSS offering) we resolve to async symbolization done in the server, as it's very expensive to do and we want to reduce load where the Agent runs.

Makes total sense for your cluster-wide always-on profiling. I've worked some with GWP [1] and Google Cloud Profiler, and they similarly defer/offload the symbolization to a dedicated service. But in a small-scale setting for e.g. a Rust std::backtrace::Backtrace with an Error or panic, I generally want to print a nice symbolized version in-process shortly after capture (or never). For small-scale on-demand profiling, similarly I likely want to symbolize soon after, although probably out-of-process.

[1] https://research.google/pubs/pub36575/


No idea about LLVM's support for it, but I know Apple's XNU kernel, built with LLVM, uses it (check for the __CTF,__ctf section). I think it uses ctf_insert(1), so it probably doesn't need the toolchain itself to support it.


Thanks so much!

You are right! Hopefully these formats will become standard and implemented in more toolchains. My worry is that there seem to be several "competing" formats for a more compact DWARF debug (as opposed to unwind) information. I hope this doesn't lead to more fragmentation of the ecosystem.

Definitely! I assume you work at Google, so most of your binaries probably have some crash handlers set up already, right? I find this very useful, and wish it were more commonplace. I really like Folly's implementation of this idea [0] that just requires a single line of code: `folly::symbolizer::installFatalSignalHandler()`


> I assume you work at Google

I did for a really long time. At a startup now.

> so most of your binaries probably have some crash handlers set up already, right?

Yes, there was nice instrumentation there too. All google3 binaries did something like InitGoogle() that set off commandline flag parsing, signal handler installation, log hooks, mlock/huge page setup, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: