I want to like BPF, but so few of my problems are in the kernel. Performance problems are usually at the application layer, either in my code or my dependencies. Even in the rare case that it something outside of my control (like the compiler, or the nature of the data), it's almost never the kernel. Lastly, when it is the kernel's fault, there's usually a sysctl or other knob to turn to fix it. Real kernel problems, such as some missing functionality, are usually better resolved by committing changes to the kernel itself, not so much on demand filters. BPF is a solution in search of problems.
Most performance problems are measured as some function of the hardware resources (CPU usage, throughput, disk iops, network latency, ...). How do you find out what application or what part of your application is causing problems or consuming resources? eBPF helps with observability from the kernel up through the application to connect what your machine is doing (disk throughput too slow, maybe?) with what your applications are doing (frequency of reads/writes, object sizes, prefetching, buffer cache usage?).
For some use-cases, the fact that you can insert probes and eBPF code to a running program is a huge win. This is more obvious to kernel developers who can't always recompile a kernel, deploy it, and recreate a particular state to debug a problem. Application developers may think they can just change the code and add printf to get better observability, or maybe use gdb? eBPF has its advantages.
For many of the kernel tracing tools, I'll add user stack traces as needed for the user context. TCP connections and latency _with_ the Java code paths responsible; ditto for disk I/O, memory growth, lock contention, etc. If you've ever had a network problem, a disk I/O problem, a memory problem, etc, BPF can give you new insights that are unavailable from user-space tooling.
But that's also why BPF doesn't seem to have a place in this world. Anything surfaced by a BPF program should probably be surfaced by a proper kernel module or syscall. As far as I can tell, the utility of BPF tracing is solely between the time a bug comes up, and a few weeks later when a kernel upgrade exposes this info anyways.
Neither of your suggestions really get at the point of eBPF. That is, to safely (goodbye kernel modules) and dynamically (goodbye syscalls) instrument the kernel.