henrikl's comments

henrikl · 2025-07-13T01:17:37 1752369457

Seeing a systems language like Zig require runtime polymorphism for something as common as standard IO operations seems off to me -- why force that runtime overhead on everyone when the concrete IO implementation could be known statically in almost all practical cases?

nu11ptr · 2025-07-13T02:21:27 1752373287

I/O strikes me as one place where dynamic dispatch overhead would likely be negligible in practice. Obviously it depends on the I/O target and would need to be measured, but they don't call them "I/O bound" (as opposed to "CPU bound") programs for no reason.

throwawaymaths · 2025-07-13T04:19:10 1752380350

> why force that runtime overhead on everyone

pretty sure the intent is for systems that only use one io to have a compiler optimization that elides the cost of double indirection... but also, you're doing IO! so usually something else is the bottleneck, one extra indirection is likely to be peanuts.

do_not_redeem · 2025-07-13T01:22:15 1752369735

I think it's just the Zig philosophy to care more about binary size than speed. Allocators have the same tradeoff, ArrayListUnmanaged is not generic over the allocator, so every allocation uses dynamic dispatch. In practice the overhead of allocating or writing a file will dwarf the overhead of an indirect call. Can't argue with those binary sizes.

(And before anyone mentions it, devirtualization is a myth, sorry)

kristoff_it · 2025-07-13T01:30:20 1752370220

> (And before anyone mentions it, devirtualization is a myth, sorry)

In Zig it's going to be a language feature, thanks to its single unit compilation model.

https://github.com/ziglang/zig/issues/23367

do_not_redeem · 2025-07-13T01:41:04 1752370864

Wouldn't this only work if there's only one implementation throughout the entire compliation unit? If you use 2 allocators in your app, your restricted function type has 2 possible callees for each entry, and you're back to the same problem.

Zambyte · 2025-07-13T02:46:09 1752374769

> A side effect of proposal #23367, which is needed for determining upper bound stack size, is guaranteed de-virtualization when there is only one Io implementation being used (also in debug builds!).

> In the less common case when a program instantiates more than one Io implementation, virtual calls done through the Io interface will not be de-virtualized, as that would imply doubling the amount of machine code generated, creating massive code bloat.

From the article

yxhuvud · 2025-07-13T09:44:50 1752399890

I wonder how massive it actually would be. I'm guessing it really wouldn't be all that massive in practice even if it of course is easy to create massive examples using ways people typically don't write code.

thrwyexecbrain · 2025-07-13T09:47:13 1752400033

Having a limited number of known callees is already better than a virtual function (unrestricted function pointer). A compiler in theory could devirtualize every two-possible-callee callsite into `if (function_pointer == callee1) callee1() else callee2()` which then can be inlined at compile time or branch-predicted at runtime.

In any case, if you have two different implementations of something then you have to switch between them somewhere -- either at compile-time or link-time or load-time or run-time (or jit-time). The trick is to find an acceptable compromise of performance, (machine)code-bloat and API-simplicity.

throwawaymaths · 2025-07-13T04:20:32 1752380432

> Wouldn't this only work if there's only one implementation throughout the entire compliation unit

in practice how often are people using more than one io in a program?

latch · 2025-07-13T05:47:45 1752385665

I think having a thread pool on top of some evented IO isn't _that_ uncommon.

You might have a thread pool doing some very specific thing. You can do your own threadpool which wont use the Io interface. But if one of the tasks in the threadpool wanted to read a file, I guess you'd have to pass in the blocking Io implementation.

throwawaymaths · 2025-07-13T14:28:43 1752416923

one of the io interfaces provided is a standard threadpool io. and if it was really important, you could write your own io interface that selects between std threadpool and std blocking based off of an option (i am guessing, i don't know, but seems reasonable)

NobodyNada · 2025-07-13T18:51:48 1752432708

In larger Rust applications or servers I find myself doing this very often -- for example, one application I'm working on mostly uses blocking I/O for occasional filesystem access but has a little bit of async networking.

ozgrakkurt · 2025-07-13T02:09:22 1752372562

It can also mean faster compilation (and sometimes better performance? https://nical.github.io/posts/rust-custom-allocators.html)

Just templating everything doesn’t mean it will be faster every time

lerno · 2025-07-13T18:49:10 1752432550

> care more about binary size than speed

That does not seem to be true if you look at how string formatting is implemented.

ozgrakkurt · 2025-07-13T07:43:18 1752392598

Runtime polymorphism isn’t something inherently bad.

It is bad if you are introducing branching in a tight loop or preventing compiler from inlining things it would inline otherwise and other similar things maybe?