Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This could be partially automated with `__attribute__((__packed__))` and a bit of -fipa-struct-reorg for better cache performance. Sadly, there's no any kind of `__reorder_yes_i_know_and_i_want_to_violate_c_standard__` attribute. But I really believe managing and optimizing memory layout (unless explicitly necessary, like when declaring serialization formats) should be compiler's job, not human's.


Adding just the __packed__ attribute and not also an __aligned__ attribute is not necessarily a good idea -- if a struct is marked __packed__ gcc assumes it might be at any alignment, so on architectures which don't permit unaligned loads it may have to generate a lot of byte loads and shifts to do simple reads of integer members.


Fwiw, -fipa-struct-reorg was removed in GCC 4.8.x. From the release notes:

The struct reorg and matrix reorg optimizations (command-line options -fipa-struct-reorg and -fipa-matrix-reorg) have been removed. They did not always work correctly, nor did they work with link-time optimization (LTO), hence were only applicable to programs consisting of a single translation unit.


I'm not sure letting the compiler go wild would be such a great idea: one of the strengths of C is predictable performance, which would be hard to obtain if the compiler is allowed to e.g. move data across cache lines.


If you're running on one microarchitecture, I agree. But if you're running on more than one, manual structure packing may actually give more unpredictable performance than letting GCC handle it. At least GCC will make an effort to optimize for each one and hopefully avoid things that are absolutely terrible to do on that arch (orders-of-magnitude performance loss type stuff), while a manually chosen packing optimized for one microarch can be hugely pessimal on another one.

That's a guess though, no numbers. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: