All of these sound weird to me—most non-stupid (hello 802.2) protocols and hardware are going to have natural-aligned structure fields, so basically any mainstream (8-bit-byte, two’s complement, etc.) ABI is going to lay them out the same way, packed or not.
As for RV64 and Arm64, the layout rules for same-size scalar types in their common ABIs are outright identical aren’t they?
We’re (most of us) a long way away from the time where each DOS compiler had its own opinions on whether long double should be 8-, 16-, 32-, or 64-bit aligned and 80 or 128 bits long.
> All of these sound weird to me—most non-stupid (hello 802.2) protocols and hardware are going to have natural-aligned structure fields, so basically any mainstream (8-bit-byte, two’s complement, etc.) ABI is going to lay them out the same way, packed or not.
In the long ago year of 2015 I worked on a project where the same binary packet was:
1. Generated by an 8 bit micro controller
2. Consumed by a 32bit Cortex M3
3. Passed onto iPhones, Androids, Windows Phones, and Windows PCs running ObjC, Java, C#, and C++ respectively
4. Uploaded to a cloud provider
The phrase "natural aligned" has no meaning in that context.
> The phrase "natural aligned" has no meaning in that context.
The phrase “naturally aligned” as I’m accustomed to seeing it used refers to the alignment of a power-of-two-sized type (usually a scalar one) being equal to its size. Unless you’re working with, say, 18-bit or 24-bit integers (that do exist in obscure places), it does have a meaning, and unless you’re using non-eight-bit bytes that meaning is fairly universal (and if you’re not, your I/O is probably screwed up in hard-to-predict ways[1]).
At least for your items 2, 3, and 4—excluding Java and C# which are not relevant to TFA about C and are likely to use manual packing code—you have, let’s see,
- The bytes are eight bits wide, and ASCII byte strings have their usual meaning;
- The integer types are wraparound unsigned and two’s complement signed least-endian with no padding bits or trap representations and come in 8-bit, 16-bit, 32-bit, and 64-bit sizes and identical alignments;
- The floating-point types are IEEE 754 single and double precision floats, little endian, respectively 32 bits and 64 bits in size and of identical alignment, though you should probably avoid relying on subnormals or the exact choice of NaNs;
- Structures and unions have the alignment requirement of their most strictly aligned member;
- The members of a structure are laid out at increasing offsets, with each member starting at the earliest offset permitted by its alignment (while the members of a union all start at offset zero as the standard requires);
- The structure or union is then padded at the end so that its alignment divides its size.
If you avoid extended precision and SIMD types, the default ABI settings should get you completely compatible layouts here. (On an earlier ARM you might’ve run into mixed-endian floats, but not on any Cortex flavour.) Even bitfields would be entirely fine, except Microsoft bloody Windows had to be stupid there.
Honestly the only potential problem is 1, an unspecified 8-bit controller, and that only because the implicit integer promotions of standard C make getting decent performance out of those a bit of a crapshoot, leading to noncompliant hacks like 8-bit ints or 48-bit long longs. Still, if the usual complement of 8/16/32/64-bit integers is available, the worst you’re likely to have to do is spell out any structure padding explicitly.
I do my current work (embedded) on an architecture with the following properties:
- 8-bit bytes
- 16-bit aligned accesses to 32-bit types
- 32-bit aligned accesses to 64-bit types.
- Struct alignment depends on the size of the struct (32-bit aligned for >= 64-bit structs)
It's a pretty common architecture in the automotive industry, though probably would be considered esoteric for other applications.
This is not the first platform I've encountered with "unnatural" alignment rules in the embedded space, and I'm sure it won't be the last. (The extra packing this allows is actually quite handy.)
I think we agree that this makes sense on some metaphysical level. The problem is that there are definitely platforms where the normal alignment isn't what you describe above. And there isn't to my knowledge a switch in GCC to force it to follow these rules on any given platform. There isn't __attribute__((natural_alignment)). But there is __attribute__((packed)).
Since C11 there is _Alignas(sizeof T), forcing one of the proposed meanings for alignment, and _Alignof(T), which queries actual (i.e. natural, per another meaning) alignment. But, yeah, the argument upthread seems more about the implicit meaning of natural than anything else.
On that note, something that caught me off guard once is that C11 _Alignof and GCC __alignof__ can differ: for example in 32-bit x86 __alignof__(double) == 8 but _Alignof(double) == 4; however __alignof__(struct { double d; }) == 4. Apparently __alignof__ gives the preferred alignment whereas _Alignof gives the alignment required by ABI.
> the default ABI settings should get you completely compatible layouts here
That's not true! You must not assume that the alignment always equals the size of a type. For example, the SysV i386 ABI uses 32-bit alignment for 64-bit types (double, int64_t). The Microsoft x86 ABI, however, uses 64-bit alignment, as do all 64-bit ABIs (See https://stackoverflow.com/a/11110283.)
If you want to share structs directly between different machines, you should use appropriate struct packing directives - unless you really know what you are doing.
What's worse, MSVC's 32-bit x86 ABI reports an 8-byte alignment requirement (via __alignof) for 64-bit integer types, and its struct layout algorithm uses that alignment to determine padding, but those integers and structs are only aligned to 4 bytes when allocated on the stack! This has caused issues with Rust code trying to link with MSVC code [0], since Rust's standard library documentation asserts that properly aligned pointers have addresses that are always a multiple of the alignment used for struct layout.
This was just a placeholder, perhaps a bad example. I program a proprietary CPU architecture which does not require alignment. And for which the compiler naturally prefers to pack structs. Getting it to mimick Arm style struct padding is much harder and error prone than just having the Arm pack everything.
Maybe you are right and we are heading for a One True Struct Layout in the future. Today I think it is still too scary to pass the same unpacked struct declaration to various compiler archs and hope they come up with the same interpretation.
I don’t think RV32 actually differs re alignment or struct layout, it’s just that with RV64 and Arm64 even the non-fixed-width names for the integer types are the same (LP64) except for Windows-on-ARM.
As for RV64 and Arm64, the layout rules for same-size scalar types in their common ABIs are outright identical aren’t they?
We’re (most of us) a long way away from the time where each DOS compiler had its own opinions on whether long double should be 8-, 16-, 32-, or 64-bit aligned and 80 or 128 bits long.