> A compiler can generate the same code either way by shifting by 1 at compile t...

ModernMech · on Sept 29, 2022

> subtract an offset from the pointer to the array

Yeah, you shift the array, not the index. But the instruction for accessing memory on many architectures assumes a shift, so whether it's 1 or 0, it's still just one instruction. Here's a community of programming language developers talking about exactly this question: https://www.reddit.com/r/ProgrammingLanguages/comments/x95ni...

The consensus there is: it's really not something to worry about, as it all comes out in the wash. Performance is certainly not an argument for choosing one versus the other, as there are other dimensions to the choice, like target user familiarity/comfort.

klodolph · on Sept 30, 2022

It sounds like you're definitely misunderstanding what I'm saying.

Linked thread is x86, an architecture with unusually rich addressing modes. People make compilers for all sorts of languages and architectures. The fact that in SOME cases, it's simpler not to have an offset, means that there is some pressure to use 0 as the base for your array. If you pick x86 or amd64 as your benchmark, then you're going to get a very narrow picture.

Just to pick an example, consider Arm. I'm looking at an older version of the architecture right now, but you cannot use both a register offset and an immediate offset at the same time. So if your arrays start at index 1, then you must either adjust the index or adjust the array.

Not all language designers follow the zero-overhead principle like C++ does, but you can see how this would cause some language designers to decide that 0 is a more natural array index because it results in the lowest overhead in the most scenarios (across different architectures, not just x86).

Same thing explains why little endian architectures won over big endian. Big endian is definitely more convenient to debug, it's more convenient to look at hex dumps. However, with little endian, the correspondence between memory and registers is slightly simpler. The difference is not so extreme that big endian architectures do not exist, but it is large enough that we've "drifted" into a world where big endian architectures are all but gone, and you mostly find them these days in network appliances.

And to be clear, I'm not saying that the advantage of 0-based indexes is some massive advantage that makes it a clear winner. I'm just saying that there's slight pressure to use 0-based indexes.

ModernMech · on Sept 30, 2022

> If you pick x86 or amd64 as your benchmark, then you're going to get a very narrow picture.

99% of the language designers I know target x86 first and foremost. But in the given link only one person mentioned x86, while others were speaking generally and voiced an opinion that it really doesn’t matter.

I take your point that other designers might have other pressures, but the choice of 1 vs 0 in the common case (not a narrow slice as you seem to suggest) comes down to other factors. The pressure to conform to developer expectations for 0-based indexing is much greater than anything else; as demonstrated in this HN thread, some people won’t even consider using a language if it has 1-based indexing. Other communities face persistent confusion with 0-based indexing. That provides far more pressure than the ARM instruction set for language devs that I know. Maybe the ones you know feel differently.

Your point is taken on bigendian vs littlendian, but the same has not happened for indexing. Languages haven’t settle on either, but instead they have bifurcated between dev targeted versus end user targeted languages, with the former being 0 based and the latter being 1 based.