Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

0 based indexing was arbitrarily chosen due to the memory model of old programming languages.


I strongly disagree. The decision was not in the least bit arbitrary. Language designers chose zero-based indexing because they believed its advantages outweighed the advantages of one-based indexing.

And it still does, in my opinion. In the end, no matter how high level your programming language is, you'll eventually be computing offsets to a base address in RAM using indexes that are zero-based. Having that be consistent at every level has great value.


It was arbitrary in that they chose to use a leaky abstraction to make it easier for themselves to make arrays using their memory model. The first person to do this could also have subtracted one from the user’s index to get the memory offset.


Or they could have saved themselves a CPU cycle every time they needed to index an array.

Don't get me wrong, non-zero-cost abstractions are great when they improve developer ergonomics commensurate to how much they cost. Garbage collection at the expense of GC overhead? Yes please. Classes and methods and inheritance at the expense of vtable lookups? Sure!

One-indexed arrays at the expense of either giving up the memory of the first item or always subtracting every index operation? Eh... I think I'd rather count from zero.


That's not what arbitrary means.

Arbitrary means there was no logical reason behind the choice, they just picked one. You can dislike the reason all you want. You can disagree with their choice, or think there are more important reasons to choose what you think they should have chosen, but that doesn't make their choice arbitrary.


It was chosen because that's how pointer math works and nothing has happened to change that.

I can see some benefits of using 1-base, but on the whole, if 0-base has to be used anyway, I'd rather only use the necessary version so I'm not making off-by-one errors when switching languages.


Yes I understand pointer array math. The zero based indexing is great for that. It’s still a leaky abstraction to make the implementation correct. I’m also not saying it’s not my preferred way, but it’s not for any really good reason, it’s cultural.


Any fixed array range (at least starting index) is an example of worse is better, pushing the work onto the user of the language to map whatever the actual range should be into a 0-based (or 1-based) form. Different problems have different natural ranges, and you shouldn't have to write a routine that manages that for you. It results in one of several possible outcomes (when your range isn't naturally starting at whatever the language forces on you):

1. You don't help your own users, instead they have to know everywhere that they need to do `histogram[c - 'a']` to calculate the actual index (clutters the code, chance to forget something).

2. You do help your own users, but now they have to remember the function/procedure call to access it: `inc_histogram(c)`. Creating a plethora of setter/getter routines to gloss over this issue and bring performance back to straight array access.

3. You do help your own users, but they realize it's "just" an array and they can use `histogram[c-'a']` to access values (and set them directly) bypassing your API.

Better languages let you do this:

  histogram[c]++
Done.


If you have an array A with indices from S to T, you can use as its base address the address of A[S] minus S multiplied by sizeof(A[S]), then your address calculations are the same as with 0-based arrays. So I'd say it's a problem of a memory model of one old programming language, that is, C.


Not really (the memory model of old machines was chosen too), but who cares?

0 based indexing is also slightly less prone to off-by-1 errors. But again, none of that matters more than keeping it standard.


Exactly, it’s an arbitrary cultural standard now.


That's the point I was making. It doesn't matter if it's arbitrary. The world runs on 0-based index languages. I couldn't care less to give that away for another arbitrary number. There is literally zero benefits and only downsides in getting used to it for any software engineer.


But software engineers aren’t the only ones who program these days. It’s true that many popular languages used by developers are 0-based, but the most popular language, Excel, is 1-based.

Non devs seem to prefer 1-based indexing in my experience. As a teacher of Java and C++ to new programmers, the 0 based nature of those languages is always a sticking point; it causes novice programmers to write incorrect code, which frustrates them and leaves them with the perception that programming is filled with arbitrary rules that are only there to confuse people. And they’re not so wrong, seeing as that even programmers here can argue endlessly about 1 vs 0 indexing with no definitive answer.


If it was chosen for a reason, then it wasn't chosen arbitrarily.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: