Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Memory addresses are exactly analogous to array indices, and suffer from exactly the same semantics issues. After all, low-level memory is just an array of bytes.

We do have a convention that dereferencing a memory address returns the 8 bits to the right of that address. We've even optimized our hardware for that convention. But that's just a convention of the dereference operation; it's not fundamental to the addresses themselves.

I agree that a C pointer isn't analogous to an array index; that's because a pointer is a range, determined by a pair of memory addresses. One, stored at runtime, refers to the location before the first byte of the range. The other, implicitly derived from the runtime value and the size information in the pointer type, refers to the location after the last byte of the range. When we think of memory addresses as the article's indexes, and pointers as the article's ranges, everything falls into place.

(Incidentally, please be careful calling out people for not understanding computers. C isn't actually the lowest level of computing, and pointers aren't as primitive as your post implies. When you call someone out, you need to be 100% clear and 100% right.)



> We do have a convention

^ there, semantic issue resolved

specific languages might reuse the word array for abstracting underlying optimizations, but calling array an indexed object doesn't really change what an array is, no more than calling fish a dolphin change it from being a mammal

also, a pointer is a range only when paired with a type. otherwise a pointer is the index of a cell within the address space, and you want the address space zero starting not because it's convenient, but because otherwise you wouldn't be able to reference the last cell (since it overflow your word size) unless you do additional stuff to normalize the one starting address to zero back again

using cell deliberately because memory can be accessed by word, byte, page etc

anyway. what you call a contiguous memory area that have a type and can be navigated by offset? that's an array. well then, are you going to use the pointer convention for it or just have the +1 to be removed at every access operation?

and we're back again to what an array is. arbitrary memory constructs that are called array shouldn't be taken into account for they are the one causing the whole confusion we're into and we shouldn't be, because an array is an array and an indexed object is not


Gotcha. Perhaps the article would be better off using a word like list instead of array, to avoid the additional semantics that C attaches to that word.

In any case, I think I agree that dereferencing an address should return the byte to the right for the reason that you mention. That's a solid point, and I totally didn't think of that :) That's a really important property of the dereferencing operation.

I still feel like that doesn't make the mental model of memory-addresses-are-gaps-between-bytes any less valuable, though, nor does it mean that abstractions built on top of this memory model need to use the same conventions as the underlying system - that's the point of abstractions, after all :)


A pointer points to the start of its pointee - i.e. the point "just before" its pointee. That's how derived-to-base casts work. That's also how you can have "one-past-the-end" pointers, which are actually "just after" the relevant array.

for example, if you have the following structs

    typedef struct { void *key; } base;
    typedef struct { base b; int misc; int data[2]; } derived;
then derived is laid out as follows

    -----+------+---------+---------+---------+-----
     ... | base | derived | data[0] | data[1] | ...
    -----+------+---------+---------+---------+-----
         ^                ^                   ^
         |                |                   |
        base         derived.data      &derived.data[2]


Yep. A pointer is a memory range, but it supports a cast operation that allows you to change the pointer type, and therefore the end address. I think we agree, right? :)


Except you can have a void* that does not have an end address.


Hmm, yeah, void pointers are weird. I'd be inclined to say that its start and end address are the same and it's a range over 0 bytes of memory, and the fact that dereferencing fails is an artifact of the dereferencing operation itself... but I don't know enough about void pointer voodoo to know whether that's actually a consistent interpretation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: