Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think it means there is a small chance enough hard drives might fail at the same time that there happen to be no backups of those drives.

They make so many backups so quickly that there is only a 0.00000000001% (I didn't count the zeros) chance of this occuring.



Which of course means that (if they're telling the truth) the probability of losing your data mostly comes from really big events: collapse of civilization, global thermonuclear war, Amazon being bought by some entity that just wants to melt its servers down for scrap, etc. (Whose probability is clearly a lot more than 10^-11 per year; the big bang was only on the order of 10^10 years ago.)


There's some clever wordplay/marketing here... "designed to provide 99.99..99%" means that the theoretical model of the system tells you that you lose 1 in X files per year when everything is working as modeled (e.g. "disks fail at expected rate as independent random variables"). If something not in the model goes wrong (e.g. power goes out, a bug in S3 code), data can be lost above and beyond this "designed" percentage. The actual probability of data loss is therefore much, much higher than this theoretical percentage.

A more comical way to look at it: The percentage is actually AWS saying "to keep costs low, we plan to lose this many files per year"; when we screw up and things don't go quite to plan, we lose a _lot_ more.


per object. So although the chance of losing any particular object is tiny, the chance of you losing something is proportional† to the number of objects. Still extremely small.

†roughly proportional if you have << 1e11 objects


Yes. Though I bet the real lossage probabilities are dominated by failure events that take out a substantial fraction of all the objects there are, and that happen a lot more often than once per 10^11 years.


Agreed. More likely a catastrophic and significant loss for a small number of customers rather than a fraction of a percentage of loss for a large number.

Similar deal for hard drive bit error rates, where the quoted average BER may not necessarily accurately represent what can happen in the real world. For example, an unrecoverable read error loses 4096 bits (512 byte sectors) or 32768 bits (4k sectors) all at once, rather than individual bits randomly flipped over a long period.


More important than the speed of those backups is this:

Amazon Glacier synchronously stores your data across multiple facilities before returning SUCCESS on uploading archives.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: