Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a fallacy. Infinity does not contain all possibilities.

There are an infinite number of integers. You can start at 10 and count up forever, never running out of integers. But no matter how high you count, you’ll never count to “orange” - “orange” is not contained in the sequence of infinite integers.

You’ll need to first prove that every sequence of integers is contained somewhere in pi, since the number of possible integer sequences grows faster than the “space” for sequences in pi. In other words, I can always pick a digit that creates a valid, non repeating, integer sequence from the pool of possible sequences while never creating the integer sequence “123456789123456789123456789.” You’d need to prove that pi doesn’t do this.

Even if pi does contain every sequence of integers and you could map that to bytes which, in turn, maps to a file, this would not compress.

Your metadata directory would be larger than the raw files unless you get very lucky and your file is very early in the sequence of pi.

A byte can represent 256 unique values. 256 unique values can not compress to less than a byte. So if your index is a digit of pi where your file starts, your file starts after some other number of files. Your index is going to be the index inside of the address space of “all possible files.” This will get large very quickly.



> This is a fallacy. Infinity does not contain all possibilities.

From the link: One of the properties that π is conjectured to have is that it is normal, which is to say that its digits are all distributed evenly, with the implication that it is a disjunctive sequence, meaning that all possible finite sequences of digits will be present somewhere in it. If we consider π in base 16 (hexadecimal) , it is trivial to see that if this conjecture is true, then all possible finite files must exist within π. The first record of this observation dates back to 2001.

> Your metadata directory would be larger than the raw files unless you get very lucky and your file is very early in the sequence of pi.

This is very very clearly a tongue in cheek project and not intended as a practical way to store files


> One of the properties that π is conjectured to have is that it is normal, which is to say that its digits are all distributed evenly

.(0123456789) has this property too.


Being normal requires not only the correct frequencies of single digit substrings, but of k-digit substrings as well, for all k. One number that is normal to base 10 is Champernowne's number [1].

[1] https://en.wikipedia.org/wiki/Champernowne_constant


.(0123456789) is rational, though. I think the implication is that if pi is normal in addition to its other properties — specifically irrationality — that it has to contain all possible digit sequences.

Edit: oops, I forgot about n-length sequences of digits, .(0123456789) is definitely not normal, this is why I’m not a mathematician.


Pi may contain any sequence of digits but it's not proven that it contains all sequences of digits. IMHO, it's possible to construct a sequence of digits which will not be in Pi. It's easy to construct such sequence for first N digits of Pi of any length (for example, N zeroes).


Constructing things for N digit subsets can be misleadingly easy.

It's easy to construct a sequence of digits that is larger than any N digit number. But it's obviously impossible for any such number to be the largest number.


We’re talking about a hypothetical scenario in which pi is normal, though.


This reasoning is true of every real number, yet it's been proven that almost all real numbers are absolutely normal and therefore contains every finite sequence of digits


It's a slight mistatement of what a normal number is that changes it significantly. It isn't that each digit is evenly distributed, but that for every n, every string of digits n long is evenly distributed.


>This is a fallacy.

It is not, since the readme considers it an (unproven) working assumption.

>This will get large very quickly.

What? I can not believe that this joke repo, which even explains that saving a few hundred bytes takes minutes, is not in fact practical.


> you’ll never count to “orange” - “orange”

Ok but I don't see how that relates to the idea of storing sequences of numbers (the basic function of a filesystem) in Pi. Orange is not a number so it doesn't make sense to look for it in any sequence of numbers. "Orange" can however be represented as a sequence of ASCII characters, and if every sequence of numbers is contained in Pi, then an ascii representation of any string can also be found.

> This will get large very quickly.

Right, but here's a thought...

Suppose the place in Pi where your sequence is located is huge. You could maybe (probably) find a sequence earlier in Pi that is a pointer to your actual location :) So the metadata would need like 3 things to make this work. 1) length of your data, 2) checksum of your data and 3) pointer into Pi. If hash of (pointer+length) doesn't match checksum, it's a pointer (which may point to a different pointer etc).

That might make it compress a bit better. Although you may need huge amount of RAM just to dereference the pointers.


> Suppose the place in Pi where your sequence is located is huge. You could maybe (probably) find a sequence earlier in Pi that is a pointer to your actual location

If the sequence of bytes necessary to represent the file encodes to an integer index into pi, and that integer index encodes to a sequence of bytes larger than the original file, I have no reason to believe trying to repeat that process would result in anything other than an even larger integer index.


This is such a hackernews comment lmao, storing all your files inside the decimal representation of pi is obviously a joke project.


> But no matter how high you count, you’ll never count to “orange” - “orange” is not contained in the sequence of infinite integers.

Well, certainly not with that attitude!

Meanwhile, in my highly efficient, proprietary fruit string compression scheme, "orange" is represented by the integer 8. (Don't confuse it with "blood orange" though; that's 26.)


My prime is less legal than your irrationals

https://en.m.wikipedia.org/wiki/Illegal_number

I want my ShorFS where data is stored as factors.


> you’ll never count to “orange” - “orange” is not contained in the sequence of infinite integers.

What is a string, but a sequence of bytes? What is a sequence of bytes, but a decomposed integer?

"orange" = 111,494,907,916,911


No. Encoding is not a clever way around infinity.

You can encode orange in RGB and HSL too, but the set of all integers still does not contain the concept of an orange. You’ve just assigned meaning to an integer.

In the same way you can’t count to orange, you can never start at 1 and count up to -1. There are an infinite number of different integers greater than 1, and that infinity does not contain -1.

SciFi likes abusing this. Just because there are an infinite number of universes doesn’t mean there is a universe that contains anything you can dream up Rick and Morty style.


> You can encode orange in RGB and HSL too, but the set of all integers still does not contain the concept of an orange.

Nobody claimed "the concept of an orange" can be found. The claim that the word "orange" can't be found, however, is provably false.

If we're going by this logic, then nothing but 0 and 1 can ever be stored on any medium, because a hard drive can't possibly store "the concept of an orange" either.

> Just because there are an infinite number of universes doesn’t mean there is a universe that contains anything you can dream up Rick and Morty style.

And you know this with certainty.... how? With enough branches in a truly infinite timeline, the likelihood of there being a combination which produces a specific outcome is quite high.


> And you know this with certainty.... how? With enough branches in a truly infinite timeline, the likelihood of there being a combination which produces a specific outcome is quite high.

It is not. It is quite low. The number of possible states grows at a faster rate than the rate necessary to maintain “infinity.” It’s only true if you assert that all possible states at a branch are added to the set, which is a tautology.

A fun puzzle. Let’s say you have an empty set whose members are sets. You add a set to it of infinite size (like the set of all positive integers). Next, you add another set to it whose size is also infinite but whose members are not contained in the first set. Now you repeat this process an infinite number of times.

You have an infinite set of infinite sets. The question is: is such a set possible and, if so, does your infinite set of infinite sets necessarily contain all possible infinite sets?


Most pitches for Everett branches (which are a specific enough concept that it's silly to get all technical, but whatever) hold that there is an amplitude distribution over the entire configuration space of the universe. While the proposed mechanism of decoherence implies a sort of discreteness, every single possible configuration would have non-zero amplitude. Of course there's no specific evidence for this over, say, collapse theories (or something stranger, like Bohmian mechanics), but there's certainly no philosophical issue. Not sure what you think ZFC has to do with physics, but an empty set with members is a tad nonexistent.


> It’s only true if you assert that all possible states at a branch are added to the set

That is generally how "multiverses"/branching timelines are handled, yes.


> the set of all integers still does not contain the concept of an orange.

Nor can any file system store the "concept of an orange".

File systems are really just pointers to numbers, if you think about it.


Did you reply to the wrong person? I actually said just as much in my comment.

> If we're going by this logic, then nothing but 0 and 1 can ever be stored on any medium, because a hard drive can't possibly store "the concept of an orange" either.


Ah yes, sorry! This should have gone one level up.


What if I created a new system of encoding words, where the digit "1" represents the word "orange"? Does that disprove the claim, just as encoding it with bytes does?


> the set of all integers still does not contain the concept of an orange. You’ve just assigned meaning to an integer.

We all do, all day, every day. Isn’t that exactly what computer science and mathematics is?

> Just because there are an infinite number of universes doesn’t mean there is a universe that contains anything you can dream up Rick and Morty style.

Sure, there are many different kinds of infinities, but what does that have to do with pi and file systems?


What if a digit (say, 9) only appears in pi up to a certain decimal point? If your sequence contains that digit and doesn't appear by that point, pi won't contain your sequence.


If pi is indeed normal (whether proven or not), that would never happen. A normal number, by definition, contains an even distribution of digits. There would be no "last 9" because there would have to be more after to satisfy the distribution. Basically, the count of all nines must be the same as the nine other numbers, as the number of digits goes to infinity.

Even if the first million digits of some unnamed number were all nines (with no more afterward), it would not be normal because the distribution of nines, as the number of digits goes to infinity, would go down.


Right, the point is that we don’t know whether pi is normal.


According to https://math.stackexchange.com/a/216578, “𝜋 is not known to have this property, but it is expected to be true.”


The repo takes it as a working assumption that this property is indeed true.


Come on the length of the index into Pi is anywhere between 0 and infinity. I don't think anyone feels defrauded. If anything this is a gambler's dream!


If π is a normal number, then it contains all possible sequences. However, it is currently unknown whether π is a normal number. I agree with you regarding the fact that this might not be a great compression algorithm, but it is certainly fun.


> since the number of possible integer sequences grows faster than the “space” for sequences in pi

Is this true? Aren't the digits of pi and the number of possible (finite) integer sequences both countable?


Pi is a function in reality, not a number.


Pi is a number. You can approximate it to an arbitrary degree of precision by computing a function, but the map is not the territory.


I tried something similar, finding all the integers in pi:

https://pi.paradite.com/


> Even if pi does contain every sequence of integers and you could map that to bytes which, in turn, maps to a file, this would not compress. [...] This will get large very quickly.

Psst... That was the joke :)


the set of reals is larger than the set of integers despite both being infinitely large




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: