Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Elegant six-page proof reveals the emergence of random structure (quantamagazine.org)
589 points by davidvarela_us on April 26, 2022 | hide | past | favorite | 166 comments


Jinyoung Park's Path to Math video on IAS is a really wonderful 3 minute story of how she came to study threshold behavior in random discrete structures, just like this work. Highly recommended. https://www.ias.edu/ideas/paths-math-jinyoung-park


Since we regularly get comments on HN about people wanting to study math in some capacity and about the challenges in doing so - I thought this was a really quote from this video:

Jinyoung (after having been a secondary school Math teacher for 7 years):

"... there was a big obstacle in studying mathematics or pursuing my career in mathematics, which was me, myself. Because I just couldn't stop thinking that oh I'm too old and I started too late or I didn't learn enough amount of mathematics in college ..."

and now here she is with some world class work to her name.


From the link:

"Jinyoung is a first-generation college graduate, and after seven years as a secondary school teacher in South Korea, she went on to earn a mathematics Ph.D. from Rutgers University. Jinyoung’s story demonstrates the importance of role models at all levels of one’s education and the fact that it is never too late to begin anew."


And yet people will shit on DEI efforts. Incorporating more folks into the scientific endeavor helps combat things like stagnation in science precisely because it reduces academic incest.


That was not the topic of OP's post or this really.


DEI efforts discriminate against Asians like Jinyoung, so yes, I will shit on them.


Not. Necessarily, given that DEI efforts are not one-dimensional. They consider factors like first-generation, geneder, class background, etc.

(I am an Asian male, so I really have nothing to selfishly benefit from promoting DEI efforts)


Well, one problem when discussing DEI is there are many implementations of it. My own own departments only considers race or gender, and really only focuses on gender. Although I know many DEI programs really do take a more holistic view of what diversity means.


I am back in graduate school at 31 after a few years working and I also wondered if I was too old to go back. It is definitely possible if you have the drive to do it. I can say it is both easier and harder than my first time around. My drive to succeed is much higher, my focus is sharper, but my overall energy and free time budgets are more limited (kids). I would recommend if one can and wants to do it.


Math is one of those fields where there seems to be a clear correlation betwixt youth and creativity. It's just the consequence of having fresh eyes looking at an ancient problem ;)

But experience is clearly necessary due to the shear vastness of the topic. I find history and philosophy of math to be just as engaging and often crucial. Understanding the origins of probability in 17th century gambling houses, for example, sheds enormous light on the subject.

To that end, I think something like "Mathematical Evolutions" might be the best place to start:

"To tell the tale of The Isometric Problem, one must begin by quoting Virgil..."

https://www.maa.org/sites/default/files/pdf/upload_library/2...


I'm not sure the peak is driven by fresh eyes. I think fluid intelligence is a big requirement and the brain peaks in that regard in your 20's. It doesn't mean you can't still make great progress later in life, but it sure does get harder in my personal experience.


The video doesn't say (maybe because the answer is 0, but important if not) how much math Park did between the basic college math for the BS Math Education degree, and starting PhD at age ~30.

Did she study advanced math in college, or as a hobby during the pre-PhD years, or did she spend 10 years studying K-12 math very thoroughly while teaching, and then start going deeper?


[flagged]


Jinyoung Park is literally one of the co-authors of the paper in question.


She was at Institute for Advanced Study at Princeton. It's -the- most elite intellectual society. The top 1% of the 1% academics have to compete for admission. If that doesn't wash away illformed biases, I don't know what can! What she accomplished is no joke!! Huge respect tbh.


As I already commented in the first submission of this article:

Now that's something I can really use in one of my current problems. Calculating a minimal perfect hash by creating acyclic random graphs. This conjecture gives now tresholds when to stop trying creating random graphs and start afresh.

This eg is needed for large perfect hashes in gperf or integer sets in compilers, such as eg. for C switch statements with Unicode. switch in C is only efficient for dense tables, but not sparse. sparse tables are linear, but could be constant. A switch in Pascal was constant, but so far GCC and llvm didn't care. gperf neither.

see http://ilan.schnell-web.net/prog/perfect-hash/algo.html and http://programming.sirrida.de/hashing.html#c_case_of_string


Isn't the proof for Hamiltonian cycles? From what I could understand from your link to CHM92, it only requires a cyclic graph to stop, which isn't necessarily supposed to touch all vertices. Is that right?


The Hamiltonian cycle is only a motivating example. The conjecture which was proven is much more general.

https://gilkalai.wordpress.com/2022/04/02/amazing-jinyoung-p...


Huh, I was thinking the same thing about a related problem: compressed static maps by solving sparse matrices: https://docs.rs/compressed_map/0.1.0/compressed_map/

To get a concrete bound though, you'd probably need to know what is K.


A nice 'visual' illustration of the threshold is found in Percolation theory https://en.wikipedia.org/wiki/Percolation_theory where if you have an infinite grid of square rooms with either a wall or a door between each pair of rooms, that there is shift about what is connected when the percentage of door gets just over 50% as is visualized in this animation: https://en.wikipedia.org/wiki/Percolation_theory#/media/File...


I recommend everyone interested in percolation theory to also check https://meltingasphalt.com/interactive/going-critical/ and play with critical thresholds interactively!


This was fascinating! It started with some simple simulations, but led into some very enlightening generalizations about why culture spreads more rapidly in cities than in the countryside, and why new language spreads more rapidly among adolescents.


Great link, thanks for sharing!


Thank you!


And, to give an example we've probably all become familiar with, this this is equivalent to how a disease becomes an epidemic as the R value crosses 1.


That seems more like a truism: If on average one case leads to more than one other (R > 1), the number of cases will grow. The R number therefore appears to be nothing more than a measure of growth or decline. What am I missing?


As far as I understand it R is not a rate because it does not include time. Diseases that take on average a few days for you to pass on verses a few months will have different growth rates. For example, an STI might have an R factor of 3, but the rate is completely dependent on how often people have unprotected sex.


Good point, R is often referred to as a rate both in media and academic publications, but "number" or "value" appears to be more correct. Fixed.


R is not a measure of anything (i.e. real world data), it's more of a prediction based on a mathematical modeling procedure. It's not constant for a pathogen like Sars-CoV2, as it depends also on other conditions (environmental factors like temp/humidity, and social factors (norms like handshaking, etc.).

The models used to generate R-numbers have some assumptions which may or may not be valid: (1) rectangular and stationary age distribution, and (2) homogeneous mixing of the population. Thus, the result is a fairly rough estimate.

A major use is in getting a decent estimate of what percentage of a population needs to be vaccinated in order to halt the spread of a viral infection. However, this supposes 'sterilizing vaccination', i.e. vaccinated individuals are not asymptomatic carriers and spreaders of the infection. While this was the case for the smallpox vaccine, it doesn't seem to be the case with all known Covid19 vaccines, where there are many breakthrough cases (even though symptoms are reduced and hospitalization is minimal).


This is essentially what is happening in the percolation case as well.


What I mean is that R varies by disease (or for a given disease, according to interventions, etc).

The size of the total epidemic varies sharply as R is changed in this way.


When you say R, are you referring to R_0? Because R_0 is not a mathematical predictor of percolation.


I think there's lots of ways to formulate this to see how it's percolation. I'm not a percolation expert though so please call me out if you are, and I'm wrong:

To sketch here:

Consider R the reproductive number of a disease (or R_e, the effective reproductive number, to be specific). This depends on both the inherent contagiousness of the disease, but also on the behavior of the population. If people choose to have fewer contacts (or bars are closed) then R decreases.

Let's say covid is spreading. We ask people to limit their contacts, and see if this stops spread.

We can think of this as trying to remove edges from the contact graph that the disease spreads on.

The contact graph becoming connected or unconnected as we remove edges is clearly percolation.

(Subject to some modeling assumptions about edges being removed at random, but these assumptions are common to both Erdos renyi graph models and SIR style compartmental models).

Make sense?


Sudden thresholds were just recently relevant with disease transmission and recoverability too.


I wonder if this has implications for evolution of life. Life is about DNA which is about a graph of nucleoids connecting to each other in specific structures. But DNA must have evolved out of random structures and random mutations. So if there's a threshold at which structures become inevitable, then DNA has a chance to be born?


> But DNA must have evolved out of random structures and random mutations

Before DNA there was only RNA. DNA came later but established itself because it is much better at conserving information. There are still retro vira carrying their genetic information in the form of RNA reminiscent of that RNA era.

So the course of evolution might have been like this: 1) random RNA coils slowly gained structures being able to catalyze reactions e.g. like replicating itself. 2) Replicators become catalysts for other reactions. 3) Specialized RNA molecules start entangling themselves like e.g. one of them replicating the others while another provides access to chemistry providing energy (aka "food"). 4) RNAs start encoding proteins and enzymes adding to the entanglements. 5) Membranes appear isolating the entanglements from the environment inventing cells. 6) DNAs are created from RNAs forming the modern biochemical tri-unity of DNA, RNA and proteins.


Current origin-of-life theories are pretty diverse, but a very promising approach postulates the ribosome, the protein synthesis machine composed of proteins and RNA, as the original self-replicating entity. This model supposes the abiotic synthesis of both simple amino acids and nucleosides, the building blocks of proteins and nucleic acids, and some kind of random polymerization process.

If such random polymers were able to assemble into something like the ribosome, that structure could plausibly start replicating itself using the abiotic pool of free amino acids and nucleosides. This is somewhat different from older 'RNA world' theories in that it involves both amino acids and nucleosides. Abiotic synthesis of these precursors is fairly plausible.


There is the field of catalytic random networks, the gist being that DNA can encode for enzymes which greatly increase the rate of protein generation.

I read a book on it by Stuart Kauffman from the Santa Fe Institute of Complexity about 20 years ago; things may well have changed since then, but the core idea was amazing.

Also reminds me of rates of reaction from Physical Chemistry, how some concentrations of chemicals can have a huge impact on the overall reaction.

Edit: Updated with the author’s name.


Yes I was also wondering the implications that in a system like a universe being born out of a big bang eventually life structures will appear with a certain probability.


Here's a direct link to the preprint if you want to skip the fluff:

https://arxiv.org/abs/2203.17207


Actually, the "fluff" in this case is outstanding. The paper itself doesn't provide much context, but the Quanta piece does a great job at explaining why the result is important.


I agree its a pleasant read, but like other Quanta articles, it spends too long on the journey and not enough time on the precise details for my liking. There are few symbolic expressions to be found. Others agree, which is why I shared the link.

Plus, calling it 'fluff' is surely the mildest of sassy descriptions, no?


The direct link to the actual paper is certainly very useful and much appreciated. I think yshklarov's point is just (1) it won't be accessible or interesting to the vast majority of HN readers (including me) without significant background info and (2) Quanta does a shockingly good job at this background (but is not intended to replace the paper itself).


> calling it 'fluff' is surely the mildest of sassy descriptions, no?

On the contrary, "fluff" suggests that the piece has nothing to offer outside of the source material.


Nah, it just suggests that some of the article is superfluous, which I stand by.

Like I said a comment or two up, its a good article, but doesn't contain enough of the meat.


It does an okay job, but there are still a lot of unanswered questions, or assumed knowledge for a lay person:

> Mathematicians want to know when such a graph is likely to have some sort of interesting structure.

Why?

> a Hamiltonian cycle, a chain of edges that passes through every vertex exactly once > adding more edges to a graph that already contains the property will not destroy the property.

Adding one edge after the exact number of edges required to create a Hamiltonian cycle (number of edges equal to number of vertices) would appear to break the property.

> mathematicians often rely on an easier computation, one that provides a minimum possible value, or lower bound, for the threshold [of probability]

How can there be a lower bound other than zero? However small the possibility, surely given an infinite number of cases, there are infinite possibilities of a particular structure being created.

> The sunflower conjecture considers whether collections of sets can be constructed in ways that resemble the petals of a sunflower.

What does this mean?

> If Hamiltonian cycles are “spread out” nicely, that means that not too many cycles contain the same edge or subset of edges.

This seems to suggest multiple Hamiltonian cycles in a graph, contradicting the earlier definition that every vertex must be connected. I guess they meant every vertex in any particular Hamiltonian cycle.

And so on...


> Adding one edge after the exact number of edges required to create a Hamiltonian cycle (number of edges equal to number of vertices) would appear to break the property.

A Hamiltonian cycle is a simple cycle (no repeated vertices or edges) that contains every vertex of the graph. If a graph G has a Hamiltonian cycle, then adding more edges to G will not make that cycle go away; it will still be there. So the property of "has a Hamiltonian cycle" is not broken by adding more edges.

As a simple example: consider the graph which is the cycle on 5 vertices. That is, the graph has 5 vertices, and is just one big cycle with 5 edges. This graph has a Hamiltonian cycle (the entire graph itself is one such cycle). If we add an extra edge to this graph, say between vertices 1 and 3, the original Hamiltonian cycle does not go away.

> How can there be a lower bound other than zero? However small the possibility, surely given an infinite number of cases, there are infinite possibilities of a particular structure being created.

The lower bound can be other than zero because they are looking at the threshold (probability) at which the probability of the object existing goes from "very low" to "extremely high". This is alluded to in the following quote:

  "When edges are added to a random graph of N vertices with a probability of less than log(N)/N, for instance, the graph is unlikely to contain a Hamiltonian cycle. But when that probability is adjusted to be just a hair greater than log(N)/N, a Hamiltonian cycle becomes extremely likely."
I don't know the precise probabilities, but this would be something like: "When the probability of an edge being present is less than log(N)/N then the probability of there being a Hamiltonian cycle is 1/(N^2). When the probability of an edge being present is slightly more than log(N)/N then the probability of there being a Hamiltonian cycle becomes (1 - 1/(N^2))."

(Note that I plucked the above probabilities out of thin air just for the sake of illustration, just to give you an idea of the form that these statements take. For the precise probabilities, please ask Google.)

> What does this mean?

See: https://en.wikipedia.org/wiki/Sunflower_(mathematics)

> This seems to suggest multiple Hamiltonian cycles in a graph, contradicting the earlier definition that every vertex must be connected.

This is no contradiction. There can be multiple Hamiltonian cycles in a graph. Consider the complete graph on n vertices; there are roughly n-factorial-many Hamiltonian cycles. Any permutation of the vertices corresponds to one such cycle. Different permutations can correspond to the same cycle, so the number is not exactly n-factorial. But you get the idea.


> Consider the complete graph on n vertices; there are roughly n-factorial-many Hamiltonian cycles. Any permutation of the vertices corresponds to one such cycle. Different permutations can correspond to the same cycle, so the number is not exactly n-factorial.

Isn't it just (n-1)!/2 as each cycle can be cyclically permuted (n possibilities) and (for n>=3) also be reversed? (Cases n=2 and n=1 have only one cycle.)

(That wasn't the point you were trying to get to, I know. I'm just thinking about the precise number.)


Thanks for the answers to some of my questions. I think the conclusion is that the article is above my comprehension, which is absolutely fine given who I imagine Quanta's readership is.


Hm, they prove the existence of a K, but don't put any explicit bounds on it. Someone's going to have to proof-mine this one...


I'm trying to visualize this. I go to Wolfram Alpha and type "chance of getting 504 heads in 1000 coin flips" and see the answer is about 1/40, and when I change 504 to 505 I see the odds are about 1/41 - only slightly worse.

Then I check the differences between 524 and 525 and I see that the odds are decreasing much more sharply (1/400, 1/459). The little graph they helpfully provided shows what's happening: I've moved from the flattish "top" of the Bernoulli distribution to the steepish "slope" of the distribution. And at larger numbers still, the differences between adjacent numbers become negligible again as I reach the flattish "trough" at the edge of the distribution. You could say that the top of the distribution has values that are all pretty similar to each other, the bottom values are also similar to each other, and sides are a region where small differences are comparatively much more important.

Is this roughly what the article means when it discusses thresholds? The rather sharp transition from "both pretty likely" to "the second one is a lot less likely" to "both pretty unlikely"? And if so, how sharply would the slope of the distribution have to change to qualify as being a threshold?


The effect you're seeing on the coin flip is best understood by seeing that each coin flip is independent, and in no way connected to the past. So, the odds are based on a fair 50/50 per flip, which leaves you with a simple 2^n sample space of one of each binary combination for n flips. The math for that works out easily, and can be plotted.

The emergence of random structure in graphs, however, is different. The chance of a specific structure, such as a cycle of some length or a spanning tree of certain dimension, can go from not particularly likely (<20%) to significantly likely (95%+) in just a single additional node. Those transition thresholds at which the percentage changes in an intuitively surprising way are the subject matter of interest here.


Eh, the coins weren't important, that's just an easy thing for me to visualize. If one graphs that emergence -if one graphs the number of nodes in a graph against the chance of finding some structure in a graph of that size - I had imagined one would see a smooth curve like one half of a binomial distribution curve. It sounds like you're saying the graph would look discontinuous?


It's as continuous a want discrete function can be, but yes it has a region of explosive growth, sort of like a sigmoid.

Think of all the Jenga games. What is the probability of the tower collapsing on turn T (or T/H, for a tower of size H), graphed as a function of T (or T/H)?


Coins example would work here if he asked what are the odds of getting n of a/b in a row or how many n-length rows for few different ns in a number of flips.


I think it's slightly different. Consider this problem instead, what's the chances of two people sharing the same birthday in a group. It's (365364...*(365-n+1))/(365^n). If you plot this out, it increases exponentially. At n=23 it's about 50%. At around 60, it's a bit more than 99%. It's similar to the other problems, where a given condition can have an arbitrarily high chance of being present at surprisingly low graph sizes.


That's sigmoid (starts exponential and then smoothly changes into an asymptotically constant)


I don't think it's Bernoulli, binomial rather?


Neat comment on the arxiv (https://arxiv.org/abs/math/0603218):

The gap between expectations and reality is studied, 7 pages

(these comments are added by the submitting author).


Direct link to the PDF: https://arxiv.org/pdf/2203.17207



Reminds me of the Pythagorean perspective that there are basic harmonies (wholenesses) in math itself-- and due to these harmonies in math (arithmetic, geometric, etc), harmonies manifest in the cosmos.

So, the question might be: does this finding have any implications for physical phenomena?


Maybe someone familiar with the graph theory math can help me.

I don't understand why these properties aren't numerically accessible. Why is the estimation necessary?

E.g. Assume N nodes, and edges E can be created at random with redundancy (picking same edge e twice just keeps the prior edge) by picking pairs of points.

Assume N random edge picks are made, then the probability of e.g. a Hamiltonian cycle appearing in those N picks can be calculated.

(My quick back of envelope sketch says P(H-cycle) = N! / N^N , but please consult the expert literature.)

But one can also calculate the probability of a H-cycle given N+1 random picks, N+2 random picks, ... and that appears as = P(H-cycle) * (1 + extra factors that account for increases in other edges not redundantly in the H-cycle)

(Again, my back of envelope sketch for N + 2 picks gives: = P(H-cycle) * ( 1 + 2/(N-1)[N - 3 - 1/N]) , but please consult the expert literature.)

These probabilities would seem to tell a user that given a graph with N nodes and E edges, that if e.g. in the H-cycle case, E > N the user can get an explicit probability for the likelihood of a H-cycle being present.

Are there graph properties that prevent this approach being viable?


Wasn't there also a connection between solving constraint satisfaction (k-SAT Boolean CSP examples) where the clause length to the number of clauses makes it super easy or super hard for solvers to solve for a solution?

I vaguely remember this from reading literature when writing heuristic solvers for my CS grad course in AI search.


To answer my own question here are some references to the problem at hand https://dl.acm.org/doi/10.1145/3491210


The article says the Kahn-Kalai conjecture also holds for hypergraphs - IIRC hypergraphs are the foundation of the Wolfram Physics project, meaning this result is likely to have implications for the emergence of patterns within the models they are working on.


It always fascinated me how basically two quarks, one electron and some bosons can create so much chemical complexity.

It feels like trowing just 6 types of lego bricks into a huge bag and after a lot of random mixing obtaining whole universe of diversity.


All the complexity of computing comes from just two bits.

(Some even argue that’s true of the universe itself)


Not true, the bits live in a super complex hardware, the don't create any complexity on their own.


Except there’s no such thing as bits on the hardware level. There’s voltages. High and low. That’s usually 5V and 0V, respectively.


You're thinking too concretely. The hardware we know is only one tiny implementation of computation. The broader concept holds across any writeable tape of any sort, whether it uses voltages in silicon, rocks on a beach, or exists only in a person's mind, does not matter one bit -- they're all equally powerful.


How can a Hamiltonian cycle be increasing? If you have one and you add an edge, then you no longer have a chain of edges that passes through every vertex exactly once, because two vertices will have two edges.


The chain of edges forming the cycle doesn't have to include all edges in the graph. There just has to exist a set of edges that form a Hamiltonian cycle. Adding further edges doesn't change that (but adding further vertices would).

Edit: Put another way, it's not the Hamiltonian cycle itself that's increasing, it's the property of there existing a Hamiltonian cycle in the graph.


So the definition in the article, “a chain of edges that passes through every vertex exactly once”, is incorrect?


No, that's the correct definition of a Hamiltonian cycle. However, the article isn't as clear as it could be about what the "increasingness" aspect applies to.

> It’s possible to think about any property, so long as it is “increasing” — that is, if adding more edges to a graph that already contains the property will not destroy the property.

The "property" here isn't the Hamiltonian cycle itself, it's the property "a Hamiltonian cycle exists within this graph". This property, rather than the cycle itself, is increasing: if you add edges to a graph that contains a Hamiltonian cycle it will still contain a Hamiltonian cycle.


Ah, I get it now. Thanks.


> 2 (of a scientific theory or solution to a problem) pleasingly ingenious and simple: the grand unified theory is compact and elegant in mathematical terms.

A 6 page proof described as "elegant" must be incredibly dense.


As the other poster noted, for a big math result this is comparatively rather short. For a concrete example[0], a recent important result related to the Arnold conjecture is 268 pages long, with another 100 pages in appendices.

[0]: https://arxiv.org/abs/2103.01507


The proof section is 3.5 pages. That's really pretty short, as far as modern math results go. I get your point, though.


The abstract is a one liner and the intro has in the first line a formula or two. No idea if this is the tone of math papers but I call this dense :)


Compared to some of the papers I've read recently, this is extremely elegant and succinct and doesn't just jam all the math in line by line.

Comparatively one of the papers I looked at the other day was something like 100 pages long with a ~30 page proof section with very little free whitespace packed full of complex mathematics.


You might be interested in this 1 page paper by John Nash, which proves the existence of equilibria for finite N-player games (an extremely powerful result). In essence it uses a set theory result (Kakutani's fixed point theorem), and simply notes that his description of a N-player game meets the required conditions for that result to hold. http://www.sscnet.ucla.edu/polisci/faculty/chwe/austen/nash1...


So it seems this allows us to understand when phase transitions occur, I wonder if this has applications in improving high temperature superconductors.


Can anyone with a solid CS background comment on whether this relates to P!=NP at all?

I mean this deals with Hamiltonian cycles (which is the focus of a well know NP hard problems). It talks about lower boundaries for randomness (which relates to the information stored in the graph). The article talks about gaps which are logarithmic.

I can't quite articulate it, but somehow intuitively, it sounds like it could connect.


Off Topic: Can I get some more resources like quanta, which explains new research on mainstream or obscure subjects to a layman?


Scientific American, or American Scientist are my go-to favourites.

https://www.scientificamerican.com/

https://www.americanscientist.org/


> To make a random graph, take a biased coin — one that lands on heads with a probability of 1%, or 30%, or any other percentage between zero and 100 — and flip it

Nooooo!!!! This is impossible. You can bias dice but not coin to flip.


Actually, you can if you bend the coin!


What level of expertise is this proof accessible to? Could a final year undergraduate understand this perhaps with a bit of help or background reading?

I am talking about reading and understanding it, not necessary validating the correctness.


Be bold!

Any paper, any field, if you can get your hands on it, Take a half an hour and read the words. Look at the pictures, Look at the graphs. Take a break. Go for a walk, maybe get some coffee, and decide if you want to continue. This will give you a very rough survey of the terrain. Often, I'll quit right here if there's nothing for me to latch on to.

Take a couple of hours, take notes about the things you don't understand - maybe take a minute here or there to look up concepts, maybe take a minute there to graph equations.

Get a good night's sleep. Look at your notes and think about all the stuff you don't understand. There are plenty of things I don't get _at all_. There are also things I can latch onto. Put some serious effort into linking the equations with the words. Decide if you want to spend days or weeks really digging in and _understanding_.

I find each step rewarding. I often quit early, and move on to the next shiny new thing. There's some real value in each step. I learn about things I never new existed. I certainly can't explain them, but sometimes things pop up again and again and I do get the motivation to take a deeper dive - not necessarily mastery, but at least an understanding of my depth of ignorance. Maybe watch a lecture on that topic for perspective.

You never know what's going to be useful. Nurture that spark of interest. You might get nothing out of it today, but 10 years from now, you'll see some other problem and vaguely recall this is kinda like that other thing I never really got. There's definitely some opportunity cost. But if you're just farting around on reddit, an hour here and there can be really enjoyable, and possibly someday useful.


I would say yes because I am very close to understanding it :) Although I have a math teacher degree, we didn't really learn that much higher level stuff and also it's been twenty years since I touched anything like this. Notation wise I am missing A△B and not much else. For the theorems mentioned, I am reasonably sure if I worked through https://www.ihes.fr/~duminil/publi/2017percolation.pdf I would understand this paper in question.

Edit: ah! I paged through the percolation introduction and it mentions A∆B ∶= (B ∖ A) ∪ (A ∖ B) so that's the symmetric difference between A and B. And it seems the first ten pages is enough, we only need to get to the Russo theorem.


I think so. To be honest not 100% sure but it seems you would need a bit probability theory + understanding what graphs/hypergraphs are. So my first impression is, it is very accessible, which I would also expect from an elegant proof...


I notice that most links on physics, astronomy and math cover pre-prints.

Would it not be prudent to wait for the final version.

Is all the peer reviews and "fact checking" done prior to the preprint?


For a big result like this, enough experts read the paper to catch glaring mistakes. Sort of a de facto peer review


Exactly, Quanta would have the experts they interview go over the paper first if they didn't already go over it before and evaluate it before making any comments.

Plus arXiv papers are freely accessible ;)


Reporters often rely on social input to decide what to cover—when someone they trust tells them, “you should check this out.” Then there is an exploratory phase where they gather some basic facts and reach out to contacts to see if it is in fact newsworthy, and maybe to start gathering quotes to add context and fill out the story. At some point they say “this is real” and start working with a editor on a schedule to publish it.

Whether a paper is on a preprint serve is not a big deal. If a lot of experts are excited about it, that’s a story anyway.

And if it falls apart later due to some surprise flaw… well that is also a story!


Publications don't check facts. They just check formatting and interestingness and plausibility. Peer review is totally separate, not part of publication's pseudo "peer review".


very interesting. I would like to see that math applied to the emergence of life. how slim were our chances? or maybe not slim at all and maybe even unavoidable?


>The methods that would eventually lead to [new discovery] began with a breakthrough on a seemingly unrelated problem.

Science, baby!


> young mathematicians

Is a 40 yr old mathematician "young"?


Yes


Can someone explain to my why proving random graphing can produce known shapes is important? Because this seems absurdly obvious to any layman. Why is this a complex proof? I’m guessing it’s more that they proved the thresholds for these shapes being formed more than why?


The interesting thing a out graph properties (like the emergence of connectivity, giant components, Hamiltonian paths etc.) is that they happen as "phase transitions".

A classic example is cuckoo hashing: You want to know how many edges the random graph of hashes can have before it contains a cycle, since that's when you need to rehash into a larger table. You might expect that this number is "pretty random" in that you sometimes get a cycle with few edges or sometimes have many edges but no cycle. However it turns out that it very predictably happens exactly when the graph gets to a certain size. In the same way as 1000 coin flips very predictably have 450-550 heads.

What's so cool about the theorem is thst it proves _any_ property you can think of has _some_ sudden threshold like that.


Surely not _any_ property? From the article I understood that it’s only those that are monotonic under adding edges?


It seems to me most graph problems that people care about are monotonic under adding edges. Actually, I can't think of a single problem in graph theory I've done where that wouldn't hold.


Being a https://en.wikipedia.org/wiki/Perfect_graph is not monotonic under adding edges, and it's certainly an interesting property for a graph to have! (But yes, most interesting properties are monotonic under adding edges)


> It seems to me most graph problems that people care about are monotonic under adding edges. Actually, I can't think of a single problem in graph theory I've done where that wouldn't hold.

https://en.wikipedia.org/wiki/Braess%27s_paradox is very famous.


You are right! Luckily that's the definition of a graph property .

Something that definitely won't work: The parity of the number of edges.

Still, it's pretty general.


> However it turns out that it very predictably happens exactly when the graph gets to a certain size. In the same way as 1000 coin flips very predictably have 450-550 heads.

> What's so cool about the theorem is thst it proves _any_ property you can think of has _some_ sudden threshold like that.

...like what? The example you give, of the number of heads yielded from 1000 coin flips, doesn't have a sudden threshold at any point. What do you mean by a sudden threshold?


You are right. We have to define the notion of sudden. It works like this: As n goes to infinity you have 100% chance of 49% heads or more, but 0% chance of 51% or more.


In the coin flip scenario, the intrinsic properties of a coin result in a threshold of 50% which above a certain scale, is very "sharp" or sudden in the transition. Compare this to a hash table or queuing theory, where a certain amount of capacity works well up to a certain point, then it falls apart. If you can design a random test which will display that property, then you can determine it before the fact without having a way to directly calculate that property, like we do for coin tosses.


The cases I know tend to be about infinite systems, where you have a critical probability below which the chance of some property is exactly 0. Very often, you get that above that probability the chance is exactly 1 (because of [1]).

In general what I am familiar with is 'percolation theory'.

[1] https://en.wikipedia.org/wiki/Kolmogorov%27s_zero%E2%80%93on...


This is exactly what the Cuckoo Cycle [1] family of graph-theory Proof-of-Work systems gets its name from.

The puzzle instances are pseudo-random graphs in which edges are defined by the siphash24 hash function, and the solutions are cycles of length L, which have a chance of about 1/L of occurring.

[1] https://github.com/tromp/cuckoo


Cuckoo hash must be a great example. I was interested in the problem solved by this paper solely because I learnt the analysis of time complexity of cuckoo hash just a few months ago.


How do you make the theorem talk about random number sequences? What should the finite set X be?


This is quite the rabbit hole but according to some background it links to a number of interesting problems, for example phase transitions in spin-glasses in physics. The wiki page on spin glasses is a good place to start, perhaps. Here's more on background:

http://assets.press.princeton.edu/chapters/i9917.pdf

"These are all examples of what are called combinatorial optimization problems, which typically, though not always, arise from a branch of mathematics called graph theory...What have spin glasses to do with all this? As it turns out, quite a lot. Investigations into spin glasses have turned up a number of surprising features, one of which is that the problem of finding low-energy states of spin glasses is just another one of these kinds of problems. This led directly from studies of spin glasses to the creation of new algorithms for solving the TSP [Traveling Salesman] and other combinatorial optimization problems."

The first reference in the original Kahn-Kalai “expectation threshold” conjecture (as linked in the article) is this 2005 Nature paper, "Rigorous location of phase transitions in hard optimization problems"

https://www.cs.cornell.edu/selman/papers/pdf/05.nature.phase...

> "Constraint satisfaction problems are at the heart of statistical physics, information theory and computer science. Typically, they involve a large set of variables, each taking values in a small domain, such as {0, 1}, and a collection of constraints, each binding a few of the variables by forbidding some of their possible joint values. Examples include spin-glasses in statistical physics, error-correcting codes in information theory, and satisfiability and graph colouring in computer science. Given a collection of constraints, a fundamental scientific question is how many of them can be satisfied simultaneously."

So... the article notes "each property has what’s called a threshold: a probability at which the structure emerges, often very abruptly."

Here's a short video of a sudden phase transition in supercooled water, which is sort of comparable to a phase transition in a spin glass, or say, the transition from amorphous to crystalline silicon. These things happen at sharp thresholds but have defied first-principles calculation as far as I know about this (which isn't so much, but seems to agree with your latter question re importance):

https://www.youtube.com/watch?v=PM9nwYF1uR4

Perhaps then as a result of this work, your theoretical condensed matter physicists now have new proven mathematical tools to understand things like this?


Thank you!


Yes, this is all about knowing precisely when the properties are going to emerge. How many edges are needed for the Hamiltonian cycle to be present (with high probability).


From article, “The conjecture pertains to far more than random graphs. If true, it holds for random sequences of numbers, for generalizations of graphs called hypergraphs, and for even broader types of systems.”

“Obvious to a layman” is also frequently not at all related to proof, particularly in a mathematical sense.

Your summary of what they’ve proven is off, as well.


Ah yes all clear now :)


> Because this seems absurdly obvious to any layman.

That something seems obvious does not suffice for mathematics. It needs to be proven.

> Why is this a complex proof?

Because no one has come up with a simpler proof. (Although this proof is in fact not very complex, as far as modern mathematical proofs go).


It's not about _can produce_ as much as will it and when which is now an open path for further discovery.


If you think something is absurdly obvious you usually need to re-read what you've read.


No. It's not that rare for absurdly obvious things to get published to great fanfare.

I'm still bitter about "De Morgan's Laws". There are two of them:

1. If two things are not both true, then one or more of them is false.

2. If neither of two things is true, then both of them are false.

Of course this is obvious to everyone. Writing it down did not merit having it named after yourself. I guarantee many other people had also written it down earlier.


A great and hilarious example of that is this heavily cited paper titled "A mathematical model for the determination of total area under glucose tolerance and other metabolic curves."[0]

The title of this paper is not misleading at all. They basically just reinvented Riemann sum in 1994.

0. https://pubmed.ncbi.nlm.nih.gov/8137688/


Whle these are "consequences" of DeMorgan's laws, they are written in the logical predicate format

And you can bet a lot of node developers will get tripped up by those if they need to simplfy or rewrite an if statement

It's "obvious" but not so much (especially for the time), and shows the importance of publishing (formalizing and adding your name) to things that might be obvious but maybe not


Don't forget the celebrated Bayes theorem that falls directly out of the definition of conditional probability P(A|B) = P(A & B) / P(B).

If you had to squint at it and turn that into P(B|A) = P(A & B) / P(A), you'd realize that you can simply multiply the top and bottom by P(A), then pull out the remaining P(A)/P(B).

       P(A & B) / P(B)
     = P(A & B) * P(A) / (P(A) * P(B))
     = P(A & B) / P(A) * P(A) / P(B)
     = P(B | A) * P(A) / P(B).


Interestingly, both rules are rejected in so-called "intuitionist" math. They introduce a third state any proposition can be in, as long as it hasn't been proven true or false (using other axioms). The math that derives from that is pretty nutty, like most things spawned after Godel numbering was discovered.


True, and there are lots of these (Bayes's theorem, Bresenham's line drawing algorithm, Dijkstra's shortest path algorithm all spring to mind). But I kind of assume that it's because it's useful to have a name for De Morgan's Laws and these others (so it stuck as a good meme in the original meaning of "meme"), rather than because we're all so impressed with this amazing result.


Knowing that true things are obviously true is easy. The difference between naive and professional mathematician is the ability to be precise enough to avoid knowing that false things are "obviously true".

To wit, what you stated is not De Morgan's law.

> De Morgan is given credit for stating the laws in the terms of modern formal logic, and incorporating them into the language of logic


Obligatory xkcd: https://xkcd.com/2042/


The art museum visitor would be unambiguously correct in the case of De Morgan's laws. Rolle's Theorem depends on some fairly tricky setup work.

But for an even more obvious theorem that was actually difficult to prove (Rolle's theorem isn't), see https://en.wikipedia.org/wiki/Jordan_curve_theorem

("Any path which begins in the interior of a closed curve, and ends in the exterior of the same curve, must cross the curve at some point.")


From that link:

"The first formal proof of the Jordan curve theorem was created by Hales in the HOL Light system, in January 2005, and contained about 60,000 lines. Another rigorous 6,500-line formal proof was produced in 2005 by an international team of mathematicians using the Mizar system. Both the Mizar and the HOL Light proof rely on libraries of previously proved theorems, so these two sizes are not comparable."


I wish Quanta was a print publication I could subscribe to. Definitely the types of articles I'd like to sit down and read not on a computer.


I feel like there’s a market for a publishing service that could partner with popular blogs and websites. They could either allow articles to be selected by the user or the partner site, which could then be auto-paginated and printed being being posted to the subscriber. Hell, sell a service where you will curate articles based on used interests. Add some relevant Twitter threads for letters to the editor.

Partner sites could also benefit from some increased subscriber revenue, and ads could actually be relevant again (remember when Scientific American advertisements were kind of cool and industry related?).

Call the prototype deadtreepress.io or something.

(I recognise the environmental implications, but I think paper and postage can be sustainable in some regions. Maybe you could integrate offsets into the price?)


Such a service exists. Here are two examples:

https://www.myscreenbreak.com/

https://waldenpond.press/


These look really cool! Might check them out


It’s a fine idea, but ignores the importance of typesetting and design for print.

I’m sure plenty of people would want something like this regardless, but the product would usually look much less polished than people expect to see in printed and bound materials and that would reflect on the authors/editors.

Authors and editors who take pride in the presentation of their work might be a hard sell.


Sounds like a job for a design & typesetting DALL-E AI.

BTW, Kindle is pretty successful, and pretty much all books use a standard design template, so the great importance of typesetting & design is questionable.


"Pretty successful" is a poor proxy for "satisfies condition X." Kindle is known to be bad for anything where the spatial layout matters.


My other half is an editor and graphic designer and constantly points out bad design. Many CMS platforms do this out if the box to allow cross platform compatibility (at various degrees of success), but you could just define some core defaults to ensure compatibility.


Slightly tongue in cheek, but perhaps you could use ML to automatically typeset it. I've seen some ML that would convert a design to HTML and I'd imagine this isn't too far removed from that.


I feel I would want that too even though I got used to reading on screens.

Maybe it could be pdf, I think it may be about some finite amount of content to consume. I guess for me some every 3-12 months ebook with articles that are still relevant and will likely stay relevant for at least a few years would be awesome.


This is a great idea. You could draw in a casual nerdy audience willing to pay a reasonable price for access to high quality OpenAI style explainer pages with access to the raw data and papers.


Here Quanta's editor-in-chief explains his thoughts on a print version [2018]: https://www.publishersweekly.com/pw/by-topic/columns-and-blo...


Wow, thanks for posting this. I would never have come across this. It's good to get some background info to the editer-in-cheif's thinking on when and why we might get a print version of the magazine.


Agreed! Just for a fun read, you might enjoy this collection they put out a few years ago...

https://mitpress.mit.edu/books/prime-number-conspiracy


I recently started scaling back my reading online, and subscribing to a newspaper and a bunch of magazines. I went searching for Quanta in print and was disappointed to find nothing. Honestly it's improved my mood, focus and comprehension, and given me many more opportunities to start conversations and share with my family and friends, especially kids. I'm sure it's possible to schedule reading time on a device with a screen and go and sit on a park bench and relax, but I've never been able to do it without distractions. Plus I've really enjoyed getting into cryptic crosswords again, and having puzzle pullouts in the paper is a guilt-free activity I can pressgang the children into.

That said, I've really enjoyed New Scientist. It's not as deep or esoteric as Quanta but still satisfying for a layman like me.


This is what an e-reader and Pocket in Firefox is for! Transferring an article to your e-reader in a single click.


Did we all shift to a timeline where the computer printer wasn’t invented?


The commenter was probably referring to something more involved than just printing out an html page.


Why not simply use a tablet?


[flagged]


With statements like "left handed people can reason too" or "I drove a red car and nothing happened" you are implying a doubt, that the opposite could be an idea worth considering. If a prejudice has a context, that context is relevant - otherwise the idea is presented as general.

[Edited for content a second time: I quoted an article I was just reading by chance, presenting data that are exemplary of another point, about the relation between generic statements and actual differences: I now remove it realizing I missed a detail in that quote that excluded the professional context.]


You've stated an axiom.


Ok, please ELI5 this for me, The very statement that anything at all is random is completley absurd to me, especially from an academic context. Are they using a definition of random that is equivalent to "nearly impossible to predict"? , I mean, yes, from the perspective of a limited observer random things can exist, but in an absolute sense, for something to be random then even with the knowledge of all things past that lead up to an event, you would not be able to find the cause of that event because it was truly random.

Do they mean "unpredictable for humans with known means of predicting"?

You might as well use the term "magical" or "miraculous" if you use "random".


Lemme go through details, not because anyone doesn't know the details, but to get them written down and discussable.

"Random" means I draw up a list of all possible outcomes. The probability of an event is defined as the fraction of outcomes in which that event is true.

Example. Choose two numbers "randomly" between 1 and 3. What's the probability that the two numbers are equal? The list of outcomes is: (1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3). There are 9 possible outcomes. In 3 outcomes the first and second numbers are equal. So the probability is 3/9 = 1/3.

The math part is just bookkeeping.* The math isn't random, it just uses the rule above. And there is no claim (within the math) that things are or aren't "really random" in the actual world.

*Can be very slippery bookkeeping. People who know what they're doing get wrong answers all the time. But bookkeeping.


Thank you, my original comment was for ELI5, I was asking because I didn't know and clarifying like this helps.

Wouldn't it be more correct to say arbitrary instead of random?


I think random is often used as a keyword that means "use ideas from probability." Those ideas are tricky and (I think) not well captured by any one word.

Example of a tricky situation: Say we play cards. You shuffle the cards well, and you don't show them to me. I might say the order of the deck is random, to indicate that I have no idea which card is in what position. But you might be looking at the faces of the cards. In that case you might say the same cards are ordered in a way that's not random.

Could we describe the same situation using the word arbitrary but not the word random? I'm not sure. Seems likely.

I know how to make up models and test the models by experiment. I don't know whether randomness is real or not. I don't think it's deeply meaningful what words we use, as long as it's clear what the math is and how we're applying it.


I agree, but in crypto for example they use the term "cryptographically random" as in for the purposes of computing it is not possible to predict the value.

As a complete outsider to the field, I had no idea what is meant by random in a mathematical sense because I thought math was always realistic, as in it mirrors real things, you can't have 2+2=3 because reality doesn't work that way.


This is mathematics, not physics. You can define stuff that don't "really" exist in the world, but that work for the purpose of calculating stuff.

"randomness" has as much right to exist in a mathematical sense as a "point" or a "plane".


Randomness exists in physics as well.


I am just responding to the comment which assumes that it doesn't.


How can you prove something that isn't real? How is randomness defined as a concept in mathematics?


As long as you can clearly define your starting axioms you can mathematically prove or disprove anything. The physical example would be to just say, suppose a universe with the same properties of ours exist but where the gravitational constant is twice as large prove whether X can occur in that universe.

As far as how randomness is defined I believe that might be field dependent but this is me getting out of my depth (I've only taken a few courses on combinatorics).


Randomness is usually defined in terms of a probability distribution, which is just a measure where the total mass assigned to all elements adds up to 1.


You can actually only prove unreal things because you impose limitations. Real things can not be proven, but only disproved.


What they're using is the mathematical definition of randomness - an event whose outcomes cannot be determined prior to its occurence. The assumption is that it's physically impossible to predict with certainty what the true outcome is.

In many cases, what you're describing applies - there are many real world phenomena for which we have very limited information or are too complicated to model exactly. Probability theory is really useful in these situations. Why probability theory is useful is because even unpredictable events have some high level patterns that we can discover.

In the case of the article, they're interested in understanding properties of random graphs. Random graphs are useful because they're useful models of real life graphs (such as for social media websites). This is because the real life graph is constructed in a way that appears "random" (such as to whoever maintains the social media site). A lot of graph properties are proxies for real life social behavior - triangles indicate close knit groups, cycles indicate broad friend circles. If you can estimate some of these properties in a random graph, you can do the same for real life graphs. For e.g. Facebook might be interested in identifying close-knit friend circles to identify potential users who may be friends or recommend groups that users can join based on who their friends are. It will also give them sensible estimates on how many people know each other personally, how many friend circles exist and so on.

However, there are certain real life processes which are considered to be 'truly random' - quantum tunnelling, radioactive decay. It's physically impossible (regardless of what technology we can develop) to predict if an electron will tunnel through a barrier, or how long it takes for an uranium atom to decay. However, there are patterns to this randomness such as the probability that the particle tunnels and the half life for radioactive decay. These are macroscopic descriptions of random phenomena.

EDIT: added more relevant information about random graphs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: