The hint suggests looking at a random graph 797 000 000
.
That answer (or reasonably close values) is not accepted by expii. Of course, the process is not exactly the same as for a
That I did (hint: you don't need to generate the whole graph, it is enough to know the total of people in the connected component; once you have 850 000 000
) and close values (because experimental imprecision). Then it escalated and I ended up trying everything between 650 000 000
and 1000 000 000
. Of course none of these values is accepted.
So, my question is simple: am I really supposed to enter a number followed by six zeroes in this entry field, or should I simply enter the number of millions (i.e. a rounding of
The hint suggests looking at a random graph 797 000 000
.
That answer (or reasonably close values) is not accepted by expii. Of course, the process is not exactly the same as for a
That I did (hint: you don't need to generate the whole graph, it is enough to know the total of people in the connected component; once you have 850 000 000
) and close values (because experimental imprecision). Then it escalated and I ended up trying everything between 650 000 000
and 1000 000 000
. Of course none of these values is accepted.
So, my question is simple: am I really supposed to enter a number followed by six zeroes in this entry field, or should I simply enter the number of millions (i.e. a rounding of
I seem to be having the same problem you did. Working with Mathematica I simulated some directed graphs, got the average in-drgree of the vertices (limited to a maximum of 1) and computed an answer of 865 million. I tried that answer and others around it, but the system rejects them all.
]]>It turns out that the answer is not simply recoverable from the giant component size of an Erdos-Renyi graph, as I originally had thought. That's due in a large part to the fact that the out-degree distribution in the breadth-first exploration is not Poisson(2). Instead, it has a very discrete distribution, and is either 0 or 200. Although the expected value is 2, the end result is very different.
I'm very happy to see that this is under discussion here now, because maybe we can figure out the answer together. :)
This is related to the giant component in a random graph, in the sense that it can be modeled by the following branching process:
Maintain a queue of "active" vertices, and a set of "seen" vertices. Everything else is "unseen".
Start with a single vertex in the "active" queue, and nothing in the "seen" set.
Each iteration, take a vertex V out of the "active" queue, and with probability 1%, select 200 uniformly random and independent vertices from the entire graph. For each such vertex which is not already in the "active" or "seen" sets, add it to the "active" queue.
The question is what the expected number of "seen" vertices is in the end.
We see, for example, that there is 99% probability that the first vertex adds no other vertices to the queue, and the entire process dies out with only 1 vertex seen. This immediately means that the answer to the whole question must be less than or equal to:
We know that with probability 99%, the total number of "seen" vertices is 1.
Inspect further into the remaining 1% of probability. We now have a queue with 200 vertices in it. My instinct is that heuristically, as long as the queue doesn't die out prematurely, it will reach the size of the giant component in the Erdos-Renyi random graph with
Also, as the process gains more vertices, it is less and less likely to die out prematurely. For example, once it has 200 vertices in the queue, there are 200 independent chances to have one of the vertices retweet to 200 followers, and each such chance is 1%. So, the probability that none of them retweet is now only:
Remember we had so far figured out that in 0.99 of the probability space, the number of "seen" vertices would be 1.
This means that in
The remaining
and the probability of exactly 2 retweets is:
and the probability of exactly 3 retweets is:
and the probability of exactly 4 retweets is:
These start to build up. For example, if there were 2 retweets among the first 200-batch, then there are now 400 in the queue, and the probability of 0 retweets is only
and so on. So, if the process picks up critical mass, I think it runs away to fill up to the large size that matches the end behavior of the Erdos-Renyi giant component breadth-first exploration process. Very importantly, the process would almost definitely end with either a relatively small total number (under 20,000) or it would blow up to the full size (about 797 million).
To calculate the answer, we then need to estimate the probability that the process dies out early. Since we're approximating the answer, this can likely be done with just a few rounds of calculations like what I just did above, because as the dying-out-probability falls too low, it will become an error term that is swamped by the huge outcome (hundreds of millions) that appears when critical mass is passed. This would indeed parallel the giant component structure in the Erdos-Renyi random graph, where there is a single linear-size component and all other components are logarithmic in the number of vertices.
To check the intuition on the end behavior, I think you might be able to model the growth process by a system of recursions (or maybe differential equations). Imagine the process is already in an advanced stage, with a linear number of vertices seen, and a linear number of vertices in the "active" queue. If there are
Thank you for joining me on this journey through problems. I am sorry that there are sometimes errors. I'm creating the problems fresh each week on my own, by trying to identify some mathematical insights in real world phenomena. I am often surprised to learn new things about the world, and to see the world in a new light. However, the creation of puzzles weekly is not easy, and there are sometimes errors and oversights. I truly appreciate all of your feedback, even if I am often unable to reply directly. (Indeed, that is why I upgraded this forum so that we can have a community of people discussing, and I am active on this new forum as well.) It is a pleasure to have a worldwide community of enthusiasts who are able to help think through these creations together. I hope that my post above provides a useful starting point for us to collectively solve this problem completely.
]]>