To me, the thing that makes probability fun is that the results are frequently surprising. We’ve got very strong instincts about how we expect numbers to work. But when you do anything that involves a lot of computations with big numbers, our intuition goes out the window – nothing works the way we expect it to. A great example of that is something called the birthday paradox.
Suppose you’ve got a classroom full of people. What’s the probability that there are two people with the same birthday? Intuitively, most people expect that it’s pretty unlikely. It seems like it shouldn’t be likely – 365 possible birthdays, and 20 or 30 people in a classroom? Very little chance, right?
Let’s look at it, and figure out how to compute the probability.
Interesting probability problems are all about finding out how to put things together. You’re looking at things where there are huge numbers of possible outcomes, and you want to determine the odds of a specific class of outcomes. Finding the solutions is all about figuring out how to structure the problem.
A great example of this is something called the birthday paradox. This is a problem with a somewhat surprising outcome. It’s also a problem where finding the right way to structure the problem is has a dramatic result.
Here’s the problem: you’ve got a group of 30 people. What’s the probability that two people out of that group of thirty have the same birthday?
We’ll look at it with some simplifying assumptions. We’ll ignore leap year – so we’ve got 365 possible birthdays. We’ll assume that all birthdays are equally likely – no variation for weekdays/weekends, no variation for seasons, and no holidays, etc. Just 365 equally probable days.
How big is the space? That is, how many different ways are there to assign birthdays to 30 people? It’s 36530 or something in the vicinity of 7.4*1076.
To start off, we’ll reverse the problem. It’s easier to structure the problem if we try to ask “What’s the probability that no two people share a birthday”. If P(B) is the probability that no two people share a birthday, then 1-P(B) is the probability that at least two people share a birthday.
So let’s look at a couple of easy cases. Suppose we’ve got two people? What’s the odds that they’ve got the same birthday? 1 in 365: there are 3652 possible pairs of birthdays; there are 365 possible pairs. So there’s a probability of 365/3652 that the two people have the same birthday. For just two people, it’s pretty easy. In the reverse form, there’s a 364/365 chance that the two people have different birthdays.
What about 3 people? It’s the probability of the first two having different birthdays, and the probability of the third person having a different birthday that either of those first two. There are 365 possible birthdays for the third person, and 363 possible days that don’t overlay with the first two. So for N people, the probability of having distinct birthdays is .
At this point, we’ve got a nice recursive definition. Let’s say that is the probability of people having distinct birthdays. Then:
- For 2 people, the probability of distinct birthdays is 364/365. ()
- For N>2 people, the probability of distinct birthdays is
.
Convert that to a closed form, and you get: . For 30 people, that’s
. Work it out, and that’s
0.29 – so the probability of everyone having distinct
birthdays is 29% – which means that the probability of at least
two people in a group of 30 having the same birthday is 71%!
You can see why our intuitions are so bad? We’re talking about something where one factor in the computation is the factorial of 365!
Let’s look a bit further: how many people do you need to have, before there’s a 50% chance of 2 people sharing a birthday? Use the formulae we wrote up above, and it turns out to be 23. Here’s the numbers – remember that this is the reverse probability, the probability of all birthdays being distinct.
1 | 1 |
2 | 0.997260273973 |
3 | 0.991795834115 |
4 | 0.983644087533 |
5 | 0.9728644263 |
6 | 0.959537516351 |
7 | 0.943764296904 |
8 | 0.925664707648 |
9 | 0.905376166111 |
10 | 0.883051822289 |
11 | 0.858858621678 |
12 | 0.832975211162 |
13 | 0.805589724768 |
14 | 0.776897487995 |
15 | 0.747098680236 |
16 | 0.716395994747 |
17 | 0.684992334703 |
18 | 0.653088582128 |
19 | 0.620881473968 |
20 | 0.588561616419 |
21 | 0.556311664835 |
22 | 0.524304692337 |
23 | 0.492702765676 |
24 | 0.461655742085 |
25 | 0.431300296031 |
26 | 0.401759179864 |
27 | 0.373140717737 |
28 | 0.345538527658 |
29 | 0.319031462522 |
30 | 0.293683757281 |
31 | 0.269545366271 |
32 | 0.24665247215 |
33 | 0.225028145824 |
34 | 0.20468313538 |
35 | 0.185616761125 |
36 | 0.16781789362 |
37 | 0.151265991784 |
38 | 0.135932178918 |
39 | 0.121780335633 |
40 | 0.108768190182 |
41 | 0.0968483885183 |
42 | 0.0859695284381 |
43 | 0.0760771443439 |
44 | 0.0671146314486 |
45 | 0.0590241005342 |
46 | 0.0517471566327 |
47 | 0.0452255971667 |
48 | 0.0394020271206 |
49 | 0.0342203906773 |
50 | 0.029626420422 |
With just 23 people, there’s a greater than 50% chance that two people will have the same birthday. By the time you get to just 50 people, there’s a greater than 97% chance that two people have the same birthday!
As an amusing aside, the first time I saw this problem worked through was in an undergraduate discrete probability theory class, with 37 people in the class, and no duplicate birthdays!
Now – remember at the beginning, I said that the trick to working probability problems is all about how you formulate the problem. There’s a much, much better way to formulate this.
Think of the assignment of birthdays as a function from people to birthdays: . The number of ways of assigning birthdays to people is the size of the set of functions from people to birthdays. How many possible functions are there? . is the number of days in the year – 365, and is the number of people in the group.
The set of assignments to unique birthdays is the number of injective functions. (An injective function is a function where .) How many injective functions are there? .
The probability of all birthdays being unique is the size of the set of injective functions divided by the size of the set of all assignments: .
So we’ve got the exact same result – but it’s a whole lot easier in term of the discrete functions!
The math here is nice, for sure, and the result is surprising.
I like to hunt more for ways to make the surprise into something more intuitive. Avoiding the huge numbers like 365! is certainly helpful for that.
With 30 people, there are (30 choose 2) chances for a birthday coincidence among a pair of them, and that’s 435 shots at it, each with a 1/365 chance of success (not independent, so this won’t be exact of course!). So we should expect there to be a little over 1 coincidence on average. And wait a minute, expected value is linear regardless of the independence, so maybe that average amount is actually exact, even though the probability distribution could be quite complicated.
Now that’s always meta-surprising to me: that all the complications of the shape of the distribution and the independence of events go away when you look at the expected value.
When my geometry teacher (on a whim, to take our minds off the numbing of number-less proofs for a few minutes) mentioned this, we did the in-class survey and, well, also had one of those rarities where a class of 32 had no two sharing a birthday.
The closed form seems to be “off-by-one”. The first factor in the denominator should read (365-n)!
A reliable rule of thumb is that if you have N pigeonholes, and randomly assign pigeons to them, then you only need √N pigeons to get a 50% chance of two pigeons occupying the same hole. The rule of thumb is good for N>10 or so, and gets better as N goes to infinity. As an exercise, prove this. (Hint: Poisson distribution.)
This has obvious applications in hashing, but is also a handy rule of thumb for random sampling: If the size of the population is N, √N is a good sample size (assuming an unbiassed sample), in the sense that you probably don’t need stratification.
Thanks Mark. I read an even more simplified version of this. Incidentally, and based on the same maths, here’s a sucker bet that you can use while driving.
This assumes that the last two digits of a numberplate are numerical. The bet is to say that 2 cars in the next 20 will have the same last two digits. The odds are supposedly better than 7 to 1.
I remember doing the birthday problem with a class of 14 year olds many years ago. They were not convinced by the maths so I sent a couple on a survey of the classes in the school, I can’t remember the exact results but they surveyed at least fifty classes with 30 or more students. The results convinced them
Alternatively the denominator of the closed form solution could be fixed by changing it to (364 – (n – 1)), which is mathematically identical to garfen’s suggested fix.
I went a direction similar to Josh Zucker, but I don’t fully understand the slight differences I get in results from what Mark has. Here’s my rationale:
There are 435 potential pairings of 2 that can be made from 30. The chance that two people have the same birthday is 1/365 (i.e., 365 * 1/365^2).
So, chance of a single pair NOT matching birthdays would be 1-1/365 = 364/365. Therefore, the probability that none of the 435 pairings contains people with matching birthdays (meaning none of the 30 individuals’ birthdays match) would be 364^435/365^435.
Now, when I try this math it is very close to the table Mark presents. For example, with 30 people (giving 435 combinations of 30 people taken 2 at a time), I get a probability of having no matches of .303 to Mark’s .294.
As for the breakpoint at 23 people where there is greater than 50% probability that in a group of 30 at least one birthday matches, I get .4995 chance of having all distinct birthdays vs. Mark’s .4927–again very close but not exact. (253 combinations of 23 taken 2 at a time, so 364^253/365^253).
While my answer is very close it is always just a bit higher. Is it just coincidence that this approach happens to approximate Mark’s results? Where am I off?
You’re wrong because you’re considering that the probability of each pair not sharing their birthday is independent from the other pairs when you combine them into your P=(364/365)^435.
You can’t do this. To see the dependence, imagine you’ve got three people, 1, 2, and 3. You’ve got three pairs, and (1,2), (1,3), and (2,3). Now imagine I give you the following information: (1,2) share their birthday, and (1,3) also share their birthday. Do you still think that the probability of (2,3) sharing their birthday is 364/365? Of course not. The probability is 1. These are not independent events, so you can’t just multiply their probabilities, having instead to use the famous P(A and B) = P(B) * P(A given B).
Mark is doing it the right way.
“Do you still think that the probability of (2,3) sharing their birthday is 364/365?” should be “not sharing”, of course.
And the probability should be 0. I was thinking in reverse 🙁
10th grade for me. It was mind-blowing at the time. The teacher couldn’t help but smile as she talked about the problem, so we all knew something was up.
A more esoteric version of the same problem, but one which does a much better job of highlight just how bad humans are at intuitively estimating probabilities in cases like this, is the following:
Caesar gets stabbed, and with this dying breath says “Et tu, Brute?” Now, take a deep breath. What are the odds that one of the atoms Caesar expelled in his dying breath was one of the atoms you just breathed in?
your closed formula seems to be wrong (e. g. for n = 2: f(2) = (365! / ((365-(2-1))! * 365^n) = 1/365 vs. 364/365); additionally you never said that N has to be less or equal to 365
I’ve seen this paradox before, but I’d forgotten the number at which the probability is 50%. When I saw it, I recognized it instantly: 23 is also the maximum number of players on an NHL roster! So one would expect about half the teams in the NHL to have two roster players who share a birthday.
As it turns out, the NHL’s birthdays are quite biased (see http://bit.ly/1iBY2iL and other articles), so I would expect *more* than half of the teams to have “birthday twins” (I’ll use that term as shorthand for “people with the same birthday”).
That got me thinking: is there a way to calculate a group’s bias based on how far off it is from the expected number of birthday twins in each case? For instance, let’s take each NHL team as one case. There are 30 teams, so we’d expect 15 of these to have birthday twins. If it turns out that 20 of them do, is there anything we can infer about the NHL (as a whole) from that?
Pingback: 编程面试10大算法概念汇总 | WHO1753
Pingback: 编程面试的10大算法概念汇总 | zengine
Pingback: 编程面试的10大算法概念汇总 — 蓝枭博客
Pingback: 编程面试的10大算法概念汇总 | aftertoday
I have a fond memory of the time my high school teacher went through this for our math class. There were only about 10 students in the class, so the teacher said something about it being very likely that there were no shared birthdays; but when he went through and wrote everybody’s birthday on the board, it turns out that there was one shared birthday — he had forgotten that there was a pair of twins in the class.
Pingback: Carnival of Mathematics #105 | MatheMazier
Pingback: 编程面试过程中常见的10大算法概念汇总 – 农夫庄园