A recent post on Coding Horror about the nature of probability paradoxes struck a chord with me; I’ve always been fascinated by probability and its counterintuitive nature. Even though I rationally understand such concepts, I really do find it hard to “internalise” the thing. I thought I’d write about it to try and explain how to understand the problem.
The paradox in question is the Boy/Girl problem, which wikipedia has an explanation to here. However, I don’t like the Wikipedia explanation either. I will try to do better.
The question is:
“A family has two children. You know that (at least) one of the children is a boy. What is the probability that the other child is a girl?”.
The kneejerk response to this is “50%”. This is completely understandable and comes from a lifetime of learning that “chance has no memory”. The chance of a child being a boy or a girl is 50%, right? So how does knowing the sex of one of the children possibly affect the sex of the other?
The answer is that you have to consider the whole range of outcomes, you have to consider the results of the two childbirth events as a whole. The fact is, there are two children, with two possibilities each, so the range of possible outcomes looks like this:
BB BG GB GG
Where B and G mean boy and girl, obviously. Each of these four outcomes has equal probability.
However, since you have been told that one of the children is a boy, you are forced to remove one of the outcomes, leaving you with:
BB BG GB
See it? Since you know for sure that there are not two girls, there are only three possibilities left. In two of those three possibilities, the other child is a girl. So the chance of the other child being a girl is actually two thirds – 66.67%.
That is counterintuitive enough, but is fairly understandable when explained. Where it really gets freaky is when you introduce order. This is where my brain absolutely chokes and I have real difficulty accepting at an intuitive level the implications of having this extra information.
Let’s ask the question again, with information about the order:
“A family has two children. You know that the eldest child is a boy. What is the probability that the other child is a girl?”.
Sounds the same, right? My intuitive soul practically screams that the outcome should be the same. But it’s not.
Let’s look at the total range of possible outcomes again.
BB BG GB GG
We know the eldest child is a boy. Now we have to remove not one, but TWO, possible outcomes:
BB GB
In other words, the probability of the other child being a girl is back to 50%.
This seems utterly insane. How the fucking hell could knowing the order of birth influence probability in this way? After all, they have to be either younger or older, didn’t they? One or the other – it’s assumed! How can this change anything?
Well, the thing to understand here is that we’re not talking about two events any more. We’ve actually removed one of the events, so don’t need to consider four possible outcomes, of which one has been removed. We’ve considering one event with two possible outcomes – 50%.
In order to understand it more, let’s actually switch analogies to tossing a coin. We’re not used to thinking about order in relation to children, but in coin-tossing it is natural. When it comes down to it, though, it’s all the same thing. Let’s re-ask the questions – in terms of coin tosses.
Question: A coin was tossed twice. At least once, it came up heads. What’s the chance it came up tails the other time?
Outcomes: HH HT TH TT
Eliminate: TT
Remainder: HH HT TH
Answer: Two out of three times, ie. 66.67%
And now, let’s introduce our knowledge of the order:
Question: A coin was tossed twice, and came up heads the second time. What’s the chance it came up tails the first time?
Outcomes: HH HT TH TT
Eliminate: HT TT
Remainder: HH TH
Answer: 50%
Suddenly our knowledge of the order seems valuable. We’ve been given more information, and as a result the probability question is far more specific and back in line with the “independent event” intuitive expectations we had in the first place. By knowing everything about one of the events, we remove it from the equation.
Understanding how knowing the order influences probability has powerful ramifications. For example, we can now understand why the “statistical” result of 66.67% for the first “other child” example above doesn’t square with our intuitive expectation of 50%. When we first consider the problem, we can’t separate things out and are thinking in terms of “if the family has one boy, and another child is born, then that child has 50% chance of being either sex”. But see? That’s because we removed the event! If we don’t know the order, it’s back to 66.67%.
To me, rephrasing the question in terms of coins produces an “a-ha” moment in which I can intuitively grasp why the probability has suddenly “changed”. I hope the explanation works equally well for you.
Tags: maths