To understand a lot of statistical ideas, you need to know about
probability. The two fields are inextricably entwined: sampled statistics
works because of probabilistic properties of populations.
I approach writing about probability with no small amount of trepidation.
For some reason that I’ve never quite understood, discussions of probability
theory bring out an intensity of emotion that is more extreme than anything else
I’ve seen in mathematics. It’s an almost religious topic, like programming
languages in CS. This post is intended really as a flame attractor: that is, I’d request that if you want to argue about Bayesian probability versus frequentist probability, please do it here, and don’t clutter up every comment thread that
discusses probability!
There are two main schools of thought in probability:
frequentism and Bayesianism, and the Bayesians have an intense contempt for the
frequentists. As I said, I really don’t get it: the intensity seems to be mostly
one way – I can’t count the number of times that I’ve read Bayesian screeds about
the intense stupidity of frequentists, but not the other direction. And while I
sit out the dispute – I’m undecided; sometimes I lean frequentist, and sometimes I
lean Bayesian – every time I write about probability, I get emails and comments
from tons of Bayesians tearing me to ribbons for not being sufficiently
Bayesian.
It’s hard to even define probability without getting into trouble, because the
two schools of thought end up defining it quite differently.
The frequentist approach to probability basically defines probability in terms
of experiment. If you repeated an experiment an infinite number of times, and
you’d find that out of every 1,000 trials, a given outcome occured 350 times, then
a frequentist would say that the probability of that outcome was 35%. Based on
that, a frequentist says that for a given event, there is a true
probability associated with it: the probability that you’d get from repeated
trials. The frequentist approach is thus based on studying the “real” probability
of things – trying to determine how close a given measurement from a set of
experiments is to the real probability. So a frequentist would define probability
as the mathematics of predicting the actual likelihood of certain events occuring
based on observed patterns.
The bayesian approach is based on incomplete knowledge. It says that you only
associate a probability with an event because there is uncertainty about it –
because you don’t know all the facts. In reality, a given event either will happen
(probability=100%) or it won’t happen (probability=0%). Anything else is an
approximation based on your incomplete knowledge. The Bayesian approach is
therefore based on the idea of refining predictions in the face of new knowledge.
A Bayesian would define probability as a mathematical system of measuring the
completeness of knowledge used to make predictions. So to a Bayesian, strictly speaking, it’s incorrect to say “I predict that there’s a 30% chance of P”, but rather “Based on the current state of my knowledge, I am 30% certain that P will occur.”
Like I said, I tend to sit in the middle. On the one hand, I think that the
Bayesian approach makes some things clearer. For example, a lot of people
frequently misunderstand how to apply statistics: they’ll take a study showing
that, say, 10 out of 100 smokers will develop cancer, and assume that it means
that for a specific smoker, there’s a 10% chance that they’ll develop cancer.
That’s not true. The study showing that 10 out of 100 people who smoke will develop cancer can be taken as a good starting point for making a prediction – but a Bayesian will be very clear on the fact that it’s incomplete knowledge, and that it therefore isn’t very meaningful unless you can add more information to increase the certainty.
On the other hand, Bayesian reasoning is often used by cranks.
A Bayesian
argues that you can do a probabilistic analysis of almost anything, by lining
up the set of factors that influence it, and combining your knowledge of those factors in the correct way. That’s been used incredibly frequently by cranks for
arguing for the existence of God, for the “fact” that aliens have visited the
earth, for the “fact” that artists have been planting secret messages in
paintings, for the “fact” that there are magic codes embedded in various holy texts, etc. I’ve dealt with these sorts of arguments numerous times on this blog; the link above is a typical example.
Frequentism doesn’t fall victim to that problem; a frequentist only
believes probabilities make sense in the setting of a repeatable experiment. You
can’t properly formulate something like a probabilistic proof of God under the
frequentist approach, because the existence of a creator of the universe isn’t a
problem amenable to repeated experimental trials. But frequentism suffers
from the idea that there is an absolute probability for things – which is often ridiculous.
I’d argue that they’re both right, and both wrong, each in their own settings. There are definitely settings in which the idea of a fixed probability based on a model of repeatable, controlled experiment is, quite simply, silly. And there
are settings in which the idea of a probability only measuring a state of knowledge is equally silly.