Schools of thought in Probability Theory

To understand a lot of statistical ideas, you need to know about
probability. The two fields are inextricably entwined: sampled statistics
works because of probabilistic properties of populations.

I approach writing about probability with no small amount of trepidation.

For some reason that I’ve never quite understood, discussions of probability
theory bring out an intensity of emotion that is more extreme than anything else
I’ve seen in mathematics. It’s an almost religious topic, like programming
languages in CS. This post is intended really as a flame attractor: that is, I’d request that if you want to argue about Bayesian probability versus frequentist probability, please do it here, and don’t clutter up every comment thread that
discusses probability!

There are two main schools of thought in probability:
frequentism and Bayesianism, and the Bayesians have an intense contempt for the
frequentists. As I said, I really don’t get it: the intensity seems to be mostly
one way – I can’t count the number of times that I’ve read Bayesian screeds about
the intense stupidity of frequentists, but not the other direction. And while I
sit out the dispute – I’m undecided; sometimes I lean frequentist, and sometimes I
lean Bayesian – every time I write about probability, I get emails and comments
from tons of Bayesians tearing me to ribbons for not being sufficiently
Bayesian.

It’s hard to even define probability without getting into trouble, because the
two schools of thought end up defining it quite differently.

The frequentist approach to probability basically defines probability in terms
of experiment. If you repeated an experiment an infinite number of times, and
you’d find that out of every 1,000 trials, a given outcome occured 350 times, then
a frequentist would say that the probability of that outcome was 35%. Based on
that, a frequentist says that for a given event, there is a true
probability associated with it: the probability that you’d get from repeated
trials. The frequentist approach is thus based on studying the “real” probability
of things – trying to determine how close a given measurement from a set of
experiments is to the real probability. So a frequentist would define probability
as the mathematics of predicting the actual likelihood of certain events occuring
based on observed patterns.

The bayesian approach is based on incomplete knowledge. It says that you only
associate a probability with an event because there is uncertainty about it –
because you don’t know all the facts. In reality, a given event either will happen
(probability=100%) or it won’t happen (probability=0%). Anything else is an
approximation based on your incomplete knowledge. The Bayesian approach is
therefore based on the idea of refining predictions in the face of new knowledge.
A Bayesian would define probability as a mathematical system of measuring the
completeness of knowledge used to make predictions. So to a Bayesian, strictly speaking, it’s incorrect to say “I predict that there’s a 30% chance of P”, but rather “Based on the current state of my knowledge, I am 30% certain that P will occur.”

Like I said, I tend to sit in the middle. On the one hand, I think that the
Bayesian approach makes some things clearer. For example, a lot of people
frequently misunderstand how to apply statistics: they’ll take a study showing
that, say, 10 out of 100 smokers will develop cancer, and assume that it means
that for a specific smoker, there’s a 10% chance that they’ll develop cancer.
That’s not true. The study showing that 10 out of 100 people who smoke will develop cancer can be taken as a good starting point for making a prediction – but a Bayesian will be very clear on the fact that it’s incomplete knowledge, and that it therefore isn’t very meaningful unless you can add more information to increase the certainty.

On the other hand, Bayesian reasoning is often used by cranks.
A Bayesian
argues that you can do a probabilistic analysis of almost anything, by lining
up the set of factors that influence it, and combining your knowledge of those factors in the correct way. That’s been used incredibly frequently by cranks for
arguing for the existence of God, for the “fact” that aliens have visited the
earth, for the “fact” that artists have been planting secret messages in
paintings, for the “fact” that there are magic codes embedded in various holy texts, etc. I’ve dealt with these sorts of arguments numerous times on this blog; the link above is a typical example.

Frequentism doesn’t fall victim to that problem; a frequentist only
believes probabilities make sense in the setting of a repeatable experiment. You
can’t properly formulate something like a probabilistic proof of God under the
frequentist approach, because the existence of a creator of the universe isn’t a
problem amenable to repeated experimental trials. But frequentism suffers
from the idea that there is an absolute probability for things – which is often ridiculous.

I’d argue that they’re both right, and both wrong, each in their own settings. There are definitely settings in which the idea of a fixed probability based on a model of repeatable, controlled experiment is, quite simply, silly. And there
are settings in which the idea of a probability only measuring a state of knowledge is equally silly.

Lying Losers and Cheap Victories: Uncommon Descent at its best

I just had to promote this to the top level of the blog.

If you remember, way back in December, I posted something about Sal Cordova’s new blog. (As an interesting sidenote, Sal started his blog after
supposedly resigning from Uncommon Descent, claiming that he was returning to school, and that the evil darwinists would sabotage his academic career if he
continued to be associated with UnD. But of course, now, he’s back with
the UnDs.)

Anyway… I was mocking him because on his blog he was posting something about how math and physics were going to prove his young-earth creation rubbish. What I mocked was that he posted what he called “fundamental theorems of intelligent design”. These consisted of a couple of equations without bothering to tell you what the symbols in those equations meant.

He also babbled about fourier transforms – copying and pasting equations
from wikipedia, again without bothering to define anything (and in fact, copying
and pasting the wrong equation.)

Sal showed up to “defend” himself, rather poorly. Then he disappeared. The
comment thread died out on December 20th of last year.

There was no activity at all on the thread until March, when two pretty random
comments were posted. And then, again, silence.

Until April 2nd. On April 2nd, Sal showed up again, and posted a comment. More than three months after his last appearance on the blog; more than three months since the comment thread ended; more than one month since the last comment of any kind.

Then, on April 4th at 8am, Sal posted a comment over at Uncommon Descent,
in which he declared victory: “Look at the very end of that blog at Mark Chu’- Log. I posted right there in hostile territory on April 2, 2008. Did you notice no one offered a rebuttal?”.

Yes, this is the Uncommon Descent version of victory. You make an idiotic mistake, make a fool of yourself trying to defend it, wait four months to post a reply comment to a thread that no one has looked at for months, and less than two days later, crow about how no one dared to offer a rebuttal.

Of course, now there are multiple rebuttals. But even if no one had
bothered to reply at all – the simple fact of the matter is that this is a perfect demonstration of the typical tactics of the UD folks. Don’t engage in real
debates. Don’t have real discussions. But find ways to misquote people,
to pretend, to create a fake victory. Truth doesn’t matter. What matters
is whether you trick people. The fact that Sal posted something absolutely
mind-boggling stupid on his blog – that means absolutely nothing in Sal’s world. The fact that he still doesn’t have any clue about what he was talking about
all those months ago, then he never managed to come within miles of making
a coherent point – that means nothing in Sal’s world.

What matters is winning – where winning is defined in the shallowest possible
way. Let months go by, post something in a months-old comment thread, and then wait less than two days before you crow about how no one could rebut you. That’s Sal’s idea of victory. Not winning an argument; not doing an experiment; not
proving a point; no… victory is a trick.

Liars: No Information Allowed

Bad from the Bad Ideas Blog sent me a link to some clips from Ben Stein’s new Magnum Opus, “Expelled”. I went and took a look. Randomly, I picked one that looked like a clip from the movie rather than a trailer – it’s the one titled “Genetic Mutation”.

Care to guess how long it took me to find an insane, idiotic error?

4 seconds.

Continue reading

Understanding Non-Euclidean Hyperbolic Spaces – With Yarn!

crochet_02.jpg

One of my fellow ScienceBloggers, Andrew Bleiman from Zooilogix, sent me an amusing link. If you’ve done things like study topology, then you’ll know about non-euclidean spaces. Non-euclidean spaces are often very strange, and with the exception of a few simple cases (like the surface of a sphere), getting a handle on just what a non-euclidean space looks like can be extremely difficult.

One of the simple to define but hard to understand examples is called a hyperbolic space. The simplest definition of a hyperbolic space is a space
where if you take open spheres of increasing radius around a point, the amount of space in those open spheres increases exponentially.

If you think of a sheet of paper, if you take a point, and you draw progressively larger circles around the point, the size of the circles increases
with the square of the radius: for a circle with radius R, the amount of space inside the circle is proportional to R2. If you did it in three dimensions, the amount of space in the spheres would be proportional to R3. But it’s always a fixed exponent.

In a hyperbolic space, you’ve got a constant N, which defines the “dimensionality” of the space – and the open spheres around it enclose a
quantity of space proportional to NR. The larger the open circle around
a point, the higher the exponent.

What Andrew sent me is a link about how you can create models of hyperbolic
spaces using simple crochet.
And then you can get a sense of just how a hyperbolic space works by playing with the thing you crocheted!

It’s absolutely brilliant. Once you see it, it’s totally obvious
that this is a great model of a hyperbolic space, and just about anyone
can make it, and then experiment with it to get an actual tactile sense
of how it works!

It just happens that right near where I live, there’s a great yarn shop whose owners my wife and I have become friends with. So if you’re interested in trying this out, you should go to their shop, Flying Fingers, and buy yourself some yarn and crochet hooks, and crochet yourself some hyperbolic surfaces! And tell Elise and Kevin that I sent you!

The Real Murphy's Law

749px-Rocket_sled_track.triddle.jpg

I know better than to attempt to write an april fools day post that really
tries to fool anyone. I’m not a good enough writer to carry that kind of thing off
in a genuinely amusing way. On the other hand, I love april fools day pranks, and
I generally like the silly mood of the day. So I thought I’d write some posts in
the spirit of silliness.

As someone working in engineering, one of my favorite rules is Murphy’s Law. The thing about Murphy’s law is that odds are, what you just thought when I said “Murphy’s Law” is not, in fact, Murphy’s Law. Odds are, you think that Murphy’s law says “If anything can go wrong, it will”. That’s not what it says – Murphy’s law is almost always stated wrong!

The real Murphy’s law: If there’s more than one way to do something,
and one way will result in disaster, then someone will do it that way.

Continue reading

Zero Sum Games

In game theory, perhaps the most important category of simple games is
something called zero sum games. It’s also one of those mathematical
things that are widely abused by the clueless – you constantly hear
references to the term “zero-sum game” in all sorts of contexts, and they’re
almost always wrong.

A zero-sum game is a game in which the players are competing for resources, and the set of resources is fixed. The fixed resources means that any gain by one player is necessarily offset by a loss by another player. The reason that this is called
zero-sum is because you can take any result of the game, and “add it up” – the losses will always equal the wins, and so the sum of the wins and losses in the result of the game will always be 0.

Continue reading

Introduction to Linear Regression

Suppose you’ve got a bunch of data. You believe that there’s a linear
relationship between two of the values in that data, and you want to
find out whether that relationship really exists, and if so, what the properties
of that relationship are.

Continue reading

Framing and Expelled: Why the Framers are Mis-Framing

I’m going to jump into the framing wars again. As I mentioned last time,
I think that most folks who are “opposed” to framing really don’t understand what they’re talking about – and I’ll once again explain why. But on the other hand,
I think that our most prominent framing advocates here at SB are absolutely
terrible at it – and by their ineptitude, are largely responsible for
the opposition to the whole thing.

Continue reading

Basic Statistics: Mean and Standard Deviation

Several people have asked me to write a few basic posts on statistics. I’ve
written a few basic posts on the subject – like, for example, this post on mean, median and mode. But I’ve never really started from the beginnings, for people
who really don’t understand statistics at all.

To begin with: statistics is the mathematical analysis of aggregates. That is, it’s a set of tool for looking at a large quantity of data about a population, and finding ways to measure, analyze, describe, and understand the information about the population.

There are two main kinds of statistics: sampled statistics, and
full-population statistics. Full-population statistics are
generated from information about all members of a population; sampled statistics
are generated by drawing a representative sample – a subset of the population that should have the same pattern of properties as the full population.

My first exposure to statistics was full-population statistics, and that’s
what I’m going to talk about in the first couple of posts. After that, we’ll move on to sampled statistics.

Continue reading