I was recently fortunate enough to get a review copy of Cory Doctorow’s new book, <a href="Little Brother“>”Little Brother”. I’ve never read Doctorow before, but the book
was edited by Patrick Neilsen Hayden, who I think is the best editor in
the business, and Patrick says that this book is one of the best things
he’s ever worked on. In his words, it’s “one of the books that, should I happen to be run down by a beer truck next tuesday, I’d most like to be remembered for having helped into print”. So when Patrick posted on his blog that he had review copies available, I jumped at the chance.
Monthly Archives: April 2008
Implementing Compact Binary Heaps
Last post, I described the basic idea of the binary heap data
structure. But I didn’t talk about how to implement it – and there’s
a very cool way of implementing it – optimally space efficient,
very fast, and with respectable cache locality for moderate sized
datasets.
Binary Heaps
One of the most neglected data structures in the CS repertoire is
the heap. Unfortunately, the jargon is overloaded, so “heap” can mean
two different things – but there is a connection, which I’ll explain
in a little while.
In many programming languages, when you can dynamically create
objects in memory, the space from which you allocate the memory to
create them is called the heap. That is not what I’m talking about.
What I am talking about is an extremely simple but powerful structure
that is used for maintaining a collection of values from which you want
to be able to quickly remove the largest object quickly. This may sound
trivial, but this turns out to be an incredibly useful structure. You can use this basic structure to implement a very fast sorting algorithm; to maintain a prioritized set of values/tasks; to manage chunks of memory; etc. (The
prioritized set is the most common case in my experience, but that’s probably because I’ve spent so much of my career working on distributed systems.)
Simple Games, Utility Functions, and Saddle Points
Last time I wrote about Game Theory, I explained the basic idea of
zero sum games. In their simplest form, a game can be described by a payoff matrix,where each dimension of the matrix is the set of strategies which can be selected by one player, and each entry in the matrix describes the payoffs that result when all players select their strategy. Informally, a game is zero-sum when the total payoff is zero – that is, when any positive payout to one player is balanced by payouts from other players.
We can formalize some of the basic ideas more precisely. What we can say is that each move results in a state; and for each player y, there is a function, up, called a utility function, which maps from a state to a utility value. Taking the player utility functions as components, we can define the game’s utility function in terms of tuples: for players 1..n,
the utility function maps from states to tuples of utility values: u(state)=(u1, …, un).
More Bad Bayesians: No ETs!
Remember when I talked about the problems with Bayesian probability? As you’ll probably recall, one of the things that drives me crazy about Bayesianism is that you get a constant stream of crackpots abusing it. Since the basic assumption of Bayesian probability is that you can always use it, you’ll constantly get people abusing it.
Christopher Mims, who was one of the people running ScienceBlogs when I first signed on, sent me a classic example. A professor has published a paper in a journal called “Astrobiology”, arguing that there’s an exceedingly low probability of intelligent life elsewhere in the Universe.
Asteroid Apophosis, Orbit Changes, and Boy (not)Geniuses
You might have heard the story that’s been going round about the asteroid
Apophis. This is an asteroid that was, briefly, considered by NASA to be a collision risk with earth. But after more observations to gather enough data to compute its orbit more precisely, the result was that it’s not a significant risk. The current NASA estimates are that it’s a collision risk of about one in 45,000.
The news around it is that some German kid claims to have figured out that
NASA got it wrong, and that the real risk is 1 in 450. What was NASA’s big
mistake, according to the kid?
He says that if the asteroid were to hit a satellite, that it would change the satellite’s trajectory enough to make it hit the earth.
This has been reported with ridiculous credulity. Anyone with the least
bit of mathematical literacy should know, pretty much without even needing to
think about it, that this is absolutely silly.
XKCD and Friendly Numbers
I’ve been getting mail all day asking me to explain something
that appeared in today’s XKCD comic. Yes, I’ve been reduced to explaining geek comics to my readers. I suppose that there are worse fates. I just can’t
think of any. 🙂
But seriously, I’m a huge XKCD fan, and I don’t mind explaining interesting things no matter what the source. If you haven’t read today’s
comic, follow the link, and go look. It’s funny, and you’ll know what
people have been asking me about.
The comic refers to friendly numbers. The question,
obviously, is what are friendly numbers?
First, we define something called a divisors function over the integers, written σ(n). For any integer, there’s a set of integers that divide
into it. For example, for 4, that’s 1, 2, and 4. For 5, it’s just 1 and 5. And for 6, it’s 1, 2, 3, 6. The divisors function, σ(n) is the sum of all of the divisors of n. So
$ sigma(4)=8, sigma(5)=6, sigma(6)=12.$
For each integer, there is a characteristic ratio, defined
using the divisors function. For the integer n, the characteristic
is the ratio of the divisors function over the the number itself: σ(n)/n. So the characteristic ratio of 4 is 7/4; for 6, it’s
12/6=2.
For any characteristic ratio, the set of numbers that share that characteristic are friendly with each other. A friendly number is,
therefore, any integer that shares its characteristic ratio with at least one other integer. If an integer isn’t friendly, then it’s called a solitary number. 1, 2, 3, 4, and 5 are all solitary numbers. 6 is
friendly with 28 (1+2+4+7+14+28/28 = 56/28 = 2).
replaceMath( document.body );
Ben Stein and Darwin: Truth is what matters.
Like the rest of the skeptical blogosphere, I’ve been watching the
uproar around Ben Stein’s new movie with a lot of amusement, but also with
a lot of disgust. There’s one thing that I feel compelled to comment on that
I think has, for some reason, not been addressed nearly enough.
Probability Distributions
I’d like to start with a quick apology. Sorry that both the abstract algebra and the new game theory posts have been moving so slowly. I’ve been a bit overwhelmed lately with things that need doing right away, and
by the time I’m done with all that, I’m too tired to write anything that
requires a lot of care. I’ve known the probability theory stuff for so long,
and the parts I’m writing about so far are so simple that it really doesn’t take nearly as much effort.
With that out of the way, today, I’m going to write about probability distributions.
Random Variables
The first key concept in probability is called a random variable.
Random variables are a key concept – but since they’re a key concept of the
frequentist school, they are alas, one of the things that bring out more of
the Bayesian wars. But the idea of the random variable, and its key position
in understanding probability and statistics predates the divide between
frequentist and Bayesian though. So please, folks, be a little bit patient,
and don’t bring the Bayesian flamewars into this post, OK? If you want to
rant about how stupid frequentist explanations are, please keep it in the comments here. I’m trying to
explain basic ideas, and you really can’t talk about probability and
statistics without talking about random variables.