I was browsing around my crackpottery achives, looking for something fun to write about. I noticed a link from one of them to an old subject of one of my posts, the inimitable Miles Mathis. And following that, I noticed an interesting comment from Mr. Mathis about Monte Carlo methods: “Every mathematician knows that ‘tools’ like Monte Carlo are used only when you’ve got nothing else to go on and you are flying by the seat of your pants.” I find Monte Carlo computations absolutely fascinating. So instead of wasting time making fun of more of Mathis rubbish, I decided to talk about Monte Carlo methods.
It’s a little hard to talk about Monte Carlo methods, because there’s a lot of disagreement about exactly what they are. I’m going to use the broadest definition: a Monte Carlo method is a way of generating a computational result using repeated computations and random sampling.
In other words, Monte Carlo methods are a way of using random sampling to solve problems.
I’ll start with a really simple example. Suppose you want to know the value of . (Pretend that you don’t know the analytical solution.) One thing that you could do is try to measure the circumference of a rod, and then divide it by its diameter. That would work, but it would be really hard to get much accuracy. You could, instead, get a great big square sheet of paper, and cover the whole thing in a single layer of grains of sand. Then, very carefully, you could remove the grains of sand that weren’t in the circle, compare it to the number of grains of sand that weren’t in the circle. By doing that, you could get a very, very accurate measurement of the area of the circle, and using that, you could get a much more accurate estimate of .
The problem with that is: it’s really hard to get a perfect single-grain layer of sand all over the paper. And it would be a lot of very, very tedious work to get all of the grains that weren’t in the circle. And it would be very tedious to count them. It’s too much trouble.
Instead, you could just take 1,000 grains of sand, and drop them randomly all over the circle and the square. Then you could count how many landed in the circle. Or ever easier, you could just go to a place where lots of drunk people play darts! Draw a square around the dartboard, and count how many holes there are in the square wall around it, versus how many in the dartboard!
You’re not going to get a super-precise value for – but you might be surprised just how good you can get!
That’s the basic idea of monte carlo simulation: you’ve got a problem that’s hard to compute, or one where you don’t know a closed-form solution to make it easy to compute. Getting the answer some other way is intractable, because it requires more work than you can reasonably do. But you’ve got an easy way to do a test – like the “is it in the circle or not” test. So you generate a ton of random numbers, and use those, together with the test, to do a sequence of trials. Then using the information from the trials, you can get a reasonable estimate of the value you wanted. The more trials you do, the better your estimate will be.
The more you understand the probability distribution of the space you’re sampling, the better your estimate will be. For example, in the example above, we assumed that the grains of sand/darts would be distributed equally all over the space. If you were using the dartboard in a bar, the odds are that the distribution wouldn’t be uniform – there’d be many more darts hitting the dartboard than hitting the wall (unless I was playing). If you assumed a uniform distribution, your estimate would be off!
That’s obviously a trivial example. But in reality, the Monte Carlo method is incredibly useful for a wide range of purposes. It was used during World War II by the Manhattan project to help design the first atom bomb! They needed to figure out how to create a critical mass that would sustain a nuclear chain reaction; to do that, they needed to be able to compute neutron diffusion through a mass of radioactive uranium. But that’s a very hard problem: there are so many degrees of freedom – so many places where things could proceed in several seemingly (or actually!) random directions. With the computers they had available to them at the time, there was absolutely no way that they could write a precise numerical simulation!
But, luckily for them, they had some amazing mathematicians working on the problem! One of them, Stanislav Ulam, had been sick, and while he was recovering, fooled around with some mathematical problems. One of them involved a solitaire game, called Canfield. Ulam wanted to figure out how often the game of Canfield was winnable. He couldn’t figure out how to do it analytically, but he realized that since the deals of cards are a uniform distribution, then if you were to take a computer, and make it play through 1000 games, the number of times that it won would be a pretty good estimate of how many times the game was winnable in general.
In that case, it’s obvious that a complete solution isn’t feasible: there are 52! possible deals – roughly ! But with just a couple of hundred trials, you can get a really good estimate.
Ulam figured that out for the card game. He explained it to Jon von Neumann, and von Neumann realized that the same basic method could be used for the Neutron diffraction process!
Since then, it’s been used as the basis for a widely applicable approach to numeric integration. It’s used for numerous physics simulations, where there is no tractable exact solution – for example, weather prediction. (We’ve been able to get reasonably accurate weather predictions up to six days in advance, using very sparse input data, by using Monte Carlo methods!) It’s an incredibly useful, valuable technique – and anyone who suggests that using Monte Carlo is in any way a half-assed solution is an utter jackass.
I’ll finish up with a beautiful example – my favorite example of combining analytical methods with Monte Carlo. It’s another way of computing an estimate of , but it gets a more accurate result with fewer trials than the sand/darts.
It’s based on a problem Buffon’s needle. Buffon’s needle is a problem first proposed by the Count of Buffon during the 1700s. He asked: suppose I drop a needle onto a panelled wood floor. What’s the probability that the needle will fall so that it crosses a one of the joints between different boards?
Using some very nice analytical work, you can show that if the panels have uniform width , and the needle has length , then the probability of a needle crossing a line is: . That gives us the nice property that if we let , then the probability of crossing a line is .
Using that, you can do a Monte Carlo computation: take a sheet of paper, and a couple of hundred matchsticks. Draw lines on the paper, separated by twice the length of a matchstick. Then scatter the matchsticks all over the paper. Divide the total number of matchsticks by the number that crossed a line. The result will be roughly .
For example – with 200 trials, I got 63 crossing a line. That gives me roughly 3.17 as an estimate of . That’s not half-bad for a five minute experimental estimate!
While Monte Carlo techniques do cover a lot of ground (the precise boundaries of which are not well-understood), I think that Monte Carlo integration has a fairly well-understood meaning: you identify a random variable, the expected value of which is the value of the desired integral.
If you want to expand on this, why not do an article on the Metropolis-Hastings algorithm?
” you could remove the grains of sand that weren’t in the circle, compare it to the number of grains of sand that weren’t in the circle”
Yes, it’s very beautiful to believe that mathematics should give us intuitions and not numbers, but from time to time you should measure something to get paid XD
Darts and Pi are very beautiful examples, but we really need more imaginative ones. I frequently use these methods, for example, to check Stokes’ based computations of dynamical properties, like volume or inertia tensor. It has saved me hundreds of hours of numerical debugging. They can be coded in minutes, are extremely general, have no inherent accuracy limitations (maybe slow convergence for some problems) and are simply enough to be easilly checked with high confidence.
Just so it’s clear, @John Fringe, are you pointing out the slight typo in that paragraph you’ve quoted? It has weren’t x2, instead of weren’t, then were (or vice versa).
Yes, of course. I deleted a “small typo here” accidentally :S
What makes Monte Carlo methods better than simply sampling uniformly within your defined space of variable values?
I suppose it’s easier to include more samples in your estimate, but for a fixed number of samples, is it actually more accurate?
Uniform sampling would be a very bad idea for the general case. The technical explanation requires a very long post, requiring some knowledge about signals, aliasing and probability, so I’m just hintting the real reason. Using a random sampling and a correct computation (with some restrictions), the Law of Large Numbers guarantees that the value of the computation converges to the correct value as more samples are generared. There is no such guatantee for uniform sampling.
For one concrete example, imagine trying to count the number of primes in 1 .. 1 trillion by uniform sampling. If you take every nth element, you will find at most 1 prime in the sample; if you take every second number, you will conclude there’s two primes in that range. A random sample, on the other hand, will come up with somewhere around 3.6 x 10^10 primes (I think).
Mark,
Would you consider doing a post on pseudorandom number generation?
I vaguely recall using a Monte Carlo simulation to estimate some probabilities involving a type of puzzle. I suspected that the estimate for the most improbable state may have been inflated by a linear congruential generator.
Sure – it’s an interesting subject! I don’t actually know much about it, so it’ll take a bit of research. (Translate as: don’t expect it this evening :-))
One small historical correction: while the Monte Carlo Method was developed and used by Stan Ulam and others who worked on the Manhattan Project, they used it for research into the *hydrogen* (not the *atom*) bomb, which was developed *after* World War II.
Stan Ulam discusses this in his wonderful book, *Adventures of a Mathematician*, starting on page 199:
“The Monte Carlo method came into concrete form with its attendant rudiments of a theory after I proposed the possibilities of such probabilistic schemes to Johnny in 1946 during one of our conversations. It was an especially long discussion in a government car while we were driving from Los Alamos to Lamy. We talked throughout the trip, and I remember to this day what I said at various turns in the road or near certain rocks. (I mention this because it illustrates what may be multiple storing in the memory in the brain, just as one often remembers the place on the page where certain passages have been read, whether it is on the left- or right-hand page, up or down, and so on.) After this conversation we developed together the mathematics of the method. It seems to me that the name Monte Carlo contributed very much to the popularization of this procedure. It was named Monte Carlo because of the elements of chance, the production of random numbers with which to play the suitable games.”
You can also find a nice paper by Metropolis, Hastings, and other, on their original proposal of the Metropolis-Hastings here:
http://bayes.wustl.edu/Manual/EquationOfState.pdf
The name Mathis sounded familiar, and sure enough, he was the fellow who wrote this hilariously mistaken article about tides http://milesmathis.com/tide.html. Most notably, he is confused about the basic equation of tides, and therefore concluded that tides are really inverse-square, vice inverse cube.
Okay, cool. I’m in a field (evolutionary biology) that uses Monte Carlo techniques for a lot of analyses and in part because I’m only between undergrad and grad school and in part because understanding hasn’t been necessary (lots of more-or-less-black-box software available), I’ve not understood the principle very well until now.
Maybe some time in the future you could tackle Markov Chains and how MCMC techniques work?
I couldn’t find an email for you so I thought the following would fit here, given that you started with pi. I don’t know if you’ve encountered the crank known as “Jain 108” but he’s proven that the accepted value of pi is wrong. The true value, which he calls “J pi” (for “Jain 108 pi”, get it?) is 3.144…. His stuff is all over and he wants you to buy one of his many books but I got the following at: http://www.jainmathemagics.com/product/137/default.asp?
Here’s the money quote:
” You’ve been spoonfed a lie for your whole life! The anointed Pi that you know (the Circle To Square relationship) is a deliberate disharmonic frequency. It does not equal 3.141592… Ask NASA, they use another value for pi! This erroneous pi value is merely an approximation for the limit or sum of millions and billions of triangular polygons (like narrow slices of a pizza) used to derive the circumference of the circle, fudging straight lines to achieve a smooth curve, using the Linear to comprehend or describe the Non-Linear Curve-Circumference. It does not reflect or account for the millions and billions of tiny or infinitesimal little Areas Under The Curve. The fundamental error is that our greatest mathematicians for thousands of years assumed, with brute force, that the more polygons we imposed the more magically that the Area Under The Curve would simply just disappear. Like der. An ancient and embarrassing intellectual faux pas.
• Fortunately, the True Value of Pi, known as JainPi = 3.144605511029693144… can be determined, by copying and comprehending the Phi-ratioed mathematics of flowers and crystals, the human canon and planetary distances from the sun: based on the Golden Root or Square Root of Phi (1.272… which is the height of the Cheop’s pyramid when its apothem or centre of base to edge length is 1 unit). JainPi = 4 / √ϕ (4 ÷ by the square root of Phi). It has algebraic golden roots derived from a 4th dimensional equation/polynomial (x4 + 16×2 – 256 = 0), contradicting all the books on its transcendentality.
• One purpose of this compendium is the promotion of International Conferences entitled: “IS PI A LIE?.” Its aim is to gather and harvest the greatest mathematical minds on the planet today, towards the Fibonaccization of East & West, to finally put an end to this ancient debate.
• Lots of interesting information on Traditional or Legacy Pi, has been included in this book and its sequel, volume 9, is specifically on Pi with some extra notes on JainPi.
• … it has been over a decade that Jain of Oz has speculated that Pi could be falsch and now with the release of his major work The Book of Phi, volume 8, confirming that it is indeed falsch, seasoned and once skeptical mathematicians are now reaching for their thesauri in search of superlatives that would do Jain 108 justice… openly and excitedly announcing without any doubt the correctness of JainPi… a phi-nomenon that takes us to the source of non-decaying spin of the electron and the physics of black holes… ”
My favorite part is ” once skeptical mathematicians are now reaching for their thesauri in search of superlatives that would do Jain 108 justice.” And NASA apparently uses J pi. What’s not to like about that?