Big Science News! Inflation, Gravity, and Gravitational Waves

gwaves

So, big announcement yesterday. Lots of people have asked if I could try to explain it! People have been asking since yesterday morning, but folks, I’ve got a job! I’ve been writing when I have time while running slow tests in another window, so it’s taken more than a day to get to it.

The announcement is really, really fascinating. A group has been able to observe gravity wave fluctuations in the cosmic microwave background. This is a huge deal! For example, Sean Carroll (no relation) wrote:

other than finding life on other planets or directly detecting dark matter, I can’t think of any other plausible near-term astrophysical discovery more important than this one for improving our understanding of the universe.

Why is this such a big deal?

This is not an easy thing to explain, but I’ll do my best.

We believe that the universe started with the big bang – all of the matter and energy, all of the space in the universe, expanding outwards from a point. There’s all sorts of amazing evidence for the big bang – not least the cosmic microwave background.

But the big-bang theory has some problems. In particular, why is everything the same everywhere?

That sounds like a strange question. Why wouldn’t it be the same everywhere?

Here’s why: because for changes to occur at the same time in different places, we expect there to be some causal connection between those places. If there is no plausible causal connection, then there’s a problem: how could things happen at the same time, in the same way?

That causal connection is a problem. To explain why, first I need to explain the idea of the observable universe.

Right now, there is some part of the universe that we can observe – because light from it has reached us. There’s also some part of the universe that we can’t observe, because light from it hasn’t reached us yet. Every day, every moment, the observable universe gets larger – not because the universe is expanding (it is, but we’re not talking about the size of the universe, but rather of the part of the universe that we can observe). It’s literally getting larger, because there are parts of the universe that are so far away from us, that the first light they emitted after the universe started didn’t reach us until right now. That threshold, of the stuff that couldn’t possible have gotten here yet, is constantly expanding, getting farther and farther away.

There are parts of the universe that are so far away, that the light from them couldn’t reach us until now. But when we look at that light, and use it to see what’s there, it looks exactly like what we see around us.

The problem is, it shouldn’t. If you just take the big bang, and you don’t have a period of inflation, what you would expect is a highly non-uniform universe with a very high spatial curvurature. Places very far away shouldn’t be exactly the same as here, because there is no mechanism which can make them evolve in exactly the same way that they did here! As energy levels from the big bang decrease, local fluctuations should have produced very different outcomes. They shouldn’t have ended up the same as here – because there’s many different ways things could have turned out, and they can’t be causally connected, because there’s no way that information could have gotten from there to here in time for it to have any effect.

Light is the fastest thing in the universe – but light from these places just got here. That means that until now, there couldn’t possibly be any connection between here and there. How could all of the fundamental properties of space – its curvature, the density of matter and energy – be exactly the same as here, if there was never any possible causal contact between us?

The answer to that is an idea called inflation. At some time in the earliest part of the universe – during a tiny fraction of the first second – the universe itself expanded at a speed faster than light. (Note: this doesn’t mean that stuff moved faster than light – it means that space itself expanded, creating space between things so that the distance between them expanded faster than light. Subtle distinction, but important!) So the matter and energy all got “stretched” out, at the same time, in the same way, giving the universe the basic shape and structure that it has now.

Inflation is the process that created the uniform universe. This process, which happened to the entire universe, had tiny but uniform fluctuations because of the basic quantum structure of the universe. Those fluctuations were the same everywhere – because when they happened, they were causally connected! Inflation expanded space, but those fluctuations provided the basic structure on which the stuff we observe in the universe developed. Since that basic underlying structure is the same everywhere, everything built on top it is the same as well.

We’ve seen lots of evidence for inflation, but it hasn’t quite been a universally accepted idea.

The next piece of the puzzle is gravity. Gravity at least appears to be very strange. All of the other forces in our universe behave in a consistent way. In fact, we’ve been able to show that they’re ultimately different aspects of the same underlying phenomena. All of the other forces can be described quantum mechanically, and they operate through exchange particles that transmit force/energy – for example, electromagnetic forces are transmitted by photons. But not gravity: we have no working quantum theory for how gravity works! We strongly suspect that it must, but we don’t know how, and up to now, we never found any actual proof that it does behave quantumly. But if it did, and if inflation happened, that means that those quantum fluctations during expansion, the things that provided the basic lattice on which matter and energy hang, should have created an echo in gravity!

Unfortunately, we can’t see gravity. The combination of inflation and quantum mechanics means that there should be gravitational fluctuations in the universe – waves in the basic field of gravity. We’ve predicted those waves for a long time. But we haven’t been able to actually test that prediction, because we didn’t have a way to see gravitational waves.

So now, I can finally get to this new result.

They believe that they found gravity waves in the cosmic microwave background. They used a brilliant scheme to observe them: if we look at the cosmic microwave background – not at any specific celestial object, but just at the background – gravitational waves would created a very subtle tensor polarization effect. So they created a system that could observe polarization. Then they removed all of the kinds of polarization that could be explained by anything other than gravitational waves. What they were left with was a very clear wave pattern in the polarization – exactly what was predicted by quantum inflation! You can see one of their images of this wave pattern at the top of this post.

If these new observations are confirmed, that means that we have new evidence for two things:

  1. Inflation happened. These gravity waves are an expected residue of inflation. They’re exactly what we would have expected if inflation happened, and we don’t have any other explanation that’s compatible with them.
  2. Gravity is quantum! If gravity wasn’t quantum, then expansion would have completely smoothed out the gravitational effects, and we wouldn’t see gravitational waves. Since we do see waves, it’s strong evidence that gravity really does have a quantum aspect. We still don’t know how it works, but now we have some really compelling evidence that it must!

Monte Carlo!

I was browsing around my crackpottery achives, looking for something fun to write about. I noticed a link from one of them to an old subject of one of my posts, the inimitable Miles Mathis. And following that, I noticed an interesting comment from Mr. Mathis about Monte Carlo methods: “Every mathematician knows that ‘tools’ like Monte Carlo are used only when you’ve got nothing else to go on and you are flying by the seat of your pants.” I find Monte Carlo computations absolutely fascinating. So instead of wasting time making fun of more of Mathis rubbish, I decided to talk about Monte Carlo methods.

It’s a little hard to talk about Monte Carlo methods, because there’s a lot of disagreement about exactly what they are. I’m going to use the broadest definition: a Monte Carlo method is a way of generating a computational result using repeated computations and random sampling.

In other words, Monte Carlo methods are a way of using random sampling to solve problems.

I’ll start with a really simple example. Suppose you want to know the value of \pi. (Pretend that you don’t know the analytical solution.) One thing that you could do is try to measure the circumference of a rod, and then divide it by its diameter. That would work, but it would be really hard to get much accuracy. You could, instead, get a great big square sheet of paper, and cover the whole thing in a single layer of grains of sand. Then, very carefully, you could remove the grains of sand that weren’t in the circle, compare it to the number of grains of sand that weren’t in the circle. By doing that, you could get a very, very accurate measurement of the area of the circle, and using that, you could get a much more accurate estimate of \pi.

The problem with that is: it’s really hard to get a perfect single-grain layer of sand all over the paper. And it would be a lot of very, very tedious work to get all of the grains that weren’t in the circle. And it would be very tedious to count them. It’s too much trouble.

Instead, you could just take 1,000 grains of sand, and drop them randomly all over the circle and the square. Then you could count how many landed in the circle. Or ever easier, you could just go to a place where lots of drunk people play darts! Draw a square around the dartboard, and count how many holes there are in the square wall around it, versus how many in the dartboard!

You’re not going to get a super-precise value for \pi – but you might be surprised just how good you can get!

That’s the basic idea of monte carlo simulation: you’ve got a problem that’s hard to compute, or one where you don’t know a closed-form solution to make it easy to compute. Getting the answer some other way is intractable, because it requires more work than you can reasonably do. But you’ve got an easy way to do a test – like the “is it in the circle or not” test. So you generate a ton of random numbers, and use those, together with the test, to do a sequence of trials. Then using the information from the trials, you can get a reasonable estimate of the value you wanted. The more trials you do, the better your estimate will be.

The more you understand the probability distribution of the space you’re sampling, the better your estimate will be. For example, in the \pi example above, we assumed that the grains of sand/darts would be distributed equally all over the space. If you were using the dartboard in a bar, the odds are that the distribution wouldn’t be uniform – there’d be many more darts hitting the dartboard than hitting the wall (unless I was playing). If you assumed a uniform distribution, your estimate would be off!

That’s obviously a trivial example. But in reality, the Monte Carlo method is incredibly useful for a wide range of purposes. It was used during World War II by the Manhattan project to help design the first atom bomb! They needed to figure out how to create a critical mass that would sustain a nuclear chain reaction; to do that, they needed to be able to compute neutron diffusion through a mass of radioactive uranium. But that’s a very hard problem: there are so many degrees of freedom – so many places where things could proceed in several seemingly (or actually!) random directions. With the computers they had available to them at the time, there was absolutely no way that they could write a precise numerical simulation!

But, luckily for them, they had some amazing mathematicians working on the problem! One of them, Stanislav Ulam, had been sick, and while he was recovering, fooled around with some mathematical problems. One of them involved a solitaire game, called Canfield. Ulam wanted to figure out how often the game of Canfield was winnable. He couldn’t figure out how to do it analytically, but he realized that since the deals of cards are a uniform distribution, then if you were to take a computer, and make it play through 1000 games, the number of times that it won would be a pretty good estimate of how many times the game was winnable in general.

In that case, it’s obvious that a complete solution isn’t feasible: there are 52! possible deals – roughly 3\times 10^{66}! But with just a couple of hundred trials, you can get a really good estimate.

Ulam figured that out for the card game. He explained it to Jon von Neumann, and von Neumann realized that the same basic method could be used for the Neutron diffraction process!

Since then, it’s been used as the basis for a widely applicable approach to numeric integration. It’s used for numerous physics simulations, where there is no tractable exact solution – for example, weather prediction. (We’ve been able to get reasonably accurate weather predictions up to six days in advance, using very sparse input data, by using Monte Carlo methods!) It’s an incredibly useful, valuable technique – and anyone who suggests that using Monte Carlo is in any way a half-assed solution is an utter jackass.

I’ll finish up with a beautiful example – my favorite example of combining analytical methods with Monte Carlo. It’s another way of computing an estimate of \pi, but it gets a more accurate result with fewer trials than the sand/darts.

It’s based on a problem Buffon’s needle. Buffon’s needle is a problem first proposed by the Count of Buffon during the 1700s. He asked: suppose I drop a needle onto a panelled wood floor. What’s the probability that the needle will fall so that it crosses a one of the joints between different boards?

Using some very nice analytical work, you can show that if the panels have uniform width t, and the needle has length l, then the probability of a needle crossing a line is: \frac{2l}{\pi t}. That gives us the nice property that if we let l = \frac{t}{2}, then the probability of crossing a line is \frac{1}{\pi}.

Using that, you can do a Monte Carlo computation: take a sheet of paper, and a couple of hundred matchsticks. Draw lines on the paper, separated by twice the length of a matchstick. Then scatter the matchsticks all over the paper. Divide the total number of matchsticks by the number that crossed a line. The result will be roughly \pi.

For example – with 200 trials, I got 63 crossing a line. That gives me roughly 3.17 as an estimate of \pi. That’s not half-bad for a five minute experimental estimate!

Squishy Equivalence with Homotopy

In topology, we always talk about the idea of continuous deformation. For example, we say that two spaces are equivalent if you can squish one into the other – if your space was made of clay, you could reshape it into the other just by squishing and molding, without ever tearing or gluing edges.

That’s a really nice intuition. But it’s a very informal intuition. And it suffers from the usual problem with informal intuition: it’s imprecise. There’s a reason why math is formal: because it needs to be! Intuition is great, as far as it goes, but if you really want to be able to understand what a concept means, you need to go beyond just intuition. That’s what math is all about!

We did already talk about what topological equivalence really is, using homeomorphism. But homeomorphism is not the easiest idea, and it’s really hard to see just how it connects back to the idea of continuous deformation.

What we’re going to do in this post is look at a related concept, called homotopy. Homotopy captures the idea of continuous deformation in a formal way, and using it, we can define a form of homotopic equivalence. It’s not quite equivalent to homeomorphism: if two spaces are homeomorphic, they’re always homotopy equivalent; but there are homotopy equivalent spaces that aren’t homeomorphic.

How can we capture the idea of continuous transformation? We’ll start by looking at it in functions: suppose I’ve got two functions, f and g. Both f and g map from points in a topological space A to a topological space B. What does it mean to say that the function f can be continuously transformed to g?

We can do it using a really neat trick. We’ll take the unit interval space – the topological space using the difference metric over the interval from 0 to 1. Call it U = [0, 1].

f can be continuously deformed into g if, and only if, there is a continuous function t: A \times U \rightarrow B, where \forall a \in A: t(a, 0) = f(a) \land t(a, 1) = g(a).

If that’s true, then we say t is a homotopy between f and g, and that f and g are homotopic.

That’s just the first step. Homotopy, the way we just defined it, doesn’t say anything about topological spaces. We’ve got two spaces, but we’re not looking at how to transform one space into the other; we’re just looking at functions that map between the spaces. Homotopy says when two functions between two spaces are loosely equivalent, because one can be continuously deformed into the other.

To get from there to the idea of transformability of spaces, we need to think about what we’re trying to say. We want to say that a space A can be transformed into a space BB. What does that really mean?

One way to say it would be that if I’ve got A, I can mush it into a shape B, and then much it back to A, without ever tearing or gluing anything. Putting that in terms of functions instead of squishies, that means that there’s a continous function f from A to B, and then a continous function g back from B to A. It’s not enough just to have that pair of functions: if you apply f to map A to B, and then apply g to map back, you need to get back something that’s indistinguishable from what you started with.

Formally, if A and B are topological spaces, and f: A \rightarrow B and g: B \rightarrow A are continuous functions, then the spaces A and B are homotopically equivalent – equivalent over squishing and remolding, but not tearing or gluing – if f \circ g is homotopic with the id function on A, and g \circ f is homotopic with the id function on B.

That captures exactly the notion of continuous transformation that we tried to get with the intuition at the start. Only now it’s complete and precise – we’ve gotten rid of the fuzziness of intuition.

Multiplying Spaces

When people talk informally about topology, we always say that the basic idea of equivalence is that two spaces are equivalent if they can be bent, stretched, smushed, or twisted into each other, without tearing or gluing. A mug is the same shape as a donut, because you can make a donut out of clay, and then shape that donut into a mug without tearing, punching holes, or gluing pieces together. A sphere is the same shape as a cube, because if you’ve got a clay sphere, you can easily reshape it into a cube, and vice-versa.

Homeomorphism is the actual formal definition of that sense of equivalence. The intuition is fantastic – it’s one of the best informal description of a difficult formal concept that I know of in math! But it’s not ideal. WHen you take a formal idea and make it informal, you always lose some details.

What we’re going to do here is try to work our way gradually through the idea of transformability and topological equivalence, so that we can really understand it. Before we can do that, we need to be able to talk about what a continuous transformation is. To talk about continuous transformations, we need to be able to talk about some topological ideas called homotopy and isotopy. And to be able to define those, we need to be able to use topological products. (Whew! Nothing is ever easy, is it?) So today’s post is really about topological products!

The easiest way that I can think of to explain the product of two topological spaces is to say that it’s a way of combining the structures of the spaces by adding dimensions. For example, if you start with two spaces each of which is a line segment, the product of those two spaces is a square (or a circle, or an octagon, or …) You started with two one-dimensional spaces, and used them to create a new two-dimensional space. If you start with a circle and a line, the product is a cylinder.

In more formal terms, topological products are a direct extension of cartesian set products. As the mantra goes, topological spaces are just sets with structure, which means that the cartesian product of two topological sets is just the cartesian products of their point-sets, plus a combined structure that preserves attributes of the original structure of the spaces.

Let’s start with a reminder of what the cartesian product of two sets is. Given a set A and a set B, the cartestian product A \times B is defined as the set of all possible pairs (a, b), where a \in A and b \in B. If A=\{1, 2, 3\} and B=\{4, 5\}, then A\times B = \{ (1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5)  \}.

In category theory, we take the basic idea of the cartesian product, and extend it to a general product of different mathematical objects. It does this by using the idea of projections. In this model, instead of saying that the product of sets A and B is a set of pairs (a, b), we can instead say that the product is a set S of objects, and two functions P_A : S \rightarrow A and P_B : S \rightarrow B. (To be complete, we’d need to add some conditions, but the idea should be clear from this much.) Given any object in the the product set S, P_A(S) will give us the projection of that object onto A. This becomes more interesting when we consider sets of objects. The A-projection of a collection of points from the product set S is the shadow that those points cast onto the set A.

A topological product is easiest to understand with that categorical approach. The set of points in a product category A \times B is the cartesian product of the sets of points in A and the sets of points in B. The trick, with topologies, is that you need to describe the topological structure of the product set: you need to be able to say what the neighorhoods are. There are lots of ways that you could define the neighborhoods of the product, but we define it as the topological space with the smallest collection of open-sets. To understand how we get that, the projections of the category theoretical approach make it much easier.

Informally, the neighborhoods in the product A \times B are things that cast shadows into the topological spaces A and B which are neighborhoods in A and B.

Suppose we have topological spaces A and B. If S is the product topology A \times B, then it has projection functions P_A: S \rightarrow A and P_B: S \rightarrow P_B.

The projection functions from the product need to maintain the topological structure of the original topologies. That means that the projection function must be continuous. And that, in turn, means that the inverse image of the projection function is an open set. So: for each open set O in A, P_A^{-1}(O) is an open-set in S.

Let’s look at an example. We’ll start with two simple topological spaces – a cartesian plane (2d), and a line (1d). In the plane, the neighborhoods are open circles; in the line, the neighborhoods are open intervals. I’ve illustrated those below.

open-sets

The product of those two is a three dimensional space. The neighborhoods in this space are cylinders. If you use the projection from the product to the plane, you get open circles – the neighborhood structure of the plane. If you use the projection from the product to the line, you get open intervals – the neighborhood structure of the line.

open-set-cyl

One interesting side-point here. One thing that we come across constantly in this kind of formal math is the axiom of choice. The AoC is an annoying bugger, because it varies from being obviously true to being obviously ridiculously false. Topological products is one of the places where it’s obviously true. The axiom choice is equivalent to the statement that given a collection of non-empty topological spaces, the product space is not empty. Obvious, right? But then look at the Banach-Tarski paradox.

Friday Random Ten (2/28): Music for the new site!

I haven’t done one of these in quite a while. The new home of this blog seems like a good excuse to start again.

  1. Adrien Belew, “Troubles”: Adrien Belew did an absolutely fantastic set of three solo albums of divinely weird music, called Side One, Side Two, and Side Three. This is the first track off of the third: Belew playing funky blues. Pure fun.
  2. Genesis, “Afterglow”: Old Genesis; is there anything better to an old proghead? I love Wind and Wuthering – it’s the best of the post-Gabriel Genesis.
  3. Marillion, “The Hollow Man”: and we transition from a sad old Genesis song, to a sad old Marillion song. Hollow Man is a beautiful, simple track, which really shows off Hogarth’s vocals, and Rothery’s guitar.
  4. Gogol Bordello, “Harem in Tuscany”: After two sad songs, this is a wonderful change. Eastern European Gypsy Punk!
  5. Do Make Say Think, “You, you’re a history in rust”: Great post-rock. Very dense, atmospheric. Perfect music to work to – grabs you, draws you in, engages you, but doesn’t distract you..
  6. Reddy, “Hamster Theatre”: This one, I struggle to describe. I’ve been told that genre-wise, they’re “Rock in Opposition” – but the only other group I know that anyone calls RiO is Thinking Plague, which shares members. This is mostly instrumental, with elements of rock, jazz, and European folk. Played on band featuring sax and accordion. I don’t know what the heck it is. It’s definitely not something I want to listen to frequently, but when I’m in the mood, it’s terrific.
  7. Djam Karet, “The Great Plains of North Dakota”: Anyone who knows me – especially anyone who’s read any of my past FRTs, knows that I’m a big old proghead. Instrumental prog, though, is frequently a bit of a tough sell for me. Too often, far too often, listening to instrumental prog is rather like watching someone masturbate – content free, emotion free, done solely for the gratification of the performer. Djam Karet is one of the instrumental bands that is not like that all: they’re absolutely brilliant.
  8. Sylvan, “The Fountain of Glow, Pt. 2”: More prog. For some reason, I just can’t get into Sylvan. I can’t say what it is about them, but even though they seem like they should be right in my musical territory, they just don’t work for me.
  9. NOW Ensemble, “Waiting in the Rain for Snow”: One of my more recent musical loves is post-classical music. There’s a wonderful little label based out of NY called “New Amsterdam”, and I’ve learned to pretty much buy all of their albums, sight unseen, as soon as they come out. They’re hard to describe – but 20th century classical chamber music blended with rock is enough to give you a sense. NOW is one of New Amsterdam’s house ensembles. They’re towards the more classical end of the NA spectrum, with a mimimalist feel. Absolutely brilliant stuff.
  10. William Brittelle, “Powaballad”: After NOW, I had to listed to another New Amsterdam artist. And this is amazingly weird in comparison to NOW. It’s still that same basic family – very much the rock/classical chamber fusion, but much of the rock side mixed in, with a much less traditional classical structure.

Bitcoin, MtGox, and Deflation

This is the last bitcoin post. Here I’ll try to answer the questions that led me to start writing about it.

What is MtGox? What happened there?

MtGox was a company that maintained bitcoin wallets. The basic idea is that they acted like a bank/broker for bitcoins. If you want to get bitcoins, you can go to someone like MtGox, and give them some money. They create a public/private keypair for you, and use it to create a transaction giving you the bitcoins. When you want to make a purchase, you’d go to your MtGox account, and tell them to transfer the bitcoins, and they use your key to sign the transaction, and then broadcast it to the bitcoin network. It is through processes like this one that you can buy Bitcoin with PayPal.

By using MtGox, you don’t need to have a program that participates in the bitcoin network to do transactions. You don’t need to worry about keeping your keys safe. You don’t need to have software capable of generating and signing transactions. All you need is your web-browser, to log in to MtGox.

Here’s where the problems start: MtGox didn’t start off as a bitcoin bank. In fact, they started off about as far from banking as you can imagine. From the name, you might think that MtGox is named after a mountain. Nope! It’s an acronym, for “Magic: the Gathering Online Exchange”. MtGox started off as a trading card exchange market.

This continues to boggle my mind. I just can’t quite wrap my head around it. A hacked together trading card exchange site decides to start acting as a sort of electronic bank/currency broker. And people trusted them with hundreds of millions of dollars!.

What happened is completely predictable.

You have an online site that manages massive quantities of money. Criminals are going to try to steal from it. Hell, when I was administrating Scientopia, at least once a week, I’d get email from someone with some kind of scam to try to manipulate google ads with fake clickthroughs, offering to split the profit. Scientopia’s revenue was only in the hundred dollar a month range – but it was still enough to attract crooks and scammers. Imagine what happens when it’s not $10 to be made, but $100,000,000?!

Crooks tried to steal money from MtGox. From what we know (there’s still a lot about this that’s still being figured out), they succeeded. They found a weakness in the MtGox implementation of the bitcoin protocol, and they exploited it to steal a massive number of bitcoins.

The ridiculous thing about all of this is, as I said above, it was totally predictable. You should never just hack together cryptosystems. You should never just hack together anything that handles money. When you hack together a crpytosystem that handles money, it’s pretty much a given that money is going to get lost.

If you want to deal with money, you need to be really, really serious about security. That doesn’t just mean making sure you write code. It means having an entire staff of people who’s job it is to make sure that you don’t fuck up. It means having people working full time, trying to break your system – because if they can break it, so can someone else! It means having a strongly adverserial setup, where the people trying to break it really want to break it – they can’t be the same people who want it to not get broken. It means having a different team of people who’s full time job is auditing – constantly watching the system, checking transactions, verifying them, making sure that everything is working correctly, catching any potential problems the moment they start, instead of letting them continue until they become disasters.

MtGox had none of that. It was a hacked together site. To get a sense of the way it was built, just look at the CEO’s blog, where he talks about implementing SSH in PHP. I’m not saying that he used this SSH code in MtGox – but read it, and read the comments, and you’ll get a sense of how poorly he understands security issues.

What does it mean when people say that Bitcoin is deflationary?

When you read the hype around bitcoin, you also see a lot of criticisms from the skeptics. I am one of the skeptics, but I’m trying to be as fair as I can in these posts. One of the criticisms that you constantly see is that Bitcoin is deflationary.

As I mentioned in yesterdays post, the only source of new bitcoins is mining. Each time the ledger gets updated with a new block in the blockchain, the person who generated the solution for that block gets a bounty, in the form of newly created bitcoins. Today, the bounty for a block is 25 bitcoins. But the bitcoin protocol specifies that that bounty will gradually decline and eventually disappear. When that happens, the miners will receive a commision, in the form of a transaction fee for transactions in the new block, but they won’t get new bitcoins. When the system gets to that point, the supply of bitcoins will be fixed: no new bitcoins, ever.

Lots of people think that that’s a good thing. After all, inflation sucks, right? This will be a fixed supply of money, whose value can’t be manipulated by politicians.

The catch is that nothing is ever that simple.

First: the fact that new bitcoins will not be issued means that the total supply of bitcoins will decline. People die without giving their passwords to their heirs. Passwords get lost. People forget about bank accounts. All of those things are more mean that bitcoins fall out of circulation. So not only is the supply of bitcoins going to stop increasing, it’s going to start decreasing. In fact, the bitcoin folks are completely open about this:

Because of the law of supply and demand, when fewer bitcoins are available the ones that are left will be in higher demand, and therefore will have a higher value. So, as Bitcoins are lost, the remaining bitcoins will eventually increase in value to compensate. As the value of a bitcoin increases, the number of bitcoins required to purchase an item decreases. This is a deflationary economic model. As the average transaction size reduces, transactions will probably be denominated in sub-units of a bitcoin such as millibitcoins (“Millies”) or microbitcoins (“Mikes”).

Is it really a problem? Maybe. I don’t know enough about economics to have a strong opinion, but it’s certainly enough to be worrying. The argument runs as follows:

When the supply of money is decreasing, it means that there’s less money available for making purchases – which means that the value of the money needs to increase. A bitcoin will need to be able to purchase more today than it did yesterday. And that is a serious problem.

Economies work best when money is kept moving. In an ideal world, money isn’t an asset at all: it’s just a medium. You want people to make products, sell them to other people, and then use the money that they made. If they take their money and hide it in a mattress, there’s going to be less activity in the economy than if they used it. The whole idea of money is just to make it easier to match up producers and consumers; when money is taken out of the system, it means that there’s potential economic activity that can’t happen, because the money to make it happen has been withdrawn from the system.

This is why most governments try to run their economies so that there is a moderate amount of inflation. Inflation means that if you take your money and hide it in your mattress, its value will slowly decrease. It means that withdrawing your money from the system is a losing proposition! So a bit of inflation acts as a motivation to put your money to work producing something.

Deflation, on the other hand, does the opposite. Suppose that today, I’ve got 10 bitcoins and 100 dollars, and they’re worth the same amount of money. I’m going to go buy some bacon. I can spend $10 buying bacon, and keep $90 and 10 bitcoins; or I can spend 1 bitcoin, and key 9 bitcoins and $100. So overall, I’ve got the equivalent of $190 and some bacon.

Next week, the value of bitcoins has risen to $15/bitcoin. If I spent my dollars to buy bacon, then now I’ve got $150 worth of bitcoins, $90 worth of dollars, and some bacon – my total asserts are equal to $240 and some bacon. If I spent my bitcoin, then I’d have $135 worth of bitcoins, $100 worth of dollars, and some bacon – $235. If I used my bitcoin to buy stuff, I lost $5.

That means that I’m strongly motivated to not use my bitcoins. And that’s not a good thing. That kind of deflation is very harmful to an economy – for example, look at Japan during the 1990s and 2000s, and to some extent still today.

The Tech of Bitcoin

Now we can get to the stuff about bitcoin that’s actually interesting. How does it work?

Before I start, one major caveat. I’m deliberately talking about this in vague terms – enough so that you an understand how and why it works, but not enough so that you could do something like implement it. Like anything else involving cryptography: if you’re thinking about implementing your own crpytosystem, don’t!

Cryptography is an area where even seasoned experts can easily make mistakes. A serious crpytosystem is built by a team of professionals, including people whose entire job is do everything in their power to break it. And even then, it’s all to easy to wind up with un-noticed bugs! When it comes to something like bitcoin, an inexperienced cryptographer trying to implement a new agent for the bitcoin network is insane: you’re dealing with money, and you’re risking losing a whole lot of it if you screw up. (That’s basically what appears to have happened to mtgox – and if the reports I’m reading are correct, they managed to lose hundreds of millions of dollars worth of bitcoins.

On to the fun part!

Basically, bitcoin is a protocol. That means that it’s really just a system that defines how to communicate information between a collection of computers. Everything about bitcoin is defined by that protocol.

At the heart of the protocol is the ledger. The ledger is a list of transactions that says, essentially A gave N bitcoins to B. The only way to get a bitcoin is by a transaction: there needs to be a transaction saying that someone or something gave you a bitcoin. Once you have one, you can transfer it to someone else, by adding a new transaction to the ledger.

That’s the basic idea of it. The beauty of bitcoin is that at the core, it’s incredibly simple – it’s just a list of transactions. What makes it interesting is the crpytographic magic that makes it work. In theory, the ledger could be simple text. Each line would contain a sender, a receiver, and a quantity. If you have that, it’s enough to manage transactions. But there are a bunch of problems with a simple ledger approach, which mostly involve making sure that transactions in the ledger are valid and that they can’t be changed.

In addition, there are problems that come about because of the fact that bitcoin wants to be completely decentralized. There is no authority in charge of bitcoin. That means that you need to have a consensus based system. Anyone can join the network of bitcoin managers at any time, and anyone in the network can drop out at any time – but the network as a whole must always have a consensus about what the current ledger is. There’s also the question of where the coins come from!

We’ll start with the question of authentication.

Suppose I own a bitcoin, and I want to use it to buy a loaf of bread from friendly baker, Barbera. I need to give Barbera my bitcoin. To do that, I need to add an entry to the ledger saying “MarkCC gave one bitcoin to Barbera”. The problem is, the ledger entry needs to contain something to prove that I’m really the owner of the bitcoin that I’m trying to transfer! If it doesn’t do that, then my arch-nemesis, Tony the thief, could just add a ledger entry saying “MarkCC gave one bitcoin to Tony”. Only I should be able to add a ledger entry transferring my bitcoin.

This one is simple: it’s a textbook use-case for public key crpytography. In a public key system, you have key-pairs. One member of the key pair is called the public key, and the other is the private key. Anything encrypted with the public key can only be decrypted using the private key; anything encrypted with the private key can only be decrypted with the public key. So you get a key pair, lock the private key away somewhere safe, and give away copies of the public key to anyone who wants it. Then if someone sees a message that can be decoded with your public key, it means that you must have been the one who sent it. No one else could encrypt a message using your private key!

In the bitcoin ledger, the lines are signed. Instead of directly saying “MarkCC gave one bitcoin to Barbera”, they say “One bitcoin was transferred from the owner of MarkCCs cryptokey to the owner of Barbera’s cryptokey”. The ledger entry for that transaction is signed using a signature generated from the ledger entry using MarkCC’s private key. Anyone can verify the validity of the transaction by checking the signature using MarkCC’s public key. Once that’s in the ledger, Barbera now own a bitcoin, and only Barbera (or whoever has access to Barbera’s private key) can do transfer that bitcoin to anyone else.

Now, on to the complicated part! There is no authoratative ledger in bitcoin. There are many copies – thousands of copies! – of the ledger, and they’re all equal. So how can you be sure that a transaction is real? Someone could write a transaction to a ledger, show you the ledger, and then you could find out that in every ledger except the one you looked at, the transaction doesn’t exist! If you there is no single place that you can check to be sure that a transaction is in the ledger, how can you be sure that a transaction is real?

The answer is a combination of a consensus protocol, and a bit of computational cleverness. When you want to add a ledger entry, you broadcast a message to the bitcoin network. Every 10 minutes or so, participants in the bitcoin network take whatever transactions were added to the current ledger, put them into a structure called a block, and perform a very difficult, semi-random computational task using the block. When they find a solution, they sign the block using the solution, and broadcast it to the network.

The first agent in the bitcoin network that completes a solution and broadcasts it to the network “wins” – their block becomes the new basis for the next block.

The blocks form a chain: block one contains some transactions; block 2 contains more transactions that happened after block1 was broadcast; block 3 contains transactions that happened after block 2 was broadcast, and so on.

At any time, the consensus ledger – that is the master, canonical ledger – is the one with the longest verifiable chain. So someone can broadcast a new solution to an old block, but it will be ignored, because there is already a longer chain in the consensus. You can be certain that your transaction can’t be revoked or lost once you see a new block issued that builds on the block containing your transaction.

This all relies on people finding solutions to computationally expensive problems. Why would anyone do it? That’s why this process of computing hashes for the blocks is called mining: because if you’re the first one who finds a solution for a block, then you get to add a transaction giving yourself 25 brand new bitcoins to yourself! Mining – the process of maintaining the ledger – becomes a sort of lottery, where the people doing the mining randomly get bonuses to motivate them to do it.

The computational side of it is clever in its simplicity. The bitcoiners want the problem to be hard enough to make it effectively impossible to cheat. But they also want to be sure that it’s easy enough that they get blocks frequently. If it takes an hour before anyone has a solution for a new block, that’s also a problem: if it takes that long to commit a transaction to the ledger, then people aren’t going to trust bitcoin for fast transactions. They can’t be sure that a bitcoin was transferred to them unless they know that the transaction was committed in an accepted block. But there’s a trick there: people want to get the mining rewards. That means that they’re constantly trying to push the limits of what they can get away with computationally. People started with bunches of PCs, and then discovered that the GPUs on their graphic cards could do it faster, and then started building custom PCs with tons of graphics cards to do bitcoin mining computations in parallel. And of course, there’s always Moore’s law: computers are constantly getting faster. That means that they can’t just pick a particular complexity for the problem and stick with it.

The solution is to make the problem variable. They start with a well known algorithm that’s a very good one-way problem (meaning that it’s relatively easy to compute a result given an input; but very, very hard to figure out the inverse – to get an input that produces a desired result. In slightly more mathematical terms, if y=f(x), then computing y is easy if you know x, but it’s very hard to compute x if you know y.

There are a bunch of well known one-way computations. They just picked one, called SHA-256 computation. Now the clever part: they make SHA-256 computation into a variable complexity problem by picking a threshold T, agreed on by consensus in the bitcoin ledger protocol. The solution for a block is a hashcode for the block plus a bit of extra data which is smaller than T: for a ledger block L, they need to find a value N called a Nonce where \textbf{SHA}(L + N) < T.

Because SHA-256 is a one-way function, there’s no good way to predict what value of N will give them a hashvalue that’s smaller than the threshold – the only way to do it is to just keep guessing new N-values, and hoping that one of them will produce an acceptable result. By reducing the value of T, you can make the problem harder; by increasing the value of T, you can make it easier. The bitcoin protocol specifies a regular interval and an algorithm for selecting a new T.

When a miner solves the problem, they publish the new ledger block to the bitcoin network, with the new ledger section and its (H, N) values. Once a new block is issued, all of the future ledger entries can only get added to the next unsolved block in the ledger.

The reason that this is safe is a matter of computation. You can go back in time, and find an old transaction, remove an entry from it, and recalculate the block. But it takes time, and other people are still moving on, computing new blocks. For your change to be accepted by the bitcoin network, you would need to issue new version of the altered block, plus new versions of any other blocks issued since the one you altered, and you’d have to do it before anyone else could issue a new block. The consensus is the longest block-chain, so issuing blocks that aren’t longer than the longest chain in the network is a waste of time. Because the computation is hard, even being one block behind is enough to make it effectively impossible to be able to change the ledger: to become the new largest chain when you’re just one block behind means you’d need to compute solutions for three blocks before anyone could find a solution for just one more block!

That last bit isn’t so clear when you read it, so let’s work through an example.

  1. Block B is issued
  2. Block C is issued
  3. You want to change block B.
  4. The current longest blockchain is […, B, C].
  5. To replace B with a new block B’, you need to issue a
    longer blockchain that the current consensus of […, B, C].
  6. That means that you need to issue B’, C’, and a new block D’ before anyone else can just issue D.
  7. By falling just one block behind, you need to issue 3 three new blocks before anyone else can issue just one.

And that, my friends, is effectively impossible.

Money and Bitcoins part 1: What is money?

Bitcoin has been in the news a lot lately. I’ve been trying to ignore it, even though lots of people have sent me questions about it.

In the last couple of days, Bitcoin has been in the news even more. Mtgox, one of the major companies associated with bitcoin has gotten into serious trouble. It’s not entirely clear what exactly is going on, but it appears that mtgox lost a massive quantity of bitcoins. As a result, they’re almost certainly going bankrupt, and a whole lot of people are about to lose/have already lost a huge amount of money. This burst of news about mtgox has turned the trickle of questions into a flood.

In order to shut you all up, I’m going to try to answer the questions. I started off working on one post, but it’s gotten out of hand – over 5000 words, and not finished yet! Instead of just posting that monstrosity as one gigantic mega-post, I’m splitting it up into several posts.

To understand bitcoin, first you need to understand what money really is. That’s going to be the focus of this post: what is money? And how is bitcoin both similar to, and different from, other things that we call money?

Money is, ultimately, an elaborate shell game. Currency is a bunch of worthless tokens that we all agree to pretend are worth something, so that we have some way of exchanging valuable things in a reasonable, fair, and easy way.

To understand that, let’s start with a really simple scenario. We’ve got two farmers: Will grows wheat and turns it into flour, and Pam raises pigs and chickens. Will has lots of wheat, so he can grind it into flour and make bread – but he’d also like to be able to have some bacon with his toast for breakfast. Pam, meanwhile, has all the bacon she can eat (lucky lady!), but she’d like to have some bread, so that she can make a BLT for lunch.

There’s an easy solution. Will goes over to Pam’s place, and offers her some bread in exchange for some bacon. Between them, they figure out how much bread is worth how much bacon, and make the trade. And hurrah! both are happy. This is simple barter – no money needed.

Things become more complicated when we add more people to the story. We can add Mary the miller who gets wheat from Will and grinds it into flour, and Bob the Baker who gets flour from Mary and bakes it into bread. Now the process of making bread has gotten more elaborate: the person who grows the wheat is no longer the person who sells the final product to the pig-farmer!

Now if Will wants his bacon, it’s harder. He has wheat, but Pam doesn’t want wheat, she wants bread! As long as Mary and Bob both like bacon, this can be made to work. Bob can go to Pam, and trade bread for her bacon. Then he can take some of the bacon he got from Pam, and give it to Mary in exchange for flour. Mary can take part of the bacon he got from Bob, and give it to Will in exchange for wheat. Then Mary, Bob, and Will all have their bacon, and Pam has her bread, and everyone is happy. We’re still in the land of barter, without any money, but it’s getting difficult, because everything is stuck going through intermediaries.

This works, as long as everyone in the chain is happy with bread and bacon. But as things get more complicated, and you get more people involved, you get situations where you have people who want something that they can’t easily trade for what they have. We could add Phil the plowmaker to our scenario. Phil makes the best plows you’ve ever seen. If Phil wants to get some wheat, he’s in great shape: Will would love to get one of Phil’s plows to plow his fields. But Phil doesn’t want to deal with freshly harvested wheat – he wants bread and Bacon! That means he’s got a problem: Pam has no use for a plow, and neither does Bob. In order to get bread and bacon in exchange for his plows, he somehow needs to get something that Pam and Bob want. He’s not part of the chain from Will to Pam.

If you’re sticking with barter, then poor Phil has a very complicated problem. He needs to go to Pam, and figure out what she wants in exchange for bacon – she could say she needs a new butcher’s knife, and she’ll trade that for bacon. So now Phil needs to find someone who’ll trade a plow for a butcher’s knife. He can go to Kevin the knifemaker, and see if he’ll take a plow. If Kevin doesn’t want a plow, then he needs to find out what Kevin wants, and see if he can find someone who’ll trade a plow for that. He’s stuck running around, trying to find the sequence of exchanges that give him what he wants. If he can’t find a chain, he’s stuck, and he’ll just have to give up and not have any bacon, even though he’s got beautiful plows.

The solution to this mess is to create create a medium of exchange. You need to create something which has a symbolic value, which everyone is willing to trade for his or her stuff. Will his wheat for little green pieces of paper. Pam exchanges her bacon for little green pieces of paper. Phil exchanges his plows for little green pieces of paper. The little green pieces of paper are completely worthless themselves – they’re just green paper! But everyone recognizes that other people want them. Because the other people want them, they know that if they take some, they’ll be able to use them to get stuff from other people.

For Phil to get his bacon, there’s still got to be some chain: he exchanges his plow for some green paper; he goes to Pam, and gives her green paper in exchange for the bacon. Pam uses that green paper to get stuff she wants. The chain is still there: ultimately, what Phil did by giving Pam the green paper is give her something that she wanted. But instead of needing to find a concrete product or service that she wanted, he was able to give her symbolic tokens of value, which should could then exchange for whatever she wanted.

That’s what money is. It’s a symbolic medium, where we all agree that the symbolic tokens are worth something. The tokens only have value because we believe that other people will also take them in exchange for their stuff/their work. And because we believe that, we’ll take them in exchange for our work. When we do this, we’re simultaneously being perfectly rational and completely delusional. We’re trading the products of our work for something utterly worthless, because we’ve all agreed to pretend that the worthless stuff isn’t worthless.

Of course, when you’ve got valuable stuff moving around, governments get involved. In fact, governments exist largely to help make it safe for valuable stuff to move around! Money ends up getting tied to governments, because the governments exist in large part specifically to provide an enforceable legal system to manage the exchange of money.

In todays world, money is just a set of tokens. If you go back a hundred years, all over the world, people had what they believed what a different idea about money. Money was backed by gold. If you had a dollar, it wasn’t just a green piece of paper. It was a lightweight representation of about 1.67 grams of gold. You could go to the government with your dollars, and exchange them for that quantity of gold. According to many people, derogatorily called goldbugs, this was fundamentally different from todays money, which they call fiat currency, because it was backed by a tangible valuable asset.

The problem with that argument is: why is gold any more valuable than green paper?

Gold is valuable because it is a pretty yellow metal, incredibly malleable and easy to work with, non-corrosive, and useful for a wide variety of artistic and industrial purposes. It’s also relatively rare. In other words: it’s valuable because people want it. It’s not valuable because it’s tangible! No one would say that a currency backed by rocks is intrinsically more valuable than a so-called fiat currency, even though rocks are tangible. People don’t want a lot of rocks, so rocks aren’t worth much.

Now we can finally get around to just what bitcoin is.

Bitcoin is a currency which isn’t backed by any government. In fact, it’s not backed by anyone. It’s a fundamentally decentralized system of currency. There’s no central authority behind it. Instead, it works based on an interesting protocol of computation over communication networks. Everything about it is distributed all over the world. You could pick any individual computer or group of computers involved in bitcoin, blow them to bits, and bitcoin would be unaffected, because there would still be other people in the bitcoin network. It’s all driven by distributed computation. A bitcoin is an utterly intangible asset. There is no coin in a bitcoin.

I’ll go into more detail in my next post. But the basic idea of bitcoin is really, really simple. There are a bunch of computers on the network that are keeping track of bitcoin transactions. Between them, they maintain a ledger, which consists of a list of transactions. Each transaction says, effectively, “X gave N bitcoins to Y”. The only way that you can own a bitcoin is if, somewhere in the ledger, there is a transaction saying that someone gave a bitcoin to you. There is no coin. There is no identifying object that represents a bitcoin. There’s not even anything like the serial number printed on a banknote. There is no bitcoin: there is just a transaction recordin a ledger saying that someone gave you a bitcoin. There is absolutely no notion of traceability associated with a bitcoin: you can’t take a bitcoin that someone gave you, and ask where it came from. Bitcoins have no identity.

The point of it is specifically that intangibility. Bitcoin exists largely as a way to move valuable stuff around without the involvement of governments. Bitcoin is, really, just a way of making a computationally safe exchange of value, without transferring anything tangible, and without any single central authority in control of it. You know that someone gave you money, because there are thousands of computers around the world that have agreed on the fact that someone gave you money.

Obviously, there needs to be some technical muscle behind it to make people trust in those unknown entities managing the ledgers. We’ll talk about that in the next post.

What’s amusing about bitcoin is that in many ways, it’s no different from any other kind of money: it’s absolutely worthless, except when people agree to pretend that it isn’t. And yet, in bitcoin circles, you’ll constantly see people talking disdainfully about “fiat money”. But bitcoin is the ultimate in fiat: there’s nothing there except the ledger, and the only reason that anyone thinks it’s worth anything is because they’ve all agreed to pretend that it’s worth something.

Run! Hide your children! Protect them from math with letters!

Normally, I don’t write blog entries during work hours. I sometimes post stuff then, because it gets more traffic if it’s posted mid-day, but I don’t write. Except sometimes, when I come accross something that’s so ridiculous, so offensive, so patently mind-bogglingly stupid that I can’t work until I say something. Today is one of those days.

In the US, many school systems have been adopting something called the Common Core. The Common Core is an attempt to come up with one basic set of educational standards that are applied consistently in all of the states. This probably sounds like a straightforward, obvious thing. In my experience, most Europeans are actually shocked that the US doesn’t have anything like this. (In fact, at best, it’s historically been standardized state-by-state, or even school district by school district.) In the US, a high school diploma doesn’t really mean anything: the standards are so widely varied that you can’t count on much of anything!

The total mishmash of standards is obviously pretty dumb. The Common Core is an attempt to rationalize it, so that no matter where you go to school, there should be some basic commonality: when you finish 5th grade, you should be able to read at a certain level, do math at a certain level, etc.

Obviously, the common core isn’t perfect. It isn’t even necessarily particularly good. (The US being the US, it’s mostly focused on standardized tests.) But it’s better than nothing.

But again, the US being the US, there’s a lot of resistance to it. Some of it comes from the flaky left, which worries about how common standards will stifle the creativity of their perfect little flower children. Some of it comes from the loony right, which worries about how it’s a federal takeover of the education system which is going to brainwash their kiddies into perfect little socialists.

But the worst, the absolute inexcusable worst, are the pig-ignorant jackasses who hate standards because it might turn children into adults who are less pig-ignorant than their parents. The poster child for this bullshit attitude is State Senator Al Melvin of Arizona. Senator Melvin repeats the usual right-wing claptrap about the federal government, and goes on
to explain what he dislikes about the math standards.

The math standards, he says, teach “fuzzy math”. What makes it fuzzy math? Some of the problems use letters instead of numbers.

The state of Arizona should reject the Common Core math standards, because the math curicculum sometimes uses letters instead of numbers. After all, everyone knows that there’s nothing more to math than good old simple arithmetic! Letters in math problems are a liberal conspiracy to convince children to become gay!

The scary thing is that I’m not exaggerating here. An argument that I have, horrifyingly, heard several times from crazies is that letters are used in math classes to try to introduce moral relativism into math. They say that the whole reason for using letters is because with numbers, there’s one right answer. But letters don’t have a fixed value: you can change what the letters mean. And obviously, we’re introducing that into math because we want to make children think that questions don’t have a single correct answer.

No matter where in the world you go, you’ll find stupid people. I don’t think that the US is anything special when it comes to that. But it does seem like we’re more likely to take people like this, and put them into positions of power. How does a man who doesn’t know what algebra is get put into a position where he’s part of the committee that decides on educational standards for a state? What on earth is wrong with people who would elect someone like this?

Senator Melvin isn’t just some random guy who happened to get into the state legislature. He’s currently the front-runner in the election for Arizona’s next governor. Hey Arizona, don’t you think that maybe, just maybe, you should make sure that your governor knows high school algebra? I mean, really, do you think that if he can’t understand a variable in an equation, he’s going to be able to understand the state budget?!

Topological Spaces: Defining Shapes by Closeness

When people talk about what a topological space is, you’ll constantly hear one refrain: it’s just a set with structure!

I’m not a fan of that saying. It’s true, but it just doesn’t feel right to me. What makes a set into a topological space is a relationship between its members. That relationship – closeness – is defined by a structure in the set, but the structure isn’t the point; it’s just a building block that allows us to define the closeness relations.

The way that you define a topological space formally is:

A topological space is a pair (X, T, N), where X is a set of objects, called points; T is a set of subsets of X; and N is a function from elements of X to elements of T (called the neighborhoods of X where the following conditions hold:

  1. Neighborhoods basis: \forall A \in N(p): p \in A: every neighborhood of a point must include that point.
  2. Neigborhood supersets: \forall A \in N(p): \forall B \in X: B \supset A \Rightarrow B \in N(p). If B is a superset of a neighborhood of a point, then B must also be a neighborhood of that point.
  3. Neighborhood intersections: \forall A, B \in N(p): A \cap B \in N(p): the intersection of any two neighborhoods of a point is a neighborhood of that point.
  4. Neighborhood relations: \forall A \in N(x): \exists B \in N(x): \forall b \in B: A \in N(b). If A is a neighborhood of a point p, then there’s another neighborhood B of p, where A is also a neighborhood of every point in B.

The collection of sets T is called a topology on T, and the neighborhood relation is called a neighborhood topology of T.

Like many formal definitions, this is both very precise, and not particularly informative. What the heck does it mean?

In the previous topology post, I talked about metric spaces. Every metric space is a topological space (but not vice-versa), and we can use that to help explain how the set-of-sets T defines a meaningful definition of closeness for a topological space.

In the metric space, we define open balls around each point in the space. Each one forms an open set around the point. For any point p in the metric space, there are a sequence of ever-larger open-balls of points around p.

That sequence of open balls defines the closeness relation in the metric space:

  • a point q is closer to p than it r is if q is in one of the open balls around p, which r isn’t. (In a metric space, that’s equivalent to saying that the distance d(q, p) < d(q, r).)
  • two points q and r are equally close to p if there is no open ball around p where q is included but r isn’t, or where r is included but p isn’t. (In a metric space, that’s equivalent to saying that d(q, p) = d(r, p).)

In a topological space, we don’t neccessarily have a distance metric to define open balls. But the neighborhoods of each point p define the closeness relation in the same way as the open-balls in a metric space!:

  • The neighborhoods N(p) of a point are equivalent to the open balls around p in a metric space.
  • The open sets of the topology (the members of T) are equivalent to the open sets of the metric space.
  • The complements of the members of T are equivalent to the closed sets of the metric space.

One of the most important ideas in topology is the notion of continuity. Some people would say that it’s the fundamental abstraction of topology, because the whole idea of the equivalence between two shapes is that there is a continuous transformation between them. And now that we know what a topological space is, we can define continuity.

Continuity isn’t a property of a topological space, but rather a property of a function between two topological spaces: if (T, X_T, N_T) and (U, X_U, N_U) are both topological spaces, then a function f: X \rightarrow Y is continuous if and only if for every open set C \in X_U, the inverse image of f on C is an open set in X_T. (The inverse image of f is the set of points x \in X_T: f(x) \in C).

Once again, we’re stuck with a very precise definition that’s really hard to make any sense out of. I mean really, the inverse image of the function on an open set is an open set? What the heck does that mean?

What it’s really capturing is that there are no gaps in mapping from one space to the other. If there was a gap, it would create a boundary – there would be a hard edge in the mapping, and so the inverse image would show that as a closed set. Think of the metric spaces idea of open sets. Imagine an open set with a cube cut out of the middle. It’s definitely not continuous. If you took a function on that open set, and its inverse image was the set with the cube cut out, then the function is not smoothly mapping from the open set to the other topological space. It’s only mapping part of the open set, leaving a ugly, hard-edged gap.

In topology, we say that two shapes are equivalent if and only if they can be continuously transformed into each other. In intuitive terms, that continuous transformation means that you can do the transformation without tearing holes are gluing edges. That gives us a clue about how to understand this definition. What the definition means is really saying is pretty much that there’s no gluing or tearing: it says that if a set in the target is an open set, the set of everything that mapped to it is also an open set. That, in turn, means that if f(x) and f(y) are close together in U, then x and y must have been close together in T: so the structure of neighborhood relations is preserved by the function’s mapping.

One continuous map from a topological space isn’t enough for equivalence. It’s possible to create a continuous mapping from one topological space to another when they’re not the same – for example, you could map part of the topology T onto U. As long as for that part, it’s got the continuity properties, that’s fine. For two topologies to be equivalent, there must be a homeomorphism between the sets. That is, a function f such that:

  • f is one-to-one, total, and onto
  • Both f and f^{-1} are continuous.

As a quick aside: here’s one of the places where you can see the roots of category theory in algebraic topology. There’s a very natural category of topological spaces. The objects in the category are, obviously, the topological spaces. The arrows are continuous functions between the spaces. And a homeomorphism (homo-arrow) in the category is a homeomorphism between the objects.