*(Note: in the original version of this, I made an absolutely **huge** error. One of my faults in discussing topology is scrambling when to use forward functions, and when to use inverse functions. Continuity is dependent on properties defined in terms of the *inverse* of the function; I originally wrote it in the other direction. Thanks to commenter Dave Glasser for pointing out my error. I’ll try to be more careful in the future!)*
Since I’m back, it’s time to get back to topology!
I’m going to spend a bit more time talking about what continuity means; it’s a really important concept in topology, and I don’t think I did a particularly good job at explaining it in my first attempt.
Continuity is a concept of a certain kind of *smoothness*. In non-topological mathematics, we define continuity with a very straightforward algebraic idea of smoothness. A standard intuitive definition of a *continuous function* in algebra is “a function whose graph can be drawn without lifting your pencil”. The topological idea of continuity is very much the same kind of thing – but since a topological space is just a set with some additional structure, the definition of continuity has to be generalized to the structure of topologies.
The closest we can get to the algebraic intuition is to talk about *neighborhoods*. We’ll define them more precisely in a moment, but first we’ll just talk intuitively. Neighborhoods only exist in topological metric spaces, since they end up being defined in terms of distance; but they’ll give us the intuition that we can build on.
Let’s look at two topological spaces, **S** and **T**, and a function f : **S** → **T** (that is, a function from *points* in **S** to *points* in **T**). What does it mean for f to be continuous? What does *smoothness* mean in this context?
Suppose we’ve got a point, *s* ∈ **S**. Then f(*s*) ∈ **T**. If f is continuous, then for any point p in **T** *close to f(s)*, f-1(p) will be *close to* *s*. What does close to mean? Pick any distance – any *neighborhood* N(f(s)) in **T** – no matter how small; there will be a corresponding neighborhood of M(*s*) around s in **S** so that for all points p in N(f(s)), f-1 will be in M(*s*). If that’s a bit hard to follow, a diagram might help:
To be a bit more precise: let’s define a neighborhood. A neighborhood N(p) of a point p is a set of points that are *close to* p. We’ll leave the precise definition of *close to* open, but you can think of it as being within a real-number distance in a metric space. (*close to* for the sake of continuity is definable for any topological space, but it can be a strange concept of close to.)
The function f is continuous if and only if for all points f(s) ∈ **T**, for all neighborhoods N(f(s)) of f(s), there is some neighborhood M(s) in **S** so that f(M(s)) ⊆ N(f(s)). Note that this is for *all* neighborhoods of *all* points in **T** mapped to by f – so no matter how small you shrink the neighborhood around f(s), the property holds – and it implies that as the neighborhood in **T** shrinks, so does the corresponding neighborhood in **S**, until you reach the single points f(s) and s.
Why does this imply *smoothness*? It means that you can’t find a set of points in the range of f in **T** that are close together, but that weren’t close together in **S** before being mapped by f. f won’t put things together that weren’t together originally. And it won’t pull things apart that weren’t
close together originally. *(This paragraph was corrected to be more clear based on comments from Daniel Martin.)*
For a neat exercise: go back to the category theory articles, where we defined *initial* and *final* objects in a category. There are corresponding notions of *initial* and *final* topologies in a topological space for a set. The definitions are basically the same as in category theory – the arrows from the initial object are the *continuous functions* from the topological space, etc.
Category Archives: Good Math
Pathetic Statistics from HIV/AIDS Denialists
While I was on vacation, I got some email from Chris Noble pointing me towards a discussion with some thoroughly innumerate HIV-AIDS denialists. It’s really quite shocking what passes for a reasonable argument among true believers.
The initial stupid statement is from one of Duesberg’s papers, [AIDS Acquired by Drug Consumption and Other Noncontagious Risk Factors][duesberg], and it’s quite a whopper. During a discussion of the infection rates shown by HIV tests of military recruits, he says:
>(a) “AIDS tests” from applicants to the U.S. Army and the U.S. Job
>Corps indicate that between 0.03% (Burke et al.,1990) and 0.3% (St
>Louis et al.,1991) of the 17- to 19-year-old applicants are HIV-infected
>but healthy. Since there are about 90 million Americans under the age
>of 20, there must be between 27,000 and 270,000(0.03%-0.3% of 90
>million) HIV carriers. In Central Africa there are even more, since 1-2%
>of healthy children are HIV-positive (Quinn et al.,1986).
>
>Most, if not all, of these adolescents must have acquired HIV from
>perinatal infection for the following reasons: sexual transmission of
>HIV depends on an average of 1000 sexual contacts, and only 1in 250
>Americans carries HIV (Table 1). Thus, all positive teenagers would
>have had to achieve an absurd 1000 contacts with a positive partner, or
>an even more absurd 250,000 sexual contacts with random Americans
>to acquire HIV by sexual transmission. It follows that probably all of
>the healthy adolescent HIV carriers were perinatally infected, as for
>example the 22-year-old Kimberly Bergalis (Section 3.5.16).
Now, I would think that *anyone* who reads an allegedly scientific paper like this would be capable of seeing the spectacular stupidity in this quotation. But for the sake of pedantry, I’ll explain it using small words.
If the odds of, say, winning the lottery are 1 in 1 million, that does *not* mean that if I won the lottery, that means I must have played it one million times. Nor does it mean that the average lottery winner played the lottery one million times. It means that out of every one million times *anyone* plays the lottery, *one* person will be expected to win.
To jump that back to Duesberg, what he’s saying is: if the transmission rate of HIV/AIDS is 1 in 1000, then the average infected person would need to have had sex with an infected partner 1000 times.
Nope, that’s not how math works. Not even close.
Suppose we have 1000 people who are infected with HIV, and who are having unprotected sex. *If* we follow Duesberg’s lead, and assume that the transmission rate is a constant 0.1%, then what we would expect is that if each of those 1000 people had sex with one partner one time, we would see one new infected individual – and that individual would have had unprotected sex with the infected partner only one time.
This isn’t rocket science folks. This is damned simple, high-school level statistics.
Where things get even sadder is looking at the discussion that followed when Chris posted something similar to the above explanation. Some of the ridiculous contortions that people go through in order to avoid admitting that the great Peter Duesberg said something stupid is just astounding. For example, consider [this][truthseeker] from a poster calling himself “Truthseeker”:
>If Duesberg had said that, he would indeed be foolish. The foolishness,
>however, is yours, since you misintepret his reasoning. He said, as you note
>
>>Most, if not all, of these adolescents must have acquired HIV from perinatal
>>infection for the following reasons: sexual transmission of HIV depends on an
>>average of 1000 sexual contacts, and only 1 in 250 Americans carries HIV
>>(Table 1). Thus, all positive teenagers would have had to achieve an absurd
>>1000 contacts with a positive partner, or an even more absurd 250,000 sexual
>>contacts with random Americans to acquire HIV by sexual transmission.
>
>This states the average transmission requires 1000 contacts, not every
>transmission. With such a low transmission rate and with so few Americans
>positive – you have to engage with 250 partners on average to get an average
>certainty of 100% for transmission, if the transmission rate was 1. Since it is
>1 in 1000, the number you have to get through on average is 250,000. Some might
>do it immediately, some might fail entirely even at 250,000. But the average
>indicates that all positive teenagers would have had to get through on average
>250,000 partner-bouts.
Truthseeker is making exactly the same mistake as Duesberg. The difference is that he’s just had it explained to him using a simple metaphor, and he’s trying to spin a way around the fact that *Duesberg screwed up*.
But it gets even worse. A poster named Claus responded with [this][claus] indignant response to Chris’s use of a metaphor about plane crashes:
>CN,
>
>You would fare so much better if you could just stay with the science
>points and refrain from your ad Duesbergs for more than 2 sentences at
>a time. You know there’s a proverb where I come from that says ‘thief thinks
>every man steals’. I’ve never seen anybody persisting the way you do in
>calling other people ‘liars’, ‘dishonest’ and the likes in spite of the
>fact that the only one shown to be repeatedly and wilfully dishonest
>here is you.
>
>Unlike yourself Duesberg doesn’t deal with matters on a case-by-case only basis
>in order to illustrate his statistical points. precisely as TS says, this shows
>that you’re the one who’s not doing the statistics, only the misleading.
>
>In statistics, for an illustration to have any meaning, one must assume that
>it’s representative of an in the context significant statistical average no?
>Or perphaps in CN’s estimed opinion statistics is all about that once in a
>while when somebody does win in the lottery?
Gotta interject here… Yeah, statistics *is* about that once in a while when someone wins the lottery, or when someone catches HIV, or when someone dies in a plane crash. It’s about measuring things by looking at aggregate numbers for a population. *Any* unlikely event follows the same pattern, whether it’s catching HIV, winning the lottery, or dying in a plane crash, and that’s one of the things that statistics is specifically designed to talk about: that fundamental probabilistic pattern.
>But never mind we’ll let CN have the point; the case in question was that odd
>one out, and Duesberg was guilty of the gambler’s fallacy. ok? You scored one
>on Duesberg, happy now? Good. So here’s the real statistical point abstracted,
>if you will, from the whole that’s made up by all single cases, then applied to
>the single case in question:
>
>>Thus, all positive teenagers would have had to achieve an absurd 1000 contacts
>>with a positive partner, or an even more absurd 250,000 sexual contacts with
>>random Americans to acquire HIV by sexual transmission.
>
>This is the statistical truth, which is what everybody but CN is interested in.
Nope, this is *not* statistical truth. This is an elementary statistical error which even a moron should be able to recognize.
>Reminder: Whenever somebody shows a pattern of pedantically reverting to single
>cases and/or persons, insisting on interpreting them out of all context, it’s
>because they want to divert your attention from real issues and blind you to
>the overall picture.
Reminder: whenever someone shows a pattern of pedantically reverting to a single statistic, insisting on interpreting it in an entirely invalid context, it’s because they want to divert your attention from real issues and blind you to the overall picture.
The 250,000 average sexual contacts is a classic big-numbers thing: it’s so valuable to be able to come up with an absurd number that people will immediately reject, and assign it to your opponents argument. They *can’t* let this go, no matter how stupid it is, no matter how obviously wrong. Because it’s so important to them to be able to say “According to *their own statistics*, the HIV believers are saying that the average teenage army recruit has had sex 250,000 times!”. As long as they can keep up the *pretense* of a debate around the validity of that statistic, they can keep on using it. So no matter how stupid, they’ll keep defending the line.
[duesberg]: www.duesberg.com/papers/1992%20HIVAIDS.pdf
[truthseeker]: http://www.newaidsreview.org/posts/1155530746.shtml#1487
[claus]: http://www.newaidsreview.org/posts/1155530746.shtml#1496
Programs are Proofs: Models and Types in Lambda Calculus
Lambda calculus started off with the simple, untyped lambda calculus that we’ve been talking about so far. But one of the great open questions about lambda calculus was: was it sound? Did it have a valid model?
Church found that it was easy to produce some strange and non-sensical expressions using the simple lambda calculus. In order to try to work around those problems, and end up with a consistent system, Church introduced the concept of *types*, producing the *simply typed lambda calculus*. Once types hit the scene, things really went wild; the type systems for lambda calculi have never stopped developing: people are still finding new things to do by extending the LC type system today! Most lambda calculus based programming languages are based on the Hindley-Milner lambda calculus, which is a simplification of one of the standard sophisticated typed lambda calculi called *SystemF*. There’s even a [Lambda Cube][cube], though it’s not related to the Time Cube. Once people really started to understand types, they realized that the *untyped* lambda calculus was really just a pathologically simple instance of the simply typed lambda calculus: a typed LC with only one base type.
The semantics of lambda calculus are easiest to talk about in a typed version. For now, I’ll talk about the simplest typed LC, known as the *simply typed lambda calculus*. One of the really amazing things about this, which I’ll show, is that a simply typed lambda calculus is completely semantically equivalent to an intuitionistic propositional logic: each type in the program is a proposition in the logic; each β reduction corresponds to an inference step; and each complete function corresponds to a proof! Look below the fold for how.
Why oh why Y?
So in the last few posts, I’ve been building up the bits and pieces that turn lambda calculus into a useful system. We’ve got numbers, booleans, and choice operators. The only thing we’re lacking is some kind of repetition or iteration.
In lambda calculus, all iteration is done by recursion. In fact, recursion is a pretty natural way of expressing iteration. It takes a bit of getting used to, but anyone who’s programmed in a language like Scheme, ML, or Haskell for a while gets very used to idea, and feels frustrated coming back to a language like Java, where you need to write loops.
It can be a bit difficult if you’re not used to thinking recursively. I wrote [an explanation of recursion][recursion] which you can go read if you’re not used to what recursion is or how it works.
be found here). But since functions in lambda calculus don’t have names, that means that we resort to something tricky. It’s called the Y combinator, aka the lambda fixed point operator.
Let’s start by looking at a simple recursive function outside of the lambda calculus. The factorial function, n!, is the standard example:
factorial(n) = 1 if n = 0,
or factorial(n) = n*factorial(n-1) if n > 0
If we want to start trying to write that in lambda calculus, we’d need a couple of tools… We need a test for equality to zero, and we need a way of multiplying numbers; and we need a way of subtracting one.
For testing equality to zero, we’ll use a function named *IsZero*, which takes three parameters: a number, and two values. If the number is 0, it returns the first value; if it’s not 0, then it returns the second value.
For multiplication – multiplication is an iterative algorithm, so we can’t write multiplication until we work out recursion. But we’ll just handwave that for now, and have a function *Mult x y*.
And finally, for subtracting one, we’ll use *Pred x* for the predecessor of x – that is, x – 1.
So – a first stab at factorial, written with the recursive call left with a blank in it, would be:
*λ n . IsZero n 1 (Mult n (**something** (Pred n)))*
Now, the question is, what kind of “something” can we plug in to there? What we’re really like to do is plug in a copy of the function itself:
*Fact ≡ λ n . IsZero n 1 (Mult n (Fact (Pred n)))*
How can we do that? Well, the usual way of plugging something in to a lambda calculus function is through a parameter:
*Fact ≡ (λ f n . IsZero n 1 (Mult n (f (Pred n)))) Fact*
Of course, we can’t plug in a copy of the function as its own parameter that way: the name *Fact* doesn’t exist in the expression in which we’re trying to use it. You can’t use an undefined name – and in lambda calculus, the *only* way to bind a name is by passing it as a parameter to a λ expression. So what can we do?
The answer is to use something called a *combinator*. A combinator is a special kind of *higher order function* which can be defined without reference to anything but function applications. (A higher order function is a function which takes functions as parameters and returns functions as results). The Y combinator is the special, almost magical function that makes recursion possible. Here’s what it looks like:
Y ≡ λ y . (λ x . y (x x)) (λ x . y (x x))
If you look at it, the reason for calling it Y is because it is *shaped* like a Y. To show you that more clearly, sometimes we write lambda calculus using trees. Here’s the tree for the Y combinator:
Why is the Y combinator an answer to our problem in defining the factorial function? The Y combinator is something called a *fixed point* combinator. What makes it special is the fact that for any function *f*, *Y f* evaluates to *f Y f*; which evaluates to *f (f Y f)*; which evaluates to *f (f (f Y f))*. See why it’s called Y?
Let’s try walking through “*Y f*”:
1. Expand Y: “*(λ y . (λ x . y (x x)) (λ x . y (x x))) f*”
2. β: “*(λ x . f (x x)) (λ x . f (x x))*
3. β again: “*f (λ x . f (x x)) (λ x . f (x x))*”
4. Since “*Y f = (λ x . f (x x)) (λ x . f (x x))*”, what we just got in step three is “f Y f”.
See, there’s the magic of “Y”. No matter what you do, you can’t make it consume itself. Evaluating “*Y f*” will produce another copy of *f*, and leave the “Y f” part untouched.
So how do we use this crazy thing?
Remember our last attempt at defining the factorial function? Let’s look at it again:
*Fact ≡ (λ f n . IsZero n 1 (Mult n (f (Pred n)))) Fact*
Let’s rewrite it just a tiny bit to make it easier to talk about:
*Metafact ≡ (λ f n . IsZero n 1 (Mult n (f (Pred n))))*
*Fact ≡ Metafact Fact*
Now – the trick is, “Fact” is not an identifier defined inside of “Fact”. How do we let “Fact” reference “Fact”? Well, we did a lambda abstraction to let us pass the “Fact” function as a parameter; so what we needed to do is to find a way to write “Fact” that lets us pass it to itself as a parameter.
What does “Y f” do? It expands into a call to “f” with “Y f” as its first parameter. And “Y f” will expand into “f Y f” – that is, a call to “f” with “Y f” as its parameter. In other words, Y f turns “f” into a recursive function with *itself* as its first parameter.
So the factorial function is:
*Fact ≡ Y Metafact*
*(Y metafact)* is the parameter value of “f” in the metafact lambda; when we do β on the function, if n is zero, then it just returns 1. If it’s not zero, then we get the call to *f (Pred n)*. *f* betas to *Y metafact*. Which does that funky magic copying thing, giving us *metafact (Y metafact) (Pred n)*.
Voila, recursion. s
I learned about the Y combinator back in my undergrad days, which would place it around 1989 – and I still find it rather mystifying. I do understand it now, but I can’t imagine how on earth anyone ever figured it out!
If you’re interested in this, then I highly recommend getting a copy of the book [The Little Schemer][schemer]. It’s a wonderful little book – set up like a childrens’ book, where the front of each page is a question; and the back of each page is the answer. It’s written in a delightfully playful style, it’s very fun and engaging, and it will not only teach you to program in Scheme.
As an important side-note there are actually a couple of different versions of the Y combinator. There are different ways of evaluating lambda calculus: given an expression like *(λ x y . x * y) 3 ((λ z. z * z) 4)*”
we can do it in two different orders: we can first do the beta on “*(λ x y . x * y)*”,which would give us: “*3 * ((λ z . z * z) 4)*”.
Or, we could beta “*((λ z . z * z) 4)*” first: “*(λ x y . x * y) 3 (4 * 4)*”. Nn this case, the two orders end up with the same result; but that’s not always the case. Sometimes the order of evaluation matters – and the way that the Y combinator works is one of those times. One order will result in expanding the Y infinitely; the other will result in a clean recursive function.
The first order is what we call *lazy evaluation*: don’t evaluate the parameters to a function until they’re needed. (This is also pretty much the same thing as what we sometime call *by name* parameter passing.) The second is called *eager evaluation* : always evaluate parameters *before* the functions that they’re passed to. (In real programming languages, Lisp, Scheme, and ML are lambda-calculus based languages that use eager evaluation; Haskell and Miranda are lambda calculus based languages that use lazy evaluation.) The Y combinator I described above is the Y for *lazy* evaluation. If we used eager evaluation, then Y combinator above wouldn’t work – in fact, it would copy Ys forever. There is another version of Y which works for eager evaluation – in fact, it’s the one described in “The Little Schemer”, and they explain it so much better than I do that I’ll just recommend again that you head for whatever bookstore you prefer, and buy yourself a copy.
[recursion]: http://goodmath.blogspot.com/2006/03/clarifying-recursion.html
[schemer]: http://www.amazon.com/gp/redirect.html?link_code=ur2&tag=goodmathbadma-20&camp=1789&creative=9325&location=/gp/search%3F%26index=books%26keywords=little%20schemer%26_encoding=UTF8
The Genius of Alonzo Church (rerun)
I’m on vacation this week, so I’m posting reruns of some of the better articles from when Goodmath/Badmath was on Blogger. Todays is a combination of two short posts on numbers and control booleans in λ calculus.
So, now, time to move on to doing interesting stuff with lambda calculus. To
start with, for convenience, I’ll introduce a bit of syntactic sugar to let us
name functions. This will make things easier to read as we get to complicated
stuff.
To introduce a *global* function (that is a function that we’ll use throughout our lambda calculus introduction without including its declaration in every expression), we’ll use a definition like the following:
*square ≡ λ x . x × x*
This declares a function named “square”, whose definition is “*λ x . x×x*”. If we had an expression “square 4”, the definition above means that it would effectively be treated as if the expression were: “*(λ square . square 4)(λ x . x×x)*”.
Numbers in Lambda Calculus
——————————
In some of the examples, I used numbers and arithmetic operations. But numbers don’t really exist in lambda calculus; all we really have are functions! So we need to invent some way of creating numbers using functions. Fortunately, Alonzo Church, the genius who invented the lambda calculus worked out how to do that. His version of numbers-as-functions are called Church Numerals.
In Church numerals, all numbers are functions with two parameters:
* Zero ≡ *λ s z . z*
* One ≡ *λ s z . s z*
* Two ≡ *λ s z . s (s z)*
* Three ≡ *λ s z . s (s (s z))*
* Any natural number “n”, is represented by a Church numeral which is a function which applies its first parameter to its second parameter “n” times.
A good way of understanding this is to think of “z” as being a a name for a zero-value, and “s” as a name for a successor function. So zero is a function which just returns the “0” value; one is a function which applies the successor function once to zero; two is a function which applies successor to the successor of zero, etc. It’s just the Peano arithmetic definition of numbers transformed into lambda calculus.
But the really cool thing is what comes next. If we want to do addition, x + y, we need to write a function with four parameters; the two numbers to add; and the “s” and “z” values we want in the resulting number:
add ≡ *λ s z x y . x s (y s z)*
Let’s curry that, to separate the two things that are going on. First, it’s taking two parameters which are the two values we need to add; second, it needs to normalize things so that the two values being added end up sharing the same binding of the zero and successor values.
add_curry ≡ λ x y. (λ s z . (x s (y s z)))
Look at that for a moment; what that says is, to add x and y: create the church numeral “y” using the parameters “s” and “z”. Then **apply x** to that new church numeral y. That is: a number is a function which adds itself to another number.
Let’s look a tad closer, and run through the evaluation of 2 + 3:
add_curry (λ s z . s (s z)) (λ s z . s (s (s z)))
To make things easier, let’s alpha 2 and 3, so that “2” uses “s2” and “z2”, and 3 uses “s3” and “z3”;
add_curry (λ s2 z2 . s2 (s2 z2)) (λ s3 z3 . s3 (s3 (s3 z3)))
Now, let’s do replace “add_curry” with its definition:
(λ x y .(λ s z. (x s y s z))) (λ s2 z2 . s2 (s2 z2)) (λ s3 z3 . s3 (s3 (s3 z3)))
Now, let’s do a beta on add:
λ s z . (λ s2 z2 . s2 (s2 z2)) s (λ s3 z3 . s3 (s3 (s3 z3)) s z)
And now let’s beta the church numeral for three. This basically just “normalizes” three: it replaces the successor and zero function in the definition of three with the successor and zero functions from the parameters to add.
λ s z . (λ s2 z2 . s2 (s2 z2)) s (s (s (s z)))
Now.. Here comes the really neat part. Beta again, this time on the lambda for two. Look at what we’re going to be doing here: two is a function which takes two parameters: a successor function, and zero function. To add two and three, we’re using the successor function from add function; and we’re using the result of evaluating three *as the value of the zero!* for two:
λ s z . s (s (s (s (s z))))
And we have our result: the church numeral for five!
Choice in Lambda Calculus
—————————
Now that we have numbers in our Lambda calculus, there are only two things missing before we can express arbitrary computations: a way of expressing choice, and a way of expressing repetition. So now I’ll talk about booleans and choice; and then next post I’ll explain repetition and recursion.
We’d like to be able to write choices as if/then/else expressions, like we have in most programming languages. Following the basic pattern of the church numerals, where a number is expressed as a function that adds itself to another number, we’ll express true and false values as functions that perform an if-then-else operation on their parameters. These are sometimes called *Church booleans*. (Of course, also invented by Alonzo Church.)
* TRUE ≡ λ t f . t
* FALSE ≡ λ t f . f
So, now we can write an “if” function, whose first parameter is a condition expression, second parameter is the expression to evaluate if the condition was true, and third parameter is the expression to evaluate if the condition is false.
* IfThenElse ≡ *λ cond t f . cond t f*
For the boolean values to be useful, we also need to be able to do the usual logical operations:
* BoolAnd ≡ *λ x y .x y FALSE*
* BoolOr ≡ *λ x y. x TRUE y*
* BoolNot ≡ *λ x . x FALSE TRUE*
Now, let’s just walk through those a bit. Let’s first take a look at BoolAnd. We’ll try evaluating “*BoolAnd TRUE FALSE*”:
* Expand the TRUE and FALSE definitions: “*BoolAnd (λ t f . t) (λ t f . f)*”
* Alpha the true and false: “*BoolAnd (λ tt tf . tt) (λ ft ff . ff)*”
* Now, expand BoolAnd: *(λ t f. t f FALSE) (λ tt tf . tt) (λ ft ff . ff)*
* And beta: *(λ tt tf.tt) (λ ft ff. ff) FALSE*
* Beta again: *(λ xf yf . yf)*
And we have the result: *BoolAnd TRUE FALSE = FALSE*. Now let’s look at “*BoolAnd FALSE TRUE*”:
* *BoolAnd (λ t f . f) (λ t f .t)*
* Alpha: *BoolAnd (λ ft ff . ff) (λ tt tf . tt)*
* Expand BoolAnd: *(λ x y .x y FALSE) (lambda ft ff . ff) (lambda tt tf . tt)*
* Beta: *(λ ft ff . ff) (lambda tt tf . tt) FALSE
* Beta again: FALSE
So *BoolAnd FALSE TRUE = FALSE*. Finally, *BoolAnd TRUE TRUE*:
* Expand the two trues: *BoolAnd (λ t f . t) (λ t f . t)*
* Alpha: *BoolAnd (λ xt xf . xt) (λ yt yf . yt)*
* Expand BoolAnd: *(λ x y . x y FALSE) (λ xt xf . xt) (λ yt yf . yt)*
* Beta: *(λ xt xf . xt) (λ yt yf . yt) FALSE*
* Beta: (λ yt yf . yt)
So *BoolAnd TRUE TRUE = TRUE*.
The other booleans work in much the same way.
So by the genius of Alonzo church, we have *almost* everything we need in order to write programs in λ calculus.
A Lambda Calculus rerun
I’m on vacation this week, so I’m recycling some posts that I thought were particularly interesting to give you something to read.
In computer science, especially in the field of programming languages, we tend to use one particular calculus a lot: the Lambda calculus. Lambda calculus is also extensively used by logicians studying the nature of computation and the structure of discrete mathematics. Lambda calculus is great for a lot of reasons, among them:
- It’s very simple.
- It’s Turing complete.
- It’s easy to read and write.
- It’s semantics are strong enough that we can do reasoning from it.
- It’s got a good solid model.
- It’s easy to create variants to explore the properties of various alternative ways of structuring computations or semantics.
The ease of reading and writing lambda calculus is a big deal. It’s led to the development of a lot of extremely good programming languages based, to one degree or another, on the lambda calculus: Lisp, ML, and Haskell are very strongly lambda calculus based.
The lambda calculus is based on the concept of *functions*>. In the pure lambda calculus, *everything* is a function; there are no values *at all* except for functions. But we can build up anything we need using functions. Remember back in the early days of this blog, I talked a bit about how to build mathematics? We can build the entire structure of mathematics from nothing but lambda calculus.
So, enough lead-in. Let’s dive in a look at LC. Remember that for a calculus, you need to define two things: the syntax, which describes how valid expressions can be written in the calculus; and a set of rules that allow you to symbolically manipulate the expressions.
Lambda Calculus Syntax
The lambda calculus has exactly three kinds of expressions:
- Function definition: a function in lambda calculus is an expression, written: “
λ param . body
” which defines a function with one parameter. - Identifier reference: an identifier reference is a name which matches the name of a parameter defined in a function expression enclosing the reference.
- Function application: applying a function is written by putting the function value in front of its parameter, as in “
x y
” to apply the functionx
to the valuey
.
Currying
There’s a trick that we play in lambda calculus: if you look at the definition above, you’ll notice that a function (lambda expression) only takes one parameter. That seems like a very big constraint – how can you even implement addition with only one parameter?
It turns out to be no problem, because of the fact that *functions are values*. So instead of writing a two parameter function, you can write a one parameter function that returns a one parameter function – in the end, it’s effectively the same thing. It’s called currying, after the great logician Haskell Curry.
For example, suppose we wanted to write a function to add x and y. We’d like to write something like: λ x y . x + y
. The way we do that with one-parameter functions is: we first write a function with one parameter, which returns another function with one parameter.
Adding x plus y becomes writing a one-parameter function with parameter x, which returns another one parameter function which adds x to its parameter:
λ x . (λ y . x + y)
Now that we know that adding multiple parameter functions doesn’t *really* add anything but a bit of simplified syntax, we’ll go ahead and use them when it’s convenient.
Free vs Bound Identifiers
One important syntactic issue that I haven’t mentioned yet is *closure* or *complete binding*. For a lambda calculus expression to be evaluated, it cannot reference any identifiers that are not *bound*. An identifier is bound if it a parameter in an enclosing lambda expression; if an identifier is *not* bound in any enclosing context, then it is called a *free* variable.
- *λ x . plus x y*: in this expression, “y” and “plus” are free, because they’re not the parameter of any enclosing lambda expression; x is bound because it’s a parameter of the function definition enclosing the expression “plus x y” where it’s referenced.
- *λ x y.y x*: in this expression both x and y are bound, because they are parameters of the function definition.
- *λ y . (λ x . plus x y*: In the inner lambda, “*λ x . plus x y*”, y and plus are free and x is bound. In the full expression, both x and y are bound: x is bound by the inner lambda, and y is bound by the other lambda. “plus” is still free.
We’ll often use “free(x)” to mean the set of identifiers that are free in the expression “x”.
A lambda calculus expression is completely valid only when all of its variables are bound. But when we look at smaller subexpressions of a complex expression, taken out of context, they can have free variables – and making sure that the variables that are free in subexpressions are treated right is very important.
Lambda Calculus Evaluation Rules
There are only two real rules for evaluating expressions in lambda calculus; they’re called α and beta. α is also called “conversion”, and beta is also called “reduction”.
α Conversion
α is a renaming operation; basically it says that the names of variables are unimportant: given any expression in lambda calculus, we can change the name of the parameter to a function as long as we change all free references to it inside the body.
So – for instance, if we had an expression like:
λ x . if (= x 0) then 1 else x^2
We can do α to replace X with Y (written “α[x/y]” and get):
λ y . if (= y 0) then 1 else y^2
Doing α does *not* change the meaning of the expression in any way. But as we’ll see later, it’s important because it gives us a way of doing things like recursion.
β Reduction
β reduction is where things get interesting: this single rule is all that’s needed to make the lambda calculus capable of performing *any* computation that can be done by a machine.
Beta basically says that if you have a function application, you can replace it with a copy of the body of the function with references to the parameter identifiers replaced by references to the parameter value in the application. That sounds confusing, but it’s actually pretty easy when you see it in action.
Suppose we have the application expression: “*(λ x . x + 1) 3*“. What beta says is that we can replace the application by taking the body of the function (which is “*x + 1*”); and replacing references to the parameter “*x*” by the value “*3*”; so the result of the beta reduction is “*3 + 1*”.
A slightly more complicated example is the expression:
λ y . (λ x . x + y)) q
It’s an interesting expression, because it’s a lambda expression that when applied, results in another lambda expression: that is, it’s a function that creates functions. When we do beta reduction in this, we’re replacing all references to the parameter “y” with the identifier “q”; so, the result is “*λ x. x+q*”.
One more example, just for the sake of being annoying:
“(λ x y. x y) (λ z . z × z) 3“.
That’s a function that takes two parameters, and applies the first one to the second one. When we evaluate that, we replace the parameter “*x*” in the body of the first function with “*λ z . z × z*”; and we replace the parameter “*y*” with “3”, getting: “*(λ z . z × z) 3*”. And we can perform beta on that, getting “*3 × 3*”.
Written formally, beta says:
λ x . B e = B[x := e] if free(e) ⊂ free(B[x := e]
That condition on the end, “if free(e) subset free(B[x := e]” is why we need α: we can only do beta reduction *if* doing it doesn’t create any collisions between bound identifiers and free identifiers: if the identifier “z” is free in “e”, then we need to be sure that the beta-reduction doesn’t make “z” become bound. If there is a name collision between a variable that is bound in “B” and a variable that is free in “e”, then we need to use α to change the identifier names so that they’re different.
As usual, an example will make that clearer: Suppose we have a expression defining a function, “*λ z . (λ x . x+z)*”. Now, suppose we want to apply it:
(λ z . (λ x . x + z)) (x + 2)
In the parameter “*(x + 2)*”, x is free. Now, suppose we break the rule and go ahead and do beta. We’d get “*λ x . x + x + 2*”.
The variable that was *free* in “x + 2” is now bound. Now suppose we apply that function: “*(λ x . x + x + 2) 3*”. By beta, we’d get “3 + 3 + 2”, or 8.
What if we did α the way we were supposed to?
- By α[x/y], we would get “*λ z . (λ y . y + z) (x+2)*”.
- by α[x/y]:
lambda z . (lambda y . y+z)) (x + 2)
. Then by β, we’d get “*λ y . y + x + 2*”. If we apply this function the way we did above, then by β, we’d get “*3+x+2*”.
“*3+x+2*” and “*3+3+2*” are very different results!
And that’s pretty much it. There’s another *optional* rule you can add called η-conversion. η is a rule that adds *extensionality*, which provides a way of expressing equality between functions.
η says that in any lambda expression, I can replace the value “f” with the value “g” if/f for all possible parameter values x, *f x = g x*.
What I’ve described here is Turing complete – a full effective computation system. To make it useful, and see how this can be used to do real stuff, we need to define a bunch of basic functions that allow us to do math, condition tests, recursion, etc. I’ll talk about those in my next post.
We also haven’t defined a model for lambda-calculus yet. (I discussed models here and here.) That’s actually quite an important thing! LC was played with by logicians for several years before they were able to come up with a complete model for it, and it was a matter of great concern that although LC looked correct, the early attempts to define a model for it were failures. After all, remember that if there isn’t a valid model, that means that the results of the system are meaningless!
Ultimate Spaghetti Coding with Linguine
Todays dose of programming pathology is a monstrosity called *Linguine*. Linguine is a language that (obviously) carries spaghetti code to an extreme. You can see the [language specification][linguine-spec], or download an [implementation with examples][linguine-impl].
It’s really a remarkably simple language. There are only 10 commands, including input and output:
1. “`x=y`”: assign the value of y to the variable x.
2. “`x+y`”: assign the value of x+y to the variable x.
3. “`x-y`”: assign the value of x-y to the variable x.
4. “`x|y`”: assign the value of x NAND y to the variable x.
5. “`x>y`”: assign the value of shifting x by y bits to x. If y is positive, then it shifts y bits right; if y is negative, then it shifts -y bits left.
6. “`x?`”: Input one character and store it in the variable x.
7. “`x$`”: Output the content of register x as a character.
8. “`x#`”: Output the content of register x as an integer.
9. “`x<y:loc`”: if x < y then jump to the program line labeled “loc”.
10. “`x~y:loc`”: if x = y then jump to the program line labeled “loc”.
Command operands on either integer literals, or on the contents of an unlimited set of numbered registers each of which contains an integer. Valid operands are either an integer (which identifies a register number if it appears on the left, or a literal value on the right); or an operand preceeded by “*”, which performs an indirection on the value it precedes. So for example, “*3” is the contents of register three. “**3” is the contents of the register whose number is stored in register 3, etc. Any of the operands may be any of these forms, so for example, “1~*3:**4” is a valid command.
A linguine statement consists of three parts: a line label, a list of commands, and a branch target, written “label[commands]target”. When executed, the line evaluates each command in the list in sequence. If it reaches the end of the list without having branched, then it will branch to the statement labeled by the branch target.
The lines of the program may appear in any order. The first statement executed will be the statement whose label is smallest. Branching to 0 ends program execution.
Comments start with a “‘”, and extend to the end of the line.
All quite simple, right?
As usual, we’ll start with hello world.
1[0=72,0$,0+29,0$,0+7,0$,0$,0+3,0$,1=32,1$,0-24,0$,0+24,0$,0+3,0$,0-6,0$,0-8,0$,1+1,1$,1-23,1$]0
That’s perfectly clear, isn’t it? Actually, it’s basically the same as the hello world program we saw in brainfuck. We’ll piece it apart.
* 0=72,0$
: 72 is the ascii code for “H”. So store that in register 0. Then output register 0.
* 0+29,0$
: 101 is the ascii code for “e”. So add 29 to the contents of register 0, which will make the new value of register 0 be 101. Then output that.
* etc.
How about something a tad more interesting? Let’s look at generating the
Fibonacci sequence:
1[0=32,2=1,1#,0$,2#]2
2[1+*2,3=*1,1=*2,2=*3,0$,2#]2
* This starts with line 1:
* “0=32
: Store the ascii code for a whitespace in register 0.
* “`2=1`”: store the value 0 in register 2.
* “`1#`”: output the value of register 1. (Since registers contain 0 when the program starts, that prints a zero.)
* “`0$`”: output the contents of register 0 as a character, which prints a whitespace.
* “`2#`”: print the contents of register 2 (1).
* Finally, branch to statement 2.
* Statement 2: registers 1 and 2 contain the previous two fibonacci numbers. So this adds them together to get the next one, and then rotates the values of the registers so that the newest fibonacci number is in register 2, and the one before that is in register one.
* “`1+*2`”: add the value of register 2 to register 1.
* “`3=*1`”: store the value of register 1 in register 3.
* “`1=*2`”: store the value of register 2 in register 1.
* “`2=*3`”: store the value of register 3 in register 2.
* “`0$,2#`”: output a space, and then output the value of register 2.
* Finally, the branch instruction repeats statement 2, to keep generating fibonacci numbers.
Here’s a nice little multiply program, which demonstrates how to write subroutines. Note that recursion doesn’t work here, because we need to use fixed register numbers. (You can set up recursion using additional levels of indirection, it’s just messy as heck.)
‘Multiply: -2 *= *-3
‘Programmed by Jeffry Johnston, 2005
‘-1=return jump, -4=temp, -5=temp
100[-4=*-3,-5=*-2,-2=0,-4<0:101]102
101[-4~0:*-1,-2-*-5,-4+1]101 '-3 is negative
102[-4~0:*-1,-2+*-5,-4-1]102 '-3 is positive
This takes the address of the instruction to return to in register -1; it takes the left operand of multiple (which is also the target of the result) in register -2, and the right operand in register -3. Registers -4 and -5 are used for temporary storage.
* Line 100: set up, and determine whether the right operand is positive or negative.
* "`-4=*-3,-5=*-2`": Copy the contents of register -3 to register -4, and the contents of register -2 to register -5.
* "`-2=0`": Store 0 in register -2.
* "`-4<0:101`": If the contents of register -4 is less than 0, then branch to statement 101.
* If you reach the end of this line, then the contents of register -4 is non-negative, and you branch to statement 102.
* Line 101: do multiplication by repeated subtraction
* "`-4~0:*-1`": If the contents of register -4 is zero, then we're done, so return by branching to the line labeled by the contents of register -1.
* "`-2-*-5`": Subtract the contents of register -5 from the contents of register -2 (the result register).
* "`-4-1`": decrement (or increment, depending on your point of view) the contents of register four, to show that you've done one subtraction.
* Repeat the line.
* Line 102: basically exactly the same as 101, except that the does a repeated add rather than a subtract.
An example of calling this would be:
1[-1=2,-2=36,-3=13]200
2[-2#]0
This stores 36 and 13 as the parameters to the multiple subroutine, and then invokes it by branching to it at line 200. The "return address" is set to statement 2 by storing 2 in register -1. When control returns to statement 2, the result of the multiplication is printed out.
One last example, which you can puzzle out on your own. A brainfuck interpreter in 21 lines of Linguine:
'BF interpreter in Linguine
'Programmed by Jeffry Johnston, 2005
1[*-2?,*-2~33:2,-2+1]1
2[*-2=-1,-2+1]3
3[-3=**-1,-3~-1:0,-3~43:4,-3~44:6,-3~45:7,-3~46:9,-3~60:10,-3~62:11,-3~91:12,-3~93:14]15
4[*-2+1,*-2~256:5]15 '+
5[*-2=0]15
6[*-2?,*-2~-1:5]15 ',
7[*-2-1,*-2~-1:8]15 '-
8[*-2=255]15
9[*-2$]15 '.
10[-2-1]15 '
12[*-2~0:13]15 ‘[
13[-3=1]16
14[-3=-1]16 ‘]
15[-1+1]3
16[-4=1]17
17[-1+*-3,*-1~91:18,*-1~93:19]20
18[-4+*-3]20
19[-4-*-3]20
20[-4~0:21]17
21[-3~1:15]3
[linguine-spec]: http://kidsquid.com/files/linguine/README
[linguine-impl]: http://kidsquid.com/files/linguine/linguine.tar.gz
Topological Spaces
Yesterday, I introduced the idea of a *metric space*, and then used it to define *open* and *closed* sets in the space. (And of course, being a bozo, I managed to include a typo that made the definition of open sets equivalent to the definition of closed sets. It’s been corrected, but if you’re not familiar with this stuff, you might want to go back and take a look at the corrected version. It’s just replacing a ≤ with a <, but that makes a *big* difference in meaning!)
Today I’m going to explain what a *topological space* is, and what *continuity* means in topology. (For another take on continuity, head on over to [Antopology][antopology] where Ofer has posted his explanation.)
A *topological space* is a set **X** and a collection **T** of subsets of **X**, where the following conditions hold:
1. ∅ ∈ **T** and **X** ∈ **T*.
2. ∀ C ∈ ℘(**T**); : ∪(c ∈ C) ∈ **T**. (That is, the union of any collection of subsets of **T** is an element of **T**. )
3. ∀ s, t ∈ **T** : s ∩ t ∈ T. *(The intersection of any two subsets of **T** is also in **T**.)
The collection **T** is called a *topology* on **T**. The *members* of **T** are the *open sets* of the topology. The *closed sets* are the set complements of the members of **T**. Finally, the *elements* of the topological space **X** are called *points*.
The connection to metric spaces should be pretty obvious. The way we built up open and closed sets over a metric space can be used to produce topologies. The properties we worked out for the open and closed sets are exactly the properties that are *required* of the open and closed sets of the topology. There are many ways to build a topology other than starting with a metric space, but that’s definitely the easiest way.
One of the most important ideas in topology is the notion of *continuity*. In some sense, it’s *the* fundamental abstraction of topology. We can now define it.
A *function* from topological space **X** to topological space **U** is *continuous* if/f for every open sets C ∈ **T** the *inverse image* of f on C is an open set. The inverse image is the set of points x in **X** where f(x) ∈ C.
That’s a bit difficult to grasp. What it’s really capturing is that there are no *gaps* in the function. If there were a gap, then the open spaces would no longer be open. Think of the metric spaces idea of open sets. Imagine an open set with a cube cut out of the middle. It’s definitely not continuous. If you took a function on that open set, and its inverse image was the set with the cube cut out, then the function is not smoothly mapping from the open set to the other topological space. It’s mapping *part of* the open set, leaving a big ugly gap.
Now, here’s were it gets kind of nifty. The set of of topological spaces and continuous functions form a *category*. with the spaces as objects and the functions as morphisms. We call this category **Top**. It’s often easiest to talk about topological spaces using the constructs of category theory.
So, for example, one of the most important ideas in topology is *homeomorphism*. A homeomorphism is a bicontinuous bijection (a bicontinuous function is a continuous function with a continuous inverse; a bijection is a bidirectional total function between sets.) A homeomorphism between topological spaces is a *homomorphism* in **Top**.
From the perspective of topology, any two topological spaces with a homeomorphism between them are *identical*. (Which ends up corresponding exactly to how we defined the idea of *equality* in category theory.)
[antopology]: http://antopology.blogspot.com/2006/08/continuity-introduced.html
Metric Spaces
Topology usually starts with the idea of a metric space. A metric space is a set of values with some concept of distance. We need to define that first, before we can get into anything really interesting.
Metric Spaces and Distance
What does distance mean?
Let’s look at a set, S, consisting of elements s1, s2, s3,…,sn. What does it mean to measure a distance from si to sj?
We’ll start by looking at a simple number-line, with the set of real numbers. What’s the distance between two numbers x and y? It’s a measure of how far over the number-line you have to go to get from x to y. But that’s really circular; we’ve defined distance as “how far you have to go”, which is defined by distance. Let’s try again. Take a blank ruler, and put it next to the numberline, so that the left edge of the ruler is on X, and then draw a mark on the ruler where it touches Y. The length of the ruler up to that mark is the distance from x to y. The reason that this one isn’t circular is because now, you can take that ruler, and use it to answer the question: is the number v farther from the number w than x is from y? Because the ruler gives you a *metric* that you can use *that is separate from* the number-line itself.
A metric over S is a function that takes two elements si and sj, and returns a *real number* which measure the distance between those two elements. To be a proper metric, it needs to have a set of required properties. To be formal, a function d : S × S → ℜ is a *metric function over S* if/f:
- ∀ si, sj ∈ S: d(si,sj) = 0 if/f i=j. (the identity property)
- ∀ si, sj ∈ S: d(si,sj) = d(sj,si) (the symmetry property)
- ∀ si, sj, sk ∈ S: d(si,sk) ≤ d(si,sj) + d(sj,sj) (the triangle inequality property)
Some people also add a fourth property, called the non-negativity property; I prefer to leave it out, because it can be inferred from the others. But for completeness, here it is: ∀ si, sj ∈ S: d(si,sj) ≥ 0.
A metric space is just the pair (S,d) of a set S, and a metric function d over the set.
For example: - The real numbers are a metric space with the ruler-metric function. You can easily verify that properties of a metric function all work with the ruler-metric. In fact, they are are all things that you can easily check with a ruler and a number-line, to see that they work. The function that you’re creating with the ruler is: d(x,y) = |x-y| (the absolute value of x – y). So the ruler-metric distance from 1 to 3 is 2.
- A cartesian plane is a metric space whose distance function is the euclidean distance:
d((ax,ay), (bx,by)) = ((ax-bx)2 + (ay-by)2 )1/2. - In fact, for every n, the euclidean n-space is a metric space using the euclidean distance.
- A checkerboard is a metric space if you use the number of kings moves as the distance function.
- The Manhattan street grid is a metric space where the distance function between two intersections is the sum of the number of horizontal blocks and the number of vertical blocks between them.
Open and Closed Sets in Metric Spaces
You can start moving from metric spaces to topological spaces by looking at open sets. Take a metric space, (S,d), and a point p∈S. An open ball B(p,r) (a ball of radius r around point p) in S is the set of points x such that d(p,x) 0, B(p,r)∩T≠∅. The closure of T (usually written as T with a horizontal line over it; sometimes written as T by computer scientists, because that’s the closure notation in many CS subjects). is the set of all points adherent to T. *(note: a typo was corrected in this paragraph. Thanks to the commenter who caught it!)
A subset T of S is called a closed subset if/f T=T. Intuitively, T is closed if it *contains the surface that forms its boundary. So in 3-space, a solid sphere is a closed space. The contents of the sphere (think of the shape formed by the air in a spherical balloon) is not a closed space; it’s bounded by a surface, but that surface is not part of the space.
Introducing Topology
Back when GM/BM first moved to ScienceBlogs, we were in the middle of a poll about the next goodmath topic for me to write about. At the time, the vote was narrowly in favor of topology, with graph theory as a very close second.
We’re pretty much done with category theory, so it’s topology time!
So what’s topology about? In some sense, it’s about the fundamental abstraction of *continuity*: if I have a bunch of points that form a continuous line or surface, what does that really mean? In particular, what does it mean *from within* the continuous surface?
Another way of looking at is as the study of what kinds of *structures* are formed from continuous sets of points. This viewpoint makes much of topology look a lot like category theory: a study of mathematical structures, what they mean, and how we can build them and create mappings between them.
Let’s take a quick look at an example. There’s a famous joke about topologists; you can always recognize a topology at breakfast, because they’re the people who can’t tell the difference between their coffee mug and their donut.
It’s not just a joke; there’s a real example hidden in there. From the viewpoint of topology, the coffee mug and the donut *are the same shape*. They’re both toruses. In topology, the exact shape doesn’t matter: what matters is the basic continuities of the surface: what is *connected* to what, and *how* they are connected. In the following diagram, all three shapes are *topologically* identical:
If you turn the coffee mug into clay, you can remold it from mug-shape to donut-shape *without tearing or breaking*. Just squishing and stretching. So in topology, they *are* the same shape. On the other hand, a sphere is different: you can’t turn a donut into a sphere without tearing a whole in it. If you’ve got a sphere and you want to turn it into a torus, you can either flatten it and punch a hole in the middle; or you can roll it into a cylinder, punch holes in the ends to create a tube, and then close the tube into a circle. And you can’t turn a torus into a sphere without tearing it: you need to break the circle of the torus and then close the ends to create a sphere. In either case, you’re tearing at least one whole in what was formerly a continuous surface.
Topology was one of the hottest mathematical topics of the 20th century, and as a result, it naturally has a lot of subfields. A few examples include:
1. **Metric topology**: the study of *distance* in different spaces. The measure of distance and related concepts like angles in different topologies.
2. **Algebraic topology**: the study of topologies using the tools of abstract algebra. In particular, studies of things like how to construct a complex space from simpler ones. Category theory is largely based on concepts that originated in algebraic topology.
3. **Geometric topology**: the study of manifolds and their embeddings. In general, geometric topology looks at lower-dimensional structures, most either two or three dimensional. (A manifold is an abstract space where every point is in a region that appears to be euclidean if you only look at the local neighborhood. But on a larger scale, the euclidean properties may disappear.)
4, **Network topology**: topology in the realm of discrete math. Network topologies are graphs (in the graph theory sense) consisting of nodes and edges.
5. **Differential Topology**: the study of differential equations in topological spaces that have the properties necessary to make calculus work.
Personally, I find metric topology rather dull, and differential topology incomprehensible. Network topology more properly belongs in a discussion of graph theory, which is something I want to write about sometime. So I’ll give you a passing glance at metric topology to see what it’s all about, and algebraic topology is where I’ll spend most of my time.
One of the GM/BM readers, Ofer Ron (aka ParanoidMarvin) is starting a new blog, called [Antopology][antopology] where he’ll be discussing topology, and we’re going to be tag-teaming our way through the introductions. Ofer specializes in geometric topology (knot theory in particular, if I’m not mistaken), so you can get your dose of geometric topology from him.
[antopology]: http://antopology.blogspot.com/