Category Archives: Good Math

π really is wrong!

I’ve written recently about several different crackpots who insist, for a variety of completely ridiculous reasons, that pi is wrong. But the other day, someone sent me a link to a completely serious site that makes a pretty compelling argument that pi really is wrong.

Continue reading

The Glorious Horror of TECO

In my oh-so-abundant free time, I’ve been working on my own little text editor. And one of my motivations is TECO: one of the oldest, and one of the very best, ever written. It’s both a text editor and a programming language – and, in fact, that’s exactly what made it such a brilliant tool. So much of the drudgery of programming is stuff that really could be done by a program. But we’ve spent so much time learning to be fancy that we’ve lost track of that. Nowadays, you can write an emacs lisp program to do the stuff you used to do in TECO; only it’s awkward enough that you usually don’t.

The problem, though, with just re-implementing TECO with a modern UI is that it was designed in a different time. When TECO was written, every byte was critical. And so the language, the syntax, the structure, it was completely ridiculous. And, as a result, it became the world’s most useful pathological programming language. It’s a glorious, hideous, wonderful, horrific piece of computing history

TECO is one of the most influential pieces of software ever written. If, by chance, you’ve ever heard of a little editor called “emacs”; well, that was originally a set of editor macros for TECO (EMACS = Editor MACroS). As a language, it’s both wonderful and awful. On the good side, The central concept of the language is wonderful: it’s a powerful language for processing text, which works by basically repeatedly finding text that matches some kind of pattern, taking some kind of action when it finds it, and then selecting the next pattern to look for. That’s a very natural, easy to understand way of writing programs to do text processing. On the bad side, it’s got the most god-awful hideous syntax ever imagined.

Continue reading

Fractals without a Computer!

This is really remarkably clever:

Since I can’t stand to just post a video without any explanation:

A fractal is a figure with a self-similar pattern. What that means is that there is some way of looking at it where a piece of it looks almost the same as the whole thing. In this video, what they’ve done is set up three screens, in a triangular pattern, and set them to display the input from a camera. When you point the camera at the screens, what you get is whatever the camera is seeing repeated three times in a triangular pattern – and since what’s on the screens is what’s being seen by the camera; and what’s seen by the camera is, after a bit of delay, what’s on the screens, you’re getting a self-similar system. If you watch, they’re able to manipulate it to get Julia fractals, Sierpinski triangles, and several other really famous fractals.

It’s very cool – partly because it looks neat, but also partly because it shows you something important about fractals. We tend to think of fractals in computational terms, because in general we generate fractal images using digital computers. But you don’t need to. Fractals are actually fascinatingly ubiquitous, and you can produce them in lots of different ways – not just digitally.

Topological Spaces and Continuity

In the last topology post, I introduced the idea of a metric space, and then used it to define open and closed sets in the space.

Today I’m going to explain what a topological space is, and what continuity means in topology.

A topological space is a set X and a collection T of subsets of X, where the following conditions hold:

  1. \emptyset \in T \land X \in T:both the empty set and the entire set T are in the set of subsets, T. X is going to be the thing that defines the structure of the topological space.
  2. \forall C \in {\bf 2}^T: \bigcup_{c \in C} \in T: the union of collection of subsets of T is also a member of T.
  3. \forall s,t \in T: s \cap t \in T: the intersection of any two elements of T is also a member of T.

The collection T is called a topology on X. The members of T are the open sets of the topology. The closed sets are the set complements of the members of T. Finally, the elements of the topological space X are called points.

The connection to metric spaces should be pretty obvious. The way we built up open and closed sets over a metric space can be used to produce topologies. The properties we worked out for the open and closed sets are exactly the properties that are required of the open and closed sets of the topology.

The idea of the topology X is that it defines the structure of X. We say collection when we talk about it, because it’s not a proper set: a topology can be (and frequently is) considerably larger than what’s allowable for a set.

What it does is define the notion of nearness for the points of a set. Take three points in the set X: a, b, and c. X contains a series of open sets around each of a, b, and c. At least conceptually, there’s a smallest open set containing each of them. Given the smallest open set around a, there is a larger open set around it, and a larger open set around it. On and on, ever larger. Closeness in a topological space gets its meaning from those open sets. Take that set of increasingly large open sets around a. If you get to an open set around a that contains b before you get to one that contains c, then b is closer to a than c is.

There are many ways to build a topology other than starting with a metric space, but that’s definitely the easiest way. One of the most important ideas in topology is the notion of continuity. In some sense, it’s the fundamental abstraction of topology. Now that we know what a topological space is, we can define what continuity means.

A function from topological space T to topological space U is continuous if and only if for every open set C subseteq U, the inverse image of f on C is an open set.

Of course that makes no sense unless you know what the heck an inverse image is. If C is a set of points, then the image f(C) is the set of points { f(x) | x \in C }. The inverse image of f on C is the set of points { x | f(x) \in C}.

Even with the definition, it’s a bit hard to visualize what that really means. But basically, if you’ve got an open set in U, what this says is that anything that maps to that open set must also have been an open set. You can’t get an open set in U using a continuous function from T unless what you started with was an open set. What that’s really capturing is that there are no gaps in the function. If there were a gap, then the open spaces would no longer be open.

Think of the metric spaces idea of open sets. Imagine an open set with a cube cut out of the middle. It’s definitely not continuous. If you took a function on that open set, and its inverse image was the set with the cube cut out, then the function is not smoothly mapping from the open set to the other topological space. It’s mapping part of the open set, leaving a big ugly gap.

If you read my old posts on category theory, here’s something nifty.

The set of of topological spaces and continuous functions form a category, with the spaces as objects and continuous functions as arrows. We call this category Top.

Aside from the interesting abstract connection, when you look at algebraic topology, it’s often easiest to talk about topological spaces using the constructs of category theory.

For example, one of the most fundamental ideas in topology is homeomorphism: a homeomorphism is a bicontinuous bijection (a bicontinuous function is a continuous function with a continuous inverse; a bijection is a bidirectional total function between sets.)

In terms of the category {\bf Top}, a homeomorphism between topological spaces is a homomorphism between objects in Top. That much alone is pretty nice: if you’ve gotten the basics of category theory, it’s a whole lot easier to understand that a homeomorphism is an homo-arrow in {\bf Top}.

But there’s more: from the perspective of topology, any two topological spaces with a homeomorphism between them are identical. And – if you go and look at the category-theoretic definition of equality? It’s exactly the same: so if you know category theory, you get to understand topological equality for free!

The Halting Problem

Some people expressed interest in seeing a full-out, formal presentation of the Halting problem proof. So, I’m going to walk through it. There are actually a lot of different versions of this proof; the one that I’m going to use is based on the one used by my grad-school theory professor, Dr. John Case. To be clear, I’m doing it from memory, so any errors are completely my own fault, not John’s!

To start off, we need to define what a computing device is. In formal mathematical terms, we don’t care how it works – all we care about is what it can do in abstract tems. So what we do is define something called an effective computing device. An effective computing device is any Turing equivalent computing device. An ECS is modelled as a two parameter function phi : {cal N} times {cal N} rightarrow {cal N}. The first parameter is an encoding of a program as a natural number; the second parameter is the input to the program. It’s also a natural number, which might seem limiting – but we can encode any finite data structure as an integer, so it’s really not a problem. The return value is the result of the program, if the program halts. If it doesn’t halt, then we say that the pair of program and input aren’t in the domain of phi. So if you wanted to describe running the program “f” on the input 7, you’d write that as phi(f, 7). And, finally, the way that we would write that a program p doesn’t halt for input i as phi(p, i) = bot.

So now we’ve got our basic effective computing device. There’s one thing we still need before we can formulate the halting problem. We need to be able to deal with more parameters. After all – a halting oracle is a program that takes two inputs – another program, snd the input to that program. the easiest way to do that is to use something called a pairing function. A pairing functions is a one-to-one function from an ordered pair to an integer. There are lots of possible pairing functions – for example, you can convert both numbers to binary, left-pad the smaller until they’re equal length, and then interleave the bits. Given (9,3), you convert 9 to 1001, and 3 to 11; then you pad 3 to 0011, and interleave them to give you 10001011 – 139. We’ll write our pairing function as angle brackets around the two values: langle x,yrangle.

With the help of the pairing function, we can now express the halting problem. The question is, does there exist a program O, called a halting oracle, such that:

forall p, forall i: (varphi(O,langle p,irangle) = left{    begin{array}{l}	0 mbox{ if } varphi(p, i) = bot \	1 mbox{ if } varphi(p,i) neq bot   end{array}right.

In english, does there exist a program O such that for all pairs of programs p and inputs i, the oracle returns 1 if varphi(p, i) halts, and 0 if it doesn’t?

Time, finally, for the proof. Suppose that we do have a halting oracle, O. That means that for any program p and input i, varphi(O, langle p, i rangle) = 0 iff varphi(p,i)=bot.

So, can we devise a program $p_d$ and input i where varphi(p_d, i) != bot,
but varphi(O, langle p, i rangle) = 0?

Of course we can. We need a p_d which takes two parameters: an oracle, and an input. So it should be really simple right? Well, not quite as easy as it might seem. You see, the problem is, p_d needs to be able to pass itself to the oracle. But how can it do that? A program can’t pass itself into the oracle. Why not? Because we’re working with the program as a Gödel number – and a program can’t contain its own Gödel number. If it contained it, it would be larger than itself. And that’s rather a problem.

But there’s a nice workaround. What we care about is: is there any combination of program and input such that O will incorrectly predict the halting status? So what we’ll do is just turn p_d into a parameter to itself. That is, we’ll look at a program p_d like the following:

def deceiver(input):
   (oracle, program) =  unpair(input)
   if oracle(program, input):
      while(True): continue
   else:
      halt

Then we’ll be interested in the case where the value of the program parameter is a Gödel number of the deceiver itself.

(As an aside, there are a variety of tricks to work around this. One of the more classical ones is based on the fact that for any given program, p, there are an infinite number of versions of the same program with different Gödel numbers. Using that property, you can embed a program p_{d_2} into another program p_{d}. But there are a few other tricks involved in getting it right. It’s not simple – Alan Turing screwed it up in the first published version of the proof!)

Now, when input = langle O, p_d rangle, then O will make the wrong prediction about what p_d will do. So – once again, we’re back where we were in the simpler version of the proof. A halting oracle is a program which, given any pair of program and input, will correctly determine whether that program will halt on that input. We’re able to construct a pair of program and input where the oracle doesn’t make the right prediction, and therefore, it isn’t a halting oracle.

This version is more abstract, but it’s still got that wonderfully concrete property. Even in the most abstract way of talking about a computing device, if you’ve got something that you believe is a halting oracle, this shows you how to create a program that will prove that the halting oracle is wrong. So you can’t possibly create a halting oracle.

And to be extra clear: this doesn’t rely on any ambiguities about definitions, or any distinction between values and meanings. It shows a way of producing a real, concrete counterexample for any purported halting oracle. No trickery, no fudging – if you think you have a halting oracle, you’re wrong, and this proof shows you exactly how to create a program that will demonstrate that.

Metric Spaces

One of the things that topologists like to say is that a topological set is just a set with some structure. That structure is, basically, the nearness relation – a relation that allows us to talk about what points are near other points.

So to talk about topology, you need to be able to talk about nearness. The way that we do that in topology is through a fundamental concept called an open sphere. An open sphere defines the set of all points that are close to a particular point according to some metric. That’s not the only way of defining it; there are various other ways of explaining it, but I find the idea of using a metric to be the easiest one to understand.

Of course, there’s a catch. (There’s always a catch, isn’t there?) The catch is, we need to define just what we mean by “according to some metric”.Fundamentally, we need to understand just what we mean by distance. Remember – we’re starting with a completely pure set of points. Any structure like a plane, or a sphere, or anything like that will be defined in term of our open spheres – which, in turn, will be defined by the distance metric. So we can’t use any of that to define distance.

Defining Distance

So. Suppose we’ve got a totally arbitrary set of points, S, consisting of elements s_1, s_2, s_3, s_4, ..., s_n, .... What’s the distance between s_i and s_j?

Let’s start by thinking about a simple number line with the set of real numbers. What’s the distance between two numbers on the number line? It’s a measure of how far over the number line you have to go to get from one point to the other. But that’s cheating: how far you have to go is really just a re-arrangement of the words; it’s defining distance in terms of distance.

But now, suppose that you’ve got your real number line, and you’ve got a ruler. Then you can measure distances over the number line. The ruler defines what distances are. It’s something in addition to the set of pointsthat allows you to define distance.

So what we really want to do is to define an abstract ruler. In pure mathematical terms, that ruler is just a function that takes two elements, s_i and s_j, and returns a real number. That real number is the distance between those two points.

To be a metric, a distance function d needs to have four fundamental properties:

Non-Negativity
\forall s_i, s_j \in S: d(s_i, s_j) \geq 0: distance is never negative.
Identity
\forall s_i, s_j \in S: d(s_i, s_j) = 0 \iff i=j; that is,
the distance from a point to itself is 0; and no two distinct points are separated by a 0 distance.
Symmetry
\forall s_i, s_j \in S: d(s_i, s_j) = d(s_j, s_i). It doesn’t matter which way you measure: the distance between two points is always the same.
Triangle Inequality
\forall s_i, s_j, s_k \in S: d(s_i, s_j) + d(s_j, s_k) \geq d(s_i, s_k).

A metric space is a pair (S, d) of a set, and a metric over the set.

For example:

  1. The real numbers are a metric space with the ruler-metric function. You can easily verify that properties of a metric function all work with the ruler-metric. In fact, they are are all things that you can easily check with a ruler and a number-line, to see that they work. The function that you’re creating with the ruler is: d(x,y) = |x-y| (the absolute value of x - y). So the ruler-metric distance from 1 to 3 is 2.
  2. A cartesian plane is a metric space whose distance function is the euclidean distance: d((a_x,ay_), (b_x,b_y)) = ((a_x-b_x)^2 + (a_y-b_y)^2 )^{frac{1}{2}}. In fact, for every n, the euclidean n-space is a metric space using the euclidean distance.
  3. A checkerboard is a metric space if you use the number of kings moves as the distance function.
  4. The Manhattan street grid is a metric space where the distance function between two intersections is the sum of the number of horizontal blocks and the number of vertical blocks between them.

With that, we can define the open spheres.

Open and Closed Sets

You can start moving from metric spaces to topological spaces by looking at open sets. Take a metric space, (S,d), and a point p \in S. An open sphere B(p,r) (a ball of radius r around point p) in S is the set of points x such that d(p,x) < r.

Now, think of a subset T \subseteq S. A point p \in S is in the interior of T if/f there is some point r where B(p,r) \in T. T is an open subset of S if every element of T is in its interior. A subset of space formed by an open ball is always an open subset. An open subset of a metric space S is also called an open space in S.

Here’s where we can start to do some interesting things, that foreshadow what we’ll do with topological spaces. If you have two open spaces T and U in a metric space S, then T cup U is an open space in S. So if you have open spaces, you can glue them together to form other open spaces.

In fact, in a metric space S, every open space is the union of a collection of open spheres in S.

In addition to the simple gluing, we can also prove that every intersection of two open subsets is open. In fact, the intersection of any finite set of open subsets form an open subset. So we can assemble open spaces with all sorts of bizarre shapes by gluing together collections of open balls, and then taking intersections between the shapes we’ve built.

So now, think about a subspace T of a metric space S. We can say that a point p is adherent to T if \forall r < 0; B(p, r) \cap T \neq \emptyset. The closure of T, written \overline{T} is the set of all points adherent to T.

A subset T of S is called a closed subset if and only if T=\overline{T}. Intuitively, T is closed if it contains the surface that forms its boundary. So in 3-space, a solid sphere is a closed space. The contents of the sphere (think of the shape formed by the air in a spherical balloon) is not a closed space; it’s bounded by a surface, but that surface is not part of the space.

Introducing Three-Valued Logic

So, finally, I’m getting around to three-valued logic!

To start, why are we even looking at three-valued logic? We want to get to fuzzy reasoning, and three valued logic really doesn’t do the job. But it’s a useful starting point, because in general, most of us have only worked with standard logics where statements are either true or false. In Fuzzy logic, we’re don’t have that – and that can lead to a lot of complexity. So looking at something simple that shows us how something either than “true” or “false” can actually make sense in logic.

Continue reading

An Introduction to Topology

When I took a poll of topics that people wanted my to write about, an awful lot of you asked me to write about topology. I did that before – right after I moved my blog to ScienceBlogs. But it’s been a while. So I’m going to go back to those old posts, do some editing and polishing, correct some errors, and repost them. Along the way, I’ll add a few new posts.

We’ll start with the fundamental question: just what is topology?

I’ve said before that the way that I view math is that it’s fundamentally about abstraction. Math is taking complex ideas, breaking down to simple concepts, and then exploring what those concepts really mean, and exactly what you can build using them.

I argue that topology, at its deepest level, is about continuity and nearness. In a continuous surface, what does in mean for things to be close to one another? What kind of structures can you build using nothing but nearness relationships? What is a structure defined solely by that notion of nearness?

Continue reading

Holy Freaking Cow… P != NP??

Very big, very exciting news in the theoretical comp sci world today!

A group at HP research has published a proof that, if correct, shows that the classic problem of computational complexity has been solved, once and for all. It’s still far from certain that it’s correct. But it’s the most credible attempt that I’ve ever seen, and it’s getting some preliminary favorable feedback from big-names in complexity theory. In fact, one of the biggest names in complexity theory, Stephen Cook, has apparently said that it should be taken seriously. If Cook thinks it’s credible, then it’s damn-well credible. It might not be correct, but it’s not just crackpottery either.

For the non-CS folks out there, here’s my attempt to explain what the problem is, and what the solution means.

Continue reading

Leading in to Fuzzy Logic

So, as I said in the intro to the new Scientopical Good Math/Bad Math, I’m
going to writing some articles about fuzzy logic.

Before diving into the fuzzy stuff, let’s start at the beginning. What
is fuzzy logic? More fundamentally, what is logic? And what problem does fuzziness solve?

So, first, what’s logic?

Logic isn’t really a single thing. There are many different logics, for different purposes. But at the core, all logics have the same basic underlying concept: a logic is a symbolic reasoning system. A logic is a system for expressing statements in a form where you reason about them in a completely mechanical way, without even knowing what they mean. If you know that some collection of statements is true, using the logic, you can infer – meaning derive from a sequence of purely mechanical steps – whether or not other statements are true. If the logic is formulated properly, and you started with an initial set of true statements, then any conclusions drawn using the inference process from those statements will also be true.

When we say logic, we tend to automatically think of a particular logic: the first order predicate logic, which is the most common, fundamental logic used in both mathematics, and in rhetoric and debate. But there are an astonishing number of different logics for different purposes. Fuzzy logic is one particular
variation, which tries to provide a way of reasoning about vagueness.

Continue reading