The first donor to the Chesapeake Math program asked me to write about the 196 algorithm, a problem also known as the mystery of the Lychrel numbers. To be completely honest, that’s something that people have asked for in the past, but I’ve never written about it, because I didn’t see what I could add. But I made a promise, so… here goes!
Take any number with at least two-digits, N. Reverse the digits of N, giving you M, and then add N+M. That gives you a new number. If you keep repeating this process, most of the time, you’ll get a palindromic number really fast.
For example, let’s start with 16:
- 16 reversed is 61.
- 16+61=77. 77 is a palindromic number.
- So one reverse+add, and we have a palindromic number.
Or 317:
- 317 reversed is 713.
- 317+713=1030.
- 1030 reversed is 301.
- 1030 + 301 = 1331, so we have a palindromic number after two steps.
You can play the same game in different number bases. For example, in base 8, we can start with 013 (11 in base-10): in one reverse+add, it becomes 44 (36 in base-10).
For most numbers, you get a palindrome in just a few steps. But for some numbers, in some number bases, you don’t. If you can’t ever get to a palindrome by doing reverse+add starting from a number, then that number is called a Lychrel number.
The process of exploring Lychrel numbers has picked up a lot of devotees, who’ve developed a whole language for talking about it:
The question that haunts Lychrel enthusiasts is, will you always eventually get a palindrome? That is, do Lychrel numbers actually exist?
In base-2, we know the answer to that: not all numbers will produce a palindrome; there are base-2 Lychrel numbers. The smallest base-22 Lychrel number is 22 – or 10110 in binary. We can look at its reverse add sequence, and see intuitively why it will never produce a palindrome:
- 10110
- 100011
- 1010100
- 1101001
- 10110100 (10, 11, 01, 00)
- 11100001
- 101101000
- 110010101
- 1011101000 (10, 111, 01, 000)
- 110010101
- 1101000101
- 10111010000
- 0b11000101101
- 0b101111010000 (10, 1111, 01, 0000)
Starting at step 5, we start seeing a pattern in the sequence, where we have recurring values where that have a pattern of 10, followed by -1s, followed by 01, followed by 0s. We’ve got a sequence that’s building larger and larger numbers, in a way that will never converge into a palindrome.
We can find similar sequences in any power-of-two base – base 4, 8, 16, etc. So in power-of-two bases, there are Lychrel numbers. But: are there Lychrel numbers in our familiar base-10?
We think so, but we’re not sure. No one has been able to prove it either way. But we’ve got some numbers, which we call Lychrel candidates, that we think are probably Lychcrel numbers. The smallest one is 196 – which is why this whole discussion is sometimes called the 196 problem, or the 196 algorithm.
People have written programs that follow the Lychrel thread from 196, trying to see if it reaches a palindrome. So far, the record for exploring the 196 Lychrel thread carries it through more than a billion iterations, producing a non-palindromic number with more than 6 million digits.
That’s pretty impressive, given that the longest Lychrel thread for any number smaller than 196 is the thread of 24 steps, starting with 89 (which produces the palindromic number 8,813,200,023,188).
From my perspective, one thing that interests me about this is its nature as a computational problem. As a problem, it’s really easy to implement. For example, here’s a complete implementation in Ratchet, a Scheme-based programming language. (I used ratchet because it features infinite-precision integers, which makes it easier to write.)
(define (reverse-number n) (string->number (list->string (reverse (string->list (number->string n)))))) (define (palindromic? n) (equal? n (reverse-number n))) (define (reverse-add n) (+ n (reverse-number n))) (define (find-palindrome seed) (define (noisy-find-palindrome n count) (if (palindromic? n) n (begin (printf "At iteration ~v, candidate=~v~n" count n) (noisy-find-palindrome (reverse-add n) (+ count 1))))) (noisy-find-palindrome seed 0))
I literally threw that together in less than five minutes. In that sense, this is a really, really easy problem to solve. But in another sense, it’s a very hard problem: there’s no way to really speed it up.
In modern computing, when we look at a problem that takes a lot of computation to solve, the way that we generally try to approach it is to throw more CPUs at it, and do it in parallel. For most problems that we come across, we can find some reasonable way to divide it into parts that can be run at the same time; then by having a bunch of computers work on those different parts, we can get a solution pretty quickly.
For example, back in the early days of this blog, I did some writing about the Mandelbrot set, and one variant of it that I find particularly interesting, called the Buddhabrot. The Buddhabrot is interesting because it’s a fractal visualization which isn’t naively zoomable. In a typical Mandelbrot set visualization, you can pick a small region of it that you want to see in more detail, and focus your computation on just that part, to get a zoomed in view on that. In the Buddhabrot, due to the chaotic nature of the computation, you can’t. So you just compute the Buddhabrot at a massive size, and then you compress it. When you want to see a region in more detail, you un-compress. To make that work, buddhabrot’s are frequently computed at resolutions like 1 million by 1 million pixels. That translates to enough complex floating point computations to compute several trillion values. That’s a big task. But in modern environments, that’s routine enough that a friend of mine at Google wrote a program, just for kicks, to compute a big buddhabrot image, and ran it on an experimental cluster.
If that kind of computational capability can be exploited just for kicks, why is it that the best effort at exploring the Lychrel thread for 196 only covers 6 million digits?
The answer is that there’s a way of computing the Buddhabrot in parallel. You can throw 10,000 CPUs at it for a couple of days, and get an amazing Buddhabrot image. But for the Lychrel thread, there’s no good way to do it in parallel.
For each additional number in the thread, you need to rearrange and add a couple of million digits. That’s a lot of work, but it’s not crazy. On any halfway decent computer, it won’t take long. To get a sense, I just whipped up a Python program that generated 1,000,000 random pairs of digits, and added them up. It took under 8 seconds – and that’s half-assed code written using a not-particularly fast interpreted language on a not-particularly-speedy laptop. A single iteration of the Lychrel thread computation even for a million-digit candidate doesn’t take that long – it’s on the order of seconds.
The catch is that the process of searching a Lychrel thread is intrinsically serial: you can’t have different CPUs computing the next thread element for different values: you don’t know the next value until you’ve finished the previous one. So even if it took just 1 second to do the reverse+add for million digit numbers, it would takes a long time to actually explore the space. If you want to explore the next million candidates, at 2 seconds per iteration, that will take you around 3 weeks!
Even if you don’t waste time by using a relatively slow interpreter like Python – even if you use carefully hand-optimized code using an efficient algorithm, it would take months at the very least to explore a space of billions of elements of the 196 thread! And to get beyond what anyone has done, you’ll probably need to end up doing years of computation, even with a very fast modern computer – because you can’t parallelize it.
If you’re interested in this kind of thing, I recommend looking at Wade Van Landingham’s p196 site. Wade is the guy who named them Lychrel numbers (based on a rough reversal of his girlfriend (now wife)’s name, Cheryl). He’s got all sorts of data, including tracking some of the longest running efforts to continue the 196 thread.
A couple of edits were made to this, based on error pointed out by commenters.
But you could compute the result with one sequence per processor, and reuse each processor when its sequence is done. This is, of course, obvious.
Yes, you could compute distinct sequences with different processors – but the problem that people are most interested in is seeing if any of our base-10 Lychrel candidates do eventually form palindromes. If they do, it’s in the multi-million digit range (or larger). So you still can’t explore that space very quickly.
“will you always eventually get a palindrome? That is, do Lychrel numbers actually exist? In base-2, we know the answer to that: No.”
is confusing, since a yes answer to the first question is a no answer to the second one.
You’re right – in the course of editing, I changed it from “Do all numbers eventually form palindromes”, and didn’t update the answer correctly.
The link to Wade’s site is busted (pointing back here).
And, while you might not be able to parallelize the problem, you could pipeline the heck out of it, which should get you a lot of the potential benefits of parallelization. The key is if you use two processors to do each addition, one starting from the least significant digits and the other starting from the most significant digits and meeting up in the middle, then you can kick off another pair of processors on the next step as soon as the first and last few digits of the previous step are known. You don’t need to wait for the full addition to be complete before starting the next step. The only tricky part is that the processor starting from the MSDs needs to wait while it’s putting out a bunch of tentative “carry propagate” digits (9s, in base 10) before it can finalize the digit to the left of the carry propagates and pass it along to the next step. When a processor pair finishes its assigned addition, it checks the palindrome property (or does that during the addition), and then makes itself available to start a new step.
By the time you work up to million-digit numbers, you should be able to keep several thousand processors (at least) busy computing the next several thousand steps, as long as the memory interlocks don’t kill your performance. Do you see any obvious reason this wouldn’t work?
That works in theory, but only if you can actually get a couple of thousand processors working efficiently with shared memory.
That’s where that approach falls apart. With multiple CPUs, even if you can somehow make shared memory work, you end up needing to introduce coordination – and coordination never comes for free.
When you look at the speedup curves of programs as you increase the number of CPUs, you almost always see a curve whose slope gradually decreases, until it becomes negative.
My best guess, based on experience, is that you could get 4 CPUs working together on this – but beyond that, the coordination cost would exceed the benefits.
So yeah, some amount of shared memory parallelism might help – but it will be a small, linear improvement in a very slow process. It’s nice – potentially converting months to weeks – but even so, you’re still talking about having a program running full out for a year before you’re close to any new results.
Actually, I think you can do a lot better than that. If we follow along the lines of my comment below and use one processor interleaving MSD and LSD digits, then each processor is primarily communicating just with the previous and next processor in the pipeline via what are essentially FIFOs. We can imagine it like a bunch of processes connected in a Unix pipeline. With pipe/socket semantics, we can get rid of the # of digits & current digit fields I proposed below and just stream the interleaved digits, with some out-of-band characters to signal stuff like “end of number,” “here’s the associated step number,” and “abort the computation – I found a palindrome (possibly followed by ‘here it is:)'”.
There would be a bunch of performance tuning involved in getting maximum throughput out of a given number of processors, paying a lot of attention to keeping communication between adjacent processes as local as possible. Conceptually, you have a computation process with an input pipe of some kind and an output pipe, and you want to balance the speed of the computation with the speed of the I/O of the pipes (whatever their form). You want the time to read 1000 bytes from the input to be approximately the same as the average time to compute the next 1000 bytes of output and the time to write 1000 bytes on the output, with the compute time being a little bit faster to maximize the throughput on the pipes and avoid starving the following processes for data. If the computation is initially faster than the I/O (as I suspect it will be in a naive implementation), you can improve the CPU utilization by looping the computation a few times through the thread/process/core/processor/board/rack before writing to the output pipe at that level.
At the outer level of computation you would be communicating with the file system – read a million digit number from a file, compute the next several thousand steps using the pipeline, and write the resulting number to a file, which gives you natural checkpoints along the way. Doing something like encoding two digits plus control info per byte would double the effective bandwidth of the pipes relative to a naive ASCII encoding. You could squeeze out a bit more I/O bandwidth by compressing the data further, but I’m not sure how much more would be worth it. I’m sure further performance ideas would occur to folks who do this kind of thing for a living.
Given this kind of streaming architecture, I think you could get effective use out of hundreds or thousands of processors, if you could (e.g.) get the use of a rack in the corner of a data center somewhere for a weekend experiment. There would be overhead, and the speedup per processor would be less than one, but I think you could get a pretty high average utilization
Parallel Lychrel computations:
http://www.dolbeau.name/dolbeau/p196/p196.html
And of course (following up on my previous comment), instead of a pair of processors, you could use a single processor that alternates between computing MSDs and LSDs, which may simplify the coordination issues. If you also store the numbers in this alternating format, you get a very natural data structure to pass from one stage of the pipeline to the next. Something like: # of digits, max digit computed, MSD0, LSD0, MSD1, LSD1, MSD2, LSD2, …. The receiver just needs to make sure not to read beyond the last digit currently computed. Storing the alternating MSDs and LSDs also makes the palindrome checking trivial (just make sure MSDi == LSDi for all i).
It seems as if there’s a missing link to Wade’s p196 site (the link I see in my browser actually points to this very post).
Pingback: Lychrel – uh! Racket | Ok, panico
Any links to the Collatz conjecture..?
It doesn’t really seem like a problem that needs more computational power. Even if the 196 series ends in a palindrome, it’s established that checking even one number can take unreasonable amounts of time. It seems at this point, the answer involves pure classic number theory; or at least classic number theory transforming the problem into one more computable.
Next candidate Lychrels are 295, 394, 493, 592, 691, & 879.