I thought that it would be fun to stick with the “stack-based” theme of last week’s pathological post, but this time, to pick an utterly pointlessly twisted stack based language, but one that would be appreciated by the mascot of one of my fellow ScienceBlogs. Orac, this one’s for you! 🙂 Our target these week is the language “Enema”.
Enema is a remarkably simple stack based language. In fact, it mostly looks like a
thoroughly trivial language – something like False, only simpler. It has a small family
of simple commands:
- 0-9
- Push the value of the digit onto the stack.
- +
- Pop the top two values on the stack, add them, and push the result.
- –
- Pop the top two values on the stack, subtract the first from the second, and push the result.
- *
- Pop the top two values on the stack, multiply them, and push the result.
- /
- Pop the top two values on the stack, integer-divide the first by the second, and push the result.
- %
- Pop the top two values on the stack, integer-divide the first by the second, and
and push the remainder onto the stack. - &
- Pop the top two values on the stack, perform logical-and on them, and push the result.
- |
- Pop the top two values on the stack, perform logical or on them, and push the result.
- ^
- Pop the top two values on the stack, perform logical exclusive-or on them, and push the result.
- I
- Input a single character, and push it onto the stack.
- O
- Output the value on top of the stack as a character.
- X
- Discard the value on top of the stack.
- D
- Duplicated the value on top of the stack.
- S
- Swap the top two values on the stack.
- R
- Rotate the stack: take the top three values on the stack S1,S2,S3, and push them back as S3,S1,S2.
- G
- Pop the top-most stack value as a memory address and push the value stored at
that address. - P
- Pop the top value on the stack as a memory address, and the second value on the stack
as a number, and store them number in memory at the address - [
- Mark the beginning of a loop.
- ]
- Mark the end of a loop. When code reaches the “]” and the end of a loop, it jumps back
to the corresponding “[“. - B
- Exit (break out of) the current loop, starting execution at the instruction after the next “]”.
- Z
- Conditional execution: skip the next instruction if the value on top of the stack is greater than 0.
- “chars”
- Push the ascii codes of the characters between the quotes onto the stack.
- ?
- Push the current depth of the stack onto the top of the stack.
- #
- Push the maximum address of memory that has been used by the program onto the stack. (This is effectively the amount of memory that has been allocated by the program.)
- : char code :
- Define char as a new function, whose body is the code up to the next “:”.
- ! char
- Discard the function definition associated with the character.
- Q
- Return from function call.
- .
- End of program: halt.
So, for example, here’s a basic “Hello world” program:
052*"!dlroW ,olleH"[DZBO].
It pushes 0, 5, and 2 onto the stack; then multiplies 5 times 2, leaving 10 on top, and 0 beneath it. Then it pushes the characters of “Hello world” onto the stack – so the stack is “0/10/”!”/”d”/”l”/”r”/”o”/”W”/” “/”,”/”o”/”l”/”l”/”e”/”H”. Then it runs a loop containing the commands “DZBO”. “D” duplicates the top value on the stack; then “Z” skips the next instruction if the stack-top is not zero. So if the character on top of the stack was not zero, then it skips to “O”, and outputs the stack top, and then goes back to the next iteration. If the stack top was 0, then it executes “B”, which breaks out of the loop, and halts. So it will output each of the characters in “Hello, World!”, followed by character 10 (which is newline); and then it will halt. Very straightforward, if awkward.
Ok… So far, we’ve got a pretty typical (if ugly-ish) stack-based language. But there’s nothing particularly pathological about it. It’s just another stack language. So why does this justify a name like “Enema”? Why is it worth a coveted FPP slot? Here’s
a simple example to illustrate why:
23* 68*+ O
This is a very simple program. It multiplies 2*3, and then adds 48 to it (converting the numeric value to a digit character value), and outputs it. So it outputs “6”.
:23: 23* 68*+ O
This is the same program, except that we’ve added a function definition to the beginning. The program now outputs “9” – because we’ve redefined “2” to push the value “3” onto the stack.
“:” can redefine any command in the program. “:23:” will redefine 2 so that instead of pushing the value 2 onto the stack, it pushes the value 3! “::stuffQ” will redefine the “:” command so that instead of starting a definition, it will now be a command that does “stuff”.
That gives you some idea of how twisted this can get. Alas, it’s really hard to write programs when your primitives can be arbitrarily redefined out from under you! So the author just totally wimped out, and never uses this in any of their examples.
But for you, my readers, I’m willing to endure all sorts of terrible punishment. So here is my rendition of a new and extra-twisted version of an enema program to
compute the integer exponent of a number.
:qX: :39Q: :41Q: { output null-terminated string to stdout } :o[DZBO]qQ: { count n-th power of v (v n -> p) } :p4[SDZBRDR*SR1-S]qSqQ: { convert integer to null-terminated string (n -> 0 c1 c2 c3 ...) } :i0S[D34+%"0"+S34+/DZB]qQ: { read integer terminated with character < '0' } :r0[I"0"-D1+ZBS34+*+]qQ: { prompt user to enter data } :a0" :eulav"352*" atad retnE"or0" :rewop"3orQ: { main program } ?Zapio52*O.
All these stack-based languages. Go blow it out your….
Flattery will get you nowhere. In this case I’m sure that’s literally true.
This raises the interesting question:
Should languages arbitrarily restrict programmers because the result may not be *nice*?
C *does* (I just tried it!) allow you to use the preprocessor to redefine keywords, which could lead to a lot of pain and suffering right there.
Keith:
The way that the C preprocessor allows you to redefine keywords actually *does* cause serious problems in real code. I’ve been bitten by it more than once when people have done something ugly with it.
One open-source project I contributed to about 3-4 years ago had one person submitting code that used his own glorious new control structures defined with the C preprocessor. Instead of normal while loops, he used his own variant, defined by:
#define whilenn(x) while ( (x) != NULL)
and then used whilenn all over the place. And he’d write conditions so that they returned values that were comparable to null… so you’d see things like:
whilenn ( equals(x,y) ) {
/* do something */
}
Where “equals” was a function that returned a string if x equals y… He claimed that programming that way was good, because it made it easy to write simple debugging print statements. Because, you see, he had a flag which could redefine “NULL” as a string, so that all of his comparisons always returned string values, and then the debugging flag would convert his “whilenn” into something including a print statement… Something like:
#define whilenn(cond) while (print_debug(cond), (cond) != null)
I don’t necessarily endorse the idea of languages that get in the face of the programmer and stop them from doing things… But I also greatly question the wisdom of any language design that makes it easy to redefine language primitives on the fly. (To be fair, you can’t blame C for the abuses of the preprocessor. C really is a decent low-level language. The preprocessor is a separate tool, by design, so putting restrictions into it about C semantics doesn’t make sense.)
But I have used languages where you *could* redefine stuff on the fly. I once had to admin a lisp machine where you could redefine just about *anything*. It implemented things like “if” using a macro that expanded to “cond”, and “loop” to a macro that worked via named-let recursion. And it let programmers see *and change* those macros. So someone would naively try to edit those macros to add debugging stuff (why is it *always* debugging that leads to these nightmares?), and end up crashing the whole machine.
Gag.
The C preprocessor? Cause problems? No way, it’s lovely. How else would you be able to write code like this.
I recall seeing someone post a CLisp session where he redefines car to mean cdr and define to mean car.
Of course, after that second redefine, the fun kinda stops, since the redefinition operation suddenly is orphaned.
Eh, this is one of those weird things that might be useful in extremely limited circumstances, but is otherwise bad. Still, no need to eliminate it, because then you screw over the programmer who really does need it.
Just smack the idiots in the head who reprogram their language primitives needlessly.
David:
Ah yes, the old Let’s Make C look like Pascal trick. I almost mentioned that.
I think the reason that debugging is so prone to this sort of thing is best explained by Kernigham and Plauger:
“Everyone knows that debugging is twice as hard as writing a program in the first place. So if you are as clever as you can be when you write it, how will you ever debug it?”
I do like stack-based programming in general