Wednesday, April 8, 2009

LMI Flashback

Once I fixed the compiler bug that was causing variable aliasing, I ran into the second (of many) problems. I discovered that there was a rather small limit on the stack frame size. The nested procedures that I used quickly exhausted the local variable space and caused the compiler to error out. Unfortunately, the Lisp Machine macro code (the microcode implemented a stack-oriented virtual machine) only had a small field for the displacement value of stack instructions, and there was no easy way to fix this. Given the choice between rewriting my code and rewriting the compiler, I chose to wimp out.

Next problem: stack overflow. It didn't occur to me that the Lisp Machine, the pinacle of AI Lab engineering, wouldn't optimize tail-recursion. It's a pretty trivial thing to implement. If you ever find the compiler emitting a call instruction followed immediately by a return instruction, you simply change that to a jump (what's tricky is if your calling sequence is such that the compiler never emits a call followed by an immediate return). It's easy to do dynamically in the microcode as well. When you do a function call, you peek at the following instruction. If it is a return instruction and the stack is empty, just omit pushing the return address.

I asked RG about this. He said that there was some microcode to handle tail recursion, but it was commented out because it didn't work. He said I was free to look into it, but he didn't expect that I'd have any luck. He was right.

Although the concept is easy, the devil is in the details. At function call time, the stack is never empty. Most of the cruft on the stack is not in use, so it could be deallocated, but there are things that might be in use. These are things like the dynamic binding level (for special variables), pending catches and unwind-protects, data structures with dynamic extent, and even more esoteric things (microcode state!) that happened to end up on the stack because it was convenient. The return instruction can simply deallocate the stack space because everything that stores stuff on the stack doesn't expect it to last beyond the next return. But for tail recursion, we want to deallocate stuff as soon as possible.

The code was a true mess. There is a status word at the beginning of the stack frame and many of the routines that store stuff on the stack set a flag in the status word to indicate they need the frame to last. But not every routine was this well behaved. In the particular case of lexical closures, the lexical environment was optimistically allocated on the stack and only moved to the heap if pointers to the closure still existed at the time of return. The code that implemented tail-recursion had to grovel around on the stack and try to determine if the stack frame was truly needed. It was easy enough to be conservative and punt, but the particular use case I had in mind (where I used some continuation-passing-style) was the hard case.

I eventually worked around the problem by allocating a particularly deep stack.

Over the time I was at LMI, I occasionally revisited this code to see if I could get it working again, but I never had any luck.

1 comment:

Faré said...

Here's for retrofitting as an "optimization" what is (or isn't) an essential attribute of the evaluation model.

In the same vein, "Let's secure this application..."