Revisiting SECD and the power of Lisp

While visiting the ICFP Conference 2008, I was discussing my SECD microprocessor reimplementation project with my co-worker Kilian. The SECD is a classic virtual machine architecture supporting functional programming languages invented by Peter J. Landin in the early 1960ies. It is a relatively unusual architecture in that it is commonly described by transformations of the machine state represented as four registers containing cons cells. In fact, the virtual machine's memory conceptionally consists of linked cons cells, not of a vector of words that is addressed by integer addresses. Jia-Huai You of University of Alberta has a good description of how the SECD virtual machine works.

Quoting SECD: DESIGN ISSUES, here is the specification of the SECD virtual machine. There is one line for each machine instruction, describing the state of the four machine registers before and after the particular instruction has been executed:

While we were discussing my hardware project, Kilian suggested that we should implement a SECD in Common Lisp in order to bridge some time that we had until the next break. Given that Lisp supports lists nicely, we chose to use Lisp lists to represent the SECD memory. We first fooled around with imperative code that mimicked the required state transformations, but soon realised that it would be much easier to directly transform the specification for each machine instruction into executable Lisp. Following common practice, we started writing down how we wanted to type in the specification for each of the SECD instructions. As an example, the LD instruction would be defined by this:

(define-instr 1 LD
  (s e (LD (m . n) . c) d)
  `((,(locate m n e) . ,s) ,e ,c ,d))

The literal "1" is the op code, the "LD" is the symbolic name of the instruction, the first list is the state before the instruction is executed (with the four list elements representing the state of the four registers), and the rest specifying the code to execute for the instruction. The executed code again returns a four element list representing the new state of the machine registers.

We then wrote a macro to transform this specification syntax into Lisp code. Our DEFINE-INSTR macro uses the first list as an argument to DESTRUCTURING-BIND, binding all the names that we then use in the transformed state:

(defmacro define-instr (code name pre post)
  (let ((name* (intern (format nil "~A-INSTR" name))))
    `(progn
       (setf (gethash ,code *instr-hash*) ',name*)
       (defun ,name* ()
         (multiple-value-setq (s e c d)
           (destructuring-bind ,pre (list s e c d)
             (assert (= ,name ,code))
             (values-list ,post)))))))

We were quite happy with our virtual machine, and we were able to run several of the precompiled SECD "binaries" from the ancient LispKit software collection. Getting the macro right and implementing the 21 instructions took little time, and it was easy to, say, instrument the macro for tracing or augmenting it with checks to make sure that we got the original syntax right. It would certainly be possible to polish the macro more - for example, having to specify the mnemonic of the operation twice is not so pretty. We just meant to create something that we could use to explore the behavior of the SECD architecture, so we did not care polishing.

The next day, I attended an introductory Erlang workshop and as I had some prior knowledge of Erlang (Programming Erlang by Joe Armstrong is a good read), I decided to try implementing a SECD VM in Erlang instead of doing the exercises suggested by the lecturer. Erlang has built-in pattern matching, so I used a transliteration scheme of the VM specification very similar to what I used in Lisp. Also, Erlang has lists, so representing the SECD memory as Erlang lists should be a one to one match, too.

So, here is the CAR instruction implemented in Erlang:

secd(S = [ [ CAR | _ ] | _ ],
     E,
     [ ?CAR | C ],
     D) ->
    secd([ CAR | S ],
         E,
         C,
         D);

In Erlang, function dispatch is based on pattern matching, and there is total flexibility to match structures and bind variables to what has matched. For example, in the implementation of the car operation above, we bind the variable S to the previous stack contents, and at the same time bind the variable CAR to the car of the cons cell that is sitting on top of the stack.

The Erlang version of the VM grew well, and the source code looked as nice as the Lisp version. Yet, as I tried to implement the the RAP instruction, I faced a major stumbling block.

The RAP instruction is used to support recursive function calls and works by replacing a previously created dummy environment on the stack, called OMEGA, by one which contains all the functions that are visible in the recursive scope. The specification uses RPLACA to denote that replacement operation, and that is what we used in our Lisp implementation of SECD, too:

(define-instr 7 RAP
  (((c* . e*) v . s) (om . e) (rap . c) d)
  (progn
    (assert (equal om 'omega))
    `(nil ,(rplaca e* v) ,c* (,s ,e ,c . ,d))))

When trying to implement RAP in Erlang, I got stuck because there are no destructive operations on lists. Not in the standard API, and seemingly not in the system API either. So, the Erlang SECD looks nice, only it does not run. Frankly, our Lisp based SECD has bugs, too, and I am not totally sure whether the problem is not in the RAP instruction.

Curiously, getting the RAP instruction right is harder than one wants to expect: In the SECD hardware implementation that I worked on, I used the original microcode of the SECD chip developed at the University of Calgary as part of a verification project. I found two major bugs in this microcode which, ironically, was part of the file set that compromises the proof that the SECD hardware implementation was correct according to its abstract specification. One bug was the garbage collector which did not work at all, and the other was in the RAP instruction which did not work either. I have trouble not concluding something regarding academia in general from these findings.

Anyway.... this was another time when I realized how much I like Common Lisp and why it will propably not be replaced by another language anytime soon. Common Lisp is teh tool of choice for my exploratory programming, as it does not have a terrible lot of restrictions, yet supports some advanced programming abstractions. There are domains where Common Lisp does not have much to offer, though. Concurrent programming currently is the one area where the lack of support in Common Lisp hurts me most. Erlang may be my cure, and I hope to look into Lisp Flavoured Erlang in order to spare myself the arcane syntax that Erlang provides by default.

I have finally returned to debugging the SECD implementation on the FPGA. When I stopped with this, I had problems getting the dual ported RAM controller to run. The SECD CPU is designed as a coprocessor and relies on a host CPU to set up the program in its memory and to start itself. Once the program has finished to run, the host CPU reads out the results from the SECD memory. My host CPU is a 6809, based on System09 written by John Kent. This is a historic coincidence, too, as the original SECD implementation used an 6809 as host CPU, too, at least for the testing. The 6809 software that has originally been used is not available anymore, so I needed to find a way to quickly write some diagnosis and debugging routines. System09 is designed to run the FLEX operating system and there is a BASIC interpreter for Flex available, but somehow I did not like that too much. My first attempt to create a diagnostic monitor was to write a C program and compile it with gcc6809. This worked, but I did not like the turnaround times too much. Thus, I tried to find a Forth implementation for the 6809 that I could use as a diagnostics monitor. My search led me to Maisforth, which is a Forth implementation for a custom computer called the "mais kastje" which was built and used by a group of dutch Forth enthusiasts. I changed my System09 based 6809 system so that its memory map matches that of the "mais kastje" and hacked the Forth interpreter so that it properly initializes and uses the ACIA serial port in System09. Getting Maisforth to run was not too hard, but I certainly would not have finished that without the help of the excellent Logicport logic analyzer that I have bought earlier this year. Logicport is a PC based logic analyzer that retails at $389, with 34 channels, complex triggering and 500 Mhz sample rate. The Windows software is very usable, and it comes with interpreters for a few serial protocols (I2C, SPI, RS232) which helped me to track down my ACIA initialization problem in a short timespan. I'm a very happy customer of this and can recommend Logicport to anyone who can't afford a "real" logic analyzer. Now that I have Forth running on the FPGA, I am back to debugging the memory controller problems. I use Forth as a scriptable debug monitor, so in addition to examine and change memory contents, I can define symbolic addresses, write little test programs and thus automate debugging quite a bit. With Forth and the logic analyzer, I quickly found that my problem lies in how I set up the low and high byte select bits of the RAM chip, which has a 16 bit datapath.

netzhansa

Saturday, September 27, 2008

Revisiting SECD and the power of Lisp

Sunday, July 29, 2007

Back to SECD

Blog Archive