Sunday, August 19, 2018

Using the VMS PATCH utility to fix a VAX LISP issue

In my previous post about VAX LISP, I complained about a misfeature of the editor that prevented binding editor functions to the Ctrl-S and Ctrl-Q keys.  In the manual, it is stated that these keys cannot be bound due to "system restrictions". It appears, though, that it is a measure to prevent users from footing themselves in the foot because Ctrl-S and Ctrl-Q could be sent by terminals to implement software flow control.  Software flow control is optional, though; it can be switched on and off for each terminal line and it is not needed for virtual terminal connections (Telnet, DECNET) as flow control is implemented by the network layer and networked terminal sessions usually have enough buffer space.  Thus, reserving Ctrl-S and Ctrl-Q really only makes sense for asynchronous terminal lines that require software flow control, and there is no reason to reserve these characters for all terminal sessions, regardless of technology.
I am used to having Ctrl-S and Ctrl-Q be bound to certain functions from GNU Emacs, and I want to use the same bindings with VAX LISP as they've become part of my muscle memory.  Thus, I wanted to remove the explicit checks for these keys in the EDITOR::BIND-COMMAND and BIND-KEYBOARD-FUNCTION functions.  Lacking the source code to VAX LISP, I could not just remove the checks and recompile, so I needed a different way.
This is where the VMS PATCH utility comes into play.  It is a standard tool that is used to modify program executables, usually to remove bugs without requiring a full reinstallation of the software package that needs to be fixed.

Finding what to patch

In order to remove the checks for Ctrl-S/Q in EDITOR::BIND-COMMAND and BIND-KEYBOARD-FUNCTION, I disassembled the two function using the standard VAX LISP function DISASSEMBLE.  VAX assembler code is easy to read, so it was rather straightforward to find out how the checks work.  Here is the relevant extract of the disassembly output from the VAX LISP BIND-KEYBOARD-FUNCTION function:
000FC  MOVQ   BIND, -(STACK)
000FF  MOVL   -10(FRAME), -(STACK)
00103  MOVZBL #89, -(STACK)
00107  MOVZBL #99, -(STACK)
0010B  MOVAB  18(STACK), FRAME
0010F  MOVL   'CHAR/=, LEX
00113  MOVL   4(LEX), FUNC
00117  JMP    @4(FUNC)
0011A  CMPL   RES, SV
0011D  BEQL   121
0011F  BRB    145
00121  MOVL   FRAME, -(STACK)
00124  MOVL   #A28, -(STACK)
0012B  MOVQ   BIND, -(STACK)
0012E  MOVL   '"The character argument must be a control character (other than~%~
            CTRL/S or CTRL/Q): ~S", -(STACK)
Apparently, #89 and #99 refer to Ctrl-S and Ctrl-Q, and the CHAR/= function is called to check whether the argument is one of those characters.
The easiest way to get rid of the check seemed to me to replace the BEQL 121 instruction, which jumps to printing the error message and returning from the function, to BRB 145, which is the jump to the rest of the function, which performs the actual binding.

Locating the code

To make the actual modification using the PATCH utility, I now needed to know where in the executable file these instructions can be found - the addresses in the DISASSEMBLE output are only relative to the start of the function, and the LISP.EXE executable file does not include any symbols.  I thus picked out a few instructions that seemed characteristic and assembled them to machine code using the VMS macroassembler.  Then, I searched for this instruction sequence in LISP.EXE using a trivial perl program, which gave me the offset of the function that I needed to update.
Knowing the offsets of the locations in the binary file that I needed to change, I could now go about and use the PATCH utility in interactive mode to devise the actual patch.  The PATCH manual suggests that the VMS standard debugger would be used for patch development, but as I did not have the symbol table for LISP.EXE, using PATCH itself for development was more appropriate as it can work in /ABSOLUTE mode which uses file offsets for loctions.
First, I needed to find and disassemble the code using the locations I had devised.  As I only had the offset of the MOVZBL #89, -(STACK) instruction and VAX instructions have variable lengths, I needed to probe a bunch of addresses before I could come up with this:
PATCH>e/i 00403E68:00403E9A
00403E68:  MOVQ    R10,-(R7)
00403E6B:  MOVL    B^0F0(R8),-(R7)
00403E6F:  MOVZBL  #089,-(R7)
00403E73:  MOVZBL  #099,-(R7)
00403E77:  MOVAB   B^18(R7),R8
00403E7B:  MOVL    B^30(R11),R9
00403E7F:  MOVL    B^04(R9),R11
00403E83:  JMP     @B^04(R11)
00403E86:  CMPL    R6,AP
00403E89:  BEQL    00403E8D
00403E8B:  BRB     00403EB1
00403E8D:  MOVL    R8,-(R7)
00403E90:  MOVL    #00000A28,-(R7)
00403E97:  MOVQ    R10,-(R7)
00403E9A:  MOVL    B^38(R11),-(R7)
This is exactly the same instruction sequence that I disassembled using VAX LISP above. As you can see, it is significantly harder to read the disassembly produced by PATCH as it does not show symbolic register names, and it also does not translate data pointers like in the last instruction - VAX LISP showed the error message where PATCH prints B^38(R11).  The addresses in the disassembly, however, are absolute offsets into the LISP.EXE file, and that was what I needed.

Creating the actual patch

Equipped with this information, I could create the actual patch to change VAX LISP so that it no longer disallows binding of Ctrl-S and Ctrl-Q, which I reproduce below:
$ PATCH/ABSOLUTE LISP$SYSTEM:LISP.EXE/OUTPUT=LISP$SYSTEM:
REPLACE /INSTRUCTION
^X00403E5A
'BLEQ ^X00403E5E'
EXIT
'BLEQ ^X00403EB1'
EXIT
REPLACE /INSTRUCTION
^X002F3246
'BEQL ^X002F324A'
EXIT
'BRB ^X002F3265'
EXIT
UPDATE
EXIT
I used the REPLACE/INSTRUCTION command which requires the specification of an address, the current instruction(s) at the location and the new instruction(s) to put at the location.  PATCH would only perform the replacement if the instructions currently in the file match what is specified, otherwise it would abort with an error.  This is useful to prevent the patch from patching the wrong file or the wrong version of a file.

Better patches

As I did not have the source code or the symbol table for LISP.EXE when I developed this patch, I needed to resort to patching in absolute mode which is kind of brittle.  PATCH can also work in symbolic mode if a symbol table is available for the executable file, which allows patches to work even if the executable has been linked differently than the executable used to develop the patch (i.e. you could write a patch for a library function that'd work for executables that were linked statically against the library).  Also, PATCH allows patches to insert more instructions than were originally present in the executable.  It does this by adding additional disk space to the executable to hold the new, longer code, and then patch the required jump instructions to the patch area into the original location.

Closing thoughts

It is quite nice to be able to fix a bug in or remove a feature from a commercial program without having the source code, just by working with a bit of disassembly and on-board utilities. Back in the day, it was also useful to be able to distribute fixes to customers in the form of tiny patches. These could even be sent as a printed letter and typed in by the customer on a terminal, with reasonable effort and good chances of success.

Wednesday, August 15, 2018

Evaluating VAX LISP 3.1

The back story

Back in the day, when there still existed a number of different computer architectures and operating systems, I was a huge fan of the VAX/VMS operating system. VMS was designed together with the VAX architecture, and it used specific mechanisms offered by the processor to implement its I/O, security and other services. As it was still common at the end of the 1970ies, portability was not a design objective, and computer companies often had both hardware and software teams working together to implement computing solutions, from hardware to operating systems, development environments and applications.
I became a VMS fan mostly for three reasons: VMS machines were popular in the early packet switched networks that I could access, their default installation often left accounts with default passwords open, and VMS has an excellent online help system which helped me to learn and program VMS without having access to printed manuals.
I also liked the VAX for its orthogonal architecture, which made it somewhat similar to my favorite 8 bit microprocessor at the time, the Zilog Z80.  Back then, being able to program a machine in assembly language was still somewhat desirable, and an orthogonal instruction set with many general purpose registers and fancy addressing modes allowing complex programs to be written in assembly language appealed to me, too.
Frankly, though, I never wrote a lot of assembly language code. Instead, C quickly became my language of choice even on VMS, and when g++ 1.42 became available on VMS, I mostly became a full  time C++ programmer.  This was when object oriented programming just became the fad, with magazines like the C/C++ Users Journal and the Journal of Object Oriented Programming being the prime sources of inspiration. Around that time, I read Grady Booch's excellent book Object-Oriented Analysis and Design with Applications, which explained OO programming using examples in a variety of then-popular OO languages like Smalltalk, C++ and Object Pascal. It also had a chapter on using CLOS as the example language. To me, CLOS was completely out of reach and locked in the ivory tower, requiring utterly expensive machines and software licenses that I'd never hope to have access to, so I read the chapter more as science fiction than as something of practical relevance.
Thus, I stuck with C++ and later Perl on VMS, which was amusing enough, also for being a rather exotic blend.

Around 2001, I was introduced to Lisp by the way of VMS. At the time, my programming gigs were all on Unix or Windows and my affection to VMS became part of my hobby as a collector of vintage computers.
At some point, I brought a MicroVAX with me to a vintage computer convention, and a friend had his MacIvory Symbolics Lisp Machine on display. As the two systems were from the same era, we thought that we should try to bring up some network between them.  My VAX ran a fairly recent version of VMS, though, and when we tried to establish a DECNET connection, the Lisp machine would signal an error and put us into a debugger. What got me hooked on Lisp was that my friend could display the source code of the failing DECNET device driver, figure out that the problem was the too-new DECNET version number that the VAX had reported in the connection setup packet, fix the Lisp code in the DECNET driver of the Lisp Machine and then continue the file transfer operation that had signaled the problem.  I certainly understood that this would theoretically be possible back then, but I had never seen a system that would have that level of integration of operating system and programming language. I was intrigued.
Fortunately, machines had become much cheaper and faster in the decade since I had first read about CLOS, and the open source movement had made Unix and Common Lisp affordable. I went for CMUCL and eventually became a full time Common Lisp developer for several years.

The holy combination

I had been looking for VAX LISP for some time, as LISP and VMS sounded like a blend of very likeable technologies, so I was very happy when someone found the installation tapes and made them available through the comp.os.vms news group in 2017.  I played around with it a little, but unfortunately the manuals that were publicly available at the time were only for VAX LISP V2.1, and V3.1 was supposedly different enough to make me want the updated manual set.  It took a bit of research and emailing until I got in touch with the former architect of VAX LISP at digital equipment, and he was kind enough to borrow me the complete manual set for version 3.0, which I scanned and OCRed so that it is now available on bitsavers.  Thus, i felt equipped to give VAX LISP a test drive as a vacation project.

About VAX LISP

VAX LISP is an implementation of Common Lisp, but it implements it as described in the first version of “Common Lisp the Language” (CLtL1). This means that it feels like Common Lisp as it was standardized nine years later, but with several important features missing. Among these missing features are:
  • LOOP, the general purpose, well-hated macro for iteration.  You get simple LOOP, DOLIST, DOTIMES and of course the dreadful DO and DO* couple.
  • The Common Lisp Object System (CLOS), although there is a version of PCL coming with VAX LISP in the examples directory.  There also is a version of the Flavors object system included.
  • The condition system. VAX LISP allows you to bind the *UNIVERSAL-ERROR-HANDLER* special variable to a function that’d be called when ERROR, CERROR or WARN would be called and that could take a non-local jump using THROW.
  • DESTRUCTURING-BIND and lambda lists with &KEY, &REST etc.  Lambda lists like we know them today are only available in DEFUN
Furthermore, the package system is not yet where it is today, there are no generalized sequences and there are probably a several other bits and pieces that are missing.

Notable features of VAX LISP are:
  • Integrated, user-extensible editor with both EDT (VMS’s standard editor) and Emacs emulation modes, both with a terminal and a DECwindows backend. There is extensive documentation on programming the editor in LISP and I’ve added several features myself, which was a joy.
  • Interface to the standard DECwindows widget set. This required the implementation of a multithreading system so that DECwindows code, written in C, could be called from LISP and DECwindows handlers could be written in LISP and called from C.
  • A foreign function interface implementing the VMS procedure calling standard, making it possible to call VMS system services and library routines as well as code written in other VMS languages.
Additionally, it comes with a good bunch of example code that is quite helpful when trying to figure out how to use the language, although the quality of the samples vary.

A Project to Try It Out

I wanted to get a feel of how programming in VAX LISP works, so I decided to write a simple http server in it.  This would give me the opportunity to try out the system interface as I would have to call into the TCP/IP subsystem.  As I wanted to evaluate VAX LISP during a vacation trip, I used SIMH on my MacBook instead of a real VAX, running VMS V5.5-2.

The Development Experience

Initially, I had hoped that I could use the graphical version of VAX LISP for development, but I did not manage to set up a fully functional DECwindows workstation environment and fell back to the terminal version.  The downside of this is that I had to alternate between working in the editor and in the REPL, as the editor does not have a convenient way to evaluate expressions and display the results built-in.  It would probably not be very hard to add that, but I wanted to spend most of my time working on the web server.

The REPL

The terminal REPL of VAX LISP does not have a lot of convenience features.  LISP expressions are entered using the standard VMS line editor, which means that no completion is available and all you get is a way to recall the previously entered line.  Annoyingly, if a line is longer than whatever your current terminal width is, there is no way to edit text before the line break.  Obviously, no work has been put into making the terminal REPL nice.

The only notable convenience is that one can bind keys (ASCII codes, really) to call LISP functions, and one use case suggested in the manual is to bind, say, Ctrl-E to call the 'ED function.  That way, switching to the editor from the REPL is very quick and I use that feature all the time.  Beyond that, the REPL experience is rather bare-bones and only slightly better than, say, a bare SBCL or Clozure CL REPL because of VMS's standard line-editing facility, which gives you a one-line recall buffer.

The Editor

VAX LISP comes with an editor that is written in LISP itself and that is extensively documented so that users can extend and adapt it as desired.  By default, it tries to emulate EDT, which is the original VMS editor, but EDT is rather quirky and pretty much unusable without a DEC keyboard as it relies on certain function keys that are not present on modern keyboard layouts.  The VAX LISP editor also supports an Emacs compatibility mode, because even back in the late 1980ies, most LISP programmers would be used to Emacs and not be willing to use another editor.  In Emacs emulation mode, I found it rather easy to write and edit code and I even did some substantial refactorings of my http server code base with it without too much grief.  I spent a bit of time of adding a few features that I am used to from GNU Emacs.  Writing editor extensions in the same language and environment as the actual project was rather joyful.
The documentation to the editor is extensive, but it leaves a number of questions unanswered.  I also find its organization a bit odd, with a whole chapter on editor concepts that comes after all the nitty gritty details of editor customization explanations as separate "Part II" of the manual.  I'd expect the description of editor concepts to be coming first, before applying the concepts would be explained.  Also, the explanation how editor major and minor modes work is rather fuzzy.
One annoying misfeature of the editor is that it explicitly disallows binding Ctrl-S and Ctrl-Q, and the manual claims that this is due to "system restrictions".  While it is true that Ctrl-S and Ctrl-Q were used to implement software flow control on terminal connections that lacked hardware flow control, software flow control is an optional feature of the VMS terminal driver and can easily be switched off if out-of-band flow control is available for a given terminal connection.  This would be easy to fix if one hard the VAX LISP source code, but that seems to be lost, unfortunately.
Maybe not so unexpectedly for an editor written in the 1980ies, the VAX LISP editor does not have an "undo" facility.  This dampened my enthusiasm when I started writing code, but in practice, it has not hurt me too much.  When I made larger changes, I saved my buffers beforehand, and as the VMS file system automatically keeps old versions of files until they're purged, I could go back to the previous state if I wanted to roll back.

VMS and the Internet Protocols

My romanticized memories of VMS date back from a time when there was no Internet, and the Internet Protocols (IP) were just one set of protocols that were used in global networking.  DEC had its own protocol suite, DECNET, and was committed to making DECNET OSI compatible, as they believed that OSI would become the dominant wide-area networking protocol suite of the future.  DECNET was integrated into VMS quite nicely, and it was easy to implement peer to peer connections from any language, even from command line scripts (which enabled the creation of early networked malware like WANK).  Implementations of other networking protocols like RSCS (for BITNET/EARN) and IP (for Arpanet) were available for VMS, but DEC just treated them as secondary networking systems that did not deserve a lot of attention.
To implement my http server in VAX LISP, I needed an implementation of IP, so I installed "DEC TCP/IP Services for VMS" (traditionally referred to as UCX, "VMS/Ultrix Connection") as an add-on product to my base VMS installation.  UCX implements a standard set of IP clients and servers like FTP, Telnet, NFS etc. and supports two programming interfaces, one similar to the BSD socket calls meant to be used from programs written in C, and another one based on VMS system services, meant for programs written in other languages.  Boldly, I decided that I wanted to use the System Services interface, also in the hope that it would be as nice and convenient as the DECNET programming interface.  I was quite wrong.
Apparently, the System Services interface to UCX is a layer on top of the socket interface, so you get the "best" of both worlds:  BSD sockets with its quirky sequence of required calls to socket(), listen() and accept(), asking you to specify the length of the "listen queue" and to set the "reuse address" option in your server code because, well, because that is how it wants its chicken to be waved to work well.  And then the beautiful VMS $QIO interface with its opaque "function code" and the parameters named P1 to P6 which would be used differently based on the function code supplied.
DEC, in its lack of enthusiasm for supporting IP on VMS, really made sure that the combination of $QIO and sockets would be as ugly as possible.

So, the required sequence to listen to incoming TCP connections and then accept them requires the program to:
  • Assign a channel to the UCX network device
  • Set the channel to be a TCP/IP socket
  • Enable the REUSEADDR option, which is already inconvenient in C with the setsockopt interface, but with $QIO you have to create an additional descriptor to point to the socket option
  • Set the local address of the socket, again using an additional descriptor to point to the local address
  • Set the socket to listen, using another $QIO invocation, now passing the maximum queue length
These steps directly correspond to what you'd have to do when using the BSD socket interface, only that with the System Services interface, you need to call $QIO all the time.  What is actually done underneath is then determined by which of the P1 to P6 parameters you set and how, so if you call $QIO with the IO$_SETMODE function code and specify a non-zero P5 parameter, you are calling setsockopt() and if you specify a non-zero P3 parameter, you're calling listen() and so on.
Of course, there are few things that can't be syntactically improved with a bunch of LISP macros and in my http server, the whole sequence does not look so bad, but seriously:  This is one of the worst APIs that I have seen in my life and the experience put quite a damper on my romanticized view on VMS and DEC in general.

Serving Files

A http server commonly serves content from files, so I wanted mine to do so as well so that it could deliver a HTML page and some additional resources to the client.  On VMS, files are not the simple byte streams that we think about when talking about files nowadays.  Rather, the Record Managment System (RMS) maintains the file structure for applications and supports various file organizations like line-sequential and indexed with fixed and varying record lengths etc.  It is still possible to read the raw data stored in a file, but that will include the bookkeeping information of RMS.  This means that the http server also needs to know something about the structure of the file being served and possibly perform some conversion instead of just forwarding the file data to the client directly.

Binary Files

Serving binary files like images requires the http server to just read the files in a block-wise fashion, but as VAX LISP does not support block-wise reading, this needs to be done with direct calls to RMS.  This gave me an opportunity to learn working with RMS, as its API is quite different from, say, the POSIX file API with its simple open()/read()/write()/close() interface.
As RMS supports different file organizations, there are many parameters that an application program needs to specify when it wants to operate on a file.  Instead of putting all these different parameters into many functions, RMS uses in-memory structures (blocks) that describe the operations and parameters.  These blocks could be defined using the facilities of the application programming language (i.e. struct definitions and numeric constant macros in C), or they could be described using the File Definition Language (FDL).  FDL is a domain specific language that describes file and record structures.  Applications could invoke the FDL interpreter to turn textual file structure descriptions into binary RMS blocks that it could then use for accessing files.
As a result, the RMS API entry points only require specifying pointers to the blocks that describe the operation, which makes the call sites look rather clean.  As, the entry points themselves clearly describe the operation, this makes the LISP code to serve a binary file using RMS calls rather clean-looking.

Text Files

Serving text files like HTML or CSS files is rather more challenging in VAX LISP, considering that the on-disk data of RMS text files cannot simply be copied to the HTTP client.  This is because, depending on the file's RMS record format, text files are stored with the length of each line stored in the file in binary form before the actual characters in the line.  For that reason, I chose to use the text file I/O mechanism of VAX LISP to read and provide for some explicit buffering to avoid writing each line to the client in a separate $QIO call.  I am not sure whether that strategy really provides performance benefits or whether UCX's own buffering would be as good.  The resulting code does not really look terribly complex, though.

Roundup

In the end, I've now implemented a basic http server in VAX LISP, called rasselbock.  The development experience was quite good, and I found it to be fun to write some code in a Lisp that felt like the Common Lisp I've been using professionally for several years, but that lacks several of the features that I took for granted.  I think that in the end, VAX LISP provides enough tooling for serious programming work.  It was also an interesting experience to write some code in a language for which the Internet offers no support at all.  All I had was the manuals, some sample code and DESCRIBE, and that was sufficient to create something which (kind of) works.

My vacation comes to an end so I'm going to stop at this point for now.  In the future, I'm going to put rasselbock onto a real VAX, and I also want to spend some time working with the DECwindows version of VAX LISP as that will probably be a much nicer development experience.  Furthermore, I would like to try converting rasselbock to run asynchronously, as that one of the cool things in VMS is that all its system services can be used asynchronously.

I need to thank Walter van Roggen who was kind enough to borrow me the VAX LISP V3.0 manual set for scanning, and Michael Kukat who parted with his remaining VAX hardware so that I have a real machine to run VAX LISP and rasselbock on.  I'd also like to thank Richard Wells who pointed me at the vectorized version of the digital logo which led me to become a producer of t-shirts.

If you made it this far, you might also be interested in my follow-on post, in which I describe how I patched the editor so that I could bind Ctrl-S and Ctrl-Q.