Sunday, December 11, 2011

Learning Ruby, and Ruby vs. Lisp

The company I work for has a lot of legacy Ruby code, and as Ruby has become kind of a mainstream language, I decided to get a book about it and learn how it works.

As my learning resource, I chose The Ruby Programming language by David Flanagan and Yukihiro Matsumoto as that receives great customer reviews, covers Ruby 1.8.7 and 1.9 and is authoritative because the language creator is one of the authors.

The book makes a good read in general. There are plenty of code examples, but not too much to obscure the prose. What I found first interesting, later annoying, was the frequent use of words like "complex", "complicated", "confusing", "surprising" or "advanced" to describe features of the language. I'd rather decide myself about using such attributes to describe something that I've just learned.

Having spent so much time with Common Lisp, I almost forgot that programming languages usually evolve over the years. Ruby is no exception, and the fact that there are significant differences between Ruby 1.8.7 and Ruby 1.9 kind of bothers me - I'll probably never write code in Ruby 1.8.7, but the differences between the two versions seem to be rather subtle and I'm curious to see how much that is going to be a bother in the future, working with legacy code.

The common theme for Ruby seems to be succinctness. This comes at the expense of making the syntax rather complex, with several special case rules required to solve ambiguities. I don't have the practice to judge whether this is a problem, but from the book, it seems there are quite some things to remember.

It seems that Ruby started off as a purely object oriented language and only later discovered that function-oriented programming is nice, too. The deep roots of object orientation made it rather hard to actually get free functions (which are not member functions of an object) integrated. Contrary to what I am used to, member functions are not a special case of free functions, but rather something quite different. It requires explicit conversion steps to convert a member function into a free function (called procs in Ruby), and invocation syntax is also different between the two. Again, the description may sound worse than it is in practice.

What I really liked was the generalization of code blocks into fibers. Ruby does not have full coroutines, but the restricted form that is available is generalized well and seems like it could be useful for building pretty wild asynchronous systems. Also, it is nice that the bindings of closures can be accessed.

But then, Ruby is an interpreted language and this fact is re-stated throughout the book. With Just In Time compilation, this could become a non-problem, but I'm not sure how well Ruby can be optimized due to its very dynamic nature. Just to see how fast it is compared to Common Lisp, I implemented the Sudoku solver from chapter 1 of the Ruby book in CL and gave the two implementations a puzzle to solve. It took the Ruby solver 0.890 CPU seconds (Ruby 1.9.2p290), whereas the Lisp solver (Clozure CL 1.7) used 0.087 CPU seconds to solve the puzzle. Ten times slower, whatever you'll make of that.

In the book, it is mentioned how little code the Sudoku solver actually uses. This is true, but then, the Lisp version is not longer. It does not seem as if adding syntax is actually the best way to add the possibility to write succinct programs to a language, and the price of the complex grammar is rather high.

Writing the CL solver, I found myself not writing tests again and then poking around in problems of my implementation without knowing what works and what does not. As I want to practice more TDD, I stepped back and added tests. This led me to solve a problem that I had with my previous attempts to practice TDD in Lisp - I do not want to export all the symbols that the tests exercise from the packages that I use, but I also don't want to import the unit testing library into my own library packages. Thus, I wrote a deftestpackage macro that creates a new package to contain the tests that I write and automatically imports all symbols from the package being tested. That way, I can easily keep tests and library source separate and don't need to qualify internal symbols in the tests.

My overall takeaway on the Ruby is this: Ruby seems to be a language that has grown from being purely object oriented to supporting functional programming. That growth was not completely natural, and it seems that if Ruby is not used as a pure object oriented language, the syntax becomes rather messy and hard to grasp. This is similar to C++, which in its first versions was relatively nice (I hear you "ow"!), but has grown into into an incomprehensible mess once people recognized how templates can be abused for metaprogramming.

I can see the appeal of Ruby, but there seems little it has to offer to me that Common Lisp cannot provide. The lack of a formal specification and the ugly grammar put me off. Then again I'm pretty sure that Ruby is more enjoyable than many other popular languages. I'm looking forward to see my theoretical conceptions be shaken by actual practice.

Monday, December 5, 2011

Global Day of Code Retreat with Lisp

Last Saturday, I attended the Global Day of Code Retreat. I found out about the event in my Twitter stream, and when I signed up for the event, I did not have much of an idea of what it would be about, except code. With some research, I found that it would be a day of using Test Driven Development practices to improve on coding style and quality. Fair enough I thought, got myself a book on TDD (which I have always wanted to do more) and reserved my seat, which was free.

The event was hosted by immobilienscout24 and sponsored by Nokia. Sponsoring included catering, and as the day started at 8:00am, the availability of a breakfast was very welcome. The crowd consisted of some 30 hackers. An informal poll showed that most of them were either using Java or Ruby as their first language, a few were into JavaScript. One guy said that C++ was his primary language, and I was the only Common Lisp hacker.

The format of the all-day event consisted of six sessions of 45 minutes. In every session, people would form new pairs and implement parts of Conway's Game of Life using TDD practices. For each session, the pair chooses a specific challenge or goal to solve. At the end of the session, every pair deletes their code, discusses what they learned and then joins the group to share their thoughts.

The focus of the goals and challenges was on TDD practices. For about a half of the participants, TDD was already part of their daily work routine. The other half had heard of TDD, like myself, and joined to learn about it in the event. For me, this was one of the bigger takeways. I found it very helpful to pair with people that practice TDD in order to learn how they'd go about to write a test before they'd write an implementation. It seems that the TDD school favors a very small test granularity. As was explained by the organizer of the meeting, one would write very small tests that exercise very small aspects of the production code initially, and then build up on that.

In the six sessions, I used Java, JavaScript and Common Lisp. For Java, the organizers had come up with a skeleton project that included a unit testing environment (jUnit?). For JavaScript, I had prepared a simple browser based environment with QUnit, for Common Lisp I used Clozure CL and the simple unit-test library that I have used in the past.

Here are some observations:

The Java dudes travel with heavy baggage

IntelliJ was the most popular IDE among the Java folks that I've talked to. I am split between admiration and disgust when it comes to what seems common practice in the Java world. I'm used to write code by thinking about what I want to write, then typing the stuff that I've thought about. With Java and IntelliJ, the process is more interactive with the IDE, i.e. the programmer types a few characters, the IDE displays some menu to choose from or automatically adds code and so on. This is nice when the IDE automatically recognizes, say, that you are using a name of a class that has not yet been defined or imported and then offers you to either import a package that defines a class of the name that you've typed, or to create a class skeleton (including all the boilerplate and ceremony that Java wants) for you.

While this appears to be convenient, it also does not always work. In both of my Java sessions, the IDE got confused in some way or another, which was not fatal, but still annoyed me. Also, the static nature of Java slowed us down. For one, even though Java does have all the nice data structures (we wanted to use a set of coordinates), we wasted a lot of the short session times converting data types and supplying boilerplate that allowed us to actually put coordinate objects into a set. Also, file based compilation took time - Not minutes, but seconds. I was assured that one would use libraries and code generators in production code and that desktop machines were faster in compiling, but I still can't really relate Java to Agile, in the sense of the word.

Also, the use of code generators in IntelliJ makes me wonder how one maintains such code. How can one actually distinguish between what was carefully crafted and what was pasted, from templates, by the IDE? In my eyes, this is like copy and paste programming with the copy step optimized into templates. I'm not the first to say that and it does not come as a surprise either, but it was an interesting experience for me nevertheless.

Pair programming is fun

Being a remote consultant, I rarely have the chance to interact over code with other programmers. This was something I found enjoyable to practice, even in languages and environments that I'm not familiar with. It is amazing how vastly different the approaches to implementing a relatively simple thing can be, and compared to code reviews - in particular if they're done by email - it is much easier to make suggestions in a constructive way. In that respect, writing code that is supposed to be deleted also helps concentrating on the essence as no sense of code ownership is developed.

I found the pair programming experience significantly impacted by the fact that programmers use different coding environments. Working on an unfamiliar keyboard and with an unfamiliar IDE is a real productivity killer. I'd hope that in a team that pairs regularly, the work environments are more standardized than they have been on this event. I was using Emacs for my Lisp and JavaScript sessions, and my partners had a hard time getting along. What I found rather interesting was that the guys who wrote JavaScript in my Emacs always mimiced what their IDE would do for them. Rather then writing "if (foo) { doSomething()", they'd type "if (foo) {}", then navigate into the braces with the cursor to code along. This is kind of curious because it seems that the balancing of parentheses, brackets and braces is much more of a chore in other languages, to the point where people slow themselves down a lot if not aided by an IDE to the balancing for them. We Lispers, with only the parentheses to keep track of, have a much easier life, in particular with Emacs doing the indentation for us.

Is TDD the kool aid?

In some sense, I am sceptical about TDD. Testing is a great idea, and doing it in a disciplined fashion certainly helps writing better quality software. Also, automated tests are really the best way to prepare software for change. But, and this is where this Saturday could not convince me, I don't believe that spending time on writing very fine grained unit tests for every aspect of of a program helps preparing it better for refactoring and change. I think that testing against external requirements is the real key to writing programs that can be changed facing changing requirements, and that it should be possible to relate every test to some requirement. I must admit that one Saturday of TDD does not give me sufficient experience to judge, though.

Common Lisp and TDD

Some of the TDD discipline probably owes to development cycle in statically compiled languages. Where we Common Lispers have a rather incremental style and do our testing interactively in the repl, developers in languages like Java or C++ write larger chunks of code before they plug it together to do something meaningful. Such environments give the developer less insight into the run-time behavior of a running system, and tests are a way to make sure that more of the interactions of the system components are actually exercised.

I am not claiming that an interactive development environment makes testing less useful. In such an environment, though, it is common to write and test a function in small, iterative cycles. Furthermore, through the use of a tracing facility, the dynamic bahavior of a system can be observed at any time, without the need to touch or recompile the code.

In any case, this was a very nice experience and I'll go to a Code Retreat again, if I can. For that, I'll probably prepare myself for Java tools better in order to get more out of pair programming.