Unit tests with mockups in Lisp

One of the bigger practical problems with unit testing is isolating the test coverage. Say, you want to test a piece of code from the middle (business) layer. Let’s assume further the piece of code under consideration makes some calls to lower level code to retrieve some data. The problem of test coverage isolation is now that if you “simply” call your function, you are implicitly also testing the lower level code, which you shouldn’t: if that lower level code gets modified in an incorrect way, you would suddenly see your middle level code fail although there was no change made to it. Let’s explore ways to avoid the problems in Common Lisp.

There is a very good reason why you would also want to have such test dependencies to ensure your middle level code still works if the lower level code is extended or modified. But that is no longer unit testing: you are then doing so-called integration tests which are related, but still different beasts.

Now, I was facing exactly the typical dreaded situation: I extended an application right above the database access layer which had not seen much tests yet. And of course, I didn’t want to go the long way (which I will eventually have to go anyway) and set up a test database with test data, write setup and tear-down code for the db etc. The typical suggestion (for the xUnit crowd) is to use mock objects which brings us finally on topic. I was wondering if there are any frameworks for testing with mock objects in Lisp, but a quick search didn’t turn up any results (please correct me if I’ve missed something). After giving the issue a little thought, it seemed quite clear why there aren’t any: probably because it’s easy enough to use home-grown solutions such as mine. I’ll use xlunit as the test framework, but that’s not relevant. Let’s look at some sample code we’ll want to test:

[geshi lang=lisp] (defun compare-data (data &connection) (let ((dbdata (retrieve-data-by-id (id data)))) (when (equal (some-value data) (some-db-specific-value dbdata)) t))) [/geshi] The issue is with retrieve-data-by-id which is our interface to the lower level database access.
And note that we’ll use some special functions on the results, too, even if they may just be accessors.
Let’s assume the following test code: [geshi lang=lisp] (use-package :xlunit)

(defclass comp-data-tc (test-case) ((testdata :accessor testdata :initform (make-test-data))))

(def-test-method comp-data-test ((tc comp-data-tc)) (let ((result (compare-data (testdata tc)))) (assert-equal result t))) [/geshi]

Now the trouble is: given the code as it is now, the only way to succeed the test is to make sure that make-test-data returns an object whose values match values in the database you’re going to use when compare-data get’s called. You’re ultimately tying your test code (especially the result of make-test-data) to a particular state of a particular database, which is clearly unfortunate. To overcome that problem, we’ll use mock objects and mock functions. Let’s define a mock-object mock-data and a mock-retrieve-data function, which will simply return a single default mock object.

[geshi lang=lisp] (defclass mock-data () ((id :accessor id :initarg :id :initform 0) (val :accessor some-db-specific-value :initarg :val :initform “foo-0”))))

(defun mock-retrieve-data (testcase) (format t “Establish mock for retrieve-data”) (lambda (id) (format t “mock retrieve-data id:~A~%”) (find-if #’(lambda (elem (when (equal (id elem) id) elem)) (testdbdata testcase)))) [/geshi]

Why that mock-retrieve-data returns a closure will become clear in a second, after we’ve answered the question how these entirely different named object and function can be of any help. The answer lies in CLs facility to assign different values (or better said) definitions to variables (or better said to function slots of symbols). What we’ll do is to simply assign the function definition we’ve just created as the function to use when retrieve-data is going to be called. This happens in the setup code of the test case:

[geshi lang=lisp] (defclass comp-data-tc (test-case) ((testdata :accessor testdata :initform (make-test-data)) (testdbdata :accessor testdbdata) (func :accessor old-retrieve-func)))

(defmethod set-up ((tc comp-data-tc)) ; set up some test data (dotimes (number 9) (setf (testdbdata tc) (append (list (make-instance ‘mock-data :id number :value (concatenate ‘string “value-” number))) (testdbdata tc)))) ; establish our mock function (when (fboundp ‘retrieve-data) (setf (old-retrieve-func tc) (fdefinition ‘retrieve-data)))) (setf (fdefinition ‘retrieve-data) (mock-retrieve-data tc)))

(defmethod tear-down ((tc comp-data-tc)) ; After the test has run, re-establish the old definition (when (old-retrieve-func tc) (setf (fdefinition ‘retrieve-data) (old-retrieve-func tc)))) [/geshi]

You can now see why mock-retrieve-data returns a closure: by this way, we can hand the data we establish for the test case down to the mock function without resorting to global variables.

Now, the accessor fdefinition comes in extremely handy here: we use it to assign a different function definition to the symbol retrieve-data which will then be called during the unit-test of compare-data.

..Establish mock for retrieve-data
mock retrieve-data id: 0
F
Time: 0.013


There was 1 failure: ...

There is also symbol-function which could be applied similarly and which might be used to tackle macros and special operators. However, the nice picture isn’t as complete as one would like it: methods aren’t covered, for instance. And it probably also won’t work if the function to mock is used inside a macro. There are probably many more edge cases not covered by the simple approach outlined above. Perhaps lispers smarter than me have found easy solutions for these, too, in which case I would like to learn more about them.


Running Linux on Dell systems

Dell is selling Ubuntu equipped systems since about a year now and seems to be quite happy with it. Whatever that effectively means, at least I can tell that I’m quite happy with Linux on Dell systems, too.

Through the last five years, I’ve been using Linux on a number of Dell systems. Under my personal control there have been three laptops (Dell C610, D610 and a Latitude 640) and a desktop (Optiplex 755), on which I have been running Debian Sarge, Ubuntu Dapper, Feisty and now Hardy. We also had several Dell servers at work running more or less smoothly with Debian (sarge, etch). Using Linux wasn’t always without problems: I had trouble with built-in modems, PCMCIA ISDN cards and acpi/hibernation. For example, on my private Latitude 640, I have trouble suspending at all, because of the ipw3945 driver for the wlan. But the important thing to note is that basically all problems were really small and never of a size requiring me to use some other OS in the first place.

The only real issue is not with Dell per se, but more with my favourite OS, Debian: over the years, and especially with the ever-lasting sarge release, getting Debian to run on a recent system got more and more difficult. That’s the main reason why I’ve been using Ubuntu on all recent hardware I had contact with: it’s more or less (more so than less) a Debian system but does run on modern hardware. Main issues here were graphics adapters, sata/scsi hostadapters and network/wifi cards, or to put it otherwise: too old kernels, too old X.org. Both problem sources can simply be solved by using a recent version of Ubuntu. Sorry, Debian, but your release cycle is just too long to be acceptable. Granted, all these problems are mostly an issue when installing a new system, but it’s not always possible to plug in some old disc with a working version of Linux.


Escaping from sql-reader-syntax in CL-SQL

This post is mainly a reference post about a particular topic whose solution wasn’t immediately obvious to me from the docs to CL-SQL. Using CL-SQL with (enable-sql-reader-syntax), I had written a routine that looks basically likes this:

[geshi lang=lisp] (defun data-by-some-criteria (criteria &key (dbspec +db-spec+) (dbtype +db-type+)) (with-database (db dbspec :database-type dbtype :if-exists :old) (let (dbresult) (if criteria (setq dbresult (select ‘some-model ‘other-model :where [and [= [some.criteria] criteria] [= [some.foreign_id] [other.id]]] :order-by ‘([other.name] [some.foreign_id] [year] [some.name]) :database db)) (setq dbresult (select ‘some-model ‘other-model :where [and [null [some.criteria]] [= [some.foreign_id] [other.id]]] :order-by ‘([other.name] [some.foreign_id] [year] [some.name]) :database db)) (when dbresult (loop for (some other) in dbresult collect some))))) [/geshi]

This is ugly because the only difference between those two select statements is the check for the criteria, but I had no idea how to combine the two select statements into one, because it’s not possible to embed lisp code (apart from symbols) into an sql-expression (i.e. the type of arguments for :where or :order etc.). With the next requirement things would become far worse: The order-by statement needs to get more flexible so that it is possible to sort results by year first. Given the approach shown above this would result in at least four select statements, which is horrible. So, naturally I wanted a single select statement with programmatically obtained :where and :order-by sql expressions.

Step 1: It occured to me that it should be possible to have the arguments in a variable and simply refer to the variable. E.g., using a more simple example: [geshi lang=lisp] (let (where-arg) (if (exact-comp-needed) (setq where-arg ‘[= [column] someval]) (setq where-arg ‘[like [column] someval])) (select ‘model :where where-arg)) [/geshi]

So I could now have my two different where-args and two different order-args and use a single select statement. Main problem solved.

Step 2: But for the :where arg in my original problem, only a small fraction of the sql-expression differs. So how do I avoid hard coding the entire value of where-arg? How can I combine some variable part of an sql-expression with some fixed parts? I.e, ultimately I want something like:

[geshi lang=lisp] (let (comp-op where-arg) (if (exact-comp-needed) (setq comp-op ‘=) (setq comp-op ‘like)) (setq where-arg ‘[ <put comp-op here> [column1] someval]) (select ‘model :where where-arg)) [/geshi]

But with CL-SQL modifying the reader, there seems to be no way to make <put comp-op here> work. I didn’t knew how to get the usual variable evaluation into the sql-expression, or how to escape from CL-SQL’s sql-reader-syntax to normal lisp evaluation.

Somewhere in the back of my head where was that itch that CL-SQL might offer some low-level access to sql expressions. And indeed it does. There are two useful functions, sql-expression and sql-operation. sql-operation “returns an SQL expression constructed from the supplied SQL operator or function operator and its arguments args” (from the cl-sql docs), and we can supply the operator and its arguments from lisp — which is exactly what I want.

Now, the nice thing is that it’s easy to mix partly handcrafted sql expressions with CL-SQL special sql syntax constructs that will be automatically handled by the reader (if you enable it only via enable-sql-reader-syntax, of course). I.e., for <put comp-op here> we can use sql-operation, but the rest stays essentially the same:

[geshi lang=lisp] (let (where-arg) (if (exact-comp-needed) (setq where-arg (sql-operation ‘= [column1] someval)) (setq where-arg (sql-operation ‘like [column1] someval))) (select ‘model ‘other-model :where where-arg)) [/geshi]

Now, coming back to my original problem, based on this approach I can split out the common part of the :where and :order arguments and combine those with the varying parts as needed and hand them down to a single select statement. Problem solved.

Categories: Lisp
Defined tags for this entry:

Lisp golf

Some time ago, I was looking at splitting text with Elisp, Perl, Ruby and Common Lisp. Yesterday, when I again had to do quite the same thing, it occurred to me that the Common Lisp solution was unnecessary complex/long. I’m not a Perl guru, but I believe the following is probably hard to beat even with Perl:


CL-USER> (format t "~{<li>~A</li>~%~}" (cl-ppcre:split "\|""Kim Wilde|Transvision Vamp|Ideal|Siouxsie and the Banshees|Nena|Iggy Pop"))
<li>Kim Wilde</li><li>Transvision Vamp</li><li>Ideal</li><li>Siouxsie and the Banshees</li><li>Nena</li><li>Iggy Pop</li>
NIL

For the uninitiated, it’s not the cl-ppcre library which is interesting here but the built-in iteration facilities of format. See the Hyperspec on the control-flow features of format for details. Now, I usually tend to avoid the mini languages that come with Common Lisp like the one of format or loop when writing real programs, but when using Lisp as a glorified shell they come in very handy.

Categories: Lisp
Defined tags for this entry:

Now I'm all over the shop ... or Converting from RCS to Mercurial

For a long time I hadn’t looked closer at those modern distributed revision control systems like Git, Darcs or Mercurial. This was mainly due to two facts: As I’m currently neither involved in any major open source project which uses these systems nor in a project at work which requires the facilities offered by such systems, and as there was no easy access for them in XEmacs, the more traditional systems like Subversion, CVS and RCS are fine for me. However, there was this nagging feeling that I might miss something and as revision systems always have been somewhat of a pet peeve of mine, I eventually spend some time reading up more on them. I’ve read quite a lot of discussions on the web, and gathered that mercurial might be worth a closer look, as it claims to be quite easy to handle, comparably well documented and quite fast. And then finally I’ve read on xemacs-beta that the new vc package (in Pre-Release) would support mercurial as well.

Well, that’s where I am now: I have several pieces of code lying around which I sometimes develop on my main machine and sometimes on my laptop when moving around. This is the scenario where a server-based approach to revision control is not what you want: you won’t be able to access your server while you’re on the road and hence you can’t commit. Now, with RCS that’s not a problem, as there is no server involved. But of course, since RCS is a file-system local revision system, syncing is a major problem and you have to go to great pains to ensure you don’t overwrite changes you made locally in between syncs. I hope that a distributed version control system like mercurial will solve the problem, as I no longer have to decide which version is the current head version, instead cherry-picking change sets at will.

But of course, for this to happen, I have to convert my RCS repositories to Mercurial. This doesn’t seem to be a common problem: there are a lot of tools for conversion from CVS or Subversion (see Mercurial Wiki, e.g. Tailor for instance), but not from RCS. I ended up following the instructions given in the TWiki Mercurial Contribution page. I have some minor corrections, though, so here we go:

-1. (Step 6 in TWiki docs) Ensure all your files are checked in RCS. I won’t copy the advice from the TWiki page here, because I believe in meaningful commit messages and would urge you to do a manual check. 0. You’ll need cvs20hg and rcsparse which you will find here. You’ll need to have Python development libraries installed, i.e. Python.h. For Debian systems, this is in package python-dev. Installation is as simple as two “./setup install” as root which will install the relevant libraries and Python scripts. 1. Create a new directory for your new mercurial repository (named REPO-HG, replace that name):

    mkdir REPO-HG

2. Initialize the repository:

   hg init REPO-HG

3. (Step 4 in the TWiki document) Create a new copy of your old RCS repository (named REPO here, replace that with the name containing your old RCS files), add a CVSROOT and a config file (mistake one in the TWiki docs: As with all CVS data, the “config” file needs to go to CVSROOT, not to CVSROOT/..). Of course, if you’re no longer interested in your old data, you may omit the initial copy.

    mkdir tmp
    cp -ar REPO tmp/REPO-old
    mkdir tmp/CVSROOT
    touch tmp/CVSROOT/config

4. Inside your directory with the old RCS data, move everything out of the RCS subdirectories (mistake two in the TWiki docs: the double-quotes need to go before the asterix):

   find tmp/REPO-old -type d -name RCS -prune | while read r; do mv -i "$r"/* "$r/.."; rmdir "$r"; done

5. Run cvs20hg to copy your old repository to mercurial. If you don’t follow the directory scheme shown below, you’ll end up with your new mercurial repository missing the initial letter of the name of all top-level files and directories.

   cvs20hg tmp/REPO-old `basename tmp/REPO-old` REPO-HG

6. Check that everything looks like you would expect:

   cd REPO-HG
   hg log

7. If you had files in your old directory not under version control that you’ll like to keep, copy them over. This might be a good time to think about whether they are worth having them under revision control. Afterwards throw away any old directory you no longer need (i.e., your original REPO, tmp/*).


Splitting the dark side ...

For a review, I needed to get the track list of a given CD. As the track list wasn’t available via CDDB, I went to some large online store and found the tracklist. I need to convert it to XML, though. The original data I fetched looks like so:

1. Fox In A Box
2. Loaded Heart
3. All Grown Up
4. Pleasure Unit
...

whereas I need:

<li id=”1”>Fox In A Box</li> <li id=”2”>Loaded Heart</li> <li id=”3”>All Grown Up</li> …

After cutting the original data to my Emacs, writing out a simple file and using Perl for that simple transformation seemed just gross. In the past, I’ve been an Emacs hacker. But no more, or so it seems, since it took me nearly half an hour just to come up with this simple function:

(defun tracklist-to-li (point mark)"Generate a string with <li>-elements containing tracks.
Assumes that one every line of region, a track position 
and the track name is given."
  (interactive "r")
  (save-excursion 
    (goto-char point)
    (let ((current-pos (point))
      (result ""))
      (while (re-search-forward "^\\([0123456789]+?\\)\.[ \t]+\\(.*\\)$"
                mark t)
    (setq result
          (concat result"<li id=\""
              (match-string 1)"\">"
              (match-string 2)"</li>\n"))
    (setq current-pos (point)))
      (message result))))

What took the most time was that I’ve had forgotten to escape the grouping parenthesis in the regular expression and that it took me a little while to accept that there is really no \d or equivalent character class in Emacs regexps. Which probably means that I’ve been doing too much in Perl, sed and the like. OTOH, it just may hint at the horror of regular expressions handling in Emacs. What I also dislike is that whenever you want some result in Emacs and see it, too, you have to invoke an interactive operation like message. Of course, there is IELM, but this doesn’t really help you for interactive functions operating on regions.

And five minutes later, I realize I need to convert some string like “The (International) Noise Conspiracy|The Hi-Fives|Elastica” into a similar list structure. With a simple cut & paste and roughly 30 seconds later, I have

[bauhaus->~]perl -e '$a="The (International) Noise Conspiracy|The Hi-Fives|Elastica"; @a=split("|",$a); foreach $b  (sort @a) { print "<li>$b</li>\n"; }'
<li>"The (International) Noise Conspiracy"</li><li>"The Hi-Fives"</li><li>"Elastica"</li>

Hmm. Perhaps I’ve come quite a long way on the dark side already … On the other hand, in Ruby, this is just as simple (I’m using irb, the interactive ruby shell here):

irb(main):008:0> a="The (International) Noise Conspiracy|The Hi-Fives|Elastica"
=>"The (International) Noise Conspiracy|The Hi-Fives|Elastica"
irb(main):009:0> a.split("|").each {|string|
irb(main):010:1* print "<li>"
irb(main):011:1> print string
irb(main):012:1> print "</li>\n"
irb(main):013:1> }<li>The (International) Noise Conspiracy</li><li>The Hi-Fives</li><li>Elastica</li>
=> ["The (International) Noise Conspiracy", "The Hi-Fives", "Elastica"]

The difference here is the implicit array Ruby generates, which of course in Perl you could hide in the array position of the foreach loop. Note the annyoing misfeature of irb to always show the prompt even when your still continuing your current input line.

In Common Lisp we can do it just as short:

CL-USER> (let* ((a "The (International) Noise Conspiracy|The Hi-Fives|Elastica")
                  (splits (ppcre:split "\|" a)))
               (loop for string in splits
                  do 
                      (format t "<li>~S</li>~%" string)))
<li>"The (International) Noise Conspiracy"</li><li>"The Hi-Fives"</li><li>"Elastica"</li>
NIL

The same thing here: The result of the split could have been easily embedded in the loop.

The lesson, of course, is that in the end this example only serves to show that things that are easy to achieve in a high-level are indeed easy to achieve. Or to put it otherwise that the use of regular expressions is no more a discriminating feature between programming languages.


Explicit resource handling in Lisp

Recently, there was a discussion about "the rise of functional languages" over on <a href="http://lwn.net/">Linux weekly news</a>, in which one of the participant claimed that one of the major reasons why nobody uses functional languages in industrial settings would be the lack of explicit resource handling (where a resource is some supposedly "alien" object in the system, say a database handle or something like that).  What he was referring to was the inability to run code on allocating/deallocating a piece of resource. Of course, some people pointed him to various solutions, in particular I recurred to the usual WITH-* style-macros in which one would nest the access to the data while at the same time hiding all what one would do on allocation/de-allocation. His reply went something along the lines that such objects may need to be long-lived (thus a WITH-macro is inappropriate) and that the only resort would be the garbage collector and that there simply is no way of running code at a guaranteed (de-allocation) time. I have to admit that I have no idea how I could code around that problem in Common Lisp (garbage collection even isn't a defined term in the ANSI specification of CL, and I'm very sure I haven't seen any mention of allocation/deallocation in it).

Now, some months later, there is a discussion in comp.lang.lisp on the topic of “portable finalizers” and Rainer Joswig pointed to this chapter in the Lisp machine manual which talks about explicit resource handling the lisp machine. From the excerpt, I can’t judge whether resources are first-class CLOS objects and hence the functions to handle them are generic functions, but if so that would actually allow running code on deallocating a resource, of course with the price of having to handle allocation/deallocation manually. I really wonder if any of todays CL implementations offers the same or at least similar functionality?

Categories: Lisp
Defined tags for this entry:

Page 4 of 4, totaling 37 entries