Apr 4

So, ELS 2011 is over which was the first conference I attended that was solely aimed at Lisp programmers. Overall I am quite happy with it although not all talks have been of the same quality. In particular I wasn’t too excited about all three key notes, although all had interesting topics. The first one by Craig Zilles about best effort code optimization was about things intelligent compilers could do. Very interesting stuff for sure and I learned a lot about low-level soft- and hardware architectures but there was no apparent direct relation to Lisp. A similar problem troubled the talk about Scala: perhaps it was due to my late arrival (I got on the right subway but in the wrong direction, not for the first time) but the part I attended left me wondering why Scala is relevant on a Lisp conference. Marc Battayani’s invited talk about his use of Common Lisp for programming FPGAs was nice, but first of all it was difficult to follow (not due to the content) and not many details were given about how the specialized Lisp embedded DSLs get converted to the FPGA specific code and what problems he had to overcome.

picture of Hamburgs harbourNow for some of the interesting regular talks: on Thursday, the report about porting SBCL to the supercomputing Blue Gene/P was nice and raised an interesting question: what can/needs to be re-discovered from old Lisp dialects for parallel programming for Lisps when more and more parallel cpus are becoming available to programmers. An issue that came up in both the talk about the futures implementation for ACL2 and in Nicolas Neuss’ initial digression about his experiences in parallelizing Femlisp was that garbage collection can get in the way of effective parallelization, up to the point where much of the expected speedup is lost. The motivation of the talk about actors framework named Jobim for Clojure nicely fitted in with an immediate question that came to my mind when I saw how Clojure connects to Java: What do you do so that Java’s semantics don’t leak into your application code? They seem to have found a nice way to abstract away the underlying Java libraries in their framework. In the last session of the first day, the lightning talk session, two things were interesting: Ralf Möller talked about using user-defined method combinations as a more powerful approach than design patterns, showing how one might implement html specific print-object methods, and Didier Verna talked about user-extensible format directives which he wants to turn in as a CLRFI. Having done a lot of work in computational linguistics, the talk about S-NLTK, the Scheme toolkit for natural language processing, was nice to see. Damir Cavar did a good job promoting the toolkit which has a similar aim like the Python NLTK, although I would have liked to learn more about the API and implementation issues. Finally, Alec Berryman by ITA gave a last minute presentation about things ITA learned about optimizing stuff for SBCL and about issues arising when adopting the old code to multi-threaded programming. Interestingly they didn’t report about gc issues but that may be related to their extensive use of object caches of pre-allocated objects.

The final panel discussion with James Knight, Christophe Rodes, James Anderson and Martin Simmons went back and forth about concurrency, distribution and efficiency vs. performance. The discussion took up several points from the talks, including gc and hardware issues. I took away from the discussion that unsurprisingly a lot of open questions need to be solved of which people are aware while at the same time there doesn’t seem to be much momentum, which, given that the community isn’t that big, isn’t surprising either after all.

Summing up I liked it a lot. For one, it was very nice to see people you only know via the net. For another, I also think that the organizers made a good decision to select a main theme for the conference and an important one, too. It really set the main theme for the conference and the discussions, and hence nicely reached its goal. Generally speaking, the conference was nicely organized so thanks for a pleasant time in Hamburg.

Posted by Holger Schauer

Defined tags for this entry:
Mar 29

This has been a rather unpleasant month (don’t ask, I won’t tell) but right now I’ll look forward toward its end because of two reasons: for one, I’ll be in Hamburg for the European Lisp Symposium for the next two days; the program for the ELS has also been published in between. I’m really looking forward to an interesting set of talks. For another, some patches to CL-SQL which add support for autoincrement behaviour for Postgresql, are probably going to be released soon. To clarify, “autoincrement” is a column constraint in MySQL (among others) that automatically increments the value of the column when a new row is inserted when no value for the autoincrement column is given (cf. MySQL docs on AUTOINCREMENT), a behaviour that Postgresql supports with the serial constraint (cf. this wikibook on converting between MySQL and Postgres). Actually, that has been my first substantial amount of Common Lisp programming in the last two years, which has been triggered by an upgrade of my Debian system. This upgrade implied that an old application of mine would now use CL-SQL version 5.0 which in turn broke the app: I had simply specified a db-type of “serial” previously, but the new CL-SQL code wouldn’t recognize that it had to fetch the automatically generated value from the DB when inserting a new record. More details on the patches can be found on the CL-SQL mailing list.

The developement of this addition was also the first time I had a real-world setup developing with git. In my own projects I use mercurial, so I was eager to learn a little bit more about the differences. It’s funny that a recent opinonated article “Why I like mercurial better than git” more or less talks only about the one point that I found confusing: branch handling. For more background information, I suggest reading this article “A guide to branching in mercurial”. Basically, in my current projects where I use mercurial, I’m using the “branching with clones” approach Steve is describing there. When working on the patches for CL-SQL, I was working on the existing autoincrement branch but when I was through I wanted to port my patches to the master branch. When using mercurial with the described approach, selecting (pulling or pushing) my patches and only my patches to the master branch is dead easy: you just issue a pull/push command restricted to the “right” changesets. Doing this is even supported by Subversion these days via svn cherry picking. Looking at the docs for git pull, fetch and merge, I wasn’t able to figure out what the corresponding “right” incantation for git might look like, if there is one at all. As I didn’t want to hose my “working copy” (sorry for the SVN term again), I resorted to git format-patch, git am resp., which worked fine. Please note that I’m not suggesting that it’s not possible with another approach, quite to the contrary I would be happy to learn about it. One thing that I found rather useful is git’s stash command which let’s you safely abandon your current work to fall back to the last commited version, in order to be able to work on something that popped up in between (typically a minor unrelated problem you encounter while working on a larger piece of changes). I understand that mercurials patch queues enable a similar functionality, but I haven’t used them sofar. Another thing that I found very useful is git’s very easy way to correct (or in git terminology “amend”) a commit by just issuing “git commit -a”. I also like the idea of the “index” or more exactly that you have to explictly “add” which changes you want to commit. A similar behaviour is possible with SVN “changelist” command, but the mere existance of a changelist is not automatically honoured by SVN’s commit.

Posted by Holger Schauer

Defined tags for this entry: , ,
Nov 11

One of the bigger practical problems with unit testing is isolating the test coverage. Say, you want to test a piece of code from the middle (business) layer. Let’s assume further the piece of code under consideration makes some calls to lower level code to retrieve some data. The problem of test coverage isolation is now that if you “simply” call your function, you are implicitly also testing the lower level code, which you shouldn’t: if that lower level code gets modified in an incorrect way, you would suddenly see your middle level code fail although there was no change made to it. Let’s explore ways to avoid the problems in Common Lisp.

There is a very good reason why you would also want to have such test dependencies to ensure your middle level code still works if the lower level code is extended or modified. But that is no longer unit testing: you are then doing so-called integration tests which are related, but still different beasts.

Now, I was facing exactly the typical dreaded situation: I extended an application right above the database access layer which had not seen much tests yet. And of course, I didn’t want to go the long way (which I will eventually have to go anyway) and set up a test database with test data, write setup and tear-down code for the db etc. The typical suggestion (for the xUnit crowd) is to use mock objects which brings us finally on topic. I was wondering if there are any frameworks for testing with mock objects in Lisp, but a quick search didn’t turn up any results (please correct me if I’ve missed something). After giving the issue a little thought, it seemed quite clear why there aren’t any: probably because it’s easy enough to use home-grown solutions such as mine. I’ll use xlunit as the test framework, but that’s not relevant. Let’s look at some sample code we’ll want to test:

[geshi lang=lisp] (defun compare-data (data &connection) (let ((dbdata (retrieve-data-by-id (id data)))) (when (equal (some-value data) (some-db-specific-value dbdata)) t))) [/geshi] The issue is with retrieve-data-by-id which is our interface to the lower level database access.
And note that we’ll use some special functions on the results, too, even if they may just be accessors.
Let’s assume the following test code: [geshi lang=lisp] (use-package :xlunit)

(defclass comp-data-tc (test-case) ((testdata :accessor testdata :initform (make-test-data))))

(def-test-method comp-data-test ((tc comp-data-tc)) (let ((result (compare-data (testdata tc)))) (assert-equal result t))) [/geshi]

Now the trouble is: given the code as it is now, the only way to succeed the test is to make sure that make-test-data returns an object whose values match values in the database you’re going to use when compare-data get’s called. You’re ultimately tying your test code (especially the result of make-test-data) to a particular state of a particular database, which is clearly unfortunate. To overcome that problem, we’ll use mock objects and mock functions. Let’s define a mock-object mock-data and a mock-retrieve-data function, which will simply return a single default mock object.

[geshi lang=lisp] (defclass mock-data () ((id :accessor id :initarg :id :initform 0) (val :accessor some-db-specific-value :initarg :val :initform “foo-0”))))

(defun mock-retrieve-data (testcase) (format t “Establish mock for retrieve-data”) (lambda (id) (format t “mock retrieve-data id:~A~%”) (find-if #’(lambda (elem (when (equal (id elem) id) elem)) (testdbdata testcase)))) [/geshi]

Why that mock-retrieve-data returns a closure will become clear in a second, after we’ve answered the question how these entirely different named object and function can be of any help. The answer lies in CLs facility to assign different values (or better said) definitions to variables (or better said to function slots of symbols). What we’ll do is to simply assign the function definition we’ve just created as the function to use when retrieve-data is going to be called. This happens in the setup code of the test case:

[geshi lang=lisp] (defclass comp-data-tc (test-case) ((testdata :accessor testdata :initform (make-test-data)) (testdbdata :accessor testdbdata) (func :accessor old-retrieve-func)))

(defmethod set-up ((tc comp-data-tc)) ; set up some test data (dotimes (number 9) (setf (testdbdata tc) (append (list (make-instance ‘mock-data :id number :value (concatenate ‘string “value-” number))) (testdbdata tc)))) ; establish our mock function (when (fboundp ‘retrieve-data) (setf (old-retrieve-func tc) (fdefinition ‘retrieve-data)))) (setf (fdefinition ‘retrieve-data) (mock-retrieve-data tc)))

(defmethod tear-down ((tc comp-data-tc)) ; After the test has run, re-establish the old definition (when (old-retrieve-func tc) (setf (fdefinition ‘retrieve-data) (old-retrieve-func tc)))) [/geshi]

You can now see why mock-retrieve-data returns a closure: by this way, we can hand the data we establish for the test case down to the mock function without resorting to global variables.

Now, the accessor fdefinition comes in extremely handy here: we use it to assign a different function definition to the symbol retrieve-data which will then be called during the unit-test of compare-data.

..Establish mock for retrieve-data
mock retrieve-data id: 0
F
Time: 0.013

There was 1 failure: ...

There is also symbol-function which could be applied similarly and which might be used to tackle macros and special operators. However, the nice picture isn’t as complete as one would like it: methods aren’t covered, for instance. And it probably also won’t work if the function to mock is used inside a macro. There are probably many more edge cases not covered by the simple approach outlined above. Perhaps lispers smarter than me have found easy solutions for these, too, in which case I would like to learn more about them.

Posted by Holger Schauer

Defined tags for this entry: ,
Feb 22

This post is mainly a reference post about a particular topic whose solution wasn’t immediately obvious to me from the docs to CL-SQL. Using CL-SQL with (enable-sql-reader-syntax), I had written a routine that looks basically likes this:

[geshi lang=lisp] (defun data-by-some-criteria (criteria &key (dbspec +db-spec+) (dbtype +db-type+)) (with-database (db dbspec :database-type dbtype :if-exists :old) (let (dbresult) (if criteria (setq dbresult (select ‘some-model ‘other-model :where [and [= [some.criteria] criteria] [= [some.foreignid] [other.id]]] :order-by ‘([other.name] [some.foreignid] [year] [some.name]) :database db)) (setq dbresult (select ‘some-model ‘other-model :where [and [null [some.criteria]] [= [some.foreignid] [other.id]]] :order-by ‘([other.name] [some.foreignid] [year] [some.name]) :database db)) (when dbresult (loop for (some other) in dbresult collect some))))) [/geshi]

This is ugly because the only difference between those two select statements is the check for the criteria, but I had no idea how to combine the two select statements into one, because it’s not possible to embed lisp code (apart from symbols) into an sql-expression (i.e. the type of arguments for :where or :order etc.). With the next requirement things would become far worse: The order-by statement needs to get more flexible so that it is possible to sort results by year first. Given the approach shown above this would result in at least four select statements, which is horrible. So, naturally I wanted a single select statement with programmatically obtained :where and :order-by sql expressions.

Step 1: It occured to me that it should be possible to have the arguments in a variable and simply refer to the variable. E.g., using a more simple example: [geshi lang=lisp] (let (where-arg) (if (exact-comp-needed) (setq where-arg ‘[= [column] someval]) (setq where-arg ‘[like [column] someval])) (select ‘model :where where-arg)) [/geshi]

So I could now have my two different where-args and two different order-args and use a single select statement. Main problem solved.

Step 2: But for the :where arg in my original problem, only a small fraction of the sql-expression differs. So how do I avoid hard coding the entire value of where-arg? How can I combine some variable part of an sql-expression with some fixed parts? I.e, ultimately I want something like:

[geshi lang=lisp] (let (comp-op where-arg) (if (exact-comp-needed) (setq comp-op ‘=) (setq comp-op ‘like)) (setq where-arg ‘[ <put comp-op here> [column1] someval]) (select ‘model :where where-arg)) [/geshi]

But with CL-SQL modifying the reader, there seems to be no way to make <put comp-op here> work. I didn’t knew how to get the usual variable evaluation into the sql-expression, or how to escape from CL-SQL’s sql-reader-syntax to normal lisp evaluation.

Somewhere in the back of my head where was that itch that CL-SQL might offer some low-level access to sql expressions. And indeed it does. There are two useful functions, sql-expression and sql-operation. sql-operation “returns an SQL expression constructed from the supplied SQL operator or function operator and its arguments args” (from the cl-sql docs), and we can supply the operator and its arguments from lisp — which is exactly what I want.

Now, the nice thing is that it’s easy to mix partly handcrafted sql expressions with CL-SQL special sql syntax constructs that will be automatically handled by the reader (if you enable it only via enable-sql-reader-syntax, of course). I.e., for <put comp-op here> we can use sql-operation, but the rest stays essentially the same:

[geshi lang=lisp] (let (where-arg) (if (exact-comp-needed) (setq where-arg (sql-operation ‘= [column1] someval)) (setq where-arg (sql-operation ‘like [column1] someval))) (select ‘model ‘other-model :where where-arg)) [/geshi]

Now, coming back to my original problem, based on this approach I can split out the common part of the :where and :order arguments and combine those with the varying parts as needed and hand them down to a single select statement. Problem solved.

Posted by Holger Schauer

Defined tags for this entry:
Nov 9

Some time ago, I was looking at splitting text with Elisp, Perl, Ruby and Common Lisp. Yesterday, when I again had to do quite the same thing, it occurred to me that the Common Lisp solution was unnecessary complex/long. I’m not a Perl guru, but I believe the following is probably hard to beat even with Perl:


CL-USER> (format t "~{<li>~A</li>~%~}" (cl-ppcre:split "\|""Kim Wilde|Transvision Vamp|Ideal|Siouxsie and the Banshees|Nena|Iggy Pop"))
<li>Kim Wilde</li><li>Transvision Vamp</li><li>Ideal</li><li>Siouxsie and the Banshees</li><li>Nena</li><li>Iggy Pop</li>
NIL

For the uninitiated, it’s not the cl-ppcre library which is interesting here but the built-in iteration facilities of format. See the Hyperspec on the control-flow features of format for details. Now, I usually tend to avoid the mini languages that come with Common Lisp like the one of format or loop when writing real programs, but when using Lisp as a glorified shell they come in very handy.

Posted by Holger Schauer

Defined tags for this entry:
Jul 4
Recently, there was a discussion about "the rise of functional languages" over on <a href="http://lwn.net/">Linux weekly news</a>, in which one of the participant claimed that one of the major reasons why nobody uses functional languages in industrial settings would be the lack of explicit resource handling (where a resource is some supposedly "alien" object in the system, say a database handle or something like that).  What he was referring to was the inability to run code on allocating/deallocating a piece of resource. Of course, some people pointed him to various solutions, in particular I recurred to the usual WITH-* style-macros in which one would nest the access to the data while at the same time hiding all what one would do on allocation/de-allocation. His reply went something along the lines that such objects may need to be long-lived (thus a WITH-macro is inappropriate) and that the only resort would be the garbage collector and that there simply is no way of running code at a guaranteed (de-allocation) time. I have to admit that I have no idea how I could code around that problem in Common Lisp (garbage collection even isn't a defined term in the ANSI specification of CL, and I'm very sure I haven't seen any mention of allocation/deallocation in it).

Now, some months later, there is a discussion in comp.lang.lisp on the topic of “portable finalizers” and Rainer Joswig pointed to this chapter in the Lisp machine manual which talks about explicit resource handling the lisp machine. From the excerpt, I can’t judge whether resources are first-class CLOS objects and hence the functions to handle them are generic functions, but if so that would actually allow running code on deallocating a resource, of course with the price of having to handle allocation/deallocation manually. I really wonder if any of todays CL implementations offers the same or at least similar functionality?

Posted by Holger Schauer

Defined tags for this entry: