Report about the European Lisp Symposium 2011

So, ELS 2011 is over which was the first conference I attended that was solely aimed at Lisp programmers. Overall I am quite happy with it although not all talks have been of the same quality. In particular I wasn’t too excited about all three key notes, although all had interesting topics. The first one by Craig Zilles about best effort code optimization was about things intelligent compilers could do. Very interesting stuff for sure and I learned a lot about low-level soft- and hardware architectures but there was no apparent direct relation to Lisp. A similar problem troubled the talk about Scala: perhaps it was due to my late arrival (I got on the right subway but in the wrong direction, not for the first time) but the part I attended left me wondering why Scala is relevant on a Lisp conference. Marc Battayani’s invited talk about his use of Common Lisp for programming FPGAs was nice, but first of all it was difficult to follow (not due to the content) and not many details were given about how the specialized Lisp embedded DSLs get converted to the FPGA specific code and what problems he had to overcome.

picture of Hamburgs harbourNow for some of the interesting regular talks: on Thursday, the report about porting SBCL to the supercomputing Blue Gene/P was nice and raised an interesting question: what can/needs to be re-discovered from old Lisp dialects for parallel programming for Lisps when more and more parallel cpus are becoming available to programmers. An issue that came up in both the talk about the futures implementation for ACL2 and in Nicolas Neuss’ initial digression about his experiences in parallelizing Femlisp was that garbage collection can get in the way of effective parallelization, up to the point where much of the expected speedup is lost. The motivation of the talk about actors framework named Jobim for Clojure nicely fitted in with an immediate question that came to my mind when I saw how Clojure connects to Java: What do you do so that Java’s semantics don’t leak into your application code? They seem to have found a nice way to abstract away the underlying Java libraries in their framework. In the last session of the first day, the lightning talk session, two things were interesting: Ralf Möller talked about using user-defined method combinations as a more powerful approach than design patterns, showing how one might implement html specific print-object methods, and Didier Verna talked about user-extensible format directives which he wants to turn in as a CLRFI. Having done a lot of work in computational linguistics, the talk about S-NLTK, the Scheme toolkit for natural language processing, was nice to see. Damir Cavar did a good job promoting the toolkit which has a similar aim like the Python NLTK, although I would have liked to learn more about the API and implementation issues. Finally, Alec Berryman by ITA gave a last minute presentation about things ITA learned about optimizing stuff for SBCL and about issues arising when adopting the old code to multi-threaded programming. Interestingly they didn’t report about gc issues but that may be related to their extensive use of object caches of pre-allocated objects.

The final panel discussion with James Knight, Christophe Rodes, James Anderson and Martin Simmons went back and forth about concurrency, distribution and efficiency vs. performance. The discussion took up several points from the talks, including gc and hardware issues. I took away from the discussion that unsurprisingly a lot of open questions need to be solved of which people are aware while at the same time there doesn’t seem to be much momentum, which, given that the community isn’t that big, isn’t surprising either after all.

Summing up I liked it a lot. For one, it was very nice to see people you only know via the net. For another, I also think that the organizers made a good decision to select a main theme for the conference and an important one, too. It really set the main theme for the conference and the discussions, and hence nicely reached its goal. Generally speaking, the conference was nicely organized so thanks for a pleasant time in Hamburg.

Categories: Lisp
Defined tags for this entry:

Lisp, sql, git and mercurial

This has been a rather unpleasant month (don’t ask, I won’t tell) but right now I’ll look forward toward its end because of two reasons: for one, I’ll be in Hamburg for the European Lisp Symposium for the next two days; the program for the ELS has also been published in between. I’m really looking forward to an interesting set of talks. For another, some patches to CL-SQL which add support for autoincrement behaviour for Postgresql, are probably going to be released soon. To clarify, “autoincrement” is a column constraint in MySQL (among others) that automatically increments the value of the column when a new row is inserted when no value for the autoincrement column is given (cf. MySQL docs on AUTOINCREMENT), a behaviour that Postgresql supports with the serial constraint (cf. this wikibook on converting between MySQL and Postgres). Actually, that has been my first substantial amount of Common Lisp programming in the last two years, which has been triggered by an upgrade of my Debian system. This upgrade implied that an old application of mine would now use CL-SQL version 5.0 which in turn broke the app: I had simply specified a db-type of “serial” previously, but the new CL-SQL code wouldn’t recognize that it had to fetch the automatically generated value from the DB when inserting a new record. More details on the patches can be found on the CL-SQL mailing list.

The developement of this addition was also the first time I had a real-world setup developing with git. In my own projects I use mercurial, so I was eager to learn a little bit more about the differences. It’s funny that a recent opinonated article “Why I like mercurial better than git” more or less talks only about the one point that I found confusing: branch handling. For more background information, I suggest reading this article “A guide to branching in mercurial”. Basically, in my current projects where I use mercurial, I’m using the “branching with clones” approach Steve is describing there. When working on the patches for CL-SQL, I was working on the existing autoincrement branch but when I was through I wanted to port my patches to the master branch. When using mercurial with the described approach, selecting (pulling or pushing) my patches and only my patches to the master branch is dead easy: you just issue a pull/push command restricted to the “right” changesets. Doing this is even supported by Subversion these days via svn cherry picking. Looking at the docs for git pull, fetch and merge, I wasn’t able to figure out what the corresponding “right” incantation for git might look like, if there is one at all. As I didn’t want to hose my “working copy” (sorry for the SVN term again), I resorted to git format-patch, git am resp., which worked fine. Please note that I’m not suggesting that it’s not possible with another approach, quite to the contrary I would be happy to learn about it. One thing that I found rather useful is git’s stash command which let’s you safely abandon your current work to fall back to the last commited version, in order to be able to work on something that popped up in between (typically a minor unrelated problem you encounter while working on a larger piece of changes). I understand that mercurials patch queues enable a similar functionality, but I haven’t used them sofar. Another thing that I found very useful is git’s very easy way to correct (or in git terminology “amend”) a commit by just issuing “git commit -a”. I also like the idea of the “index” or more exactly that you have to explictly “add” which changes you want to commit. A similar behaviour is possible with SVN “changelist” command, but the mere existance of a changelist is not automatically honoured by SVN’s commit.


Coders at work

For the holidays I finally bought Peter Seibels Coders at work, which is a very unusable book about programming: it consists solely of interviews with pretty well known programmers or “coders”. It’s an interesting constellation: On the one hand, Peter Seibel is well known in the Common Lisp community for his book Practical Common Lisp which gives a modern view on Lisp: not only is it an introduction to the language but also to several libraries and the general setting of modern lisp programming. On the other (fifteen) hands, there are people like Jamie Zawinski (XEmacs, Netscape), Don Knuth (TeX, Art of Computer Programming), Guy Steele (Lisp, Scheme, Java), Peter Norvig (PAIP, Google), Brendan Eich (Javascript) and Ken Thompson (Unix) — just to note the ones that are probably the most well known.

I’ve had resisted the urge to buy the book because I’ve always felt that programming is a craft that ultimately forces you to make your own experiences. I mean, you can read all the books you like but ultimately you have to make your own hands dirty to really get knowledge about the issues involved. So, what could I learn from other peoples experiences? On the other hand, as a lightweight (in terms of reading attention) holiday book it seemed about right, so I finally gave in.

Well, the book turned out to be a real page turner for me. It’s a fascinating read because of the re-occuring topics Seibel is addressing and the various opinions he got. He addresses topics you would expect like preferred tools (e.g. editor), worst bugs, debugging techniques, asssertions and verification, literate programming (which suprises me a little), design approaches and team work, but of course the main focus is the personal experiences and how they wound up with whatever made these guys known. One thing that I liked is that Seibel has a way to ask good follow-up questions to the responses he gets, without ever letting his own experiences or opinions getting in the way, which I can imagine has probably made for pleasant interview situations (at least I take away that impression). I wouldn’t have imagined beforehand that I would find the different stories how the guys (and one woman) got into coding so interesting. There are very few people in this book whose experience doesn’t go back to teletype and time sharing systems. Of course, as a result these stories tend to be similar, but the details differ enough that’s it doesn’t get too boring. Starting with computers in the early 80s, I don’t have any experiences with such systems and which I frankly don’t miss at all after reading more about it. But just to get to this conclusion is interesting: the constant comparison with your own experiences and opinions you can’t help but make while reading this book alone is worth buying and reading it.

Over all, it’s hard to say which interviews I found the most interesting one, essentially each has some unique point or other. That being said, the interviews with Joe Armstrong and Guy Steele made a lot of impresssion on me, whereas I’m a little disappointed by the one with Peter Norvig (though he had the funniest quotes), but I can’t really nail down why. I didn’t particular like the interview with Brad Fitzpatrick, it didn’t seem to contain as much information as the others. And Joshua Bloch seemed to hype Java all the time which I found not very convincing — the idea that todays larger context for programming contains quite a few different languages and approaches seems to elude him.

There are some points I took away from this book: For one, most of the interviewed people seem to be much more concerned with data types than I am, even the ones who have done extensiv work on dynamically or weakly typed languages. I guess I should really take a closer look at that topic and, to make it more concrete, play around with e.g. Haskell. Another point is that concurrency or parallel programming is a topic that (IIRC) all of the interviewees have seen as being responsible for the worst bugs they encountered and as a result are interested in newer approaches like STM. So, it might be worthwhile to look closer into such developments, for example by playing with Clojure, Erlang. or the transaction monad, if I’ll ever really play around with Haskell. A third point is that I realized that I’m not keeping up with academic research in CS and, not having TAoCP, might never have been up to date at all. I’m following a few online references like LtU, but not closely and it’s pretty rare these days that I look deeply into some research paper. This is something else I should probably change, if time permits.


On password safety

I’m using computers for quite a while and since fifteen or more years I’ve been using multi-user systems and the internet. Having a computer science and linguistics background, security of passwords is not exactly a new issue for me. Quite to the contrary, I’ve been telling a lot of my friends that using good passwords is important.

However, this week I’ve stumbled about some interesting pieces that shed new light on the topic for me. The first is a not exactly new article by Bruce Schneier about how secure passwords keep you safer, which contained quite some information that was new to me. It’s a very interesting piece on possible approaches to break passwords as of three years ago. Now combine that with a more recent piece on using cloud computing to crack passwords and you might get an idea how important using really good passwords might have become today.

This brings me back to my current usage of passwords. Until now, I’ve seen my passwords as reasonably hard to break. They contain all the usual stuff, a mix of upper and lower case letters, numbers and some symbols. However, reading Schneiers article made it clear to me that not all my passwords may really be as secure as I thought they would be, so it’s time to overhaul these.

In addition, I’ve been using some passwords for more than one site (though not for critical/important ones), to ease the mental cost of having to remember particular passwords for some sites. On the other hand, some time ago, I already started using a password manager to store my passwords, synchronizing the db over several installations. So, the cost for using unique passwords is lower as it used to be, so I’m going to correct that error, too.

What this amounts to at the end of the day, though, is a really old lesson: security is a permanent issue not a one-time shot. What once might have been enough security measure might be totally insufficient soon.

Summary Please use a password store like keepassxc which you can use on multiple devices and which can generate random passwords for you. The more characters you use, the better. Don’t use the same password on a second site.


There's nothing left to see

Via Lambda the Ultimate I came across an interesting article On data abstraction, revisited by William Cook, written for OOPSLA’09. It carefully dissects abstract data types from objects. All theoretical considerations aside that distinguish ADTs and objects, there is one common characteristics given by Cook: you can’t inspect the concrete representation of the data you’re abstracting. This is in itself interesting and reminded me of two rather practical things.

First of all, I was reminded of a section in Bob Martins Clean code development which discussed the idea that you should on the one hand follow the rule “Tell, don’t ask” and on the other hand have data access objects that don’t have much, if any behaviour besides providing data. This is obviously directly related to Cooks article: if you want data abstraction, you shouldn’t really provide any way to allow other objects/methods to access the internal representation. This somewhat also forbids getters as this is likely to lead to leaky abstraction, since more often than not programmers simply return the value of some data field, directly exposing the representation chosen. Now, please note that this does not necessarily follow from Cooks article, as it is possible to design getters in such a way that you can return whatever you want for a getter method, i.e., you can return a desired return type or an object satisfying a particular interface. For me, the relevant point here is the way of thinking about the kind of object at hand: do I want some behaviour (aka Cooks objects) or do I want a data sink. In the former case, and in line with what is suggested in the clean code book, it is arguably the best way to tell the object to do what is necessary rather than to inspect (get) the data it holds and do it externally in some other object/method. But even in the latter case, I think it is important to give great attention to hiding the internal representation from external access and to only allow very focussed access to the data itself. It could and has been argued that restricting the access to the stored data via getter methods is tedious (see e.g. the discussion in getters/setters/fuxors) and that allowing public access to members is allright, but looking at the issue from a data abstraction point of view it simply boils down to the question whether you want or need data abstraction or not.

Second, I’ve recently seen these two postings on the merits of the Zope Component Architecture: The emperors new clothes and the reply The success of the ZCA. Malthe asks why one should use the ZCA to override the use of a particular implementation with another instead of using some kind of reloading (or rather says that the latter is the preferrable approach). Relating this to Cooks article, Malthe could be paraphrased roughly as: we have ADTs all over the place and we only should allow only one implementation per ADT (this is what the type system would guarantee in other systems). If you want another implementation (of some interface, as Cook shows for his objects), you should reload the object defintion with the one you want. The use of the ZCA, however, is directly related to the very idea of object oriented programming in the way Cook defines it: you only have interfaces that are the relevant defining characteristics of objects (values) and hence, the use of the ZCA is the way to deal with multiple implentations in Zope (or Python). For me, all I can say is that I’m happy that the ZCA and hence the ability to easily intermingle multiple implementations is there (then again, with me reading computer science theoretic articles I’m arguably not of the angry web designer type whose benefit Malthe is arguing for).

There is another, more puzzling aspect of the article to me. After some considerations, I have to conclude that of all OO languages I happen to know, it’s really only Java that seems to be object oriented in Cooks view of the world. This is because in Java, you can define a method to return objects satisfying an interface. In addition, in dynamically typed languages like Python, Ruby, or CLOS, you could try to come away with duck typing, but it’s arguably only Python which tries to take it to the heart (for instance in CLOS, most values you’re gonna deal with are non-CLOS values and you even have an ETYPECASE statement, which is a switch-statement on type distinction). Funny enough, Cook finishes his Smalltalk analysis with the statement that “one conclusion you could draw from this analysis is that the untyped lambda calculus was the first object-oriented language”. But besides the point how some language is “more OO” than another, there is also to the point that in order to program truly object-oriented, you shouldn’t (and in Cooks world really can’t) rely on type checks, because the whole point of using objects as data abstraction is to rely on behaviour.


Version control and testing

What have version control and testing to do with each other? Well, first of all, both are common virtues in the clean code community. What you’ll find is that both virtues are important on their own ground: version control provides a safety guard in that you can roll back to prior versions if you accidently introduce problems in your code. Testing (automated unit tests) provides a safety guard, too, because you can do regression testing when you work with your code. These are both fine goals but seemingly have little to do with each other.

But in reality they do. For sake of argument, let’s take a step back and assume that you have to work in an environment of several developers where neither of these things exists. What will you likely see? What we all have seen several years ago. Commented out code blocks, redundant and often misleading or outdated comments, timestamps with comments cluttered all over the code. And frightened developers that feared each minor change because of the miriad of subtle side effects it might have, let alone major changes to core components. It’s an environment in which refactorings as welll as extensions are very hard and expensive, which results in frightened overworked developers and frustrated managers.

So, what happens when you introduce only one of those virtues? Say, we introduce version control. Now, every change gets documented, except that documenting every change requires, from the developers point of view, documentation at the wrong point. They can’t see the documented changes and the reason for these changes in the source, they see it only in the version control system — iff they add a change message with every change at all. Much more likely is that you will see commit messages such as “.” or “bug fix”, and the same old mess of timestamps, outdated comments and commented out code as before. Why is that? Because your developers are now not as frightened as they used to be (because they can now rely on the version control system to fall back to older versions), but they still have the same need to understand and document the code. And the commit log is both “too far away” from the code and “out of it’s purpose” for this task: the commit log shouldn’t document what the code is supposed to do, only when something was implemented to behave in a particular way.

This is where a development (unit) test suite comes into the picture: you document every required behaviour in tests. With every change to the code, you also update the test. As a developer, you can now look into your test suite to see what the code is supposed to do. Now developers will likely become much more confident with their changes, because they can run the tests and see what happens (hopefully next to immediately) without requiring time- and resource-consuming manual tests.

But what about documenting the changes to the code? Well, you should simply document any changes in the commit message of your version control system, because it’s now no longer necessary to keep the entire version history in mind to understand what the current code state is supposed to do. You have the tests that tell you what the code should do. The commit log now only serves the purpose of documenting what has changed over time and is no longer required to understand what the code should do. So you don’t have to keep the clutter in your code, resulting in much cleaner source code files.

Summary: Taken together, the whole of version control and testing adds up to more than a simple addition of their own values.


Space vs. tabs vs. Emacs

Betrand Mathieus blog article on cleaning up trailing whitespace comes in handy for me as I’ve frequently run into problems with the Python test coverage tool stumbling over trailing whitespace. It also reminded me of an emacs snippet I recently installed to detect a mix of space and tabs in my Python buffers — I do set (setq indent-tabs-mode nil) in my .emacs for python-mode, however, I still occasionally somehow manage to insert some tabs in my source buffers. So I came up with the following snippet which validates that a buffer in python-mode doesn’t contain any tabs. It’s hooked up with the very general write-file-hook, but there is no python-specific hook on saving buffers. In case the buffer does contain any tab, it will leave the point (the place where your cursor will be) at the fount tab.


; code snippet GUID 32F94179-A86B-4780-8645-8A526BD8533A
(defun no-tab-validation (buffer)
  (interactive "bValidate buffer:")
  (save-excursion
    (unless (equal (buffer-name (current-buffer)) buffer)
      (switch-to-buffer buffer))
    (if (re-search-forward "\t" nil t)
    (error "Buffer %s contains tabs" buffer))))

(defun py-tab-validate-on-save-hook ()
  (when (equal mode-name "Python")
    (no-tab-validation (buffer-name (current-buffer)))))

(add-hook 'write-file-hook 'py-tab-validate-on-save-hook)

;;; inhibit tabs in some modes
(defun set-indent-tabs-mode (value)
  "Set indent-tabs-mode to value."
  (setq indent-tabs-mode value))

(defun toggle-tabs ()
  "Toggle `indent-tabs-mode'."
  (interactive)
  (set-indent-tabs-mode
   (not indent-tabs-mode)))

(defun disable-tabs-mode ()
  (set-indent-tabs-mode nil))

(add-hook 'sgml-mode-hook 'disable-tabs-mode)
(add-hook 'xml-mode-hook 'disable-tabs-mode)
(add-hook 'python-mode 'disable-tabs-mode)


I write sins not tragedies

One of the nicer extensions of mozilla I use is It’s all text. It allows you to call an external editor to edit text fields which is a really nice thing if you’re editing longer wiki pages, for instance.

Not surprisingly, my external editor of choice is XEmacs. But my XEmacs is heavily configured and loads a lot of stuff. Even worse, I usually run a beta version with debugging turned on, so that it runs extra slow, so the time gap between clicking on that little “edit” button below the text field and the moment when I could finally start typing started to annoy me quickly.

I remembered that some years ago I used to log in remotely to a running machine, fire up some script and have either my running XEmacs session connected or a new XEmacs process started. Connecting to the running XEmacs happens via gnuclient, so much was clear but gnuclient doesn’t start a new XEmacs if there is none running (actually, the server process gnuserv will die when the XEmacs process terminates, so there isn’t much gnuclient can do). The emacs wiki page linked to above already contains a number of scripts that eventually do what I need to do, but I’ve found none of them convincing, so here is my version. It’s linux specific in its BSD style call to ‘ps’ to determine the running processes, but should be portable sh otherwise. I could have sprinkled the fetch_procs function with OS specific variants, but as I’m currently running my XEmacs on linux only, I left that as an exercise to the reader.

[geshi lang=bash]

!/bin/sh

emacs=xemacs gnuclient=gnuclient gnuserv=gnuserv user=id -un

gnuclientopts=”“

fetch_procs () { ps xU $user }

runninggnuserv=fetch_procs | grep "[^]]$gnuserv" runningemacs=fetch_procs | grep "[^]]$emacs" if [ “$?” -eq “0” ]; then gnuservpid=echo $runninggnuserv | cut -f1 -d'' emacspid=echo $runningemacs | cut -f1 -d'' echo “Using running gnuserv $gnuservpid of emacs $emacspid” $gnuclient $gnuclientopts “$@” else echo “Starting new $emacs” $emacs “$@” fi [/geshi]

ObTitle: Panic at the disco, from the album “A fever you can’t sweat out”


Testing and terminology confusion

I’ve become quite addicted to writing tests during my development tasks. I’ve had wanted to dig into test-driven development for quite some time, but it was the seamless integration of Test::Unit, Ruby’s unit testing module, in Eclipse that got me going initially. I then did some unit testing with Common Lisp packages and am currently heavily using pyunit and python doctests (mostly in the context of zope testing). Writing tests has become my second development nature: It gives you that warm fuzzy feeling that you have that little safety net while modifying code.

However, there are times when terminology comes along and gives you a headache. A terminology I’ve learned about during the last year is the difference between unit testing, integration tests and functional tests (for an overview see wikipedia on software testing). But as you can see for instance in this article on integration tests in Rails, it’s not always easy to agree on what means what — Jamis and/or the Rails community seem to have the integration/functional distinction entirely backwards from what, for instance, the Zope community (on testing) thinks.

Now, one might argue that terminology doesn’t matter much given that you do write tests at all, but it’s not so easy. For instance, if your “unit test” of a given class requires another class, is that still unit testing or is it integration testing? Does it even make sense to talk about unit-testing a class? A class on its own isn’t that interesting after all, it’s its integration and interoperation with collaborateurs were the semantics of a class and its methods become interesting. Hence, shouldn’t you rather test a specific behaviour, which probably involves those other classes? And what now, if your code only makes sense when run on top of a specific framework (Zope, Rails, you name it)? Michael Feathers argues convincingly in his set of unit testing rules that any such tests are probably something else.

Ultimately these questions directly pertain to two aspects: code granularity and code dependencies — and remember, test code is code after all. These are directly related, of course: if your code is very fine-grained, it’s much more likely that it will also be much more entangled (although the dependency might be abstracted with the help of interfaces or some such, you still have the dependency as such). And as a consequence, your test code will have to mimick these dependencies. On the contrary, if your code blocks are more coarse-grained (i.e. cover a greater aspect of funcionality), you might have less (inter-)dependencies, but you won’t be able to test functionality on a more fine-grained level. As Martin Fowlers excellent article Mocks aren’t stubs discusses in detail, one way to loosen these connections between code and tests is to use mock objects or stubs. Fowlers article also made clear to me that I’ve used the term “mock object” wrongly in my post on mock objects in Common Lisp: dynamically injecting an object/function/method (as a replacement for a collaborator required for the “code under test”) that returns an expected value means using a stub, not a mock — another sign of not clearly enough defined terminology (btw, the terminology Fowler is using is that of G. Mercezaos xunit patterns book).

It’s worth keeping these things apart because of their different impact on test behaviour: mocks will force you to think about behaviour whereas stubs focus on ‘results’ of code calls (or object state if you think in terms of objects being substituted). As a result, when you change the behaviour of the code under test (say you’re changing code paths in order to optimize code blocks) this might (mocks) or might not (stubs) result in changes to the test code.

It’s also worth thinking about mocks and stubs because they also shed a new light on the question of test granularity: when you’re substituting real objects in either way, you’re on your way to much more fine-grained tests, which implies that you loosen the dependency of your tests: You can now modify the code of your collaborateur class without the test for your code under test breaking. Which brings us back full circle to the distinction between unit tests and integration tests: you now might have perfect unit tests, but now you’re forced to additionally tests the integration of all the bits and pieces. Otherwise you might have all unit tests succeed but your integrated code still fails. Given this relationship, it seems immediately clear that 100% test coverage might not be the most important issue with unit tests: you might have 100% unit test success, but 100% integration failure at the same time — if you don’t do continuous integration and integration tests, of course. Now what’s interesting is that it might be possible to check test coverage on code paths, but it might not be easy to check integration coverage. I would be interested to learn about tools detailing such information.

Recently I had another aha moment with regard to testing terminology: Kevlin Henney’s presentation at this years german conference on object oriented programming, the OOP 2009, on know your units: TDD, DDT, POUTing and GUTs: tdd is test driven development, of course. The other ones might be not so obvious: “guts” are just good unit tests and “pout” is “plain old unit testing”. I saw myself doing tdd, but come to think of it, I’m mostly applying a combination of tdd, pout (after the fact testing) and ddt: defect driven testing. I find the introduction of a term for testing after the code has been written interesting because it provides a way to talk about how to introduce testing in the first place. Especially defect driven testing, the idea to write a test to pinpoint and overcome an erroneous code path, might be a very powerful way to introduce the habit of regularly writing (some) tests for an existing large code base. So you avoid the pitfall of never being able to test “all this lots of code because there is never the time for it” and you might also motivate people to try writing test before code. And on this level, it might at first not be that relevant to make the distinction between integration and unit tests to clear: start out with whatever is useful.


python-mode vs. slime

I’ve become a python programmer, too, lately, due to a job change. Python is a fine language so far, although to me it’s mostly just like Ruby, though with even less functional flavour. However, just as with Ruby, I’m really missing slime, the superior lisp interaction mode for Emacs, when hacking python code. I could now start to write down a list of things I’m missing (which I’ve intended to do), however, Andy Wingo spares me the hassle, as he has just written an excellent article on slime from a python programmers view.

However, I would like to elaborate a little on the main difference for me: the client/server socket approach of slime. Let me briefly recapulate what this implies: slime consists of two parts, a client written in Emacs lisp and a server written in Common Lisp (AFAIK there is at least also an implementation for clojure, maybe also one for some scheme implementation). In order to use slime in it’s full glory, it’s hence required that you have a common lisp process running which in turn runs the slime server part. If you now fire up slime, you’ll get an interaction buffer over which you can access the REPL of the lisp process, which in python would be the interpreter prompt. You can then interact with the lisp process, evaluating pieces of code from your lisp source code buffer directly in the connected lisp process. What is incredibly useful for me is that you can not only start a new lisp process but also connect to an already running lisp process, given that it has the slime server started (this is obviously mainly useful if the lisp implementation you use has multi-threading capabilities). I use it to connect to a running web server application, which I can then inspect, debug and modify. Modification includes redefinition of functions, macros and classes, which of course is also a particular highlight of Common Lisp. I would like to cite a comment of the reddit user “fionbio” he made wrt. to the linked article: In fact, Python language wasn’t designed with lisp-style interactive development in mind. In CL, you can redefine (nearly) anything you want in a running program, and it just does the right thing. In Python, there are some problems, e.g. instances aren’t updated if you modify their class. Lisp programmers often, though not always, refer to various things (functions, classes, macros, etc.) using symbols, while Python programs usually operate with direct references, so when you update parts of your program you have much higher chances that there will be a lot of references to obsolete stuff around.

To complement Bill clementsons excellent article series on slime a little, I’m going to describe how I’m using/configuring python-mode to make it match my expectations a little closer. Essentially I would like to access my python process just as I would with slime/Common Lisp, but that’s not possible. The reason, btw., is nearly unchanged: I need to code on a web server app (written in Zope) which may not even run on the same machine I’m developing on. Let’s first cover the simple stuff: To enable a reasonable command interface to the python interpreter, I require the ipython emacs library. If the python interpreter runs locally, I also use py-complete, so that I can complete my code at least a little. Unfortunately, this breaks when the python interpreter doesn’t run locally, because the py-complete needs to setup some things in the running python process, which it does by writing to a local temp file and feeding it to the python process. Unfortunately, the code in py-complete lacks customizability, i.e., you can’t specify where that temp file should be located — I should be able to come up with a small patch in the near future, which I will add below. Finally, I also require doctest-mode as a support for writing doctests, but that’s not really relevant.

Now, on to the more involved stuff: I introduce some new variables and a new function py-set-remote-python-environment, which uses the those variables to do a remote call (via ssh) to python. This at least allows me to do things like setting py-remote-python-command to “/home/schauer/zope/foo-project/bin/instance” and py-remote-python-command-args to “debug”, so that I can access a remote debug shell of my current zope product. That alone will only allow me to fire up and access the remote python, so I could now develop the code locally, having it executed remote. More typical though is that you would also want to keep the code on the remote machine, too: for this I use tramp, a package for remotely accessing files/directories from within emacs. In combination, this allows me to edit and execute the code on the remote machine. It is still nowhere near what is possible with slime, but at least it allows me to persue my habit of incremental and interactive development from within my usual emacs installation (i.e., it doesn’t require me to deal with any Emacs related hassle on the remote machine).


;;; python-stuff.el --- python specific configuration

(when (locate-library "ipython")
  (require 'ipython))

(when (locate-library "doctest-mode")
  (require 'doctest-mode))

(defvar py-remote-connect-command "ssh""*Command for connecting to a remote python, typically \"ssh\"")
(defvar py-remote-connection-args '("user@remotemachine")"*List of strings of connection options.")
(defvar py-remote-python-command "python""*Command to execute for python")
(defvar py-remote-python-command-args '("-i")"*List of strings arguments to be passed to `py-remote-python-command`.")
(defvar py-remote-python-used nil"Remember if remote python is used.")

(defun py-set-remote-python-environment ()
  (interactive)
  (let ((command-args (append py-remote-connection-args 
                  (list py-remote-python-command)
                  py-remote-python-command-args)))
    (setq py-python-command py-remote-connect-command)
    (setq py-python-command-args command-args))
  (setq py-remote-python-used t))

(when (locate-library "py-complete")
  (autoload 'py-complete-init "py-complete")
  (defun my-py-complete-init ()"Init py-complete only if we're not using remote python"
    (if (not py-remote-python-used)
        (py-complete-init)))
  (add-hook 'python-mode-hook 'my-py-complete-init))

Page 3 of 4, totaling 37 entries