Aug 8

For a review, I needed to get the track list of a given CD. As the track list wasn’t available via CDDB, I went to some large online store and found the tracklist. I need to convert it to XML, though. The original data I fetched looks like so:

1. Fox In A Box
2. Loaded Heart
3. All Grown Up
4. Pleasure Unit
...

whereas I need:

<li id=”1”>Fox In A Box</li> <li id=”2”>Loaded Heart</li> <li id=”3”>All Grown Up</li> …

After cutting the original data to my Emacs, writing out a simple file and using Perl for that simple transformation seemed just gross. In the past, I’ve been an Emacs hacker. But no more, or so it seems, since it took me nearly half an hour just to come up with this simple function:

(defun tracklist-to-li (point mark)"Generate a string with <li>-elements containing tracks.
Assumes that one every line of region, a track position 
and the track name is given."
  (interactive "r")
  (save-excursion 
    (goto-char point)
    (let ((current-pos (point))
      (result ""))
      (while (re-search-forward "^\([0123456789]+?\).[ \t]+\(.*\)$"
                mark t)
    (setq result
          (concat result"<li id=\""
              (match-string 1)"\">"
              (match-string 2)"</li>\n"))
    (setq current-pos (point)))
      (message result))))

What took the most time was that I’ve had forgotten to escape the grouping parenthesis in the regular expression and that it took me a little while to accept that there is really no \d or equivalent character class in Emacs regexps. Which probably means that I’ve been doing too much in Perl, sed and the like. OTOH, it just may hint at the horror of regular expressions handling in Emacs. What I also dislike is that whenever you want some result in Emacs and see it, too, you have to invoke an interactive operation like message. Of course, there is IELM, but this doesn’t really help you for interactive functions operating on regions.

And five minutes later, I realize I need to convert some string like “The (International) Noise Conspiracy|The Hi-Fives|Elastica” into a similar list structure. With a simple cut & paste and roughly 30 seconds later, I have

[bauhaus->~]perl -e '$a="The (International) Noise Conspiracy|The Hi-Fives|Elastica"; @a=split("|",$a); foreach $b  (sort @a) { print "<li>$b</li>\n"; }'
<li>"The (International) Noise Conspiracy"</li><li>"The Hi-Fives"</li><li>"Elastica"</li>

Hmm. Perhaps I’ve come quite a long way on the dark side already … On the other hand, in Ruby, this is just as simple (I’m using irb, the interactive ruby shell here):

irb(main):008:0> a="The (International) Noise Conspiracy|The Hi-Fives|Elastica"
=>"The (International) Noise Conspiracy|The Hi-Fives|Elastica"
irb(main):009:0> a.split("|").each {|string|
irb(main):010:1* print "<li>"
irb(main):011:1> print string
irb(main):012:1> print "</li>\n"
irb(main):013:1> }<li>The (International) Noise Conspiracy</li><li>The Hi-Fives</li><li>Elastica</li>
=> ["The (International) Noise Conspiracy", "The Hi-Fives", "Elastica"]

The difference here is the implicit array Ruby generates, which of course in Perl you could hide in the array position of the foreach loop. Note the annyoing misfeature of irb to always show the prompt even when your still continuing your current input line.

In Common Lisp we can do it just as short:

CL-USER> (let* ((a "The (International) Noise Conspiracy|The Hi-Fives|Elastica")
                  (splits (ppcre:split "\|" a)))
               (loop for string in splits
                  do 
                      (format t "<li>~S</li>~%" string)))
<li>"The (International) Noise Conspiracy"</li><li>"The Hi-Fives"</li><li>"Elastica"</li>
NIL

The same thing here: The result of the split could have been easily embedded in the loop.

The lesson, of course, is that in the end this example only serves to show that things that are easy to achieve in a high-level are indeed easy to achieve. Or to put it otherwise that the use of regular expressions is no more a discriminating feature between programming languages.

Posted by Holger Schauer

Defined tags for this entry:

1 Trackbacks

  1. Not lost but found

    Lisp golf
    Some time ago, I was looking at splitting text with Elisp, Perl, Ruby and Common Lisp. Yesterday, when I again had to do quite the same thing, it occurred to me that the Common Lisp solution was unnecessary complex/long. I’m not a Perl guru, but I believe

0 Comments

Display comments as(Linear | Threaded)
  1. No comments

Add Comment


Markdown format allowed
Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA