Refactoring (the name of) a python package

I had the need to refactor a python package, better said, I had to rename the package. The package in question is using namespaces, which complicates the matter, as this implies that there are two steps to the process: you have to exchange all references to the module name and you have to re-arrange the package structure. Assuming you’re at the toplevel of your (otherwise clean) package directory, the following shows the manual steps involved in the process.

First we rearrange the directories:


There's nothing left to see

Via Lambda the Ultimate I came across an interesting article On data abstraction, revisited by William Cook, written for OOPSLA’09. It carefully dissects abstract data types from objects. All theoretical considerations aside that distinguish ADTs and objects, there is one common characteristics given by Cook: you can’t inspect the concrete representation of the data you’re abstracting. This is in itself interesting and reminded me of two rather practical things.

First of all, I was reminded of a section in Bob Martins Clean code development which discussed the idea that you should on the one hand follow the rule “Tell, don’t ask” and on the other hand have data access objects that don’t have much, if any behaviour besides providing data. This is obviously directly related to Cooks article: if you want data abstraction, you shouldn’t really provide any way to allow other objects/methods to access the internal representation. This somewhat also forbids getters as this is likely to lead to leaky abstraction, since more often than not programmers simply return the value of some data field, directly exposing the representation chosen. Now, please note that this does not necessarily follow from Cooks article, as it is possible to design getters in such a way that you can return whatever you want for a getter method, i.e., you can return a desired return type or an object satisfying a particular interface. For me, the relevant point here is the way of thinking about the kind of object at hand: do I want some behaviour (aka Cooks objects) or do I want a data sink. In the former case, and in line with what is suggested in the clean code book, it is arguably the best way to tell the object to do what is necessary rather than to inspect (get) the data it holds and do it externally in some other object/method. But even in the latter case, I think it is important to give great attention to hiding the internal representation from external access and to only allow very focussed access to the data itself. It could and has been argued that restricting the access to the stored data via getter methods is tedious (see e.g. the discussion in getters/setters/fuxors) and that allowing public access to members is allright, but looking at the issue from a data abstraction point of view it simply boils down to the question whether you want or need data abstraction or not.

Second, I’ve recently seen these two postings on the merits of the Zope Component Architecture: The emperors new clothes and the reply The success of the ZCA. Malthe asks why one should use the ZCA to override the use of a particular implementation with another instead of using some kind of reloading (or rather says that the latter is the preferrable approach). Relating this to Cooks article, Malthe could be paraphrased roughly as: we have ADTs all over the place and we only should allow only one implementation per ADT (this is what the type system would guarantee in other systems). If you want another implementation (of some interface, as Cook shows for his objects), you should reload the object defintion with the one you want. The use of the ZCA, however, is directly related to the very idea of object oriented programming in the way Cook defines it: you only have interfaces that are the relevant defining characteristics of objects (values) and hence, the use of the ZCA is the way to deal with multiple implentations in Zope (or Python). For me, all I can say is that I’m happy that the ZCA and hence the ability to easily intermingle multiple implementations is there (then again, with me reading computer science theoretic articles I’m arguably not of the angry web designer type whose benefit Malthe is arguing for).

There is another, more puzzling aspect of the article to me. After some considerations, I have to conclude that of all OO languages I happen to know, it’s really only Java that seems to be object oriented in Cooks view of the world. This is because in Java, you can define a method to return objects satisfying an interface. In addition, in dynamically typed languages like Python, Ruby, or CLOS, you could try to come away with duck typing, but it’s arguably only Python which tries to take it to the heart (for instance in CLOS, most values you’re gonna deal with are non-CLOS values and you even have an ETYPECASE statement, which is a switch-statement on type distinction). Funny enough, Cook finishes his Smalltalk analysis with the statement that “one conclusion you could draw from this analysis is that the untyped lambda calculus was the first object-oriented language”. But besides the point how some language is “more OO” than another, there is also to the point that in order to program truly object-oriented, you shouldn’t (and in Cooks world really can’t) rely on type checks, because the whole point of using objects as data abstraction is to rely on behaviour.


Space vs. tabs vs. Emacs

Betrand Mathieus blog article on cleaning up trailing whitespace comes in handy for me as I’ve frequently run into problems with the Python test coverage tool stumbling over trailing whitespace. It also reminded me of an emacs snippet I recently installed to detect a mix of space and tabs in my Python buffers — I do set (setq indent-tabs-mode nil) in my .emacs for python-mode, however, I still occasionally somehow manage to insert some tabs in my source buffers. So I came up with the following snippet which validates that a buffer in python-mode doesn’t contain any tabs. It’s hooked up with the very general write-file-hook, but there is no python-specific hook on saving buffers. In case the buffer does contain any tab, it will leave the point (the place where your cursor will be) at the fount tab.


; code snippet GUID 32F94179-A86B-4780-8645-8A526BD8533A
(defun no-tab-validation (buffer)
  (interactive "bValidate buffer:")
  (save-excursion
    (unless (equal (buffer-name (current-buffer)) buffer)
      (switch-to-buffer buffer))
    (if (re-search-forward "\t" nil t)
    (error "Buffer %s contains tabs" buffer))))

(defun py-tab-validate-on-save-hook ()
  (when (equal mode-name "Python")
    (no-tab-validation (buffer-name (current-buffer)))))

(add-hook 'write-file-hook 'py-tab-validate-on-save-hook)

;;; inhibit tabs in some modes
(defun set-indent-tabs-mode (value)
  "Set indent-tabs-mode to value."
  (setq indent-tabs-mode value))

(defun toggle-tabs ()
  "Toggle `indent-tabs-mode'."
  (interactive)
  (set-indent-tabs-mode
   (not indent-tabs-mode)))

(defun disable-tabs-mode ()
  (set-indent-tabs-mode nil))

(add-hook 'sgml-mode-hook 'disable-tabs-mode)
(add-hook 'xml-mode-hook 'disable-tabs-mode)
(add-hook 'python-mode 'disable-tabs-mode)


python-mode vs. slime

I’ve become a python programmer, too, lately, due to a job change. Python is a fine language so far, although to me it’s mostly just like Ruby, though with even less functional flavour. However, just as with Ruby, I’m really missing slime, the superior lisp interaction mode for Emacs, when hacking python code. I could now start to write down a list of things I’m missing (which I’ve intended to do), however, Andy Wingo spares me the hassle, as he has just written an excellent article on slime from a python programmers view.

However, I would like to elaborate a little on the main difference for me: the client/server socket approach of slime. Let me briefly recapulate what this implies: slime consists of two parts, a client written in Emacs lisp and a server written in Common Lisp (AFAIK there is at least also an implementation for clojure, maybe also one for some scheme implementation). In order to use slime in it’s full glory, it’s hence required that you have a common lisp process running which in turn runs the slime server part. If you now fire up slime, you’ll get an interaction buffer over which you can access the REPL of the lisp process, which in python would be the interpreter prompt. You can then interact with the lisp process, evaluating pieces of code from your lisp source code buffer directly in the connected lisp process. What is incredibly useful for me is that you can not only start a new lisp process but also connect to an already running lisp process, given that it has the slime server started (this is obviously mainly useful if the lisp implementation you use has multi-threading capabilities). I use it to connect to a running web server application, which I can then inspect, debug and modify. Modification includes redefinition of functions, macros and classes, which of course is also a particular highlight of Common Lisp. I would like to cite a comment of the reddit user “fionbio” he made wrt. to the linked article: In fact, Python language wasn’t designed with lisp-style interactive development in mind. In CL, you can redefine (nearly) anything you want in a running program, and it just does the right thing. In Python, there are some problems, e.g. instances aren’t updated if you modify their class. Lisp programmers often, though not always, refer to various things (functions, classes, macros, etc.) using symbols, while Python programs usually operate with direct references, so when you update parts of your program you have much higher chances that there will be a lot of references to obsolete stuff around.

To complement Bill clementsons excellent article series on slime a little, I’m going to describe how I’m using/configuring python-mode to make it match my expectations a little closer. Essentially I would like to access my python process just as I would with slime/Common Lisp, but that’s not possible. The reason, btw., is nearly unchanged: I need to code on a web server app (written in Zope) which may not even run on the same machine I’m developing on. Let’s first cover the simple stuff: To enable a reasonable command interface to the python interpreter, I require the ipython emacs library. If the python interpreter runs locally, I also use py-complete, so that I can complete my code at least a little. Unfortunately, this breaks when the python interpreter doesn’t run locally, because the py-complete needs to setup some things in the running python process, which it does by writing to a local temp file and feeding it to the python process. Unfortunately, the code in py-complete lacks customizability, i.e., you can’t specify where that temp file should be located — I should be able to come up with a small patch in the near future, which I will add below. Finally, I also require doctest-mode as a support for writing doctests, but that’s not really relevant.

Now, on to the more involved stuff: I introduce some new variables and a new function py-set-remote-python-environment, which uses the those variables to do a remote call (via ssh) to python. This at least allows me to do things like setting py-remote-python-command to “/home/schauer/zope/foo-project/bin/instance” and py-remote-python-command-args to “debug”, so that I can access a remote debug shell of my current zope product. That alone will only allow me to fire up and access the remote python, so I could now develop the code locally, having it executed remote. More typical though is that you would also want to keep the code on the remote machine, too: for this I use tramp, a package for remotely accessing files/directories from within emacs. In combination, this allows me to edit and execute the code on the remote machine. It is still nowhere near what is possible with slime, but at least it allows me to persue my habit of incremental and interactive development from within my usual emacs installation (i.e., it doesn’t require me to deal with any Emacs related hassle on the remote machine).


;;; python-stuff.el --- python specific configuration

(when (locate-library "ipython")
  (require 'ipython))

(when (locate-library "doctest-mode")
  (require 'doctest-mode))

(defvar py-remote-connect-command "ssh""*Command for connecting to a remote python, typically \"ssh\"")
(defvar py-remote-connection-args '("user@remotemachine")"*List of strings of connection options.")
(defvar py-remote-python-command "python""*Command to execute for python")
(defvar py-remote-python-command-args '("-i")"*List of strings arguments to be passed to `py-remote-python-command`.")
(defvar py-remote-python-used nil"Remember if remote python is used.")

(defun py-set-remote-python-environment ()
  (interactive)
  (let ((command-args (append py-remote-connection-args 
                  (list py-remote-python-command)
                  py-remote-python-command-args)))
    (setq py-python-command py-remote-connect-command)
    (setq py-python-command-args command-args))
  (setq py-remote-python-used t))

(when (locate-library "py-complete")
  (autoload 'py-complete-init "py-complete")
  (defun my-py-complete-init ()"Init py-complete only if we're not using remote python"
    (if (not py-remote-python-used)
        (py-complete-init)))
  (add-hook 'python-mode-hook 'my-py-complete-init))

Page 1 of 1, totaling 4 entries