Programming | Entries from December 2014

Dependency inversion in Clojure

Posted by Holger Schauer on Tuesday, December 2. 2014

The problem

I was recently reading a nice German book on Effective Software Archictecture by Gernot Starke and stumbled upon a discussion of the dependency inversion principle, which got me thinking. Gernot Starke first discusses the problem with an allusion to traditional procedural programming (translation mine):

Classical designs for procedural languages show a very characteristic structure of dependencies. As (the following figure) illustrates, these dependencies typically start from a central point, for instance the main program or the central module of an application. At this high level, these modules typically implement abstract processes. However, they are directly depending on the implementation of the concrete units, i.e. the very functions. These dependencies are causing big problems in practice. They inhibit changing the implementation of concrete functions without causing impacts on the overall system.

Classical dependencies in procedural systems He then goes on to introduce the idea of programming against abstractions and introduces the idea of the dependency inversion principle, first coined in Bob Martin’s DIP article (see also another thorough discussion in Brett Schucherts article on DIP). Basically, the idea is that the integrating process refers only to abstractions (i.e. interfaces) which are then implemented by concrete elements (classes), cf. the next figure.

Integrate with abstractions When I take a look at some of my recent Clojure code or at some older code I’ve written in Common Lisp, I immediately recognize dependencies that correspond to those in a classical procedural system. Let’s go for an example and take a look at one specific function in kata 4, data munging:

(ns kata4-data-munging.core
    (:require [kata4-data-munging.parse :refer [parse-day]]
              [clojure.java.io :as 'io]))

(defn find-lowest-temperature
    "Return day in weatherfile with the smallest temperature spread"
    [weatherfile]
    (with-open [rdr (io/reader weatherfile)]
         (loop [lines (line-seq rdr) minday 0 minspread 0]
        (if (empty? lines)
            minday
            (let [{mnt :MnT mxt :MxT curday :day} (parse-day (first lines)) ;<-- dependency!
              curspread (when (and mnt mxt) (- mxt mnt))]
            (if (and curday curspread
                  (or (= minspread 0)
                  (< curspread minspread)))
               (recur (next lines) curday curspread)
               (recur (next lines) minday minspread)))))))

The dependency here is on the concrete implementation of parse-day, you can basically ignore the rest for the argument here. Given that this was a small coding kata, this is not unreasonable (and in the course of the kata, the code changes to be more general), but the issues here are obvious:

if we would like to parse a weather-file with a different structure, we have to change find-lowest-temperature to call out to a different function,
if the result of the new function differs, again we have to change the implementation of find-lowest-temperature,
we also have to change the namespace declaration, i.e. we probably want to require a different module.

Clojure’s built-in solutions

The application of the dependency inversion principle is typically shown in the context of object-oriented programming languages, like Java where you use interfaces and classes implementing those interfaces for breaking the dependency on concrete implementations, cf. the figure above again. But as we’ll see the principle can be applied independently of object-orientation. I’ll discuss higher-order functions, protocols and multimethods as potential solutions.

Higher order functions

For starters and probably painfully obvious is to make use of the fact that Clojure treats functions as first-class objects and supports higher-order functions. This simply means that we can pass the parsing function as an argument to find-lowest-temperature.

(defn find-lowest-temperature
    "Return day in weatherfile with the smallest temperature spread"
    [weatherfile parsefn] ; <-- function as parameter
    (with-open [rdr (io/reader weatherfile)]
         (loop [lines (line-seq rdr) minday 0 minspread 0]
        (if (empty? lines)
            minday
            (let [{mnt :MnT mxt :MxT curday :day}  (parsefn (first lines))
              curspread (when (and mnt mxt) (- mxt mnt))]
            (if (and curday curspread
                  (or (= minspread 0)
                  (< curspread minspread)))
               (recur (next lines) curday curspread)
               (recur (next lines) minday minspread)))))))

This way, we can simply call (find-lowest-temperature "myweatherfile" parse-day) and freely substitute whatever file format and accompanying parse function we need. What does this buy us?

We no longer have to modify find-lowest-temperature when we want to use a different parse-day function.
The namespace containing find-lowest-temperature also no longer requires the (namespace containing the) parse function.

But there is also a down-side: find-lowest-temperature assumes that all parsing functions it will get fed adhere to a signature that is entirely implicit: parsefn needs to take exactly one line and needs to return a map with given key-names. Higher-order functions don’t provide a solution for this per-se, so in order to solve the implicit signature issue we need to look elsewhere. This is nothing Clojure specific: Assuming you’ve passed in an object either as a method parameter or via Setter-Methods or Constructor-Injection (cf. dependency injection), Python’s or Ruby’s duck-typing basically works the same way: the caller of a method simply assumes that the callee offers a method with the right signature. It is the responsibility of the caller (of find-lowest-temperature) to provide a matching function for parse-fn.

However, this actually amounts to just move the problem from one level to the next: now some other level has to decide which concrete parse function needs to be used. This next level will have again the exact same problems: it will depend on both concrete implementations of find-lowest-temperature and parse-day (or any other parse function). If you think this through, it’s obvious that in general at one point or another, you have code that determines which function to call and which parameters to use. The question is only if we can use abstractions or whether we have to use concrete implementations. We’ll return to this issue, that now at some other level you need to handle the problem, later.