home

products

articles

about us

The Image Sourcerer

Recent Articles

Should You Care About Requirements Engineering?

Recently, I (Adil) was invited to participate in a one day seminar on the subject of Requirements Engineering. Whilst I have no direct experience of t ...

Tips for Setting Up Your First Business Website

To attract all potential customers to your business you need a presence on the web. The problem is that if you haven't set up a website before, you p ...

What's Good about LISP?

LISP is a general-purpose programming language and is the second-oldest programming language still in use, but how much do you know about it? Did you ...

Open Source Tools for Developers: Why They Matter

From a developer's point of view use of open-source tools has advantages beyond the obvious economic ones. With the open-source database MySQL in mind ...

An Interview with Norbert Cartagena

An exclusive interview with Norbert Cartagena, the former editor-in-chief of Developer Shed Inc. and self-confessed fan of science fiction. In ...

What's Good About Clojure?

What is Clojure?

Over the past few years we have seen a change in the Java scene: there are now many languages that, although not directly related to Java, run on the Java Virtual Machine (JVM). Some of these are scripting languages that interact with the JVM through the Java Scripting API that was introduced with Java 6 (JSR-223), examples include JavaScript/Rhino and BeanShell; others are languages that compile down to bytecode - the 'machine language' of the JVM. Clojure is one of the latter classes of language, although, as we shall see, it has a very dynamic flavour to its interaction with Java. In the rest of this article, I assume that most of the readers will have more experience of Java, which has been the most widely used programming language for over a decade, than LISP, which although much older (hailing from the 1950s) has always been a niche language.

Before proceeding any further, I should like to say something about what Clojure is not. Although Clojure is pronounced the same as 'closure', it is very different from the 'closure' feature of programming languages. Closures (with an 's') are functions with variables that are bound in the lexical environment in which the closure was defined. I mention this because the benefits of closures are often discussed in the context of Java because they will be a new feature of Java 7. Until now, you can create only something like a poor man's closure by combining the use of inner classes and locally scoped final variables. Don't worry if this is not clear - it is rather esoteric and not important for the discussion of the Clojure programming language - I just wanted to highlight that these are two completely separate topics.

Clojure is also different to Clozure, which is an implementation of LISP that is not based on the JVM. In fact Clozure is an evolution of Macintosh Common LISP, although it is now also available on other platforms.

Now that we have described what Clojure is not, let's say a little more about what is actually is: Clojure is a dialect of LISP with a bias towards the purer style of functional programming, as opposed to the more procedural state-changing style that Common LISP also permits. But before you start complaining about the number of parentheses in LISP, there are two important things to note: firstly, Clojure code contains fewer parentheses than Java for the same functionality; and secondly Clojure has streamlined the syntax of Common LISP to make code appear simpler and more readable. It does take a little while to get used to if you are accustomised to reading Java, but the succinctness of LISP is part of its power and appeal.

LISP has a far more uniform syntax than most other languages, as all code (apart from a small number of special forms) take the form:

(function arg₁ arg₂ ... arg_n)

So, for example, to add one and two, instead of writing 1+2 as in most programming languages, you would write:

(+ 1 2)

This looks odd to us for the in the case of arithmetic expressions as we are so accustomed to seeing in-fix notation, but the pre-fix notation (where the functor precedes the arguments) looks very natural in most other cases. It also leads to a very uniform treatment of code, which is great when you want to save time by generalising and adding new layers of abstraction. Even in arithmetic, where at first we might think the approach is misplaced, it transpires to be very useful: arithmetic expressions written with a pre-fix notation do not require precedence rules. (The precedence rules allow us to omit parentheses in some cases; for example the expression 4 + 2 × 3 evaluates to 10 rather than 18 because multiplication binds arguments more strongly than addition. In other words, multiplication takes precedence over addition.)

Let's try writing a little Clojure code. Suppose we want to write a function to determine whether the input is a palindrome. A palindrome is a word that reads the same backwards as it does forwards, as for example 'pip', 'noon' or 'madam'.

There are many possible ways of doing this. Here's how I might do it in Java:

public static boolean isPalindrome(String str) {
  for (int i=0; i<str.length()/2; i++) {
     if (str.charAt(i) != str.charAt(str.length()-1-i)) {
         return false;
     }
  }
  return true;
}

The basic idea should be clear: we take a string and iterate through it one character at a time, comparing the characters at the front of the string with those at the back. As soon as we find a mismatch we return false, as we then know that the string cannot be a palindrome,. However, if we get as far as the centre of the string without finding a mismatch then the input string must be a palindrome so we return true. Notice that if we have an even number of characters in the input string then we check the first half of those characters for a mismatch with the second half, but if we have an odd number of characters, there is no need to check the middle character against itself.

Here is a Clojure function to do the same thing:

(defn palindrome? [s]
  (or (<= (count s) 1)
    (and (= (first s) (last s))
      (palindrome? (rest (butlast s))))))

The defn defines a function, named palindrome?, which takes a single parameter, s. (The question mark at the end of the function name is merely a convention for functions that return a boolean value.) The body of the function then follows, which computes and returns a boolean value as the result of the 'or' expression. Note that you don't need to explicitly say 'return', as the last value computed in the body of the function will always be returned. The function definition is really a declarative statement of what a palindrome is: A word is a palindrome if it has a length of 1 or less, or its first and last letters match and the mid-section formed by removing the first and last letters is also a palindrome.

The Clojure definition of palindrome? is a recursive one, which is idiomatic LISP but not idiomatic Java. You can, of course, take the same approach in Java, but it doesn't have the same elegance as the Clojure solution:

public static boolean isPalindrome(String str) {
    return str.length() <= 1 || 
    (str.charAt(0) == str.charAt(str.length()-1) && isPalindrome(str.substring(1, str.length()-1)));
}

An important point to note is that although we wrote the palindrome function while thinking about words represented as strings, the Clojure version will work, without any code changes, for other types of sequence too. Here are some examples tested on the Clojure REPL (Read-Evaluate-Print-Loop). It shows that the function can be applied to strings, lists and vectors, which is not true of the Java equivalent.

user=> (palindrome? "noon")
true
user=> (palindrome? "midday")
false
user=> (palindrome? '(1 2 1))
true
user=> (palindrome? [1,2,2,1])
true
user=> (palindrome? [1,2,3,1])
false

Furthermore, we can now also use our palindrome? function as a filter and apply it to other sequences. This is natural in LISP, but in Java would require the use of the reflection APIs, which are a little convoluted and therefore not for everyday use. Here is an example in which we supply a list of strings and ask which of them are palindromes:

user=> (filter palindrome? '("noon" "midday" "pop" "seven" "able was i ere i saw elba"))
("noon" "pop" "able was i ere i saw elba")

Key Features of Clojure

There are many LISP implementations available, some are commercial products with high price tags and others are freely available as open-source projects. So why would you choose Clojure?

Some key features of Clojure are:

Programs as Data
Concurrency Support
Lazy Sequences
Java Interoperability

Programs as Data

Clojure, like other LISPs, treats programs as data. This gives the language a dynamic nature to which many other languages can only aspire. For example, we can construct a list of data items consisting of the symbol '+', followed by numbers 1 and 2. Then we can evaluate the list and treat the '+' as a function

user=> (eval (list '+ 1 2))
3

To understand the significance of this, consider the implications if XML, a data exchange language, were also executable as a program. Actually, the use of eval is often discouraged as bad practice because, well, it's a bit too powerful and you don't quite know what you might be letting yourself in for when you evaluate some arbitrary expression. The encouraged idiomatic alternative is to use higher order functions such as map, which takes a function as an argument and applies it to each member of a sequence. For example, the following REPL interaction shows first that (range 10) generates the numbers from 0 through to 9, but that we can easily generate 1 through to 10 by mapping the list with the function inc, which takes a number and increases it by 1.

user=> (range 10)
(0 1 2 3 4 5 6 7 8 9)
user=> (map inc (range 10))
(1 2 3 4 5 6 7 8 9 10)

Concurrency Support

When you first hear about Clojure, you doubtless hear about the support for concurrency. (This was not my driving concern, but it is nevertheless an important one.) Writing programs with concurrent threads is recognised as being difficult, but with a clean division of labour among concurrent threads there is a greater opportunity for the JVM to optimize the use of multiple core CPUs. A functional programming style discourages the sharing of program state, making it easier to identify sections of code that are parallelizable. Clojure data structures are immutable, which prevents one thread from changing state that is accessed by another. However, it is possible to arrange for threads to share state using Software Transactional Memory. This basically means that you can create and share references to data structures that are subject to change but, whenever you want to change the state it has to take place in a portion of code demarcated inside a transactional boundary. So you treat mutable state in the memory of the JVM in the same way that you would normally treat changes to a database.

Lazy Sequences

Lazy sequences are particularly interesting for Clojure because it is a feature that, although available in some other functional languages, is not available in Common LISP. A lazy sequence specifies how to generate the next member of a sequence, but does not actually generate it until it is needed. So, for example the sequence of counting numbers can be specified as:

(defn counting-numbers [] (iterate inc 1))

This is an infinite sequence, so to work with it you need to pick off a finite number of elements of interest. For example, to take the first ten, you would do the following:

user=> (take 10 (counting-numbers))
(1 2 3 4 5 6 7 8 9 10)

You can apply filters to infinite sequences too, so to take the first 10 even numbers, you can write:

user=> (take 10 (filter even? (counting-numbers)))
(2 4 6 8 10 12 14 16 18 20)

Lazy sequences have a magic feel about them because they enable you to model infinite data structures in a machine of limited capacity. In some cases this might help to reason in a more rigorous way, because mathematical definitions of structures often deal with infinite data structures. Another practical advantage of laziness is the ability to delay a computation until it is needed.

Java Interoperability

Given that LISP has been more of a niche language until now, one of the problems has always been the limited access to libraries compared to a mainstream language such as Java. Where libraries are available, they are usually proprietary and often expensive to use. Through the Java interoperability of Clojure, you have full access to the same range of libraries that are available to the Java programmer, as well as the cross-platform behaviour of Java.

Here are some illustrative examples that show the close interoperability between Clojure and Java.

Object Creation

First, to create an instance of a Java object, you simply use the special form new:

user=> (new java.util.Date)
#<Date Mon May 03 14:20:43 BST 2010>

There is also another, shorter, form that creates an instance of a Java object: you add a '.' to the end of the class name:

user=> (java.util.Date.)
#<Date Mon May 03 14:23:10 BST 2010>

Method Invocation

You invoke an object's methods by using the dot special form, in which the dot acts as an 'invoke' function. In the following, we create an instance of java.util.Random and bind it to the symbol r, then invoke the method nextDouble.

user=> (def r (java.util.Random.))
#'user/r
user=> (. r nextDouble)
0.3299915263686519

There is also a shorter, and more natural, form for method invocation, where the method becomes the functor:

user=> (.nextDouble r)
0.9000876116306945

Static Variables and Methods

You can access static methods by using a '/' after the class name. In the following example, we retrieve the system properties using a static method call, and then look up the java.runtime.version property:

user=> (.getProperty (System/getProperties) "java.runtime.version")
"1.6.0_20-b02"

You can access static variables in a similar way:

user=> Math/PI
3.141592653589793

Notice that the java.lang package is imported by default, so the reference to the Math class did not have to be package specified. In the other examples we have package specified the classes, but we can save the effort of doing this each time with an import:

user=> (import java.util.Random)
java.util.Random
user=> (def r (Random. (System/currentTimeMillis)))
#'user/r
user=> (.nextInt r 100)
20

In the example above, we import the Random class from the java.util package. Then we create an instance of Random, using the constructor that takes a long value - in this case the system clock time - as the randomizing seed. Once we have created the instance we can start generating random numbers; for example, using the nextInt() method.

Here, we have shown the basics of Java interoperability. There is more to discover than the scope of this article will allow, but my impression is that it is very well designed and, with some practice, quite natural.

There is one final comment I would like to mention in this section; namely that Clojure functions always implement the java.lang.Runnable interface, so you can easily pass Clojure functionality over to the Java side for execution, for instance as the body of a new Thread. Neat!

Where's the Catch?

I'm a big fan of Clojure, and with its close relationship with Java I think it stands a good chance of bringing LISP more into the mainstream (and a better chance than, say, Armed Bear Common LISP, which is another great project that brings LISP to the JVM). However, it can't all be good news; otherwise everyone would be switching to Clojure straight away as their mainstream language, and I don't see that happening. So what are the down sides?

No Object Orientation

For Java programmers, the switch from an object-oriented to a functional paradigm is quite considerable, and it is not really made any easier through Clojure's lack of support for classes that are native to Clojure. Common LISP has CLOS, the Common LISP Object System, which is more powerful than most object oriented languages, but Clojure does not have an equivalent feature. It supports a rich set of data structures including lists, sets, vectors and maps, but the way of creating anything like a new type (without resorting to Java interoperability) is by creating a struct. A struct allows you to associate properties with an object (which could themselves be structs) and also allows you to impose a hierarchy with the predicate isa? for testing inheritance. However, I think that Java programmers are likely to feel there is something missing compared to the approaches they already use.

No Tail-Call Optimization

Functional languages make heavy use of recursion to iterate over data structures. Each recursion, as a new function call, generally requires a new allocation on the call stack. However, many languages recognise a common pattern of recursion and optimize for it, so that recursion can be used without the demands on the call stack. This is called tail call optimization and in such cases the demand on the call stack remains constant regardless of the number of times the function recurses. Unfortunately, due to Clojure's intimate link with Java, such an optimization is not possible. It does, however, have a special construct called loop/recur, which the programmer can use to have the same effect as tail-call optimization.

Difficult to learn

One of the criticisms levelled at LISP is that it is difficult to learn. This is partly because it demands a different way of thinking compared to programming in procedural languages, and partly because there is a whole new library of functions to learn and become accustomed to using. The different way of thinking certainly takes some getting used to, particularly the increased use of recursion. In this, the REPL is your saviour, because it is extremely easy to perform ad-hoc testing of your code so the process of code development becomes a tightly integrated loop of code-a-little, test-a-little, code-a-little, ...

Compared to Common LISP, Clojure has fewer functions so there is less to learn, but the Java programmer need not feel lost at sea (or should that be 'lost at C?') with the plethora of Java libraries still at your fingertips.

Clojure Development Environments

There are now several development environments for Clojure, including plug-ins for the main Java IDEs. I tried out some of them (and read about others). Here are my thoughts:

Eclipse

Eclipse has a plug-in called CounterClockwise to support development in Clojure. You get syntax colouring, code completion and an integrated REPL. I tried it and it worked, but somehow the launch configurations and REPL integration didn't feel natural to me. Maybe I didn't give it enough time to work out how to use it properly.

NetBeans

NetBeans has a plug-in called Enclojure (which is without doubt the best name for any of the plug-ins). I didn't try this out, but read good reports of it.

IntelliJ

IntelliJ IDEA has a plug-in called La Clojure, which is the one that I prefer. It works with the free version of IntelliJ IDEA, so you don't have to pay for the extra features of the full version of the IDE if you intend to use it for Clojure only. La Clojure provides syntax colouring and code completion, but for me the clinching factor is that the integration with the REPL feels right. With CounterClockwise I found myself starting Clojure in a new REPL over and over, where La Clojure allowed me to continue interacting with the same REPL.

Emacs

Emacs (with the SLIME and SWANK extensions) is the hard-core alternative for real LISPers. Many don't like the user-interface to Emacs, as it takes a good while to get used to the keyboard bindings, but for code editing and REPL interaction it is very slick. Installation of emacs and the extensions requires some effort, but if you are working with Windows there is a project called Clojure Box that installs all you need in a single, easy-to-use, installation.

I liked the Emacs approach (and I have used Emacs for LISP coding in the past) but this time round I felt it was not so natural for project navigation compared to the modern IDEs. I know you can open a window and list the current buffers but a tree-structured project navigator seems more natural.

If you work on a Mac, I understand AquaMacs is good as a customised emacs with the Aqua look and feel.

Vim

The VIM editor also has support for Clojure editing through the VimClojure project. If you like the idea of a lightweight editor rather than an IDE, but have no prior experience of emacs, then this could be the way to go.

JEdit

Apparently, there is also a Clojure editing mode for JEdit. I haven't tried it and don't know if it extends to the inclusion of a REPL.

Conclusions

I am excited about Clojure. Many people think of LISP as some quirky language from a bygone era, but I think with the close integration to Java, it will be possible to write applications that demonstrate otherwise. LISP coders know how lucky they are to be coding in LISP, but with Clojure it could mean that LISP coding becomes more acceptable in the mainstream. That's a big thing and it will be a welcome change.

Simon White