This past week I’ve spent some time here and there improving my Weave Sync client for Google Chrome. It all started out as a proof of concept, so initially I only hacked together enough to get it to show some lights on the screen. I decided that if I were to take this any further, I should man up and write some tests for it, because refactoring code or adding features without knowing for sure your stuff still works freaks me out.

I guess nowadays you don’t have to defend the whole test malarkey anymore. I’ve never really minded writing tests as I my code grows anyways. I’ve done TDD on a couple of projects before so I’m used to writing tests before or, when not in TDD mode, shortly after writing the code. When fixing a bug, for instance, I tend to always write the test first. This week, though, I was paying the price of graduating a toy project into something a bit more serious. Writing nothing but tests for a while is not that much fun.

QUnit, JSpec

I decided to go with QUnit as the testing framework. I’ve never really liked JUnit-based frameworks anyway, and especially in JavaScript the particular flavour of object-orientendness provided by JsUnit felt out of place. QUnit is from the jQuery folks — it’s used to test jQuery itself. I can highly recommend it: it’s a neat little and nimble library, much like jQuery itself.

The only other contender for me was JSpec, a JavaScript interpretation of RSpec. Their idea is basically to make your tests look like real language. It’s a bit like the doctest concept that exists in the Python world. Doctests combine documentation and unit test code which is great when you want to write well-documented tests that can also serve as an API example document to developers. RSpec and clones go further by merging the natural language and unit test into one. I find this approach intriguing and I want to try it out some time, but for now I decided against it. That said, JSpec also provides a JavaScript API which reads quite nicely. So maybe that’s a good stepping stone…

A mock XMLHttpRequest implementation

The Weave client is obviously all about talking to a server, so I needed a way to mock this XMLHttpRequests. JsUnit has a mock implementation of it but it’s pretty much useless. JSpec has a better one, but it’s all tied into JSpec and it doesn’t support Progress Events handlers, something that Mozilla and WebKit based browsers do nowadays. Because frankly the readystatechange event is just silly.

So I decided to roll out my own mock XMLHttpRequest implementation, MockHttpRequest. It tries to follow the XMLHttpRequest and Progress Events specs where possible and practical. Of course it itself is 100% tested. Try it, fork it, improve it, critique it and let me know when you do! :)

Code coverage

When you’re writing tests for your code, it’s always a good idea to track code coverage. That way you can make sure your tests hit every code path. This is especially, but not only, useful when you’re writing tests after the fact.

Sadly it seems there aren’t that many tools for that out there. There’s an abandoned Firefox extension that hooks into spidermonkey (Mozilla’s JavaScript interpreter) and an abandoned Firebug extension that hooks into Firebug’s profiling functionality. Hooking into an interpreter or profiler makes a lot of sense since you don’t have to change the code in question. Hopefully one of these projects will be revived one day or a similar one will come along. If somebody wants to pick this up, I could perhaps be bothered to help. :)

For now your best shot for JavaScript code coverage seems to be JSCoverage. It takes your existing code tree, makes a copy while inserting code to keep track of code coverage (essentially a counter that’s increased on every line of code). It’s crude but effective, so for now I’m happy to get a feel for how complete my tests are.

In the excellent Coders at Work book, Doug Crockford advises programmers to rewrite their stuff every six months or so. He says rewrite, but I don’t think he actually means that. Developers love rewriting stuff and most of the time it’s absolutely pointless — I know, I’ve been there.

I think what he means is refactoring. Basically streamlining the good parts and getting rid of cruft. In an ideal world, refactoring can be done in small, atomic steps. It should create no or few incompatibilities. And most crucially, it shouldn’t in any way affect the product’s shipping date.

This past week I have given BarTab this treatment. Since its creation in late January, it has grown organically. After a few months of fixing bugs, adding features, releasing early and often, and observing it “in the wild,” it was time to step back and clean it up. And boy did it need cleaning up.

My precondition for this refactoring was that I wouldn’t add any new features. Nada. Zip. Even though it would’ve been very tempting at various stages. I did manage to fix a few lingering bugs, though. In the end I turned over almost every line of code, some even twice. It was a deeply satisfying experience, and I’m glad I stuck by my no-new-features rule. It wasn’t an ideal refactoring in the sense that it was completely backwards compatible. The old API was horrible, it is now much more symmetric and free of horrible puns.

So BarTab 2.0 (available now as beta) is leaner, meaner, less invasive (no eval() hacks!) and more compatible with other Firefox add-ons. In a lot of ways it’s the BarTab that I should always have written. But you know as well as I do, that’s not how it works. Very few people write perfect code the first time round.

To me, BarTab is the perfect example of why Release Early and Often and the occasional Refactoring works extremely well for small, self-contained pieces of code such as a library, plug-in or extension. It isn’t by far the first time I’ve done things this way, but it’s certainly turned out very nicely this time.

Today I had one of those it-came-to-me-as-I-was-under-the-shower moments. For some reason, a conversation from a couple of months ago popped back into my mind. Somebody had asked me about associative arrays in JavaScript. To which I simply replied something like, well, they’re just there, built into the language. Just use an object! You could technically even use an array (since JS arrays aren’t really arrays but hash tables with string keys whose strings happen to convert to integers), but that’s not a good idea for various reasons.

This morning I realized that my answer wasn’t entirely complete. Objects work fine as associative arrays a.k.a. hash tables a.k.a. dictionaries if your keys are strings or can be uniquely converted to strings (in which case you could write toString() methods for all your objects you want to use as keys.) Python is a bit less restrictive with its dictionaries and requires that your object be hashable. Immutable built-in types like strings and tuples are hashable, so as long as you can find a one-to-one unique mapping to those (and implement that in __hash__()), you’re good. The advantage of this system is that string representation and hashing aren’t mixed into one interface.

The good news is that thanks to the concept of object identity (present in both JavaScript and Python), you can actually write an associative array that accepts arbitrary keys:

function dict () {
    var keys = [];
    var values = [];

    return {
        get: function (key) {
            return values[keys.indexOf(key)]
        },

        set: function (key, value) {
            var i = keys.indexOf(key);
            if (i === -1) {
                i = keys.length;
            }
            keys[i] = key;
            values[i] = value;
        },

        del: function (key) {
            var i = keys.indexOf(key);
            keys.splice(i, 1);
            values.splice(i, 1);
        },

        keys: function () {
            return keys.slice();
        },

        values: function () {
            return values.slice();
        },
    };
}

I’m not going to bother with a Python implementation because it would look almost identical (modulo syntax). Also, if you find yourself wanting to use something like this in Python, you’re probably doing something wrong. Python’s dictionary implementation is insanely fast and the range of immutable types (strings, tuples, frozensets, etc.) should be sufficient.

And come to think of it, so far I haven’t felt the need for something like this in JavaScript either. But it came to me under the shower, so I had to write it down.

A few months ago I wrote a post about JavaScript titled Curly braces are not the problem wherein I pointed out one of JavaScript’s biggest weakness, the new operator and how to spell an object constructor as well as methods on the corresponding prototype. Some commentators mistook that post for critique of the prototype model itself. It was far from it, I think the prototype model is great, just the spelling was awful. Consider this:

function MyObject() {
    /* constructor here */
}
MyObject.prototype = {
    aMethod: function () {
        /* method here */
    }
};

which is alright until you now want to inherit from this and add methods:

function YourObject() {
    /* constructor here */
}
YourObject.prototype = new MyObject();
YourObject.prototype.anotherMethod = function () {
    /* another method here */
};

There are several problems with this. First of all because YourObject inherits from MyObject, it has to be spelled differently. Secondly, we can’t reuse the constructor, at least not without resorting to func.apply() tricks. Thirdly, we have to know what to pass to the constructor of MyObject at definition time.

It turns out, Doug Crockford not only agrees with me on this but also has come up with a better way. Back in January I thought that we needed more syntax to fix this, but it turns out we need less (by which I mean ditching the new statement). In Vol. III of his excellent Crockford on JavaScript lectures, he defines a constructor maker:

function new_constructor (extend, initializer, methods) {
    var prototype = Object.create(extend && extend.prototype);

    if (methods) {
        methods.keys().forEach(function (key) {
            prototype[key] = methods[key];
        });
    }

    var func = function () {
        var that = Object.create(prototype);
        if (typeof initializer === 'function') {
            initializer.apply(that, arguments);
        }
        return that;
    };

    func.prototype = prototype;
    prototype.constructor = func;
    return func;
}

I’ll let you work out the details of this yourself and instead just show you how you would define the equivalent of the two cases above:

var new_my_object = new_constructor(Object, function () {
    /* constructor here */
}, {
    aMethod: function () {
        /* method here */
    }
});

var new_your_object = new_constructor(my_object, function () {
    /* constructor here */
}, {
    anotherMethod: function () {
        /* method here */
    }
})

See how symmetrical both forms are now? And if both object constructors really were were to share the same initializer, I could easily define that as a separate function and reuse it.

Btw, if you do any sort of web development, I highly recommend you watch the Crockford on JavaScript talks. They’re not only entertaining but are an excellent lesson in history of all the technology that makes up the web.

A book on Grok!

February 10, 2010

Carlos de la Guardia has written a book on Grok, a Python web framework I helped kick off and contributed to quite a lot. I was honoured to be asked to write the foreword for it. Here’s what I wrote:

A little less than a year ago, Zope 4 was released. As the successor of the complex, verbose and certainly not agile Zope 3 (now called BlueBream), it was instantly welcomed with much cheer. Naturally I chimed in and announced without much hesitation that the forthcoming 4th edition of my book would already be based on lean and mean Zope 4.

Sadly these were all just April fool’s jokes.

Sadly? No, I should say luckily. Because something much better came out of Zope land just a few months after we jokingly invented a new Zope version. Grok 1.0 was released. And thanks to Carlos’ tremendous effort, there’s now a book to accompany and celebrate this achievement, too.

There is much to say about this book, but I’m sure you’re eager to get started with your web application. So let me just tell you what I like best about it.

Carlos has managed to capture the Grok spirit in this book. It is concise, not too heavy and doesn’t beat about the bush. Yet it manages to hit all the bases of web development. With a spikey club, of course. It is as smashing as the framework itself and I hope you’ll enjoy both as much as I have.

Have fun!

Chris McDonough has written a book on repoze.bfg! repoze.bfg, or simply BFG, is a web framework written in Python. It’s of Zope pedigree but borrows many ideas from Pylons and Django as well. It’s simple yet powerful, agile, fast, has close to 100% test coverage and is extremely well documented.

Most importantly, however, BFG is true to its motto “pay only for what you eat.” You can use SQLAlchemy, Zope’s object store ZODB, or some other persistence mechanism. You can use Zope-style object traversal, Routes-style URL mapping or a combination of both. You can use Zope Page Templates, Genshi (both through the extremely fast Chameleon engine), Jinja, or your favourite templating language. You can write your application in an extensible manner using declarative configuration, or you can just use the Python API. You can deploy on Google App Engine, Apache and mod_wsgi or anywhere else that supports WSGI. And you needn’t worry about learning about the stuff you’re not using.

BFG is well worth a look, and so is Chris’s book. What’s more, he’s done what I didn’t do with my Zope book and published it under a CC license. That in itself is already a reason for buying one. :)

Programs must be written for people to read, and only incidentally for machines to execute.

Abelson & Sussman, SICP

There’s a programming paradigm which I shall call, for the lack of a better name, bail out early. It’s so trivial that it almost doesn’t deserve a name, not to mention a blog post. Yet I often come across code that would be so much clearer if bail out early was used. Consider some code like this:

function transmogrify(input) {
    if (input.someConditionIsMet()) {
        var result;
        result = processInput(input);
        if (result) {
            return result;
        } else {
            throw UNDEFINED_RESULT;
        }
    } else {
        throw INVALID_INPUT;
    }
}

There’s a lot of if/else going on here. In the bail out early paradigm, you would try to write this without any else clauses. The trick is to sort out the problematic case first:

function transmogrify(input) {
    if (!input.someConditionIsMet()) {
        throw INVALID_INPUT;
    }

    var result;
    result = processInput(input);
    if (!result) {
        throw UNDEFINED_RESULT;
    }
    return result;
}

See how much flatter the structure of that program is? There are some other advantages:

  • Because you no longer put the main flow inside if statements, your programm is often easier to refactor. If for instance the sanity checks occur in multiple places of your program, you can simply factor them out into a utility function without messing up the structure of your code.
  • You generally have to indent less. And when you move code around, you have to reindent less — particularly pleasant when you’re coding in Python where indentation is significant.
  • In languages with curly braces like my fake JavaScript above, you don’t have to worry about blocks spanning many many lines, thus putting the opening and closing brackets so far a part that you no longer can tell what the closing bracket is actually closing.

JavaScript has picked up lots of pythonisms over the last few years which is obvioulsy a Good Thing(tm). Aza Raskin of Mozilla has now created Pyscript, a version of JavaScript sans curly braces. As a fellow Pythonista I too find curly braces aesthetically unpleasant. But I don’t think it’s the pressing issue. At the end of his post, Aza asks “What other ways can we make Javascript syntax prettier and more readable?” Let me tell you by pointing to the elephant in the room.

Writing a class/object in JavaScript, especially “subclassing,” weirds me out.

I just can’t make my peace with the functions-implicitly-become-object-constructors idea. I can see how it might make sense in a prototype world. But once you use functions to define objects, you must use the new operator for instantiation. This means there are some functions you call right away and some you don’t. I just don’t understand why it would be so bad to have a new language construct for defining objects?

The lack of such a one-and-only language construct leads to a plethora of ways how to define object methods. Some like doing it this way:

function MyObject() {
  /* constructor here */
  this.aMethod = function() {
    /* method here */
  }
}

while others like to monkey-patch them in, like so:

function MyObject() {
  /* constructor here */
}
MyObject.prototype.aMethod = function() {
  /* method here */
}

or even:

MyObject.prototype = {
  aMethod: function() {
    /* method here */
  }
}

I’m sorry, that’s just too many ways for doing something all too common in object-oriented languages: defining objects. Not to mention the prototype-less one:

var MyObject = {
  aMethod: function() {
    /* method here */
  }
}

(Of course, when using Mozilla’s JavaScript engine you can also monkey patch a prototype into this via MyObject.__proto__ = {...}. In fact, the Mozilla folks like using __proto__ all the time to specify the object’s baseclass, uh, I mean baseprototype.)

I know what you’re going to say now. People could just settle for one way and impose that as a coding convention. But why hasn’t that happened? My suspicion is that because none of the choices are truly great. The language itself doesn’t encourage a particular choice more than any other, and that’s bad. Arbitration like that is just one step away from Perl.

Coming back to Aza’s Pyscript, I think it’s a useful exercise because it shows how much you can improve JavaScript with just a few lines of, uh, JavaScript. Perhaps I should give it a whirl and come up with a language construct for creating prototype-based objects, including a decent inheritance syntax. What do you think that should look like?

Update: I’ve written a follow-up post that contains a solution.

Attention, attention! This is a service announcement!

I haven’t been blogging much about Python and Zope lately. In fact, as some of you may have noticed, I’m no longer involved in Zope at all. I continue to use Python, though. To keep your feed aggregators a.k.a. planets topical I suggest removing my blog feed from Zope related planets and switch Python related aggregators to my Python category feed.

If any of you readers have found my blog through one of the Python and Zope planets and still enjoyed my other posts, I suggest adding my general feed to your news reader so you continue to get updates. Thank you.

End of service announcement.

This is really just a “note to self” kind of post. I meant to write this down a while ago but I forgot. To prevent further forgetting, here it is:

I always compile Python myself. The Python that comes with OS X tends to get outdated pretty soon and it has outdated libraries in its site-packages directory. And beware installing or updating anything in there, it might ruin core components of OS X (because they actually use this Python instead of one that’s not the user’s to modify). I know that MacPorts too has various Python versions but it then again it applies various patches to them and builds them in weird manners that I don’t understand (framework, etc.). So the best bet to get a clean and reliable Python installation is to self-compile (and then use virtualenv to prevent it from being messed up).

As it happens, when I compiled Python 2.5 or higher on OS X, it linked to either the OS X readline library or the MacPorts one. Which one I don’t know, but it was definitely hosed. So while the interpreter worked fine, the interpreter shell would crash with a Bus Error. So what I did was compile my own plain vanilla version of readline and installed it to /opt. Of course that didn’t work right away because readline wouldn’t build on OS X Leopard without applying a small patch to a build script.

After having installed readline, I configured Python with the (undocumented, but apparently existing) --with-readline-dir option:

./configure --prefix=/opt --with-readline-dir=/opt

and did the usual make && make install dance.

Follow

Get every new post delivered to your Inbox.