Node.js: The Style of Non-Blocking

This post is part of a series of articles about my recent experience building Sled using Node.js.

Node is all about non-blocking, asynchronous architecture. This means any activity taking a long time to finish, such as file access, network communication, and database operations, are requested and put aside until the results are ready and returned via a callback function. Instead of asking to read a file and waiting for the operating system to come back with a file handler or buffer, the a callback function is invoked when the operation is completed, freeing the server to handle additional requests.

What gives Node a bit of a negative reputation is how this architecture affects its style of programming and the difficulty some people are having in getting used to it. When I first started, I described this convoluted style of coding like scratching your right armpit with your left hand by twisting your left arm over the shoulder, behind the neck, and under the back of the right armpit. There are days when it still feels like that, but at least now my arms are much more flexible.

Consider the following code, used to fetch a database record and output the user name:

function getUser(id) {
    var user = db.query(id);
    return user;
}

console.log('Name: ' + getUser(432).name);

The function blocks until the database call is completed. This means the server is doing nothing but waiting until the function completes, ignoring other pending requests. In Node, the code is broken into two functions:

function getUser(id, callback) {
    db.query(id, callback);
}

function display(user) {
    console.log(user.name);
}

getUser(432, display);

However, using JavaScript anonymous functions, the code can be streamlined into:

function getUser(id, callback) {
    db.query(id, callback);
}

getUser(432, function (user) {
    console.log(user.name);
});

This nesting of function definitions inside function calls makes the code appear more linear and many find it easier to read. However, it can be tricky for new developers. For example:

getUser(432, function (user) {
    console.log(user.name);
});

console.log('Done');

is going to output ‘Done’ before it outputs the name because the name output waits for the database call to return, place the results on the event queue, and wait for the current executing code (outputting ‘Done’) to finish.

It takes time to get used to this style of coding, especially when your execution isn’t completely linear, but forks based on changing conditions.  For example, we have an API call to add an item to a list. The call supports an optional parameter to insert the item at a specific position. Because we store each list item in its own database document, we keep the list order separately. This means, we sometimes make one database call, and sometimes many. At the conclusion, we perform additional processing and return the document id of the new list item.

This produces a very simple blocking code:

function add(list, title, position) {
    // Add new item to 'items' store
    var item = db.insert('items', { list: list, title: title });
    //  Set position if requested
    if (position !== null) {
        var sort = db.get('items.sort', list);
        addToListAt(sort, item.id, position);
        db.update('items.sort', sort);
    }
    // Perform final processing
    var result = { status: 'ok', time: (new Date()).getTime() };
    return result;
}

But not so simple non-blocking code:

function add(list, title, position, callback) {
    // Add new item to 'items' store
    db.insert('items', { list: list, title: title }, function (item) {
        //  Set position if requested
        if (position !== null) {
            db.get('items.sort', list, function (sort) {
                addToListAt(sort, item.id, position);
                db.update('items.sort', sort, function () {
                    finish(item.id);
                });
            });
        }
        else {
            finish(item.id);
        }
    });
    function finish(id) {
        // Perform final processing
        callback({ id: id, status: 'ok', time: (new Date()).getTime() });
    }
}

By embedding the callback function as an argument to the non-blocking function, and separating each logical part of the process into smaller sub-functions, you can code as fast with Node as with any other platform, but it takes some getting used to.

There are a few libraries out there that allow you to code using a blocking style, and turn that into non-blocking code automagically, but I found that to be counterproductive. There is tremendous value in seeing exactly what is happening at run-time and staying close to the execution flow.

Debugging non-blocking code is still very tricky. You can’t just put a breakpoint at the top of the function and step through it. What will happen is that you will break, complete the first request, and then move into the next event in the queue which is not likely to be the next step into your function. Instead, you need to put multiple breakpoints inside each nested function, and rely heavily on logging to analyze your code.

Like any other platform, Node has its own unique style and it can be off-putting to some. But it really doesn’t take long to get used to and once you embrace non-blocking development, the benefits in performance are significant. It also forces you to think hard about your code and architecture, to make sure you are not doing something stupid that will fail-whale you later.

Continue reading from Couch to Mongo.

5 thoughts on “Node.js: The Style of Non-Blocking

  1. You could mention another way to organize callback like pipes in unix. What they do in Express to organize code and also to batch ajax requests (I guess it’s becoming idiomatic :))

    Example (three functions with callbacks):
    function f1(next){next()}
    function f2(next){next()}
    function f3(next){next()}

    Instead of
    f1(f2(f3())))

    to have a special function that builds these folded callbacks

    batch(f1,f2,f3)

    like in Express:
    app.del(‘/user/:id’, loadUser, andRestrictTo(‘admin’), function(req, res){
    res.send(‘Deleted user ‘ + req.user.name);
    });

  2. I understand about the non-blocking approach and the potential improvements. But in a “traditional” approach like say apache / php the server *is not* just sitting there while the blocking IO completes. Another process will be getting it’s CPU slice while the first process is waiting.

    I wondering if this non-blocking approach really makes a significant difference to throughput or whether it’s more that node.js is sharing memory between requests and therefore unlike say PHP not having to repetively reload a lot of the same data into memory for every request ?

    • Initializing each handler is one aspect. The ability of the web server to handle the load is another (which Apache isn’t very good at). Node is really good at serving HTTP requests which makes a huge difference.

  3. I’ve gotten a lot of mileage out of this lib: https://github.com/caolan/async

    I can see the benefit of staying close to the async metal, but there’s a lot of help in there for doing foreach-like things and sequencing calls. Would you say the problems outweigh the abstraction benefits with something like that?

    • When I hit my first for-each problem in node, I scratched my head and stared at the screen for a while. Having to iterate over an array of identifiers, and perform a database action on each in sequence is clearly something non-blocking coding is awkward at. But once you figure out the pattern using a recursive callback function, it is actually very helpful to have the full logic in front of you. YMMV.

Comments are closed.