Wednesday, November 12, 2014

Three Simple Rules for Escaping Callback Hell

A lot of newcomers to Node.JS complain about "callback hell" and the "pyramid of doom" when they're getting started with the callback-driven continuation passing style.  It's confusing, and a lot of people reach for an async / flow-control module right away.  Many people have settled on using Promises, a solution that brings some unfortunate problems along with it (performance, error-hiding anti-patterns, and illusory behavior, for example).

I prefer using some simple best practices for working with callbacks to keep my code clean and organized. These techniques don't require adding any extra modules to your code base, won't slow your program down, don't introduce error-hiding anti-patterns, and don't convey a false impression of synchronous execution. Best of all, they result in code that is actually more readable and concise, and once you see how simple they are, you might want to use them, too.

Here they are:
  1. use named functions for callbacks
  2. nest functions when you need to capture (enclose) variable scope
  3. use return when invoking the callback

The Pyramid of Doom

Here's a contrived example that uses typical node.js callbacks with (err, result) arguments. It's a mess of nested functions: the so-called Pyramid of Doom. It keeps indenting, layer upon smothering layer, until it unwinds in a great cascading spasm of parenthesis, braces and semi-colons.

Named Callbacks

The Pyramid of Doom is often shown as a reason to use Promises, but most async libraries -- including and especially Promises -- don't really solve this nesting problem.  We don't end up with deeply nested code like this because something is wrong with JavaScript. We get it because people write bad, messy code.  Named callbacks solve this problem, very simply. Andrew Kelley wrote about this on his blog a while ago ("JavaScript Callbacks are Pretty Okay"). It's a great post with some simple ways of taming "callback hell" that get skipped over by a lot of node newcomers.

Here's the above example re-written using named callback functions. Instead of a Russian doll of anonymous functions, every function that takes a callback is passed the name of the callback function to use. The callback function is defined immediately afterwards, greatly improving readability.

Nest Only for Scope

We can do even better. Notice that two functions, sendGreeting and showResult, are still nested inside of the getGreeting function. Nested "inner" functions create a closure that encloses the callback function's own local variable scope, plus the variable scope of the function its nested inside of. These nested callbacks can access variables from higher up the call stack. In our example, both sendGreeting and showResult use variables that were created earlier in the getGreeting function. They can access these variable from getGreeting, because they're nested inside getGreeting and thus, enclose its variable scope.

A lot of times this is totally unnecessary. You only need to nest functions if you need to refer to variables in the scope of the caller from within the callback function. Otherwise, simply put named functions on the same level as the caller. In our example, variables can be shared by moving them to the top-level scope of the greet function. Then, we can put all our named functions on the same level. No more nesting and indentation!

Return when invoking a Callback

The last point to improve readability is more a stylistic preference, but if you make a habit of always returning from an error-handling clause, you can further minimize your code. In direct-style programming where function calls are meant to return a value, common wisdom says that returning from an if clause like this is bad practice that can lead to errors.  With continuation-passing style, however, explicitly returning when you invoke the callback ensures that you don't accidentally execute additional code in the calling function after the callback has been invoked. For that reason, many node developers consider it best practice. In trivial functions, it can improve readability by eliminating the else clause, and it is used by a number of popular JavaScript modules.  I find a pragmatic approach is to return from error handling clauses or other conditional if/else clauses, but sometimes leave off the explicit return on the last line in the function, in the interest of less code and better readability. Here's the updated example:

Compare this example with the Pyramid of Doom at the beginning of the post. I think you'll agree that these simple rules result in cleaner, more readable code and provide a great escape from the Callback Hell we started out with.

Good luck and have fun!

1 comment:

David Ivey said...

That was an absolutely outstanding explanation of the coding style that should be used by all javascript programmers (not just node). I'm used to programming in languages that are blocking/synchronous. In order to accomplish a javascript programming task, I ended up in callback hell to ensure that functions A, B, and C happened in sequential order because these function were setting or using data that was setup by a prior function. I didn't know it was called "callback hell" at the time, but I did know this was a horrible way to code, and I knew it would be a nightmare to try and support and debug my own code if was written like this -- much less someone else's code. So, I went in search of an answer on how to code properly using javascript callbacks. I found lots of other programmers had this same complaint about javascript, and even gave the issue a name -- callback hell. There was universal condemnation, but the answers that described how to properly code using a non-callback hell style were simply awful, or they just assumed you were using node and recommended using an async module, await, promises, etc. I like your explanation MUCH better. There is no need for extensions to the javascript language. You just need to understand the pattern of how it should properly be coded using straight javascript.

Your explanation of where to declare your variables was the key for me. The async nature of javascript along with the closure mechanism in maintaining a variable's scope are two major components to programming in javascript. I'm still coming to grips with figuring out how these language components really work, and your post has gone a long way in helping me do that. Thank you!

One area not covered by your post is "for loops" used in conjunction with callback functions. Maybe this is addressed by declaring variables in the proper spot (i.e. using closures). Maybe using recursive functions is the proper approach? I'll need a little more work and experimenting with this to get a firmer understanding. In any event, thanks again for the insightful post.

By the way, who decided that javascript with callbacks was "async" while languages that didn't use callback functions were synchronous? Wouldn't it make more sense for it to be the other way around? In my mind, the language that processes functions in sequential order should be thought of as asynchronous. The language that kicks off several functions that run at the same time that the main programming thread (colloquially) is continuing to run should be thought of as synchronous.