Chapter 9: Modularity

¶ This chapter deals with the process of organising programs. In small programs, organisation rarely becomes a problem. As a program grows, however, it can reach a size where its structure and interpretation become hard to keep track of. Easily enough, such a program starts to look like a bowl of spaghetti, an amorphous mass in which everything seems to be connected to everything else.

¶ When structuring a program, we do two things. We separate it into smaller parts, called modules, each of which has a specific role, and we specify the relations between these parts.

¶ In chapter 8, while developing a terrarium, we made use of a number of functions described in chapter 6. The chapter also defined a few new concepts that had nothing in particular to do with terraria, such as clone and the Dictionary type. All these things were haphazardly added to the environment. One way to split this program into modules would be:

A module FunctionalTools, which contains the functions from chapter 6, and depends on nothing.
Then ObjectTools, which contains things like clone and create, and depends on FunctionalTools.
Dictionary, containing the dictionary type, and depending on FunctionalTools.
And finally the Terrarium module, which depends on ObjectTools and Dictionary.

¶ When a module depends on another module, it uses functions or variables from that module, and will only work when this module is loaded.

¶ It is a good idea to make sure dependencies never form a circle. Not only do circular dependencies create a practical problem (if module A and B depend on each other, which one should be loaded first?), it also makes the relation between the modules less straightforward, and can result in a modularised version of the spaghetti I mentioned earlier.

¶ Most modern programming languages have some kind of module system built in. Not JavaScript. Once again, we have to invent something ourselves. The most obvious way to start is to put every module in a different file. This makes it clear which code belongs to which module.

¶ Browsers load JavaScript files when they find a <script> tag with an src attribute in the HTML of the web-page. The extension .js is usually used for files containing JavaScript code. On the console, a shortcut for loading files is provided by the load function.

load("FunctionalTools.js");

¶ In some cases, giving load commands in the wrong order will result in errors. If a module tries to create a Dictionary object, but the Dictionary module has not been loaded yet, it will be unable to find the constructor, and will fail.

¶ One would imagine this to be easy to solve. Just put some calls to load at the top of the file for a module, to load all the modules it depends on. Unfortunately, because of the way browsers work, calling load does not immediately cause the given file to be loaded. The file will be loaded after the current file has finished executing. Which is too late, usually.

¶ In most cases, the practical solution is to just manage dependencies by hand: Put the script tags in your HTML documents in the right order.

¶ There are two ways to (partially) automate dependency management. The first is to keep a separate file with information about the dependencies between modules. This can be loaded first, and used to determine the order in which to load the files. The second way is to not use a script tag (load internally creates and adds such a tag), but to fetch the content of the file directly (see chapter 14), and then use the eval function to execute it. This makes script loading instantaneous, and thus easier to deal with.

¶ eval, short for 'evaluate', is an interesting function. You give it a string value, and it will execute the content of the string as JavaScript code.

eval("print(\"I am a string inside a string!\");");

¶ You can imagine that eval can be used to do some interesting things. Code can build new code, and run it. In most cases, however, problems that can be solved with creative uses of eval can also be solved with creative uses of anonymous functions, and the latter is less likely to cause strange problems.

¶ When eval is called inside a function, all new variables will become local to that function. Thus, when a variation of the load would use eval internally, loading the Dictionary module would create a Dictionary constructor inside of the load function, which would be lost as soon as the function returned. There are ways to work around this, but they are rather clumsy.

¶ Let us quickly go over the first variant of dependency management. It requires a special file for dependency information, which could look something like this:

var dependencies =
  {"ObjectTools.js": ["FunctionalTools.js"],
   "Dictionary.js":  ["ObjectTools.js"],
   "TestModule.js":  ["FunctionalTools.js", "Dictionary.js"]};

¶ The dependencies object contains a property for each file that depends on other files. The values of the properties are arrays of file names. Note that we could not use a Dictionary object here, because we can not be sure that the Dictionary module has been loaded yet. Because all the properties in this object will end in ".js", they are unlikely to interfere with hidden properties like __proto__ or hasOwnProperty, and a regular object will work fine.

¶ The dependency manager must do two things. Firstly it must make sure that files are loaded in the correct order, by loading a file's dependencies before the file itself. And secondly, it must make sure that no file is loaded twice. Loading the same file twice might cause problems, and is definitely a waste of time.

var loadedFiles = {};

function require(file) {
  if (dependencies[file]) {
    var files = dependencies[file];
    for (var i = 0; i < files.length; i++)
      require(files[i]);
  }
  if (!loadedFiles[file]) {
    loadedFiles[file] = true;
    load(file);
  }
}

¶ The require function can now be used to load a file and all its dependencies. Note how it recursively calls itself to take care of dependencies (and possible dependencies of that dependency).

require("TestModule.js");

test();

¶ Building a program as a set of nice, small modules often means the program will use a lot of different files. When programming for the web, having lots of small JavaScript files on a page tends to make the page slower to load. This does not have to be a problem though. You can write and test your program as a number of small files, and put them all into a single big file when 'publishing' the program to the web.

¶ Just like an object type, a module has an interface. In simple collection-of-functions modules such as FunctionalTools, the interface usually consists of all the functions that are defined in the module. In other cases, the interface of the module is only a small part of the functions defined inside it. For example, our manuscript-to-HTML system from chapter 6 only needs an interface of a single function, renderFile. (The sub-system for building HTML would be a separate module.)

¶ For modules which only define a single type of object, such as Dictionary, the object's interface is the same as the module's interface.

¶ In JavaScript, 'top-level' variables all live together in a single place. In browsers, this place is an object that can be found under the name window. The name is somewhat odd, environment or top would have made more sense, but since browsers associate a JavaScript environment with a window (or 'frame'), someone decided that window was a logical name.

show(window);
show(window.print == print);
show(window.window.window.window.window);

¶ As the third line shows, the name window is merely a property of this environment object, pointing at itself.

¶ When much code is loaded into an environment, it will use many top-level variable names. Once there is more code than you can really keep track of, it becomes very easy to accidentally use a name that was already used for something else. This will break the code that used the original value. The proliferation of top-level variables is called name-space pollution, and it can be a rather severe problem in JavaScript ― the language will not warn you when you redefine an existing variable.

¶ There is no way to get rid of this problem entirely, but it can be greatly reduced by taking care to cause as little pollution as possible. For one thing, modules should not use top-level variables for values that are not part of their external interface.

¶ Not being able to define any internal functions and variables at all in your modules is, of course, not very practical. Fortunately, there is a trick to get around this. We write all the code for the module inside a function, and then finally add the variables that are part of the module's interface to the window object. Because they were created in the same parent function, all the functions of the module can see each other, but code outside of the module can not.

function buildMonthNameModule() {
  var names = ["January", "February", "March", "April",
               "May", "June", "July", "August", "September",
               "October", "November", "December"];
  function getMonthName(number) {
    return names[number];
  }
  function getMonthNumber(name) {
    for (var number = 0; number < names.length; number++) {
      if (names[number] == name)
        return number;
    }
  }

  window.getMonthName = getMonthName;
  window.getMonthNumber = getMonthNumber;
}
buildMonthNameModule();

show(getMonthName(11));

¶ This builds a very simple module for translating between month names and their number (as used by Date, where January is 0). But note that buildMonthNameModule is still a top-level variable that is not part of the module's interface. Also, we have to repeat the names of the interface functions three times. Ugh.

¶ The first problem can be solved by making the module function anonymous, and calling it directly. To do this, we have to add a pair of parentheses around the function value, or JavaScript will think it is a normal function definition, which can not be called directly.

¶ The second problem can be solved with a helper function, provide, which can be given an object containing the values that must be exported into the window object.

function provide(values) {
  forEachIn(values, function(name, value) {
    window[name] = value;
  });
}

¶ Using this, we can write a module like this:

(function() {
  var names = ["Sunday", "Monday", "Tuesday", "Wednesday",
               "Thursday", "Friday", "Saturday"];
  provide({
    getDayName: function(number) {
      return names[number];
    },
    getDayNumber: function(name) {
      for (var number = 0; number < names.length; number++) {
        if (names[number] == name)
          return number;
      }
    }
  });
})();

show(getDayNumber("Wednesday"));

¶ I do not recommend writing modules like this right from the start. While you are still working on a piece of code, it is easier to just use the simple approach we have used so far, and put everything at top level. That way, you can inspect the module's internal values in your browser, and test them out. Once a module is more or less finished, it is not difficult to wrap it in a function.

¶ There are cases where a module will export so many variables that it is a bad idea to put them all into the top-level environment. In cases like this, you can do what the standard Math object does, and represent the module as a single object whose properties are the functions and values it exports. For example...

var HTML = {
  tag: function(name, content, properties) {
    return {name: name, properties: properties, content: content};
  },
  link: function(target, text) {
    return HTML.tag("a", [text], {href: target});
  }
  /* ... many more HTML-producing functions ... */
};

¶ When you need the content of such a module so often that it becomes cumbersome to constantly type HTML, you can always move it into the top-level environment using provide.

provide(HTML);
show(link("http://download.oracle.com/docs/cd/E19957-01/816-6408-10/object.htm",
          "This is how objects work."));

¶ You can even combine the function and object approaches, by putting the internal variables of the module inside a function, and having this function return an object containing its external interface.

¶ When adding methods to standard prototypes, such as those of Array and Object a similar problem to name-space pollution occurs. If two modules decide to add a map method to Array.prototype, you might have a problem. If these two versions of map have the precise same effect, things will continue to work, but only by sheer luck.

¶ Designing an interface for a module or an object type is one of the subtler aspects of programming. On the one hand, you do not want to expose too many details. They will only get in the way when using the module. On the other hand, you do not want to be too simple and general, because that might make it impossible to use the module in complex or specialised situations.

¶ Sometimes the solution is to provide two interfaces, a detailed 'low-level' one for complicated things, and a simple 'high-level' one for straightforward situations. The second one can usually be built very easily using the tools provided by the first one.

¶ In other cases, you just have to find the right idea around which to base your interface. Compare this to the various approaches to inheritance we saw in chapter 8. By making prototypes the central concept, rather than constructors, we managed to make some things considerably more straightforward.

¶ The best way to learn the value of good interface design is, unfortunately, to use bad interfaces. Once you get fed up with them, you'll figure out a way to improve them, and learn a lot in the process. Try not to assume that a lousy interface is 'just the way it is'. Fix it, or wrap it in a new interface that is better (we will see an example of this in chapter 12).

¶ There are functions which require a lot of arguments. Sometimes this means they are just badly designed, and can easily be remedied by splitting them into a few more modest functions. But in other cases, there is no way around it. Typically, some of these arguments have a sensible 'default' value. We could, for example, write yet another extended version of range.

function range(start, end, stepSize, length) {
  if (stepSize == undefined)
    stepSize = 1;
  if (end == undefined)
    end = start + stepSize * (length - 1);

  var result = [];
  for (; start <= end; start += stepSize)
    result.push(start);
  return result;
}

show(range(0, undefined, 4, 5));

¶ It can get hard to remember which argument goes where, not to mention the annoyance of having to pass undefined as a second argument when a length argument is used. We can make passing arguments to this function more comprehensive by wrapping them in an object.

function defaultTo(object, values) {
  forEachIn(values, function(name, value) {
    if (!object.hasOwnProperty(name))
      object[name] = value;
  });
}

function range(args) {
  defaultTo(args, {start: 0, stepSize: 1});
  if (args.end == undefined)
    args.end = args.start + args.stepSize * (args.length - 1);

  var result = [];
  for (; args.start <= args.end; args.start += args.stepSize)
    result.push(args.start);
  return result;
}

show(range({stepSize: 4, length: 5}));

¶ The defaultTo function is useful for adding default values to an object. It copies the properties of its second argument into its first argument, skipping those that already have a value.

¶ A module or group of modules that can be useful in more than one program is usually called a library. For many programming languages, there is a huge set of quality libraries available. This means programmers do not have to start from scratch all the time, which can make them a lot more productive. For JavaScript, unfortunately, the amount of available libraries is not very large.

¶ But recently this seems to be improving. There are a number of good libraries with 'basic' tools, things like map and clone. Other languages tend to provide such obviously useful things as built-in standard features, but with JavaScript you'll have to either build a collection of them for yourself or use a library. Using a library is recommended: It is less work, and the code in a library has usually been tested more thoroughly than the things you wrote yourself.

¶ Covering these basics, there are (among others) the 'lightweight' libraries prototype, mootools, jQuery, and MochiKit. There are also some larger 'frameworks' available, which do a lot more than just provide a set of basic tools. YUI (by Yahoo), and Dojo seem to be the most popular ones in that genre. All of these can be downloaded and used free of charge. My personal favourite is MochiKit, but this is mostly a matter of taste. When you get serious about JavaScript programming, it is a good idea to quickly glance through the documentation of each of these, to get a general idea about the way they work and the things they provide.

¶ The fact that a basic toolkit is almost indispensable for any non-trivial JavaScript programs, combined with the fact that there are so many different toolkits, causes a bit of a dilemma for library writers. You either have to make your library depend on one of the toolkits, or write the basic tools yourself and include them with the library. The first option makes the library hard to use for people who are using a different toolkit, and the second option adds a lot of non-essential code to the library. This dilemma might be one of the reasons why there are relatively few good, widely used JavaScript libraries. It is possible that, in the future, new versions of ECMAScript and changes in browsers will make toolkits less necessary, and thus (partially) solve this problem.