Chapter 9: Modularity
¶ This chapter deals with the process of organising programs. In small programs, organisation rarely becomes a problem. As a program grows, however, it can reach a size where its structure and interpretation become hard to keep track of. Easily enough, such a program starts to look like a bowl of spaghetti, an amorphous mass in which everything seems to be connected to everything else.
¶ When structuring a program, we do two things. We separate it into smaller parts, called modules, each of which has a specific role, and we specify the relations between these parts.
¶ In chapter 8, while developing a terrarium, we made use of a number of
functions described in chapter 6. The chapter also defined a few new
concepts that had nothing in particular to do with terraria, such as
clone
and the Dictionary
type. All these things were haphazardly
added to the environment. One way to split this program into modules
would be:
- A module
FunctionalTools
, which contains the functions from chapter 6, and depends on nothing. - Then
ObjectTools
, which contains things likeclone
andcreate
, and depends onFunctionalTools
. Dictionary
, containing the dictionary type, and depending onFunctionalTools
.- And finally the
Terrarium
module, which depends onObjectTools
andDictionary
.
¶ When a module depends on another module, it uses functions or variables from that module, and will only work when this module is loaded.
¶ It is a good idea to make sure dependencies never form a circle. Not
only do circular dependencies create a practical problem (if module
A
and B
depend on each other, which one should be loaded first?),
it also makes the relation between the modules less straightforward,
and can result in a modularised version of the spaghetti I mentioned
earlier.
¶ Most modern programming languages have some kind of module system built in. Not JavaScript. Once again, we have to invent something ourselves. The most obvious way to start is to put every module in a different file. This makes it clear which code belongs to which module.
¶ Browsers load JavaScript files when they find a <script>
tag with an src
attribute in the HTML of the web-page. The extension
.js
is usually used for files containing JavaScript code. On the
console, a shortcut for loading files is provided by the load
function.
load("FunctionalTools.js");
¶ In some cases, giving load commands in the wrong order will result in
errors. If a module tries to create a Dictionary
object, but the
Dictionary
module has not been loaded yet, it will be unable to find
the constructor, and will fail.
¶ One would imagine this to be easy to solve. Just put some calls to
load
at the top of the file for a module, to load all the modules it
depends on. Unfortunately, because of the way browsers work, calling
load
does not immediately cause the given file to be loaded. The
file will be loaded after the current file has finished executing.
Which is too late, usually.
¶ In most cases, the practical solution is to just manage dependencies
by hand: Put the script
tags in your HTML documents in the right
order.
¶ There are two ways to (partially) automate dependency management. The
first is to keep a separate file with information about the
dependencies between modules. This can be loaded first, and used to
determine the order in which to load the files. The second way is to
not use a script
tag (load
internally creates and adds such a
tag), but to fetch the content of the file directly (see chapter 14), and
then use the eval
function to execute it. This makes script loading
instantaneous, and thus easier to deal with.
¶ eval
, short for 'evaluate', is an interesting function. You give
it a string value, and it will execute the content of the string as
JavaScript code.
eval("print(\"I am a string inside a string!\");");
¶ You can imagine that eval
can be used to do some interesting things.
Code can build new code, and run it. In most cases, however, problems
that can be solved with creative uses of eval
can also be solved
with creative uses of anonymous functions, and the latter is less
likely to cause strange problems.
¶ When eval
is called inside a function, all new variables will become
local to that function. Thus, when a variation of the load
would use
eval
internally, loading the Dictionary
module would create a
Dictionary
constructor inside of the load
function, which would be
lost as soon as the function returned. There are ways to work around
this, but they are rather clumsy.
¶ Let us quickly go over the first variant of dependency management. It requires a special file for dependency information, which could look something like this:
var dependencies = {"ObjectTools.js": ["FunctionalTools.js"], "Dictionary.js": ["ObjectTools.js"], "TestModule.js": ["FunctionalTools.js", "Dictionary.js"]};
¶ The dependencies
object contains a property for each file that
depends on other files. The values of the properties are arrays of
file names. Note that we could not use a Dictionary
object here,
because we can not be sure that the Dictionary
module has been
loaded yet. Because all the properties in this object will end in
".js"
, they are unlikely to interfere with hidden properties like
__proto__
or hasOwnProperty
, and a regular object will work fine.
¶ The dependency manager must do two things. Firstly it must make sure that files are loaded in the correct order, by loading a file's dependencies before the file itself. And secondly, it must make sure that no file is loaded twice. Loading the same file twice might cause problems, and is definitely a waste of time.
var loadedFiles = {}; function require(file) { if (dependencies[file]) { var files = dependencies[file]; for (var i = 0; i < files.length; i++) require(files[i]); } if (!loadedFiles[file]) { loadedFiles[file] = true; load(file); } }
¶ The require
function can now be used to load a file and all its
dependencies. Note how it recursively calls itself to take care of
dependencies (and possible dependencies of that dependency).
require("TestModule.js");
test();
¶ Building a program as a set of nice, small modules often means the program will use a lot of different files. When programming for the web, having lots of small JavaScript files on a page tends to make the page slower to load. This does not have to be a problem though. You can write and test your program as a number of small files, and put them all into a single big file when 'publishing' the program to the web.
¶ Just like an object type, a module has an interface. In simple
collection-of-functions modules such as FunctionalTools
, the
interface usually consists of all the functions that are defined in
the module. In other cases, the interface of the module is only a
small part of the functions defined inside it. For example, our
manuscript-to-HTML system from chapter 6 only needs an interface of a
single function, renderFile
. (The sub-system for building HTML would
be a separate module.)
¶ For modules which only define a single type of object, such as
Dictionary
, the object's interface is the same as the module's
interface.
¶ In JavaScript, 'top-level' variables all live together in a single
place. In browsers, this place is an object that can be found under
the name window
. The name is somewhat odd, environment
or top
would have made more sense, but since browsers associate a JavaScript
environment with a window (or 'frame'), someone decided that window
was a logical name.
show(window); show(window.print == print); show(window.window.window.window.window);
¶ As the third line shows, the name window
is merely a property of
this environment object, pointing at itself.
¶ When much code is loaded into an environment, it will use many top-level variable names. Once there is more code than you can really keep track of, it becomes very easy to accidentally use a name that was already used for something else. This will break the code that used the original value. The proliferation of top-level variables is called name-space pollution, and it can be a rather severe problem in JavaScript ― the language will not warn you when you redefine an existing variable.
¶ There is no way to get rid of this problem entirely, but it can be greatly reduced by taking care to cause as little pollution as possible. For one thing, modules should not use top-level variables for values that are not part of their external interface.
¶ Not being able to define any internal functions and variables at all
in your modules is, of course, not very practical. Fortunately, there
is a trick to get around this. We write all the code for the module
inside a function, and then finally add the variables that are part of
the module's interface to the window
object. Because they were
created in the same parent function, all the functions of the module
can see each other, but code outside of the module can not.
function buildMonthNameModule() { var names = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]; function getMonthName(number) { return names[number]; } function getMonthNumber(name) { for (var number = 0; number < names.length; number++) { if (names[number] == name) return number; } } window.getMonthName = getMonthName; window.getMonthNumber = getMonthNumber; } buildMonthNameModule(); show(getMonthName(11));
¶ This builds a very simple module for translating between month names
and their number (as used by Date
, where January is 0
). But note
that buildMonthNameModule
is still a top-level variable that is not
part of the module's interface. Also, we have to repeat the names of
the interface functions three times. Ugh.
¶ The first problem can be solved by making the module function anonymous, and calling it directly. To do this, we have to add a pair of parentheses around the function value, or JavaScript will think it is a normal function definition, which can not be called directly.
¶ The second problem can be solved with a helper function, provide
,
which can be given an object containing the values that must be
exported into the window
object.
function provide(values) { forEachIn(values, function(name, value) { window[name] = value; }); }
¶ Using this, we can write a module like this:
(function() { var names = ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]; provide({ getDayName: function(number) { return names[number]; }, getDayNumber: function(name) { for (var number = 0; number < names.length; number++) { if (names[number] == name) return number; } } }); })(); show(getDayNumber("Wednesday"));
¶ I do not recommend writing modules like this right from the start. While you are still working on a piece of code, it is easier to just use the simple approach we have used so far, and put everything at top level. That way, you can inspect the module's internal values in your browser, and test them out. Once a module is more or less finished, it is not difficult to wrap it in a function.
¶ There are cases where a module will export so many variables that it
is a bad idea to put them all into the top-level environment. In cases
like this, you can do what the standard Math
object does, and
represent the module as a single object whose properties are the
functions and values it exports. For example...
var HTML = { tag: function(name, content, properties) { return {name: name, properties: properties, content: content}; }, link: function(target, text) { return HTML.tag("a", [text], {href: target}); } /* ... many more HTML-producing functions ... */ };
¶ When you need the content of such a module so often that it becomes
cumbersome to constantly type HTML
, you can always move it into the
top-level environment using provide
.
provide(HTML); show(link("http://download.oracle.com/docs/cd/E19957-01/816-6408-10/object.htm", "This is how objects work."));
¶ You can even combine the function and object approaches, by putting the internal variables of the module inside a function, and having this function return an object containing its external interface.
¶ When adding methods to standard prototypes, such as those of Array
and Object
a similar problem to name-space pollution occurs. If two
modules decide to add a map
method to Array.prototype
, you might
have a problem. If these two versions of map
have the precise same
effect, things will continue to work, but only by sheer luck.
¶ Designing an interface for a module or an object type is one of the subtler aspects of programming. On the one hand, you do not want to expose too many details. They will only get in the way when using the module. On the other hand, you do not want to be too simple and general, because that might make it impossible to use the module in complex or specialised situations.
¶ Sometimes the solution is to provide two interfaces, a detailed 'low-level' one for complicated things, and a simple 'high-level' one for straightforward situations. The second one can usually be built very easily using the tools provided by the first one.
¶ In other cases, you just have to find the right idea around which to base your interface. Compare this to the various approaches to inheritance we saw in chapter 8. By making prototypes the central concept, rather than constructors, we managed to make some things considerably more straightforward.
¶ The best way to learn the value of good interface design is, unfortunately, to use bad interfaces. Once you get fed up with them, you'll figure out a way to improve them, and learn a lot in the process. Try not to assume that a lousy interface is 'just the way it is'. Fix it, or wrap it in a new interface that is better (we will see an example of this in chapter 12).
¶ There are functions which require a lot of arguments. Sometimes this
means they are just badly designed, and can easily be remedied by
splitting them into a few more modest functions. But in other cases,
there is no way around it. Typically, some of these arguments have a
sensible 'default' value. We could, for example, write yet another
extended version of range
.
function range(start, end, stepSize, length) { if (stepSize == undefined) stepSize = 1; if (end == undefined) end = start + stepSize * (length - 1); var result = []; for (; start <= end; start += stepSize) result.push(start); return result; } show(range(0, undefined, 4, 5));
¶ It can get hard to remember which argument goes where, not to mention
the annoyance of having to pass undefined
as a second argument when
a length
argument is used. We can make passing arguments to this
function more comprehensive by wrapping them in an object.
function defaultTo(object, values) { forEachIn(values, function(name, value) { if (!object.hasOwnProperty(name)) object[name] = value; }); } function range(args) { defaultTo(args, {start: 0, stepSize: 1}); if (args.end == undefined) args.end = args.start + args.stepSize * (args.length - 1); var result = []; for (; args.start <= args.end; args.start += args.stepSize) result.push(args.start); return result; } show(range({stepSize: 4, length: 5}));
¶ The defaultTo
function is useful for adding default values to an
object. It copies the properties of its second argument into its first
argument, skipping those that already have a value.
¶ A module or group of modules that can be useful in more than one program is usually called a library. For many programming languages, there is a huge set of quality libraries available. This means programmers do not have to start from scratch all the time, which can make them a lot more productive. For JavaScript, unfortunately, the amount of available libraries is not very large.
¶ But recently this seems to be improving. There are a number of good
libraries with 'basic' tools, things like map
and clone
. Other
languages tend to provide such obviously useful things as built-in
standard features, but with JavaScript you'll have to either build a
collection of them for yourself or use a library. Using a library is
recommended: It is less work, and the code in a library has usually
been tested more thoroughly than the things you wrote yourself.
¶ Covering these basics, there are (among others) the 'lightweight' libraries prototype, mootools, jQuery, and MochiKit. There are also some larger 'frameworks' available, which do a lot more than just provide a set of basic tools. YUI (by Yahoo), and Dojo seem to be the most popular ones in that genre. All of these can be downloaded and used free of charge. My personal favourite is MochiKit, but this is mostly a matter of taste. When you get serious about JavaScript programming, it is a good idea to quickly glance through the documentation of each of these, to get a general idea about the way they work and the things they provide.
¶ The fact that a basic toolkit is almost indispensable for any non-trivial JavaScript programs, combined with the fact that there are so many different toolkits, causes a bit of a dilemma for library writers. You either have to make your library depend on one of the toolkits, or write the basic tools yourself and include them with the library. The first option makes the library hard to use for people who are using a different toolkit, and the second option adds a lot of non-essential code to the library. This dilemma might be one of the reasons why there are relatively few good, widely used JavaScript libraries. It is possible that, in the future, new versions of ECMAScript and changes in browsers will make toolkits less necessary, and thus (partially) solve this problem.