The browser is a really hostile programming environment.
Web technology has, from the start, been decentralized, not just technically but also in the way it has evolved. Various browser vendors have added new functionality in ad hoc and sometimes poorly thought out ways, which then sometimes ended up being adopted by others and finally set down as a standard.
This is both a blessing and a curse. On the one hand, it is empowering to not have a central party control a system but have it be improved by various parties working in loose collaboration (or, occasionally, open hostility). On the other hand, the haphazard way in which the Web was developed means that the resulting system is not exactly a shining example of internal consistency. In fact, some parts of it are downright messy and confusing.
Networks and the Internet
Computer networks have been around since the 1950s. If you put cables between two or more computers and allow them to send data back and forth through these cables, you can do all kinds of wonderful things.
If connecting two machines in the same building allows us to do wonderful things, connecting machines all over the planet should be even better. The technology to start implementing this vision was developed in the 1980s, and the resulting network is called the Internet. It has lived up to its promise.
A computer can use this network to spew bits at another computer. For any effective communication to arise out of this bit-spewing, the computers at both ends must know what the bits are supposed to represent. The meaning of any given sequence of bits depends entirely on the kind of thing that it is trying to express and on the encoding mechanism used.
A network protocol describes a style of communication over a network. There are protocols for sending email, for fetching email, for sharing files, or even for controlling computers that happen to be infected by malicious software.
For example, a simple chat protocol might consist of one computer sending the bits that represent the text “CHAT?” to another machine and the other responding with “OK!” to confirm that it understands the protocol. They can then proceed to send each other strings of text, read the text sent by the other from the network, and display whatever they receive on their screens.
Most protocols are built on top of other protocols. Our example chat protocol treats the network as a streamlike device into which you can put bits and have them arrive at the correct destination in the correct order. Ensuring those things is already a rather difficult technical problem.
The Transmission Control Protocol (TCP) is a protocol that solves this problem. All Internet-connected devices “speak” it, and most communication on the Internet is built on top of it.
A TCP connection works as follows: one computer must be waiting, or listening, for other computers to start talking to it. To be able to listen for different kinds of communication at the same time on a single machine, each listener has a number (called a port) associated with it. Most protocols specify which port should be used by default. For example, when we want to send an email using the SMTP protocol, the machine through which we send it is expected to be listening on port 25.
Another computer can then establish a connection by connecting to the target machine using the correct port number. If the target machine can be reached and is listening on that port, the connection is successfully created. The listening computer is called the server, and the connecting computer is called the client.
Such a connection acts as a two-way pipe through which bits can flow—the machines on both ends can put data into it. Once the bits are successfully transmitted, they can be read out again by the machine on the other side. This is a convenient model. You could say that TCP provides an abstraction of the network.
The World Wide Web (not to be confused with the Internet as a whole) is a set of protocols and formats that allow us to visit web pages in a browser. The “Web” part in the name refers to the fact that such pages can easily link to each other, thus connecting into a huge mesh that users can move through.
To add content to the Web, all you need to do is connect a machine to the Internet, and have it listen on port 80, using the Hypertext Transfer Protocol (HTTP). This protocol allows other computers to request documents over the network.
Each document on the Web is named by a Uniform Resource Locator (URL), which looks something like this:
The first part tells us that this URL uses the HTTP protocol (as opposed to, for example, encrypted HTTP, which would be https://). Then comes the part that identifies which server we are requesting the document from. Last is a path string that identifies the specific document (or resource) we are interested in.
Each machine connected to the Internet gets a unique IP address,
which looks something like
22.214.171.124. You can use these directly
as the server part of a URL. But lists of more or less random
numbers are hard to remember and awkward to type, so you can instead
register a domain name to point toward a specific machine or
IP address of a machine I control and can thus use that domain name
to serve web pages.
We will take a closer look at the HTTP protocol in Chapter 17.
HTML, which stands for Hypertext Markup Language, is the document format used for web pages. An HTML document contains text, as well as tags that give structure to the text, describing things such as links, paragraphs, and headings.
A simple HTML document looks like this:
The tags, wrapped in angle brackets (
>), provide information about the structure of the
document. The other text is just plain text.
The document starts with
which tells the browser to interpret it as modern HTML, as opposed
to various dialects that were in use in the past.
HTML documents have a head and a body.
The head contains information about the document, and the body
contains the document itself. In this case, we first declared that the
title of this document is “My home page” and then gave a document
containing a heading (
<h1>, meaning “heading 1”—
produce more minor headings) and two paragraphs (
Tags come in several forms. An
element, such as the body, a paragraph, or a link, is started by
an opening tag like
<p> and ended by a closing tag like
</p>. Some opening tags, such as the one for the link (
contain extra information in the form of
name="value" pairs. These
are called attributes. In this case, the destination of the link
is indicated with
stands for “hypertext reference”.
kinds of tags do not enclose anything and thus do not need to be
closed. An example of this would be
src="http://example.com/image.jpg">, which will display the image
found at the given source URL.
To be able to include angle brackets in
the text of a document, even though they have a special meaning in
HTML, yet another form of special notation has to be introduced. A
plain opening angle bracket is written as
< (“less than”), and
a closing bracket is written as
> (“greater than”). In HTML, an ampersand
&) character followed by a word and a semicolon (
;) is called an
entity, and will be replaced by the character it encodes.
This is analogous to the way backslashes are used in
special meaning, too, those need to be escaped as
&. Inside an
attribute, which is wrapped in double quotes,
" can be used to
insert an actual quote character.
HTML is parsed in a remarkably error-tolerant way. When tags that should be there are missing, the browser reconstructs them. The way in which this is done has been standardized, and you can rely on all modern browsers to do it in the same way.
The following document will be treated just like the one shown previously:
are gone completely. The browser knows that
<title> belongs in a
head, and that
<h1> in a body. Furthermore, I am no longer explicitly
closing the paragraphs since opening a new paragraph or ending the
document will close them implicitly. The quotes around the link target
are also gone.
This book will usually omit the
from examples to keep them short and free of clutter. But I will
close tags and include quotes around attributes.
I will also usually omit the doctype. This is not to be taken as an encouragement to omit doctype declarations. Browsers will often do ridiculous things when you forget them. You should consider doctypes implicitly present in examples, even when they are not actually shown in the text.
In the context of this
book, the most important HTML tag is
<script>. This tag allows
<h1>Testing alert</h1> <script>alert("hello!");</script>
Such a script will run as soon as
<script> tag is encountered as the browser reads the HTML. The
page shown earlier will pop up an
alert dialog when opened.
Including large programs directly in HTML documents
is often impractical. The
<script> tag can be given an
attribute in order to fetch a script file (a text file containing a
<h1>Testing alert</h1> <script src="code/hello.js"></script>
The code/hello.js file included here contains the same simple program,
alert("hello!"). When an HTML page references other URLs as part of
itself, for example an image file or a script—web browsers will
retrieve them immediately and include them in the page.
A script tag must always be
</script>, even if it refers to a script file and
doesn’t contain any code. If you forget this, the rest of the page
will be interpreted as part of the script.
Some attributes can also
<button> tag shown next (which shows up
as a button) has an
onclick attribute, whose content will be run
whenever the button is clicked.
<button onclick="alert('Boom!');">DO NOT PRESS</button>
Note that I had to
use single quotes for the string in the
onclick attribute because
double quotes are already used to quote the whole attribute. I could
also have used
", but that’d make the program harder to read.
In the sandbox
Running programs downloaded from the Internet is potentially dangerous. You do not know much about the people behind most sites you visit, and they do not necessarily mean well. Running programs by people who do not mean well is how you get your computer infected by viruses, your data stolen, and your accounts hacked.
Isolating a programming environment in this way is called sandboxing, the idea being that the program is harmlessly playing in a sandbox. But you should imagine this particular kind of sandbox as having a cage of thick steel bars over it, which makes it somewhat different from your typical playground sandbox.
The hard part of sandboxing is allowing the programs enough room to be useful yet at the same time restricting them from doing anything dangerous. Lots of useful functionality, such as communicating with other servers or reading the content of the copy-paste clipboard, can also be used to do problematic, privacy-invading things.
Every now and then, someone comes up with a new way to circumvent the limitations of a browser and do something harmful, ranging from leaking minor private information to taking over the whole machine that the browser runs on. The browser developers respond by fixing the hole, and all is well again—that is, until the next problem is discovered, and hopefully publicized, rather than secretly exploited by some government or mafia.
Compatibility and the browser wars
In the early stages of the Web, a browser called Mosaic dominated the market. After a few years, the balance had shifted to Netscape, which was then, in turn, largely supplanted by Microsoft’s Internet Explorer. At any point where a single browser was dominant, that browser’s vendor would feel entitled to unilaterally invent new features for the Web. Since most users used the same browser, websites would simply start using those features—never mind the other browsers.
This was the dark age of compatibility, often called the browser wars. Web developers were left with not one unified Web but two or three incompatible platforms. To make things worse, the browsers in use around 2003 were all full of bugs, and of course the bugs were different for each browser. Life was hard for people writing web pages.
Mozilla Firefox, a not-for-profit offshoot of Netscape, challenged Internet Explorer’s hegemony in the late 2000s. Because Microsoft was not particularly interested in staying competitive at the time, Firefox took quite a chunk of market share away from it. Around the same time, Google introduced its Chrome browser, and Apple’s Safari browser gained popularity, leading to a situation where there were four major players, rather than one.
The new players had a more serious attitude toward standards and better engineering practices, leading to less incompatibility and fewer bugs. Microsoft, seeing its market share crumble, came around and adopted these attitudes. If you are starting to learn web development today, consider yourself lucky. The latest versions of the major browsers behave quite uniformly and have relatively few bugs.
That is not to say that the situation is perfect just yet. Some of the people using the Web are, for reasons of inertia or corporate policy, stuck with very old browsers. Until those browsers die out entirely, writing websites that work for them will require a lot of arcane knowledge about their shortcomings and quirks. This book is not about those quirks. Rather, it aims to present the modern, sane style of web programming.