Chapter 14: HTTP requests

¶ As mentioned in chapter 11, communication on the World Wide Web happens over the HTTP protocol. A simple request might look like this:

GET /files/fruit.txt HTTP/1.1
Host: eloquentjavascript.net
User-Agent: The Imaginary Browser

¶ Which asks for the file files/fruit.txt from the server at eloquentjavascript.net. In addition, it specifies that this request uses version 1.1 of the HTTP protocol ― version 1.0 is also still in use, and works slightly differently. The Host and User-Agent lines follow a pattern: They start with a word that identifies the information they contain, followed by a colon and the actual information. These are called 'headers'. The User-Agent header tells the server which browser (or other kind of program) is being used to make the request. Other kinds of headers are often sent along, for example to state the types of documents that the client can understand, or the language that it prefers.

¶ When given the above request, the server might send the following response:

HTTP/1.1 200 OK
Last-Modified: Mon, 23 Jul 2007 08:41:56 GMT
Content-Length: 24
Content-Type: text/plain

apples, oranges, bananas

¶ The first line indicates again the version of the HTTP protocol, followed by the status of the request. In this case the status code is 200, meaning 'OK, nothing out of the ordinary happened, I am sending you the file'. This is followed by a few headers, indicating (in this case) the last time the file was modified, its length, and its type (plain text). After the headers you get a blank line, followed by the file itself.

¶ Apart from requests starting with GET, which indicates the client just wants to fetch a document, the word POST can also be used to indicate some information will be sent along with the request, which the server is expected to process in some way.1

¶ When you click a link, submit a form, or in some other way encourage your browser to go to a new page, it will do an HTTP request and immediately unload the old page to show the newly loaded document. In typical situations, this is just what you want ― it is how the web traditionally works. Sometimes, however, a JavaScript program wants to communicate with the server without re-loading the page. The 'Load' button in the console, for example, can load files without leaving the page.

¶ To be able to do things like that, the JavaScript program must make the HTTP request itself. Contemporary browsers provide an interface for this. As with opening new windows, this interface is subject to some restrictions. To prevent a script from doing anything scary, it is only allowed to make HTTP requests to the domain that the current page came from.

¶ An object used to make an HTTP request can, on most browsers, be created by doing new XMLHttpRequest(). Older versions of Internet Explorer, which originally invented these objects, require you to do new ActiveXObject("Msxml2.XMLHTTP") or, on even older versions, new ActiveXObject("Microsoft.XMLHTTP"). ActiveXObject is Internet Explorer's interface to various kinds of browser add-ons. We are already used to writing incompatibility-wrappers by now, so let us do so again:

function makeHttpObject() {
  try {return new XMLHttpRequest();}
  catch (error) {}
  try {return new ActiveXObject("Msxml2.XMLHTTP");}
  catch (error) {}
  try {return new ActiveXObject("Microsoft.XMLHTTP");}
  catch (error) {}

  throw new Error("Could not create HTTP request object.");
}

show(typeof(makeHttpObject()));

¶ The wrapper tries to create the object in all three ways, using try and catch to detect which ones fail. If none of the ways work, which might be the case on older browsers or browsers with strict security settings, it raises an error.

¶ Now why is this object called an XML HTTP request? This is a bit of a misleading name. XML is a way to store textual data. It uses tags and attributes like HTML, but is more structured and flexible ― to store your own kinds of data, you may define your own types of XML tags. These HTTP request objects have some built-in functionality for dealing with retrieved XML documents, which is why they have XML in their name. They can also handle other types of documents, though, and in my experience they are used just as often for non-XML requests.

¶ Now that we have our HTTP object, we can use it to make a request similar the example shown above.

var request = makeHttpObject();
request.open("GET", "files/fruit.txt", false);
request.send(null);
print(request.responseText);

¶ The open method is used to configure a request. In this case we choose to make a GET request for our fruit.txt file. The URL given here is relative, it does not contain the http:// part or a server name, which means it will look for the file on the server that the current document came from. The third parameter, false, will be discussed in a moment. After open has been called, the actual request can be made with the send method. When the request is a POST request, the data to be sent to the server (as a string) can be passed to this method. For GET requests, one should just pass null.

¶ After the request has been made, the responseText property of the request object contains the content of the retrieved document. The headers that the server sent back can be inspected with the getResponseHeader and getAllResponseHeaders functions. The first looks up a specific header, the second gives us a string containing all the headers. These can occasionally be useful to get some extra information about the document.

print(request.getAllResponseHeaders());
show(request.getResponseHeader("Last-Modified"));

¶ If, for some reason, you want to add headers to the request that is sent to the server, you can do so with the setRequestHeader method. This takes two strings as arguments, the name and the value of the header.

¶ The response code, which was 200 in the example, can be found under the status property. When something went wrong, this cryptic code will indicate it. For example, 404 means the file you asked for did not exist. The statusText contains a slightly less cryptic description of the status.

show(request.status);
show(request.statusText);

¶ When you want to check whether a request succeeded, comparing the status to 200 is usually enough. In theory, the server might in some situations return the code 304 to indicate that the older version of the document, which the browser has stored in its 'cache', is still up to date. But it seems that browsers shield you from this by setting the status to 200 even when it is 304. Also, if you are doing a request over a non-HTTP protocol2, such as FTP, the status will not be usable because the protocol does not use HTTP status codes.

¶ When a request is done as in the example above, the call to the send method does not return until the request is finished. This is convenient, because it means the responseText is available after the call to send, and we can start using it immediately. There is a problem, though. When the server is slow, or the file is big, doing a request might take quite a while. As long as this is happening, the program is waiting, which causes the whole browser to wait. Until the program finishes, the user can not do anything, not even scroll the page. Pages that run on a local network, which is fast and reliable, might get away with doing requests like this. Pages on the big great unreliable Internet, on the other hand, should not.

¶ When the third argument to open is true, the request is set to be 'asynchronous'. This means that send will return right away, while the request happens in the background.

request.open("GET", "files/fruit.xml", true);
request.send(null);
show(request.responseText);

¶ But wait a moment, and...

print(request.responseText);

¶ 'Waiting a moment' could be implemented with setTimeout or something like that, but there is a better way. A request object has a readyState property, indicating the state it is in. This will become 4 when the document has been fully loaded, and have a smaller value before that3. To react to changes in this status, you can set the onreadystatechange property of the object to a function. This function will be called every time the state changes.

request.open("GET", "files/fruit.xml", true);
request.send(null);
request.onreadystatechange = function() {
  if (request.readyState == 4)
    show(request.responseText.length);
};

¶ When the file retrieved by the request object is an XML document, the request's responseXML property will hold a representation of this document. This representation works like the DOM objects discussed in chapter 12, except that it doesn't have HTML-specific functionality, such as style or innerHTML. responseXML gives us a document object, whose documentElement property refers to the outer tag of the XML document.

var catalog = request.responseXML.documentElement;
show(catalog.childNodes.length);

¶ Such XML documents can be used to exchange structured information with the server. Their form ― tags contained inside other tags ― is often very suitable to store things that would be tricky to represent as simple flat text. The DOM interface is rather clumsy for extracting information though, and XML documents are notoriously wordy: The fruit.xml document looks like a lot, but all it says is 'apples are red, oranges are orange, and bananas are yellow'.

¶ As an alternative to XML, JavaScript programmers have come up with something called JSON. This uses the basic notation of JavaScript values to represent 'hierarchical' information in a more minimalist way. A JSON document is a file containing a single JavaScript object or array, which in turn contains any number of other objects, arrays, strings, numbers, booleans, or null values. For an example, look at fruit.json:

request.open("GET", "files/fruit.json", true);
request.send(null);
request.onreadystatechange = function() {
  if (request.readyState == 4)
    print(request.responseText);
};

¶ Such a piece of text can be converted to a normal JavaScript value by using the eval function. Parentheses should be added around it before calling eval, because otherwise JavaScript might interpret an object (enclosed by braces) as a block of code, and produce an error.

function evalJSON(json) {
  return eval("(" + json + ")");
}
var fruit = evalJSON(request.responseText);
show(fruit);

¶ When running eval on a piece of text, you have to keep in mind that this means you let the piece of text run whichever code it wants. Since JavaScript only allows us to make requests to our own domain, you will usually know exactly what kind of text you are getting, and this is not a problem. In other situations, it might be unsafe.

Ex. 14.1

¶ Write a function called serializeJSON which, when given a JavaScript value, produces a string with the value's JSON representation. Simple values like numbers and booleans can be simply given to the String function to convert them to a string. Objects and arrays can be handled by recursion.

¶ Recognizing arrays can be tricky, since its type is "object". You can use instanceof Array, but that only works for arrays that were created in your own window ― others will use the Array prototype from other windows, and instanceof will return false. A cheap trick is to convert the constructor property to a string, and see whether that contains "function Array".

¶ When converting a string, you have to take care to escape special characters inside it. If you use double-quotes around the string, the characters to escape are \", \\, \f, \b, \n, \t, \r, and \v4.

function serializeJSON(value) {
  function isArray(value) {
    return /^\s*function Array/.test(String(value.constructor));
  }

  function serializeArray(value) {
    return "[" + map(serializeJSON, value).join(", ") + "]";
  }
  function serializeObject(value) {
    var properties = [];
    forEachIn(value, function(name, value) {
      properties.push(serializeString(name) + ": " +
                      serializeJSON(value));
    });
    return "{" + properties.join(", ") + "}";
  }
  function serializeString(value) {
    var special =
      {"\"": "\\\"", "\\": "\\\\", "\f": "\\f", "\b": "\\b",
       "\n": "\\n", "\t": "\\t", "\r": "\\r", "\v": "\\v"};
    var escaped = value.replace(/[\"\\\f\b\n\t\r\v]/g,
                                function(c) {return special[c];});
    return "\"" + escaped + "\"";
  }

  var type = typeof value;
  if (type == "object" && isArray(value))
    return serializeArray(value);
  else if (type == "object")
    return serializeObject(value);
  else if (type == "string")
    return serializeString(value);
  else
    return String(value);
}

print(serializeJSON(fruit));

¶ The trick used in serializeString is similar to what we saw in the escapeHTML function in chapter 10. It uses an object to look up the correct replacements for each of the characters. Some of them, such as "\\\\", look quite weird because of the need to put two backslashes for every backslash in the resulting string.

¶ Also note that the names of properties are quoted as strings. For some of them, this is not necessary, but for property names with spaces and other strange things in them it is, so the code just takes the easy way out and quotes everything.

¶ When making lots of requests, we do, of course, not want to repeat the whole open, send, onreadystatechange ritual every time. A very simple wrapper could look like this:

function simpleHttpRequest(url, success, failure) {
  var request = makeHttpObject();
  request.open("GET", url, true);
  request.send(null);
  request.onreadystatechange = function() {
    if (request.readyState == 4) {
      if (request.status == 200)
        success(request.responseText);
      else if (failure)
        failure(request.status, request.statusText);
    }
  };
}

simpleHttpRequest("files/fruit.txt", print);

¶ The function retrieves the url it is given, and calls the function it is given as a second argument with the content. When a third argument is given, this is used to indicate failure ― a non-200 status code.

¶ To be able to do more complex requests, the function could be made to accept extra parameters to specify the method (GET or POST), an optional string to post as data, a way to add extra headers, and so on. When you have so many arguments, you'd probably want to pass them as an arguments-object as seen in chapter 9.

¶ Some websites make use of intensive communication between the programs running on the client and the programs running on the server. For such systems, it can be practical to think of some HTTP requests as calls to functions that run on the server. The client makes request to URLs that identify the functions, giving the arguments as URL parameters or POST data. The server then calls the function, and puts the result into JSON or XML document that it sends back. If you write a few convenient support functions, this can make calling server-side functions almost as easy as calling client-side ones... except, of course, that you do not get their results instantly.

These are not the only types of requests. There is also HEAD, to request just the headers for a document, not its content, PUT, to add a document to a server, and DELETE, to delete a document. These are not used by browsers, and often not supported by web-servers, but ― if you add server-side programs to support them ― they can be useful.
Not only the 'XML' part of the XMLHttpRequest name is misleading ― the object can also be used for request over protocols other than HTTP, so Request is the only meaningful part we have left.
0 ('uninitialized') is the state of the object before open is called on it. Calling open moves it to 1 ('open'). Calling send makes it proceed to 2 ('sent'). When the server responds, it goes to 3 ('receiving'). Finally, 4 means 'loaded'.
We already saw \n, which is a newline. \t is a tab character, \r a 'carriage return', which some systems use before or instead of a newline to indicate the end of a line. \b (backspace), \v (vertical tab), and \f (form feed) are useful when working with old printers, but less so when dealing with Internet browsers.