The Art of Writing Software

Thoughts on Hypermedia APIs

Tags [ architecture, hypermedia, REST, REST API, RESTful web services, XHTML ]

The REST architectural style is defined in Roy Fielding’s thesis, primarily chapter 5, where the style is described as a set of architectural constraints. A quick summary of these constraints is:

Achieving a RESTful architecture with XHTML

Mike Amundsen proposed using XHTML as a media-type of choice for web APIs rather than the ubiquitous Atom or other application-specific XML or JSON representations commonly seen. By using [XHTML profiles](, we are able to define the semantics of the data contained within a particular document, as well as the semantics of contained link relations and form types.

Now, let’s throw a few simple rules into the system:

  1. all domain objects (including collections of domain objects) are resources and get assigned a URL

  2. beyond an HTTP GET to the API’s “home page”, a client simply follows standard XHMTL semantics from returned documents; namely, doing a GET to follow a link, and constructing a GET or POST request by filling out and submitting a form.

  3. retrieval (read) of resource state should be accomplished by GET, and modification of resource state should happen with POST (via a form).

Interestingly, this means that in addition to programmatic clients being able to parse XHTML (as a subset of XML) and apply standard XHTML semantics for interactions, it is possible for a human to use a browser to interact with the resources (or, as my colleague Karl Martino put it, “you can surf an API!“).


So how well does this match up against the REST constraints? By leveraging HTTP directly as an application protocol, we can get a lot of constraints for free, namely: client-server, statelessness, caching, layered system, and self-descriptive messages.

Now, we also get a uniform interface, because all of our domain objects are modelled as resources with identifiers, reads are accomplished by retrieving XHTML documents as representations, and writes are accomplished by sending form-encoded inputs as representations. Finally, because a client accomplishes its goals by “clicking links and submitting forms”, the hypermedia features of XHTML let us model the available state transitions to the client, who can then select what to do next and know how to follow one of the available transitions. Also, because an update to a resource is modelled as a PUT to the same URL we would use to GET its state, this plays nicely and naturally with standard HTTP/1.1 cache semantics (invalidation on write-through).

Finally, we’re not using code-on-demand, in our case, although we could include Javascript with our XHTML representations to provide additional functionality for that human “surfing” our API, even if a programmatic client would ignore the Javascript. However, code-on-demand is listed as an optional constraint anyway.

Coming soon…

This is an intentionally high-level post that I’m intending will be the first in a series of posts that go over specific examples and examine some practical considerations and implementation patterns that are useful. Hopefully, we’ll also be able to illustrate some of the architectural strengths and weaknesses that the REST architectural style is purported to have. Stay tuned!