Thoughts on Hypermedia APIs

The REST architectural style is defined in Roy Fielding’s thesis, primarily chapter 5, where the style is described as a set of architectural constraints. A quick summary of these constraints is:

  • client-server
    • The system is divided into client and server portions.
  • stateless
    • Each request from client to server must contain all of the information necessary to understand the request.
  • cache
    • Response data is implicitly or explicitly marked as cacheable or non-cacheable.
  • uniform interface

    • All interactions through the system happen via a standard, common interface. This is achieved by adhering to four sub-constraints:
    1. identification of resources
      • Domain objects are assigned resource identifiers (e.g. URIs)
    2. manipulation via representations

      • Actions occur by exchanging representations of current or intended resource state.
    3. self-descriptive messages

      • Messages include control data (e.g. cache-related), resource metadata (e.g. alternates), and representation metadata (e.g. media type) in addition to a representation itself.
    4. hypermedia as the engine of application state

      • Clients move from one state to the next by selecting and following state transitions described in the current set of representations.
  • layered system

    • Components can only “see” the component with which they are directly interacting.
  • code-on-demand (optional)

    • Clients can by dynamically extended by downloading and running code.

Achieving a RESTful architecture with XHTML

Mike Amundsen proposed using XHTML as a media-type of choice for web APIs rather than the ubiquitous Atom or other application-specific XML or JSON representations commonly seen. By using [XHTML profiles](, we are able to define the semantics of the data contained within a particular document, as well as the semantics of contained link relations and form types.

Now, let’s throw a few simple rules into the system:

  1. all domain objects (including collections of domain objects) are resources and get assigned a URL

  2. beyond an HTTP GET to the API’s “home page”, a client simply follows standard XHMTL semantics from returned documents; namely, doing a GET to follow a link, and constructing a GET or POST request by filling out and submitting a form.

  3. retrieval (read) of resource state should be accomplished by GET, and modification of resource state should happen with POST (via a form).

Interestingly, this means that in addition to programmatic clients being able to parse XHTML (as a subset of XML) and apply standard XHTML semantics for interactions, it is possible for a human to use a browser to interact with the resources (or, as my colleague Karl Martino put it, “you can surf an API!“).


So how well does this match up against the REST constraints? By leveraging HTTP directly as an application protocol, we can get a lot of constraints for free, namely: client-server, statelessness, caching, layered system, and self-descriptive messages.

Now, we also get a uniform interface, because all of our domain objects are modelled as resources with identifiers, reads are accomplished by retrieving XHTML documents as representations, and writes are accomplished by sending form-encoded inputs as representations. Finally, because a client accomplishes its goals by “clicking links and submitting forms”, the hypermedia features of XHTML let us model the available state transitions to the client, who can then select what to do next and know how to follow one of the available transitions. Also, because an update to a resource is modelled as a PUT to the same URL we would use to GET its state, this plays nicely and naturally with standard HTTP/1.1 cache semantics (invalidation on write-through).

Finally, we’re not using code-on-demand, in our case, although we could include Javascript with our XHTML representations to provide additional functionality for that human “surfing” our API, even if a programmatic client would ignore the Javascript. However, code-on-demand is listed as an optional constraint anyway.

Coming soon…

This is an intentionally high-level post that I’m intending will be the first in a series of posts that go over specific examples and examine some practical considerations and implementation patterns that are useful. Hopefully, we’ll also be able to illustrate some of the architectural strengths and weaknesses that the REST architectural style is purported to have. Stay tuned!