Cloud APIs REST Style Guide

From FedoraProject

Revision as of 11:00, 27 May 2010 by Markmc (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Cloud APIs REST Style Guide

Introduction to REST

REST (REpresentional State Transfer) is a distributed computing style formalized by Roy Fielding in his doctoral dissertation. Fielding studied the architectural successes of the web and generalized his findings into a set of design principles that can be applied to other distributed hypermedia systems.

REST proposes several principles including:

  1. Each resource is addressable by a URI
  2. Resources are manipulated via their representations (e.g. XML, HTML, JSON, YAML etc.)
  3. A uniform interface is exposed for all resources (GET, PUT, POST, DELETE etc. in the case of HTTP)
  4. Messages are self-descriptive - i.e. each message contains everything necessary for a stateless server to process it
  5. Hypermedia as the engine of application state (HATEOAS) - state transitions (e.g. changing a resource's state or examining a related resource) is driven by the hypermedia representation of the resource itself

By applying these principles to web services in general, it is hoped that web services will gain some of the desirable properties of the web such as horizontal scalability, de-coupled clients and servers, reduced latency through the use of proxies and real interoperability.

Another very real feature of RESTful APIs are their simplicity and familiarity of use. With any RESTful API, it should be relatively trivial to point a simple client (e.g. curl in a shell script) at a URI for the API and quickly build something useful. This makes a refeshing change from heavyweight middleware frameworks such as CORBA or WS-*.

A Word on HATEOAS

Hypermedia as the engine of application state is one of the REST principles which folks can struggle to get to grips with.

HATEOAS means that the hypermedia guides the clients' interactions. In other words, the hypermedia responses from the API contain links which the client can follow in order to make further requests. This also implies that the URI structure is not part of the API so the client does not manually construct URIs and the server is free to change its URI structure in future.

If you find that you are documenting the URI structure and explaining how clients should construct a URI, you are likely breaking this principle.

Purpose of this Style Guide

This document has two purposes:

  1. Give straightforward answers to some the questions every RESTful API designer gets bogged down in.
  2. Document consensus amongst the various API projects related to Red Hat's cloud stack.

The idea is that rather having each of these projects individually trip over the many hurdles associated with RESTful API design, we will collectively address each of the issues and come to some form of rough consensus (gasp!). And hopefully, not only will we save ourselves a few ulsers, but we will also end up with a set of APIs that look somewhat consistent.

This is a wiki, so feel free to edit it. If you want to make a more controversial change, or there is something here you don't like, fire off an email to the rest-practices mailing list.

Guidelines

Each of the below sections tries to advise you, the API designer, on the consensus style for our cloud related APIs.

Where appropriate, there will be link a relevant discussion on the rest-practices mailing list. Where the subject is still under discussion, that will be called out. Where multiple acceptable approaches have been proposed with no great controversy, then the approach which matches any of our existing APIs will be used.

XML Style

Okay, let's start with something easy (hah!) - naming conventions for elements and attributes on our XML representation of resources.

We have three options:

  1. CamelCase/studlyCaps - Java coding style
  2. lowercase_underscore - python coding style
  3. lowercase-hyphen - perhaps more common in XML

The latter option isn't so great where the identifiers are mapped into programming language identifiers. CamelCase is nice for Java implementations, but lowercase_underscore just looks better in XML/JSON/etc.

Example:

<customer_order product_type="book">
  <book href="..."/>
  ...
</customer_order>

URL Style

URLs should be alllowercasewithnopunctuation e.g.

http://foo.com/dublintours/guinnessstorehouse/

with the exception of query parameters which should be lower_case:

http://foo.com/dublintours/guinnessstorehouse/?start_date=today

Collections

URLs and XML elements for collections are the plural of the resource name e.g.

<departments>
  </department>
  </department>
<department>

Entry Point

As implied by HATEAOS, each API should strive for a single entry point URI.

Where there is a natural top-level object or collection represented by the API, this can be the resource addressed by the entry point. That resource will then have links with which the client can navigate to all other resources.

In other cases, the top-level URI should merely return an "api" resource with links to the other resources or collection of resources in the API e.g.

GET / HTTP/1.1
Host: {host}

HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: {length}
<api>
  <link rel="books" href="/books"/>
  <link rel="music" href="/music"/>
  <link rel="orders" href="/orders"/>
</api>

The client uses its knowledge of the API's link relation types to decide which URI it needs.

XML Schema

RESTful services should be representation-oriented. This means you should pay close attention to the representation of the service's resources.

The default representation of resources in our APIs is XML. Each API should have an XML schema (or RNG schema) which clients and the server alike can use to validate their XML output in their test suites. Deployed servers should not validate client input using this schema so that newer clients may continue to work with older servers.

In the case of a Java server using JAX-RS, it makes good sense to start with a schema of the representation and use xjc to generate JAX-B annotated classes. This ensures your design focus is on the XML representation rather than the object model in the code.

JSON

Describe our mapping of XML to JSON.

References:

YAML

Describe our mapping of YAML to JSON.

Compatibility and Versioning

Once released, all cloud APIs should make an API stability guarantee. Resource representations may be extended, but it must be in a backwards compatibile manner - i.e. old client works with new server.

If at any point in the API evolution a backwards incompatible change must be made, then a new link relation type should be added to support the new incompatible representation and the old relation type should be retained e.g.

<api>
  <link rel="books_v2" href="/books/v2/"/>
  <link rel="books" href="/books"/>
  <link rel="music" href="/music"/>
  <link rel="orders" href="/orders"/>
</api>

FIXME: discuss the option of using content type negotiation based on versioned media types for this.

Documentation

APIs should be documented. Suggest a documentation style.


Identifiers

Resources may typically have three types of identifiers associated with them:

  1. An opaque, server-generated identifier like a UUID. This identifier should be relatively permanent and suitable for clients to store in their own database. Give an entry point URI and this identifier, the client should be able to navigate to the resource.
  2. A URI, which is also opaque and server-generated, but less stable. The server hostname may change, the API entry point may move or the URI structure may change. For these reasons, clients should only use URIs during a single session.
  3. An optional human-readable name, most likely assigned by the user. It should be possible to find the resource using its name, but clients should be aware that the name can be changed at any time by the user.

Some guidelines fall naturally from those observations:

  • When a server response includes a reference to a resource, the most natural identifier to supply is the URI. However, the primary id could also be supplied for convenience e.g.
GET /groups/ HTTP/1.1
<group id="666" href="/groups/666">
  ...
  <members>
    <user id="101" href="/users/101"/>
    <user id="202" href="/users/202"/>
  </members>
</group>
  • When a client must supply a reference to a resource, it should only be required to supply the primary ID e.g.
POST /vms/ HTTP/1.1
<vm>
  ...
  <template id="67e2aa74-2b84-4d50-96e2-d1ec5b961c24">
</vm>

References:

Links

Resources must often reference other resources. One option is to use a partial representation of the referenced resource e.g.

<user href="/users/101"/>

or

<order id="364782"/>

Another option is to use an Atom link:

<actions>
  <link rel="reboot" href="/vms/1234/reboot"/>
  <link rel="shutdown" href="/vms/1234/shutdown"/>
</actions>

Both options have their uses, so decide which is most suitable based on these guidelines:

  • If the referenced resource is one of the main objects in the API, then <resource href="..."/> is probably appropriate
  • If the reference is to one of a number of resources of the same and the Atom link's relationship tag could be used to allow the client to pick between them, then use an Atom link
  • Prefer a href attribute over a <link rel="self" href="..."/> link
  • If you want to also expose the same link in the HTTP headers, then perhaps use an Atom link in the body for consistency

References:

Link Headers

It can be useful to use Link headers to expose some links related to a resource in the HTTP headers. This allows clients to obtain those links without parsing the response body and can use a HEAD request to avoid fetching the body at all.

However, they are appropriate only when the entity body can only contain a single link of a given relationship type - e.g. it is probably not appropriate in a response containing a collection of objects.

Also, some clients and servers have difficulty with multiple headers of the same name. To avoid this issue, concatenate multiple Link headers using a comma to separate them.

The draft HTTP Link header specification requires that any custom relations defined by an application by a URI. This is inconvenient, not yet common practice and not fully approved, so we avoid it for now.

References:

URI Templates

Where query parameters are used in an API, it makes sense to use URI templates to avoid leaking detailed knowledge of the URI structure onto the client side.

The API should document the subset of the URI template spec it will use and what substitution variables will be used in the API's templates.

References:

Media Types

Encodes more detail of the application protocol in the headers, rather than the request/reponse bodies.

If we use media types, we should continue to support application/xml, application/json, application/x-yaml.

Use of media types for versioning?

Scope of the media types? Take a look at e.g. the Sun Cloud API's set of media types.

Naming of our media types.

Use of +json and +yaml modifiers.

References:

Read-Only Fields

  1. Ignore changes to read-only fields
  2. Return an error if PUT/POSTed doc includes a read-only field
  3. Return an error if PUT/POSTed doc includes a change to a read-only field

(1) means we're not being clear on semantics.

(2) is too restrictive on clients that want to do a GET, make a minor change with xpath and then PUT the result.

(3) works, except for an often changing read-only value (e.g. free disk space).

Our pragmatic approach combines (1) and (3).

References:

Query Parameters

Allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations?

CRUD

The basic CRUD (Create, Read, Update, Delete) operations should be modelled using the HTTP POST, GET, PUT and DELETE methods respectively.

In order to list the contents of a collection, you GET the collection URI:

GET /resources/ HTTP/1.1

HTTP/1.1 200 OK
Content-Type: application/xml

<resources>
  <resource id="12345" href="/resources/12345">
    <name>foo</name>
    ...
  </resource>
  <resource ...>
    ...
  </resource>
</resources>

Note, the server may choose to return a partial representation of the resources when listing a collection. All the client should rely on being available is the individual resource URIs.

To fetch the complete representation of an individual resource, you GET the resource URI:

GET /resources/12345 HTTP/1.1

HTTP/1.1 200 OK
Content-Type: application/xml

<resource id="12345" href="/resources/12345">
  ...
</resource>

To create a new resource, you POST to the collection URI. The URI of the newly created resource is returned using the Location header and, optionally, the representation of the resource is returned in the response body:

POST /resources HTTP/1.1
Content-Type: application/xml

<resource>
  <name>foo</name>
</resource>

HTTP/1.1 201 Created
Location: /resources/54321
Content-Type: application/xml
Content-Length: <length>

<resource id="54321" href="/resources/54321">
  <name>foo</name>
  ...
</resource>

To modify a resource, you PUT to the resource URI:

PUT /resources/54321 HTTP/1.1
Content-Type: application/xml

<resource>
  <name>bar</name>
</resource>

HTTP/1.1 200 OK
Content-Type: application/xml

<resource id="54321" href="/resources/54321">
  <name>bar</name>
  ...
</resource>

Finally, to delete a resource, you DELETE the resource URI:

DELETE /resources/54321 HTTP/1.1

HTTP/1.1 204 No Content

Modelling Operations

References:

Async Operations

References:

Caching

Updates Monitoring

Errors

WADL

Security

Lots of topics here - users, roles, groups, authentication, authorization, encryption ...

The resource representations seen by a given user should depend on what the user has permissions too. Some examples:

  1. There should be no 'reboot' action link in a VM representation if the user does not have permission to reboot that VM
  2. There should be no 'HR' department included in the departments collection if that user does not have permission to view the 'HR' department

Language/Platform Considerations

Most of our APIs are implemented using:

  • Java, JBoss, JAX-RS, RESTeasy
  • Ruby, Rails, Sinatra, ActiveResource, ...

REST makes the interoperability question mostly straightforward, but we should be mindful about not making things too awkward for a certain core set of clients:

  • Shell script using e.g. curl
  • Java (which client API do we recommend)
  • Ruby
  • Python

Java

RESTeasy's Client Proxy Framework has the advantage that the server-side JAX-RS annotated interfaces can be re-used, but this also presents a problem. By using the proxy framework, a client embeds knowledge of the URI structure and, as such, is susceptible to future changes in the URI structure. For that reason, using this framework is discouraged except for quick n' dirty clients that aren't worried about future changes to the API.

Further Resources

Books

Blogs

Talks

RESTful Cloud APIs

Other RESTful APIs

Mailing Lists