Discovery and HTTP

Discovery CrossroadDiscovery is the process in which machines find out information about web resources, which enables them to interact with previously unknown services. It centered around locating and retrieving the resource metadata and parsing it. The challenge is making this workflow consistent with the web architecture and the HTTP protocol, while at the same time addressing key scalability requirements and efficiencies.

Put simply: A server is trying to interact with an unfamiliar resource (identified by a URL). The server must first find out where the resource’s metadata resides, fetch it, parse the metadata, and learn how to interact with the resource. This definition of discovery makes clear distinction between the process used to find the metadata, and the format used to provide it. First find it, then parse it – and to find it, start from the resource’s URL.

Separating the Road from the Destination

Agreeing on the latter, the document format for describing resources, is not something I expect to find an industry-wide agreement on. And it is not necessarily a bad thing. Different document schemas offer varying levels of complexity and features and are created to address different use cases. XRDS, POWDER, and even robots.txt, offer significantly different approaches to encoding resource metadata. They each define a different schema for describing resources, sharing some general concepts, but each with a different focus and approach.

XRDS came out of the XRI world and is focused on service endpoint selection – connecting a list of concrete resources to an abstract identifier. When expending XRDS to HTTP URLs – non-abstract web resources – it providers a simple format for tagging resources and related services. These tags, called types, are URI-formatted strings which inform machines about the capabilities and characteristics of the resource they are associated with.

POWDER, drawing much from the RDF world, defines a richer set of elements for describing resources and enables a more complex hierarchy of relationships between resources. While XRDS is focused on providing a (mostly flat) list of linked resources, POWDER provides a great deal of flexibility in terms of creating resource associations and hierarchies. For the use cases I am working on, I find the XRDS approach (via XRDS-Simple) much more suitable, but I have to admit to have a very superficial familiarity with POWDER at the moment.

Debating the suitability of these schemas without a concrete application is futile. The key to this discussion is that each of these schemas offers a different balance between complexity and functionality, and it is the market’s job to decide which one is the most suitable for each application. The XRDS, POWDER, and other communities should not try to merge their work into a single solution, nor should either one try to dismiss or ignore the other. Instead, they should focus on where they are in agreement, and where there is no value in competing approaches.

The mechanism used to get from point A (a resource URL) to point B (the metadata document or its location) is common to both specifications, and they share the same set of requirements. This is also the part of discovery that is operating within an existing and well-established protocol: HTTP. Regardless of what document format is used, locating it is still an unsolved problem.


Identifying Requirements

Getting from a resource URL to its metadata document can be implemented in many ways. The problem is that none of the current solutions address all the requirements presented by the common use cases. The requirements are simple, but the more we try to address, the less elegant the solution becomes. Working on the XRDS-Simple specification and talking to companies and individual about it, the following requirements crystallized:

  • Resource Declaration – allow resources to declare the availability of metadata information and its location. When a resource is accessed, it should have a way to communicate to the discovering application (consumer) that it supports the discovery protocol and indicates how to retrieve its metadata. This is useful when the consumer is able to interact with the resource but can enhance its interaction with additional data. For example, accessing an ATOM feed can be enhanced if the feed endpoint also supports the ATOM Publishing protocol.
  • Direct Metadata Access – enable direct retrieval of metadata without interaction with the resource itself. Before a resource is accessed, the consumer should have a way to fetch the resource’s metadata without accessing the resource. This is important for two reasons. First, accessing an unknown resource may have undesirable consequences. After all, the information contained in the metadata is supposed to inform the consumer how to interact with the resource. The second is efficiency – removing the need to interact with the resource in order to get its metadata (which can reduce HTTP round-trips, network bandwidth, application latency, and overall waste).
  • Web Compliant – work with existing web infrastructure. This may sound trivial but it is in fact the most complex requirement. Deploying new extensions to HTTP is a complicated endeavor. Beside getting applications to support a new header, verb, or content negotiation method, the existing caches and proxies must be enhanced to properly handle these requests, and they must not fail performing their normal duties without such enhancement. For example, a new content negotiation method may cause an existing cache to serve the wrong data to a non-discovery consumer due to its inability to distinguish the metadata request from the resource itself.
  • Scale Agnostic – support large and small providers. Any solution must work for a small hosted website as well as the world largest portal. It must be flexible enough to allow developers with restricted access to the full HTTP protocol (such as limited access to request headers) to be able to both provide and consume metadata information. It should also cache well and allow reuse of code and data.
  • Extendable – whatever we create must accommodate future enhancements. It should support the existing set of discovery documents such as XRDS and POWDER, as well as new metadata relationships we might have in the future. In addition, the solution should not depend on the metadata schema itself and work equally well with any document format – it should aim to keep the road and destination separate.

Solutions Matrix

The following is an inventory of the proposals and implementations trying to address metadata discovery. Each solution is reviewed for its compliance with the above requirements.

Discovery Comparison

HTTP Response Header – when a resource is accessed, typically using HTTP GET, the server includes in the response a header pointing to the location of the metadata document. For example, POWDER uses the ‘Link’ response header to create an association between the resource and its metadata. XRDS (based on the Yadis protocol) uses a similar approach, but since the ‘Link’ header was not available when Yadis was first drafted, it defines a custom header (‘X-XRDS-Location’) which serves a similar but less generic purpose.

[+] Resource Declaration – using the ‘Link’ header, any resource can point to its metadata document.
[-] Direct Metadata Access – the header is only accessible when interacting with the resource itself via a GET request. While HTTP GET is meant to be a safe operation, it is still possible for some resource to have side-effects.
[+] Web Compliant – uses the ‘Link’ header which is an IETF standard-track draft, and is consistent with HTTP design.
[-] Scale Agnostic – since discovery accounts for a small percent of resource requests, the extra header is wasteful. For some hosted servers, access to HTTP headers is limited and will prevent implementing this solution.
[+] Extendable – the ‘Link’ header provides built-in extendability by allowing new link relationships.

Minimum roundtrips to retrieve metadata: 2

HTTP Response Header over HEAD – same as the HTTP Response Header solution but used with the HTTP HEAD method. The idea of using the HEAD method is to solve the wasteful overhead of including the ‘Link’ header in every reply. By limiting the appearance of the ‘Link’ header only to HEAD requests, typical GET requests are not encumbered by the extra link bytes.

[+] Resource Declaration – see HTTP Response Header.
[-] Direct Metadata Access – see HTTP Response Header.
[-] Web Compliant – HTTP HEAD should return the exact same response as HTTP GET with the sole exception that the response body is omitted. By adding headers only to the HEAD response, this solution violates the HTTP protocol and might not work properly with proxies as they can return the header of the cached GET request.
[+] Scale Agnostic – solves the wasted bandwidth associated with the HTTP Response Header solution, but still suffers from the limitation imposed by requiring access to HTTP headers.
[+] Extendable – see HTTP Response Header.

Minimum roundtrips to retrieve metadata: 2

HTTP Header Negotiation – similar to HTTP Content Negotiation, this solution uses a custom HTTP request header to inform the server of the consumer’s discovery intentions. The server responds by serving the same resource (for HTTP GET) or header (for HTTP HEAD) with the relevant ‘Link’ headers. It attempts to solve the HTTP Response Header waste issue by allowing the consumer to explicitly request the inclusion of ‘Link’ headers. Protocol Z, an illustration-only protocol discussed on the XRDS-Simple list uses a new ‘Request-links’ header to inform the server the consumer would like it to include certain ‘Link’ headers in its reply. A similar idea was proposed on the W3C TAG list to use the ‘Link’ header as a request header in a similar capacity (‘Link’ is currently a response header only).

[+] Resource Declaration – same as HTTP Response Header with the option of selective inclusion.
[-] Direct Metadata Access – does not address.
[-] Web Compliant – HTTP does not include any mechanism for header negotiation and any custom solution will break existing caches.
[+-] Scale Agnostic – Requires advance access to HTTP headers on both the consumer and provider sides, but solves the bandwidth waste issue of the HTTP Response Header solution.
[+] Extendable – builds on top of the ‘Link’ header extendability.

Minimum roundtrips to retrieve metadata: 2

HTML Link Element – embeds the location of the metadata document within the HTML representation, by leveraging the HTML header (as opposed to the HTTP header) to store the metadata link information. Applies to HTML resources or similar XML-based schemas with support for ‘Link’-like element – the ATOM syndication schema being one example. POWDER uses the ‘Link’ HTML element in this manner, while XRDS uses the ‘HTTP-Equiv’ HTML element (to create an embedded version of the ‘X-XRDS-Location’ header).

[+] Resource Declaration – similar to HTTP Response Header but limited to HTML resources.
[-] Direct Metadata Access – the element requires fetching the entire resource in order to obtain the metadata location. In addition, it requires changing the resource HTML representation which makes discovery an intrusive process.
[+] Web Compliant – uses the ‘Link’ element as designed.
[+] Scale Agnostic – while this solution requires direct interaction with the resource and manipulation of its content, it is extremely accessible in many platforms.
[-] Extendable – extendability is restricted to HTML resources or similar with their own header (embedded metadata) and support for a ‘Link’-like element.

Minimum roundtrips to retrieve metadata: 2

HTTP Content Negotiation – using the ‘Accept’ request header, the consumer informs the server it is interested in the metadata and not the resource itself, to which the server responds with the metadata document or its location. In XRDS, the consumer sends an HTTP GET (or HEAD) request to the resource URL with an ‘Accept’ header and content-type ‘application/xrds+xml’. This informs the server of the consumer’s discovery interest, which in turn may reply with the discovery document itself, redirect to it, or return its location via the ‘X-XRDS-Location’ (or ‘Link’) response header.

[-] Resource Declaration – does not address as it focuses on the consumer declaring its intentions.
[+] Direct Metadata Access – provides a simple method for directly requesting the metadata document.
[-] Web Compliant – while some argue that the metadata can be considered another representation of the resource, it is very much external to it. Using the ‘Accept’ header to request a separate resource (as opposed to a different representation of the same resource) violates the HTTP protocol. It also prevents using the discovery content-type as a valid (self-standing) web resource having its own metadata.
[-] Scale Agnostic – requires access to HTTP request and response headers, as well as the registration of multiple handlers for the same resource URL based on the ‘Accept’ header. In addition, improper use or implementation of the ‘Vary’ header in conjunction with the ‘Accept’ header will cause proxies to serve the metadata document instead of the resource itself – a great concern to large providers with frequently visited front-pages.
[-] Extendable – limited to a single content-type for metadata, and does not allow any existing schemas (with well known content-type).

Minimum roundtrips to retrieve metadata: 1

HTTP OPTIONS Method – the OPTIONS method is used to interact with the server with regard to its capabilities, and communication-related information about the resource. While all the previous solutions require direct interaction with the resource, as they all imply making HTTP GET requests to the resource, OPTIONS does not. The OPTIONS method, together with an optional request header, can be used to request both the metadata location and metadata content itself.

[-] Resource Declaration – does not address.
[+] Direct Metadata Access – provides a clean mechanism for requesting metadata information about a resource without interacting with it.
[+] Web Compliant – uses an existing HTTP featured.
[-] Scale Agnostic – requires consumer and provider access to the OPTIONS HTTP method. Also does not support caching which makes this solution inefficient.
[+] Extendable – built-into the OPTIONS method.

Minimum roundtrips to retrieve metadata: 1

WebDAV PROPFIND Method – similar to the OPTIONS method, the PROPFIND method can be used to request resource specific properties, one of which can hold the location of the metadata document. PROPFIND, unlike OPTIONS, cannot return the metadata itself, unless it is returned in the required PROPFIND schema (a multistatus XML element). Other options include URIQA, an HTTP extension which defines a method called MGET, and ARK (Archival Resource Key) – a method similar to PROPFIND that allows the retrieval of resource attributes using keys (which describe the resource).

[-] Resource Declaration – does not address.
[+-] Direct Metadata Access – does not require interaction with the resource, but does require at least two requests to get the metadata (get location, get document).
[+-] Web Compliant – uses HTTP extensions with less support than standard HTTP, but still based on published standards.
[-] Scale Agnostic – same as the HTTP OPTIONS Method.
[+-] Extendable – uses extendable protocols but at the same time depends on solutions that have already gone beyond the standard HTTP protocol, which makes further extensions more complex and unsupported.

Minimum roundtrips to retrieve metadata: 2

Custom HTTP Method – similar to the HTTP OPTIONS Method, a new method can be defined (such as DISCOVER) to return (or redirect to) the metadata document. The new method can allow caching.

[-] Resource Declaration – does not address.
[+] Direct Metadata Access – same as the HTTP OPTIONS Method.
[-] Web Compliant – depends heavily on extending every platform to support the extension. Unlikely to be supported by existing proxy services and caches.
[-] Scale Agnostic – same as HTTP OPTIONS Method with the additional burden on smaller sites requiring access to the new protocol.
[+] Extendable – new protocol that can extend as needed.

Minimum roundtrips to retrieve metadata: 1

Static Resource Mapping – instead of using HTTP facilities to access the metadata, this solution defines a template to convert any URL to the metadata document URL. This can be done by adding a prefix or suffix to the resource URL, which turns it into a new resource URL. The new URL points to the metadata document. For example, to fetch the metadata document for http://example.com/resource, the consumer makes an HTTP GET request to http://xrds.example.com/resource, http://example.com/resource;discovery or http://example.com/xrds?http://example.com/resource. The idea is to define a static map between any URL to its metadata information, avoiding any dependencies on the HTTP protocol.

[-] Resource Declaration – does not address.
[+] Direct Metadata Access – creates a unique URL for the metadata document.
[+] Web Compliant – uses basic HTTP facilities.
[+-] Scale Agnostic – depending on the static mapping chosen, some hosted environment will have a problem gaining access to the mapped URL.
[-] Extendable – provides a very specific and limited method to map between resources and their metadata.

Minimum roundtrips to retrieve metadata: 1

Dynamic Resource Mapping – same as static mapping but with the ability for each domain to specify its own discovery mapping template. This is usually done by placing a configuration file at a known location (such as robots.txt or ) which contains the template needed to perform the URL mapping. The consumer first obtains the configuration document (which may be cached using normal HTTP facilities), parses it, then uses that information to access the metadata document.

[+-] Resource Declaration – does not address individual resources, but allows entire domains to declare their support (and how to use it).
[+] Direct Metadata Access – once the mapping template has been obtained, metadata can be accessed directly.
[+] Web Compliant – uses an existing design pattern (robots.txt) and standard HTTP facilities.
[+-] Scale Agnostic – works well at the URI authority level (domain) but is inefficient at the URI path level (resource path). harder to implement when different paths within the same domain need to use different templates. With the decreasing cost of custom domains and sub-domains, this will not be an issue for most services, but it does require sharing configuration at the domain/sub-domain level.
[+-] Extendable – can be, depending on the schema used to format the configuration document. At the same time, if the configuration document will use one of the current discovery schemas such as XRDS or POWDER, it will create a (mostly political) barrier for adoption by the other community. If it doesn’t, it means creating yet another schema (or at least requiring parsing another schema).

Minimum roundtrips to retrieve metadata: initial 2, 1 after caching


Putting it Together

The intention of this post is not to identify a single solution, but to show what solutions have been discussed in a single comprehensive list. It does however show that some combination of the ‘Link’ header with a dynamic mapping approach to metadata location will produce the closest match to the list of requirements.

I would like to see the XRI, XRDS-Simple, OpenID, OAuth, and POWDER communities begin a dialog around this that will move us quickly towards a single road to discovery. We can continue to disagree on the destination.

4 thoughts on “Discovery and HTTP

  1. Speaking as a member of many/most of these communities, I’m fairly agnostic.
    I *am* partial to any solution that can leverage caching (to reduce two requests to “less than two but more than one”) on average.
    My guess is that the one that will win is the one that is the simplest to get to from where we are today. That would be the static resource mapping with probably a suffix (add /.metadata/media-type to the URI, for example). This doesn’t address the “what descriptor type(s) is/are available” problem, but maybe that’s not an issue if we assume clients are have an idea of what they are looking for, and we don’t mind 404’s and have good caching… Isn’t that the web way? Ideally, we’d like to agree on a “master descriptor” (a la RDDL, etc) but it doesn’t seem like thats gonna happen, as you point out..

  2. Eran, great post. Best written analysis on this issue I’ve seen – and that’s saying a lot considering I reviewed through most of the good links on http://esw.w3.org/topic/FindingResourceDescriptions last week.
    I agree that Link headers can help, but the fact they require 2 roundtrips always is IMHO a reason they will only be a bandaid.
    Static and dynamic mapping are the two solutions I believe have the most potential. I like the elegance of dynamic mapping but I agree with Gabe’s comments above that static mapping is in the vast majority of cases much simpler and more direct, and can work with the vast majority of http(s) URIs.
    Two questions:
    1) Why are you leaning towards dynamic mapping vs. static?
    2) Must the two be mutually exclusive, i.e., can’t you have both?
    =Drummond

  3. “begin a dialog around this that will move us quickly towards a single road to discovery. We can continue to disagree on the destination.”
    This is a great analysis (made the Cover Pages News too), but I had to react to this statement.
    Ordinarily, one wants to have a consensus on destination or else the conflicting agendas and interpretations of the proposed roadway will kill you.
    The real world example of that has to do with transportation initiatives and the referenda to curtail taxation for transportation funding. We’re in that death spiral here in Washington State.
    There are so many examples of this idea going awry in Information Technology that I hesitate to pick any one.
    And the analysis is great. I must subscribe to this blog.

  4. It seems to me that some information with respect to PROPFIND is misleading.
    Resource Declaration – availability of PROPFIND can be determined by inspecting the resource with OPTIONS (Allow header), or by simply trying and checking the response code. Of course that reveals just support for the method, and not for specific metadata.
    Direct Metadata Access – this seems like a single request to me (one PROPFIND), unless the PROPFIND response does not include the relevant information bust just a link to it. Furthermore, WebDAV DeltaV (RFC 3253) defines a way to expand properties on linked resources in-line, potentially avoiding additional requests.
    Extendable – “uses extendable protocols but at the same time depends on solutions that have already gone beyond the standard HTTP protocol, which makes further extensions more complex and unsupported” — I’m not sure what the last statement tries to say. The simplest extension point for PROPFIND is defining new properties, which is totally supported and not complex at all; just mint a new unique namespace/localname combination.

Comments are closed.