Metalink/HTTP

Metalink is an XML format for describing downloads. Publishers pack information about a download into a Metalink XML file, such as mirrors and checksums, to overcome many common download problems like a server going down or file corruption. Other useful information can be included as well.

Metalink/HTTP, or mirrors & hashes in HTTP Headers, is another way currently being developed to improve the download situation. It relies on Web Linking (recently approved for RFC publication as a proposed standard) for mirrors and Instance Digests (RFC 3230) for cryptographic hashes.

The nice thing is that it relies on existing standards (using a newly proposed “duplicate” relation type) with the addition of FTP HASH – a way to request the hash of a file over FTP.

Metalink/HTTP can use Metalink/XML too, for certain features like partial file hashes that would be too verbose over HTTP headers. At this stage, Metalink/HTTP only has a few implementations, but existing Metalink/XML clients can be converted to support it quickly.

Metalink/HTTP clients begin a download with a standard HTTP GET request to the Metalink server. Alternatively, they can use a HEAD request to the Metalink server to discover mirrors via Link headers. After that, the client follows with a GET request to the desired mirrors.

> GET /distribution/example.ext HTTP/1.1
> Host: www.example.com

The Metalink server responds with the data and these headers:

< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Content-Length: 14867603
< Content-Type: application/x-cd-image
< Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
< Link: <http://www2.example.com/example.ext>; rel="duplicate" pref=1
< Link: <ftp://ftp.example.com/example.ext>; rel="duplicate"
< Link: <http://example.com/example.ext.torrent>; rel="describedby"; type="application/x-bittorrent"
< Link: <http://example.com/example.ext.metalink>; rel="describedby"; type="application/metalink4+xml"
< Link: <http://example.com/example.ext.asc>; rel="describedby"; type="application/pgp-signature"
< Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlODYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==

From the Metalink server response the client learns some or all of the following metadata about the requested object, in addition to also starting to receive the object:

  • File size.
  • ETag.
  • Mirror profile link, which may describe the mirror’s priority, whether it shares the ETag policy of the originating Metalink server, geographical location, and mirror depth.
  • Peer-to-peer information.
  • Metalink/XML, which can include partial file checksums to repair a file.
  • Digital signature.
  • Instance Digest, which is the whole file checksum.

A client request to a mirror server, using the Range header:

> GET /example.ext HTTP/1.1
> Host: www2.example.com
> Range: bytes=7433802-
> If-Match: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
> Referer: http://www.example.com/distribution/example.ext

The mirror servers respond with a 206 Partial Content HTTP status code and appropriate “Content-Length” and “Content Range” header fields.  The mirror server response, with data, to the above request:

< HTTP/1.1 206 Partial Content
< Accept-Ranges: bytes
< Content-Length: 7433801
< Content-Range: bytes 7433802-14867602/14867603
< Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
< Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlODYzYjRjYzFhOTlmNTQwYzI2M2QwM2U2MQ==

Partial file checksums are especially useful for large files. HTTP mirrors can be coordinated, meaning they share the same ETag policy which allows for early detection of file mismatches.

One of the nice things about Metalink/HTTP is that there is no dependency on XML, unless you want partial file hashes. Drawbacks include requiring changes to server software, where the XML version can even be created by users and requires no server side changes.

Metalink/HTTP is also bound to HTTP, where FTP or P2P clients won’t be using it unless they also support HTTP, unlike Metalink/XML.

We’ve been working on Metalink/HTTP for about 7 months and it’s still experimental, so if you have any ideas or comments then help us make it better.