INTERNET-DRAFT Edward Hardie Expires: January 18, 1998 NASA NIC Scenarios for the Delivery of Negotiated Content using HTTP 1. Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet- Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). Distribution of this memo is unlimited. Please send comments to the author. 2. Abstract This document describes various problems which have arisen in attempts to deliver negotiated content using HTTP and examines several scenarios by which improvements in delivery might be accomplished. This document does not discuss how HTTP might be used to negotiate the use of other protocols to deliver content. 3. Problems with content negotiation in HTTP Content providers on the World Wide Web would like to present their material in the form most appropriate to each individual user. In order to do this effectively, the content provider and content user must be able negotiate among variants when more than one representation of a resource is available. This negotiation entails the exchange of information between the provider and user about their respective capabilities and preferences. The Hypertext Transport Protocol has since version 1.0 [1] used certain request headers to indicate a list of the responses preferred by the user agent making the request. The Accept request header is used to indicate media types, Accept-Charset a list of preferred character sets, Accept-Encoding supported content encodings, and Accept-Language the set of natural languages preferred. In theory, a server receiving these headers can use them to determine which of the available variants of a particular item is most appropriate to the requesting user agent. Problems have emerged with this approach. First, the explosion of content types in use on the World Wide Web has meant that Accept headers would have to be extraordinarily long in order to adequately state the user's Accept preferences. Increasing the length of the Accept headers in order to negotiate consumes bandwidth for every request, whether the resource has variants or not. Increasing their complexity in order to account for all of the possible axes of variation can also consume excessive amounts of server processing power. Second, even if using a full list of Accept preferences did not present resource problems, it could create privacy problems. A full list of Accept preferences creates a user profile; in some situations it could be combined with other information to track a user's interaction with a server or set of servers even if the user has taken steps to eliminate that possiblity. Many content providers have attempted to circumvent the problems associated with negotiation based on Accept headers. The most popular method has been to examine the User-Agent header and to create content tailored to the typical capabilities of that user agent. Because many web user agents are both configurable and extensible, this method has serious drawbacks. A user agent capable of using graphics but configured to display only text may be presented with completely unusable content. Conversely, a user agent which has been extended to handle new media types may never be presented with them. The usefulness of this method has been further degraded by some client authors, who have chosen to use the User-Agent strings associated with other clients in order to indicate compatibility; commonly, this has been done even when the two clients are not, in fact, fully compatible. Content negotiation based on Accept headers or User-agent strings also creates difficulties for caching proxies, as they have no way to determine when the form of content delivered to two different user-agents would be the same. The delivery of inappropriate content can be prevented with the appropriate use of expiration times and cache control directives, but only at the cost of serious losses in cache efficiency. 4. Requirements for effective content negotiation Content negotiation methods should improve the likelihood that the best available resource variant is served to each user. If no available resource variant is appropriate for a particular user, they should furnish a facility for the content provider to indicate that lack. Content negotiation methods should also accomplish their task without substantially increasing the bandwidth or number of network round trips required to deliver the negotiated variant over that required to deliver similar invariant content. Some increase must be expected, but the smallest possible increase is preferred, and any overhead should be limited to those resources for which variants are available. As a rule of thumb, any method that increases bandwidth requirements by more than 50% is unacceptable. For example, a method that delivered all available variants to a client and expected the client to discard those that were not used might be very successful in providing the best available resource and in limiting overhead to those resources for which variants existed; it would, nonetheless, be unacceptable for deployment on the global Internet because of the wastage associated with each delivery. Similarly, any method which adds more than two network round trips to the content delivery is unacceptable. While some negotiation exchanges might be appropriately conceived as a series of increasingly detailed "questions" and "answers" about the respective capabilities and preferences of the parties involved, the increase in latency such extended exchanges would entail make them inappropriate for deployment on the global Internet. In addition, content negotiation methods should interact well with caching proxies. They must prevent the delivery of inappropriate content to proxy users. If possible, they should allow caches to store sufficient information to participate in the negotiation process. Lastly, content negotiation methods used in this context will be preferred if they are extensible to other MIME transport protocols that have the need to negotiate over the same space. Not all aspects of negotiation needed for HTTP fit within the MIME context, however, and content negotiation should not be limited to just those aspects which do. 5. Scenarios 5.1 Improvements to the current system Content negotiation based on Accept headers or User-Agent strings is undertaken by the server based on information present in all requests from clients. The current problems with this method could be ameliorated in a number of ways. Registration of feature bundles could, for example, allow a client to indicate its capabilities without the increase in packet size that individual enumeration of capabilities requires. The use of registered feature bundles does, however, require a fair amount of maintenance for both server and clients, especially given their dynamic extendability. Content providers could also improve the cachability of the variants served by issuing redirects to unique URLs specific to each variant; proxy users with equivalent feature bundles or user agents would then be able to retrieve the cached variant once they received a redirect. Some content providers already use this method to good effect. 5.2 Client-based negotiation Improvements in the current method tend, however, to increase demands on the processing power of the server, which raises serious problems of scalability. Shifting all or part of the processing involved in the negotiation to the client may improve scalability and shifts the locus of negotiation to the party most aware of both the current and potential possibilities for presentation to the user. Client-based content negotiation can take two basic forms: elective negotiation and "active content" negotiation. 5.2.1. Elective negotiation Elective negotiation relies on the idea that the first response to a request for a resource which has variants should be a statement naming the choices and/or the axes along which they vary. The simplest form of elective negotiation occurs when a content provider creates a meta-resource with links to several variants and explanations of the features needed to present each variant. This basic form presumes a least common denominator for presentation of the links and an active user selecting among the presentations. More complex versions of elective negotiation might use HTTP protocol elements to present the axes of variation for a particular resource ("Transparent Content Negotiation in HTTP" [2] presents one proposal along these lines). Elective negotiation limits the process of selecting among variants to the axes along which they vary. This eliminates the need for clients' enumerating their capabilities and user preferences. It also provides both a bandwidth savings and an increase in privacy. This process may, however, require a user's intervention to select among the resources provided, which reduces perceived efficiency and increases perceived latency. 5.2.2 "Active content" negotiation "Active content" negotiation takes place when the request for a resource which has variants returns content for execution in the client context rather than for presentation to the user. This active content determines the capabilities of the user agent and the preferences of the user and then retrieves the best available variant or provides an error status if no variants are appropriate. This method has several advantages: the negotiation takes place without the need for user intervention; the processing for selection of variants is moved completely off the server; and the active content itself should be cachable. There are, however, several limitations to active content negotiation. The most severe problem is that there is not yet a cross-platform standard for active content. Without such a standard, servers may have to engage in Accept header or User-Agent negotiation prior to using active content in order to identify the correct form of active content to send in response to a particular request. Some user agents will also be unable to use active content, either because they have limited processing power or because of security considerations (see Section 7). For these user agents, a completely different method must be provided. As with server-side negotiation, each variant will require a unique URL to ensure cache correctness. 6. Summary Content negotiation using HTTP currently presents problems in accuracy, scalability, and privacy. Amelioration of the problems in the current design is possible, but other content negotiation methods may provide significant improvements. The type of content negotiation most appropriate to a specific context will depend on the type of variable content being served and the capabilities of the parties to the negotiation. Before deployment on the global Internet, however, any client-based content negotiation method MUST meet the following tests: 1) It MUST provide at least as good a content negotiation facility as the current Accept header and User-Agent methods, and it SHOULD increase the likelihood that the best available variant is served to each client. 2) It MUST NOT significantly increase the bandwidth required to deliver the content. 3) It MUST NOT significantly increase the number of network round trips required to deliver the content. 4) It MUST NOT allow inappropriate content to be delivered to the users of caching proxies. It SHOULD reduce overall bandwidth requirements by providing cachable content and allowing proxies to participate in the negotiation process. 5) It SHOULD recognize HTTP's relationship to MIME and provide methods which are extensible to other MIME transports when the space over which negotiation takes place is the same. Experimentation with various content negotiation methods may be necessary to determine their usefulness for different types of resources and contexts. Interoperability will be best maintained, however, if a single method of each type is made standard as soon as experience permits. 7. Security Considerations "Active content" negotiation presumes that clients will execute code provided by servers and that those clients will allow that code to examine user preferences and user agent capabilities in order to select among variants. Any such system would have to be tightly bounded to prevent the active content from executing arbitrary code on the client. The examination of user preferences and user agent capabilities also presents a possible privacy problem, especially as the selection of a particular variant provides an opportunity for the active content to communicate its discoveries. 8. Acknowledgements Koen Holtman, Graham Klyne, Scott Lawrence, and Larry Masinter provided valuable feedback on the initial draft of this document. Discussions within the HTTP working group and the IETF-FAX mailing list also provided insights into the scope of negotiation needed in the HTTP context. 9. References [1] Tim Berners-Lee, Roy Fielding, Henrik Frystk. Hypertext Transfer Protocol -- HTTP/1.0. RFC RFC1945. May 1996. [2] Koen Holtman and Andrew Mutz. Transport Content Negotiation in HTTP. Internet-Draft draft-ietf-http-negotiation-02.txt, HTTP Working Group, May 26, 1997. 9. Author's Address Edward Hardie NASA Network Information Center MS 204-14 Moffett Field, CA 94035-1000 USA +1 415 604 0134 hardie@nasa.gov