1 |
Hypertext Transfer Protocol (HTTP) Tim Berners-Lee, CERN |
2 |
Internet Draft 5 Nov 1993 |
3 |
draft-ietf-iiir-http-00.txt Expires 5 May 1994 |
4 |
|
5 |
|
6 |
|
7 |
Hypertext Transfer Protocol (HTTP) |
8 |
|
9 |
A Stateless Search, Retrieve and Manipulation Protocol |
10 |
|
11 |
|
12 |
|
13 |
|
14 |
|
15 |
Status of this memo |
16 |
|
17 |
This document is an Internet Draft. Internet Drafts are working |
18 |
documents of the Internet Engineering Task Force (IETF), its Areas, |
19 |
and its Working Groups. Note that other groups may also distribute |
20 |
working documents as Internet Drafts. |
21 |
|
22 |
Internet Drafts are working documents valid for a maximum of six |
23 |
months. Internet Drafts may be updated, replaced, or obsoleted by |
24 |
other documents at any time. It is not appropriate to use Internet |
25 |
Drafts as reference material or to cite them other than as a |
26 |
"working draft" or "work in progress". |
27 |
|
28 |
This document is a DRAFT specification of a protocol in use on the |
29 |
internet and to be proposed as an Internet standard. Discussion |
30 |
of this protocol takes place on the www-talk@info.cern.ch mailing |
31 |
list -- to subscribe mail to www-talk-request@info.cern.ch. |
32 |
Distribution of this memo is unlimited. |
33 |
|
34 |
Abstract |
35 |
|
36 |
HTTP is a protocol with the lightness and speed necessary for a |
37 |
distributed collaborative hypermedia information system. It is a |
38 |
generic stateless object-oriented protocol, which may be used for |
39 |
many similar tasks such as name servers, and distributed |
40 |
object-oriented systems, by extending the commands, or "methods", |
41 |
used. A feature if HTTP is the negotiation of data representation, |
42 |
allowing systems to be built independently of the development of |
43 |
new advanced representations. |
44 |
|
45 |
Note: This specification |
46 |
|
47 |
This HTTP protocol is an upgrade on the original protocol as |
48 |
implemented in the earliest WWW releases. It is back-compatible |
49 |
with that more limited protocol. |
50 |
|
51 |
This specification includes the following parts: |
52 |
|
53 |
The Request |
54 |
|
55 |
|
56 |
|
57 |
T. Berners-Lee 1 |
58 |
|
59 |
Methods |
60 |
|
61 |
A list of headers in the request message |
62 |
|
63 |
The response |
64 |
|
65 |
Status codes |
66 |
|
67 |
A list of headers on any object transmitted |
68 |
|
69 |
The content of any object content transmitted |
70 |
|
71 |
Format negotiation algorithm |
72 |
|
73 |
The HTTP Registration Authority |
74 |
|
75 |
Security Considerations |
76 |
|
77 |
Unresolved points |
78 |
|
79 |
References |
80 |
|
81 |
The following notes form recommended practice not part of the |
82 |
specification: |
83 |
|
84 |
Servers tolerating clients |
85 |
|
86 |
Clients tolerating servers |
87 |
|
88 |
Purpose |
89 |
|
90 |
When many sources of networked information are available to a |
91 |
reader, and when a discipline of reference between different |
92 |
sources exists, it is possible to rapidly follow references |
93 |
between units of information which are provided at different remote |
94 |
locations. As response times should ideally be of the order of |
95 |
100ms in, for example, a hypertext jump, this requires a fast, |
96 |
stateless, information retrieval protocol. |
97 |
|
98 |
Practical information systems require more functionality than |
99 |
simple retrieval, including search, front-end update and |
100 |
annotation. This protocol allows an open-ended set of methods to be |
101 |
used. It builds on the discipline of reference provided by the |
102 |
Universal Resource Identifier (URI) as a name (URN, RFCxxxx) or |
103 |
address (URL, RFCxxxx) allows the object of the method to be |
104 |
specified. |
105 |
|
106 |
Reference is made to the Multipurpose Internet Mail Extensions |
107 |
(MIME, RFC1341) which are used to allow objects to be transmitted |
108 |
in an open variety of representations. |
109 |
|
110 |
Overall operation |
111 |
|
112 |
|
113 |
|
114 |
|
115 |
T. Berners-Lee 2 |
116 |
|
117 |
On the internet, the communication takes place over a TCP/IP |
118 |
connection. This does not preclude this protocol being implemented |
119 |
over any other protocol on the internet or other networks. In |
120 |
these cases, the mapping of the HTTP request and response |
121 |
structures onto the transport data units of the protocol in |
122 |
question is outside the scope of this specification. It should not |
123 |
however be at all complicated. |
124 |
|
125 |
The protocol is basically stateless, a transaction consisting of |
126 |
|
127 |
Connection The establishment of a connection by the |
128 |
client to the server - when using TCP/IP |
129 |
port 80 is the well-known port, but other |
130 |
non-reserverd ports may be specified in the |
131 |
URL; |
132 |
|
133 |
Request The sending, by the client, of a request |
134 |
message to the server; |
135 |
|
136 |
Response The sending, by the server, of a response to |
137 |
the client; |
138 |
|
139 |
Close The closing of the connection by either both |
140 |
parties. |
141 |
|
142 |
The format of the request and response parts is defined in this |
143 |
specification. Whilst header information defined in this |
144 |
specification is sent in ISO Latin-1 character set in CRLF |
145 |
terminated lines, object transmission in binary is possible. |
146 |
|
147 |
Character sets |
148 |
|
149 |
In all cases in HTTP where RFC822 characters are allowed, these may |
150 |
be extended to use the full ISO Latin 1 character set. 8-bit |
151 |
transmission is always used. |
152 |
|
153 |
tableofcontents |
154 |
|
155 |
REQUEST |
156 |
|
157 |
The request is sent with a first line containing the method to be |
158 |
applied to the object requested, the identifier of the object, and |
159 |
the protocol version in use, followed by further information |
160 |
encoded in the RFC822 header style. The format of the request is: |
161 |
|
162 |
|
163 |
Request = SimpleRequest | FullRequest |
164 |
|
165 |
SimpleRequest = GET <uri> CrLf |
166 |
|
167 |
FullRequest = Method UR ProtocolVersion CrLf |
168 |
[*<HTRQ Header>] |
169 |
[<CrLf> <data>] |
170 |
|
171 |
|
172 |
|
173 |
T. Berners-Lee 3 |
174 |
|
175 |
<Method> = <InitialAlpha> |
176 |
|
177 |
ProtocolVersion = HTTP/V1.0 |
178 |
|
179 |
uri = <as defined in URL spec> |
180 |
|
181 |
<HTRQ Header> = <Fieldname> : <Value> <CrLf> |
182 |
|
183 |
<data> = MIME-conforming-message |
184 |
|
185 |
|
186 |
The URI is the Uniform Resource Locator (URL) as defined in the |
187 |
specification, or may be (when it is defined) a Uniform Resource |
188 |
Name (URN) when a specification for this is settled, for servers |
189 |
which support URN resolution. |
190 |
|
191 |
Unless the server is being used as a gateway, a partial URL shall |
192 |
be given with the assuptions of the protocol (HTTP:) and server |
193 |
(the server) being obvious. |
194 |
|
195 |
The URI should be encoded using the escaping scheme described in |
196 |
the URL specification to a level such that (at least) spaces and |
197 |
control characters (decimal 0-31 and 128-159) do not appear |
198 |
unesacaped. |
199 |
|
200 |
Note. The rest of an HTTP url after the host name and optional port |
201 |
number is completely opaque to the client: The client may make no |
202 |
deductions about the object from its URL. |
203 |
|
204 |
Protocol Version |
205 |
|
206 |
The Protocol/Version field defines the format of the rest of the |
207 |
request.. At the moment only HTRQ is defined . |
208 |
|
209 |
If the protocol version is not specified, the server assumes that |
210 |
the browser uses HTTP version 0.9. |
211 |
|
212 |
Uniform Resource Identifier |
213 |
|
214 |
This is a string identifying the object. It contains no blanks. It |
215 |
may be a Uniform Resource Locator [ URL ] defining the address of |
216 |
an object as described in RFCxxxx, or it may be a representation of |
217 |
the name of an object (URN, Universal Resource Name) where that |
218 |
object has been registered in some name space. At the time of |
219 |
writing, no suitable naming system exists, but this protocol will |
220 |
accept such names so long as they are distinguishable from the |
221 |
existing URL name spaces. |
222 |
|
223 |
Methods |
224 |
|
225 |
Method field indicates the method to be performed on the object |
226 |
identified by the URL. More details are with the list of method |
227 |
names below . |
228 |
|
229 |
|
230 |
|
231 |
T. Berners-Lee 4 |
232 |
|
233 |
Request Headers |
234 |
|
235 |
These are RFC822 format headers with special field names given in |
236 |
the list below , as well as any other HTTP object headers or MIME |
237 |
headers. |
238 |
|
239 |
Object Body |
240 |
|
241 |
The content of an object is sent (depending on the method) with the |
242 |
request and/or the reply. |
243 |
|
244 |
Methods |
245 |
|
246 |
The Method field in HTTP indicates the method to be performed on |
247 |
the object identified by the URL. The method GET below is always |
248 |
supported, The list of other methods acceptable by the object are |
249 |
returned in response to either of these two requests. |
250 |
|
251 |
This list may be extended from time to time by a process of |
252 |
registration with the design authority. Method names are case |
253 |
sensitive. Currently specified methods are as follows: |
254 |
|
255 |
GET means retrieve whatever data is identified |
256 |
by the URI, so where the URI refers to a |
257 |
data-producing process, or a script which can |
258 |
be run by such a process, it is this data |
259 |
which will be returned, and not the source |
260 |
text of the script or process. Also used for |
261 |
searches . |
262 |
|
263 |
HEAD is the same as GET but returns only HTTP |
264 |
headers and no document body. |
265 |
|
266 |
CHECKOUT Similar to GET but locks the object against |
267 |
update by other people. The lock may be |
268 |
broken by a higher authority or on timeout: |
269 |
in this case a future CHECKIN will fail. ( |
270 |
Phase out? ) |
271 |
|
272 |
SHOWMETHOD Returns a description (perhaps a form) for a |
273 |
given method when applied to the given |
274 |
object. The method name is specified in a |
275 |
For-Method: field. (TBS) |
276 |
|
277 |
PUT specifies that the data in the body section |
278 |
is to be stored under the supplied URL. The |
279 |
URL must already exist. The new contenst of |
280 |
the document are the data part of the |
281 |
request. POST and REPLY should be used for |
282 |
creating new documents. |
283 |
|
284 |
DELETE Requests that the server delete the |
285 |
information corresponding to the given URL. |
286 |
|
287 |
|
288 |
|
289 |
T. Berners-Lee 5 |
290 |
|
291 |
After a successfull DELETE method, the URL |
292 |
becomes invalid for any future methods. |
293 |
|
294 |
POST Creates a new object linked to the specified |
295 |
object. The message-id field of the new |
296 |
object may be set by the client or else will |
297 |
be given by the server. A URL will be |
298 |
allocated by the server and returned to the |
299 |
client. The new document is the data part of |
300 |
the request. It is considered to be |
301 |
subordinate to the specified object, in the |
302 |
way that a file is subordinate to a directory |
303 |
containing it, or a news article is |
304 |
subordinate to a newsgroup to which it is |
305 |
posted. |
306 |
|
307 |
LINK Links an existing object to the specified |
308 |
object. |
309 |
|
310 |
UNLINK Removes link (or other meta-) information |
311 |
from an object. |
312 |
|
313 |
CHECKIN Similar to PUT, but releases the lock set on |
314 |
the object. Fails if no lock has been set by |
315 |
CHECKOUT. Suggestion : phase out this |
316 |
(rcs-like) model in favor of the PUT |
317 |
(cvs-like, non-locking) model of code |
318 |
management. |
319 |
|
320 |
TEXTSEARCH The object may be queried with a text |
321 |
string. The search form of the GET method is |
322 |
used to query the object. |
323 |
|
324 |
SPACEJUMP The object will accept a query whose terms |
325 |
are the cooridnates of a point within the |
326 |
object. The method is implemented using GET |
327 |
with a derived URL . |
328 |
|
329 |
SEARCH Proposed only. The index (etc) identified |
330 |
by the URL is to be searched for something |
331 |
matching in some sense the enclosed message. |
332 |
How does the client know what message fromats |
333 |
are acceptable to the server? (Suggestion of |
334 |
Fred Williams) |
335 |
|
336 |
(Some of these methods require more detailed specification) |
337 |
|
338 |
Note: case sensitivity of method names |
339 |
|
340 |
Although lack of case sensitivity in methos names would be a |
341 |
tolerant approach with a limited method set, we require the |
342 |
extensibility of HTTP to cover an arbitrary underlying object |
343 |
system. In such a system, method names may be case sensitive, and |
344 |
|
345 |
|
346 |
|
347 |
T. Berners-Lee 6 |
348 |
|
349 |
so we must preserve case in HTTP. |
350 |
|
351 |
GET |
352 |
|
353 |
A representation of the object is transferred to the client. |
354 |
|
355 |
|
356 |
|
357 |
Some URIs refer to specific variants of an object, and some refer |
358 |
to objects with many variants. In the latter case, the |
359 |
representations, encodings, and languages acceptable may be |
360 |
specified in the header request fields, and may affect the |
361 |
particular value which is returned. |
362 |
|
363 |
Other possible replies allow a set of URIs to be returned to the |
364 |
client, who may use them to retrieve the object. This allows name |
365 |
servers to be implemented using HTTP, and also forwarding address |
366 |
to be given when objects have been moved. |
367 |
|
368 |
Annotation replies |
369 |
|
370 |
Some servers keep 'parallel webs" -- separate stores of information |
371 |
about other server's objects. Typically this incldues annotation |
372 |
links. This information may be retrieved using the GET method, |
373 |
but a special reply is returned. |
374 |
|
375 |
Searching using GET |
376 |
|
377 |
When the TEXTSEARCH or SPACEJUMP methods are supported, their |
378 |
implementation is in fact by means of the GET method, by |
379 |
constructing a derived URL. |
380 |
|
381 |
The URL used with GET is the URL of the object, suffixed by a "?" |
382 |
character and the text to be searched. If the object being searched |
383 |
is in fact itself the result of a search (ie the URL contains a |
384 |
"?"), then those search terms are first stripped off, so the search |
385 |
is performed on the original object. |
386 |
|
387 |
In the URL, keywords are seperated with plus signs ("+"). Real |
388 |
spaces, plus signs, and other illegal characters [who defines |
389 |
illegal characters?] are represented as hexadecimal ASCII escape |
390 |
sequences (%##, e.g., space = %20, plus = %2B) |
391 |
|
392 |
Browsers accepting text input should define keywords as seperated |
393 |
by spaces (and therefore map them to plus signs). More advanced |
394 |
browsers may allow separate keyword input and, therefore, a finer |
395 |
level of control over the content without users needing to |
396 |
understand the underlying mechanisms (e.g., that they must use %20 |
397 |
to get a real space). |
398 |
|
399 |
Examples |
400 |
|
401 |
The following is a request on an object supporting TEXTSEARCH: |
402 |
|
403 |
|
404 |
|
405 |
T. Berners-Lee 7 |
406 |
|
407 |
GET /indexes/botany?annual+plants HTTP/1.0 |
408 |
|
409 |
This request on an object supporting SPACEJUMP selects point |
410 |
(0.2,3.8) within a map: |
411 |
|
412 |
GET /maps/uk/dorset/winspit/0.2+3.8 HTTP/1.0 |
413 |
|
414 |
|
415 |
LINK |
416 |
|
417 |
The link method of HTTP adds meta information (Object header |
418 |
information) to an object, without touching the object's content. |
419 |
For example, it requests the creation of a link from the specified |
420 |
object to another object. |
421 |
|
422 |
The request is followed by a set of object headers which are to be |
423 |
added. |
424 |
|
425 |
In cases in which a new header is added and the semantics of the |
426 |
header do not allow it to coexist with a similar header, then the |
427 |
previous header is deleted. For example, an object may only have |
428 |
one title, so specifying a title overwrites and preexsiting title. |
429 |
|
430 |
Link Types |
431 |
|
432 |
The link type unless void is specified in the WWW-Link object field |
433 |
for each link. If not specified, void is assumed. |
434 |
|
435 |
Unresolved points |
436 |
|
437 |
As this is generalised to allow any metainformation to be added, a |
438 |
better name might be DESCRIBE or attribute (as a verb). I don't |
439 |
like "describe" as it does not suggest alteration. |
440 |
"Bestow"/"Rescind" might be better. For example, one could bestow a |
441 |
title or author on something previously title-less. We are looking |
442 |
at a general data model behind here a little like a relational |
443 |
database. This function adds records. "Set" and "Reset"? |
444 |
|
445 |
Related Methods |
446 |
|
447 |
LINK is like POST except that |
448 |
|
449 |
no storage is requested for the destination object, and |
450 |
|
451 |
the destination is not assumed to be subordinate to the |
452 |
specified object. |
453 |
|
454 |
See also: UNLINK. |
455 |
|
456 |
Tim BL |
457 |
|
458 |
POST |
459 |
|
460 |
|
461 |
|
462 |
|
463 |
T. Berners-Lee 8 |
464 |
|
465 |
This method of HTTP creates a new object linked to and subordinate |
466 |
to the specified object. The content of the new object is |
467 |
enclosed as the body of the request. |
468 |
|
469 |
The POST method is desiged to allow a uniform function to cover |
470 |
|
471 |
Annotation existing documents; |
472 |
|
473 |
Posting a message to a bulletin board topic, newsgroup, mailing |
474 |
list, or similar group of articles; |
475 |
|
476 |
Adding a file to a directory; |
477 |
|
478 |
Extending a document during authorship. |
479 |
|
480 |
The client may not assume any postconditions of the method in terms |
481 |
of web topology. For example, if a POST is accepted, the effect may |
482 |
be delayed or overruled by human moderation, batch processing, etc. |
483 |
The client should not be surprised if a link is not immediately |
484 |
(or never) created. |
485 |
|
486 |
However, the semantics are a request for a link to be made from the |
487 |
object whose URL is quoted to the new, enclosed, object. |
488 |
|
489 |
If no URL for the NEW object is given by the client, the server is |
490 |
requested to see to the storage of the new object. That is, the |
491 |
server does not have to store it but will have to return a URL be |
492 |
which it can later be retrieved. The semantics of this method |
493 |
(currently) imply nothing of any undertakings by the server to |
494 |
maintain the availability of the object. |
495 |
|
496 |
If the client gives a URL, then the server is not obliged to store |
497 |
the object, but may take a copy and may in that case issue a new |
498 |
URL. |
499 |
|
500 |
Return object headers |
501 |
|
502 |
The method shall return a set (possibly empty) of object headers |
503 |
for the newly posted object. If a URL has been assigned by the |
504 |
server, then that may be included. Similarly, if a URN has been |
505 |
assigned, then that shall be returned. Other things which may be |
506 |
returned include for example the expiry-date if any. The server may |
507 |
return the entire metadata for the object (as in the HEAD command), |
508 |
or a subset of it. |
509 |
|
510 |
The object body shall not be returned, so the transaction shall end |
511 |
with the blank line terminating the headers. |
512 |
|
513 |
Link type |
514 |
|
515 |
The link type may be specified by explicitly giving a (reverse) |
516 |
link in the object header of the linked object. If a link or links |
517 |
between the two objects are present in those headers, then that |
518 |
|
519 |
|
520 |
|
521 |
T. Berners-Lee 9 |
522 |
|
523 |
link is used. If no such link is specified, then the server should |
524 |
generate a link. The link type in this case is determined by the |
525 |
server. The server may perform other operations as a result of the |
526 |
new object being added: lists and indexes might be updated, for |
527 |
example. |
528 |
|
529 |
Submission |
530 |
|
531 |
When articles are submitted, the analogy of being addeed to a body |
532 |
of knowledge by being linked is close. When a form is submitted, |
533 |
this can be done with POST, though in this case side-effects will |
534 |
be expected. |
535 |
|
536 |
(Should submission for action have a different method -- see |
537 |
showmethod -- or should it be just POST for simplicity? When |
538 |
interfacing with other systems such as bews and mail, the |
539 |
distinction is not made as the system does not have the ability to |
540 |
distinguish different methods. We now have a possibility of making |
541 |
a separate action, though.) |
542 |
|
543 |
Unresolved points |
544 |
|
545 |
The client has no way of knowing what data formats the server is |
546 |
prepared to accept. This may not be a serious problem, but it may |
547 |
be if the server has some restrictions. This applies to all |
548 |
submissions of object content from client to server, for all |
549 |
methods. |
550 |
|
551 |
Related methods. |
552 |
|
553 |
When a new URL has been returned by the server, it may in general |
554 |
(typically, but not necessarily) be usable as the argument of |
555 |
DELETE , GET , PUT , etc, methods. |
556 |
|
557 |
To make a link between two existing objects, see LINK. |
558 |
|
559 |
Tim BL |
560 |
|
561 |
SEARCH |
562 |
|
563 |
Some correspondance about SEARCH: |
564 |
|
565 |
Q |
566 |
|
567 |
How did the client know that the server could search for tex? Maybe t |
568 |
his server can only serach for text and GIF patterns, even though it g |
569 |
ave out tex. How can the client know that a GIF search is possible? |
570 |
|
571 |
Answer |
572 |
|
573 |
I assume the following has happened: |
574 |
|
575 |
client requests with ACCEPT: text/x-TeX |
576 |
|
577 |
|
578 |
|
579 |
T. Berners-Lee 10 |
580 |
|
581 |
If the server, which it should if possible, replies with the doc |
582 |
requested in ``TeX''. |
583 |
|
584 |
Therefore a server can obviously do an ``internal representation'' to |
585 |
``TeX'' conversion. |
586 |
|
587 |
Since the server responded in a particular representation (ie TeX) |
588 |
The client should be able to search in ``TeX'' provide the ``SEARCH'' |
589 |
method is in the required ALLOWED: section of the response. |
590 |
|
591 |
The server should only respond with ALLOWED: SEARCH ... if it can do a |
592 |
Tex to ``internal representation'' conversion for searching. |
593 |
|
594 |
Fred Williams fwilliam@ccs.carleton.ca |
595 |
|
596 |
SHOWMETHOD |
597 |
|
598 |
When an object can support more operations than are defined in this |
599 |
specification, SHOWMETHOD allows a client to understand the |
600 |
interface to that operation sufficiently to allow the user to |
601 |
perform it interactively. |
602 |
|
603 |
Required parameter field |
604 |
|
605 |
For-Method: This filed contains only the method name |
606 |
about which the client is inquiring. |
607 |
|
608 |
Preconditions |
609 |
|
610 |
The methodname spacified in the For-Method field must have been |
611 |
previously issued in a "Allowed:" field returned with the given |
612 |
object. |
613 |
|
614 |
The client should specify an Accept: field which includes at least |
615 |
one form langauge it it wants to be able to interpret the result. |
616 |
|
617 |
Postcondidtion |
618 |
|
619 |
SHOWMETHOD returns, if possible, a form in a representation |
620 |
acceptable to the client. This form will contain instructions for |
621 |
ordering the operation, and fields for the parameters. |
622 |
|
623 |
A suitable language for the form might be HTTP2, but any language |
624 |
may be used which is acceptable to the client. |
625 |
|
626 |
SPACEJUMP |
627 |
|
628 |
This method is similar to the TEXTSEARCH method, but instead of the |
629 |
search criterion being a text string, it is a set of coordinates |
630 |
defining a point within the image. The semantics of the operation |
631 |
are not defined here. Typically, the user clicks on a point within |
632 |
the image with a mouse or other pointing device. |
633 |
|
634 |
|
635 |
|
636 |
|
637 |
T. Berners-Lee 11 |
638 |
|
639 |
Two or more coordinates are supplied, in the order x, y z, t. All |
640 |
coordinates are scaled so that 0 represents the bottom left hand |
641 |
point and 1.0 represents the top right hand point. |
642 |
|
643 |
The z access direction follows the normal right-hand rule, that is |
644 |
extends toward the viewer when the x and y axes are flat as in the |
645 |
normal two-dimensional representation. |
646 |
|
647 |
In the case of a time-occupying object, 0 represents the starting |
648 |
instance, and 1.0 represents the finishing instant. |
649 |
|
650 |
The method is implemented using GET with a derived URL . |
651 |
|
652 |
TEXTSEARCH |
653 |
|
654 |
This is a simple form of search. The text is assumed to derive from |
655 |
the requesting user, and is in no special format. |
656 |
|
657 |
The exact algorithm to be applied is not defined in this |
658 |
specification, but techniques such as vocabulary proximity matching |
659 |
between the request data portion and the contents or titles of |
660 |
documents, keyword matching, stemming, and the use of a thesaurus |
661 |
are quite appropriate. |
662 |
|
663 |
Whilst this method name is given as a flag to specify that the |
664 |
function is available, the search form of the GET method is in fact |
665 |
used to query the object. |
666 |
|
667 |
UNLINK |
668 |
|
669 |
This method deletes metainformation about an object. The request |
670 |
contains object geaders which are to be removed. Only headers |
671 |
exactly matching the headers given are removed. |
672 |
|
673 |
Obviously the operation may be used for unlinking objects. It may |
674 |
also be used for removing other metainformation such as object |
675 |
title, expiry date, etc. |
676 |
|
677 |
Related methods |
678 |
|
679 |
See LINK. |
680 |
|
681 |
Tim BL |
682 |
|
683 |
HTTP Request fields |
684 |
|
685 |
These header lines are sent by the client in a HTTP protocol |
686 |
transaction. All lines are RFC822 format headers. The list of |
687 |
headers is terminated by an empty line. |
688 |
|
689 |
FROM: |
690 |
|
691 |
In Internet mail format, this gives the name of the requesting |
692 |
|
693 |
|
694 |
|
695 |
T. Berners-Lee 12 |
696 |
|
697 |
user. This field may be used for logging purposes and an insecure |
698 |
form of access protection. The interpretation of this field is |
699 |
that the request is being performed on behalf of the person given, |
700 |
who accepts responsability for the method performed. |
701 |
|
702 |
The Internet mail address in this field does not have to correspond |
703 |
to the internet host which issued the request. (For example, when a |
704 |
request is passed through a gateway, then the original issuer's |
705 |
address should be used). |
706 |
|
707 |
The mail address should, if possible, be a valid mail address, |
708 |
whether or not it is in fact an internet mail address or the |
709 |
internet mail representation of an address on some other mail |
710 |
system. |
711 |
|
712 |
|
713 |
|
714 |
ACCEPT: |
715 |
|
716 |
This field contains a comma-separated list of representation |
717 |
schemes (Content-Type metainformation values) which will be |
718 |
accepted in the response to this request. |
719 |
|
720 |
The set given may of course vary from request to request from the |
721 |
same user. |
722 |
|
723 |
This field may be wrapped onto several lines according to RCFC822, |
724 |
and also more than one occurence of the field is allowed with the |
725 |
signifiance being the same as if all the entries has been in one |
726 |
field. The format of each entry in the list is (/ meaning "or") |
727 |
|
728 |
<field> = Accept: <entry> *[ ; <entry> ] |
729 |
<entry> = <content type> *[ , <param> ] |
730 |
<param> = <attr> = <float> |
731 |
<attr> = q / mxs / mxb |
732 |
<float> = <ANSI-C floating point text represntation> |
733 |
|
734 |
See the appendix on the negotiation algorithm as a function and |
735 |
penalty model. |
736 |
|
737 |
If no Accept: field is present, then it is assumed that text/plain |
738 |
and text/html are accepted. |
739 |
|
740 |
Example |
741 |
|
742 |
Accept: text/plain; text/html |
743 |
Accept: text/x-dvi, q=.8, mxb=100000, mxt=5.0; text/x- |
744 |
c |
745 |
|
746 |
Wildcards |
747 |
|
748 |
In order to save time, and also allow clients to receive content |
749 |
types of which they may not be aware, an asterisk "*" may be used |
750 |
|
751 |
|
752 |
|
753 |
T. Berners-Lee 13 |
754 |
|
755 |
in place of either the second half of the content-type value, or |
756 |
both halves. This only applies to the Accept: filed, and not to |
757 |
the content-type field of course. |
758 |
|
759 |
Example |
760 |
|
761 |
Accept: *.*, q=0.1 |
762 |
Accept: audio/*, q=0.2 |
763 |
Accept: audio/basic q=1 |
764 |
|
765 |
may be interpreted as "if you have basic audio, send it; otherwise |
766 |
send me some other audio, or failing that, just give me what |
767 |
you've got." |
768 |
|
769 |
ACCEPT-ENCODING: |
770 |
|
771 |
Similar to Accept, but lists the Content-Encoding types which are |
772 |
acceptable in the response. |
773 |
|
774 |
<field> = Accept-Encoding: <entry> *[ , <entry> ] |
775 |
<entry> = <content transfer encoding> *[ , <param> ] |
776 |
|
777 |
Example |
778 |
|
779 |
Accept-Encoding: x-compress; x-zip |
780 |
|
781 |
|
782 |
ACCEPT-LANGUAGE: |
783 |
|
784 |
Similar to Accept, but lists the Language values which are |
785 |
preferable in the response. A response in an unspecifies language |
786 |
is not illegal. See also: Language . |
787 |
|
788 |
Language coding TBS. (ISO standard xxxx) |
789 |
|
790 |
USER-AGENT: |
791 |
|
792 |
This line if present gives the software program used by the |
793 |
original client. This is for statistical purposes and the tracing |
794 |
of protocol violations. It should be included. The first white |
795 |
space delimited word must be the software product name, with an |
796 |
optional slash and version designator. Other products which form |
797 |
part of the user agent may be put as separate words. |
798 |
|
799 |
<field> = User-Agent: <product>+ |
800 |
<product> = <word> [/<version>] |
801 |
<version> = <word> |
802 |
|
803 |
Example: |
804 |
|
805 |
UserAgent: LII-Cello/1.0 libwww/2.5 |
806 |
|
807 |
REFERER: |
808 |
|
809 |
|
810 |
|
811 |
T. Berners-Lee 14 |
812 |
|
813 |
This optional header field allows the client to specify, for the |
814 |
server's benefit, the address ( URI ) of the document (or element |
815 |
within the document) from which the URI in the request was |
816 |
obtained. |
817 |
|
818 |
This allows a server to generate lists of back-links to documents, |
819 |
for interest, logging, etc. It allows bad links to be traced for |
820 |
maintenance. |
821 |
|
822 |
If a partial URI is given, then it should be parsed relative to the |
823 |
URI of the object of the request. |
824 |
|
825 |
Example: |
826 |
|
827 |
Referer: http://info.cern.ch/hypertext/DataSources/Over |
828 |
view.html |
829 |
|
830 |
AUTHORIZATION: |
831 |
|
832 |
If this line is present it contains authorization information. The |
833 |
format is To Be Specified (TBS). The format of this field is in |
834 |
extensible form. The first word is a specification of the |
835 |
authorisation system in use. |
836 |
|
837 |
Proposals have been as follows: (see the specification for current |
838 |
one implemented by AL Sep 1993) |
839 |
|
840 |
User/Password scheme |
841 |
|
842 |
Authorization: user fred:mypassword |
843 |
|
844 |
The scheme name is "user". The second word is a user name |
845 |
(typically derived from a USER environment variable or prompted |
846 |
for), with an optional password separated by a colon (as in the URL |
847 |
syntax for FTP). Without a password, this povides very low level |
848 |
security. With the password, it provides a low-level security as |
849 |
used by unmodified FTP, Telnet, etc. |
850 |
|
851 |
Kerberos |
852 |
|
853 |
Authorization: kerberos kerberosauthenticationsparam |
854 |
eters |
855 |
|
856 |
The format of the kerberosauthenticationsparameters is to be |
857 |
specified. |
858 |
|
859 |
CHARGETO: |
860 |
|
861 |
This line if present contains account information for the costs of |
862 |
the application of the method requested. The format is TBS. The |
863 |
format of this field must be in extensible form. The first word |
864 |
starts with a specification of the namespace in which the account |
865 |
is . (This is similar to extensible URL definition.) No namespaces |
866 |
|
867 |
|
868 |
|
869 |
T. Berners-Lee 15 |
870 |
|
871 |
are currently defined. Namespaces will be registered with the |
872 |
registration authority . |
873 |
|
874 |
The format of the rest of the line is a function of the charging |
875 |
system, but it is recommended that this include a maximum cost |
876 |
whose payment is authorized by the client for this transaction, and |
877 |
a cost unit. |
878 |
|
879 |
Note: Server tolerance of bad clients |
880 |
|
881 |
Whilst it is seen appropriate for testing parsers to check full |
882 |
conformance to this specification, it is recommended that |
883 |
operational parsers be tolerant of deviations. |
884 |
|
885 |
In particular, lines should be regarded as terminated by the Line |
886 |
Feed, and the preceeding Carriage Return character ignored. |
887 |
|
888 |
Any HTTP Header Field Name which is not recognised should be |
889 |
ignored in operational parsers. |
890 |
|
891 |
It is recommended that servers use URIs free of "variant" |
892 |
characters whose representation differs in some of the national |
893 |
variant character sets, punctuation characters, and spaces. This |
894 |
will make URIs easier to handle by humans when the need (such as |
895 |
debugging, or transmission through non hypertext systems) arises. |
896 |
|
897 |
RESPONSE |
898 |
|
899 |
The response from the server shall start with the following syntax |
900 |
(See also: note on client tolerance ): |
901 |
|
902 |
<status line> ::= <http version> <status code> <reason line> |
903 |
<CrLf> |
904 |
<http version> ::= 3*<digit> |
905 |
<status code> ::= 3*<digit> |
906 |
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
907 |
<reason line> ::= * <printable> |
908 |
|
909 |
|
910 |
<http version> identifies the HyperText Transfer Protocol |
911 |
version being used by the server. For the |
912 |
version described by this document version it |
913 |
is "HTTP/1.0" (without the quotes). |
914 |
|
915 |
< status code > gives the coded results of the attempt to |
916 |
understand and satisfy the request. A three |
917 |
digit ASCII decimal number. |
918 |
|
919 |
<reason string> gives an explanation for a human reader, |
920 |
except where noted for particular status |
921 |
codes. |
922 |
|
923 |
Fields on the status line are delimited by a single blank (parsers |
924 |
|
925 |
|
926 |
|
927 |
T. Berners-Lee 16 |
928 |
|
929 |
should accept any amount of white space). The possible values of |
930 |
the status code are listed below . |
931 |
|
932 |
Response headers |
933 |
|
934 |
The headers on returned objects those RFC822 format headers listed |
935 |
as object headers , as well as any MIME conforming headers, notably |
936 |
the Content-Type field. Note that this specification doesnot |
937 |
define any headers particular to the response which are not also |
938 |
apropriate to any transmission of an object with a request. |
939 |
|
940 |
Response data |
941 |
|
942 |
Additional information may follow, in the format of a MIME message |
943 |
body. The significance of the data depends on the status code. |
944 |
|
945 |
The Content-Type used for the data may be any Content-Type which |
946 |
the client has expressed his ability to accept, or text/plain, or |
947 |
text/html. That is, one can always assume that the client can |
948 |
handle text/plain and text/html. |
949 |
|
950 |
Status codes |
951 |
|
952 |
The values of the numeric status code to HTTP requests are as |
953 |
follows. The data sections of messages Error, Forward and |
954 |
redirection responses may be used to contain human-readable |
955 |
diagnostic information. |
956 |
|
957 |
SUCCESS 2XX |
958 |
|
959 |
These codes indicate success. The body section if present is the |
960 |
object returned by the request. It is a MIME format object. It is |
961 |
in MIME format, and may only be in text/plain, text/html or one fo |
962 |
the formats specified as acceptable in the request. |
963 |
|
964 |
OK 200 |
965 |
|
966 |
The request was fulfilled. |
967 |
|
968 |
CREATED 201 |
969 |
|
970 |
Following a POST command, this indicates success, but the textual |
971 |
part of the response line indicates the URI by which the newly |
972 |
created document should be known. |
973 |
|
974 |
Accepted 202 |
975 |
|
976 |
The request has been accepted for processing, but the processing |
977 |
has not been completed. The request may or may not eventually be |
978 |
acted upon, as it may be disallowed when processing actually takes |
979 |
place. there is no facility for status returns from asynchronous |
980 |
operations such as this. |
981 |
|
982 |
|
983 |
|
984 |
|
985 |
T. Berners-Lee 17 |
986 |
|
987 |
Partial Information 203 |
988 |
|
989 |
When received in the response to a GET command, this indicates that |
990 |
the returned metainformation is not a definitive set of the object |
991 |
from a server with a copy of the object, but is from a private |
992 |
overlaid web. This may include annotation information about the |
993 |
object, for example |
994 |
|
995 |
ERROR 4XX, 5XX |
996 |
|
997 |
The 4xx codes are intended for cases in which the client seems to |
998 |
have erred, and the 5xx codes for the cases in which the server is |
999 |
aware that the server has erred. It is impossible to distinguish |
1000 |
these cases in general, so the difference is only informational. |
1001 |
|
1002 |
The body section may contain a document describing the error in |
1003 |
human readable form. The document is in MIME format, and may only |
1004 |
be in text/plain, text/html or one for the formats specified as |
1005 |
acceptable in the request. |
1006 |
|
1007 |
Bad request 400 |
1008 |
|
1009 |
The request had bad syntax or was inherently impossible to be |
1010 |
satisfied. |
1011 |
|
1012 |
Unauthorized 401 |
1013 |
|
1014 |
The parameter to this message gives a specification of |
1015 |
authorization schemes which are acceptable. The client should |
1016 |
retry the request with a suitable Authorization header. |
1017 |
|
1018 |
PaymentRequired 402 |
1019 |
|
1020 |
The parameter to this message gives a specification of charging |
1021 |
schemes acceptable. The client may retry the request with a |
1022 |
suitable ChargeTo header. |
1023 |
|
1024 |
Forbidden 403 |
1025 |
|
1026 |
The request is for something forbidden. Authorization will not |
1027 |
help. |
1028 |
|
1029 |
Not found 404 |
1030 |
|
1031 |
The server has not found anything matching the URI given |
1032 |
|
1033 |
Internal Error 500 |
1034 |
|
1035 |
The server encountered an unexpected condition which prevented it |
1036 |
from fulfilling the request. |
1037 |
|
1038 |
Not implemented 501 |
1039 |
|
1040 |
|
1041 |
|
1042 |
|
1043 |
T. Berners-Lee 18 |
1044 |
|
1045 |
The server does not support the facility required. |
1046 |
|
1047 |
REDIRECTION 3XX |
1048 |
|
1049 |
The codes in this section indicate action to be taken (normally |
1050 |
automatically) by the client in order to fulfill the request. |
1051 |
|
1052 |
Moved 301 |
1053 |
|
1054 |
The data requested has been assigned a new URI, the change is |
1055 |
permanent. (N.B. this is an optimisation, which must, |
1056 |
pragmatically, be included in this definition. Browsers with link |
1057 |
editing capabiliy should automatically relink to the new reference, |
1058 |
where possible) |
1059 |
|
1060 |
The response contains one or more header lines of the form |
1061 |
|
1062 |
URI: <url> String CrLf |
1063 |
|
1064 |
Which specify alternative addresses for the object in question. |
1065 |
The String is an optional comment field. If the response is to |
1066 |
indicate a set of variants which each correspond to the requested |
1067 |
URI, then the multipart/alternative wrapping may be used to |
1068 |
distinguish different sets |
1069 |
|
1070 |
Found 302 |
1071 |
|
1072 |
The data requested actually resides under a different URL, however, |
1073 |
the redirection may be altered on occasion (when making links to |
1074 |
these kinds of document, the browser should default to using the |
1075 |
Udi of the redirection document, but have the option of linking to |
1076 |
the final document) as for "Forward". |
1077 |
|
1078 |
The response format is the same as for Moved . |
1079 |
|
1080 |
Method 303 |
1081 |
|
1082 |
Method: <method> <url> |
1083 |
body-section |
1084 |
|
1085 |
Like the found response, this suggests that the client go try |
1086 |
another network address. In this case, a different method may be |
1087 |
used too, rather than GET. |
1088 |
|
1089 |
The body-section contains the parameters to be used for the method. |
1090 |
This allows a document to be a pointer to a complex query |
1091 |
operation. |
1092 |
|
1093 |
@@TBS in more detail. |
1094 |
|
1095 |
The body may be preceded by the following additional fields as |
1096 |
listed . |
1097 |
|
1098 |
|
1099 |
|
1100 |
|
1101 |
T. Berners-Lee 19 |
1102 |
|
1103 |
Object MetaInformation |
1104 |
|
1105 |
The header fields given with or in relation to objects in HTTP are |
1106 |
as follows. All are optional. These headers specify |
1107 |
metainformation: that is, information about the object, not the |
1108 |
information which is contained in the object. |
1109 |
|
1110 |
There is no reason for limiting these fields to HTTP use, as any |
1111 |
other system which requires metainformation is encouraged to use |
1112 |
them. |
1113 |
|
1114 |
The order of header lines withing the HTTP header has no |
1115 |
significance. However, those fields which are not MIME fields |
1116 |
should occur before the MIME fields, so that the MIME fields and |
1117 |
following form a valid MIME document. This is not mandatory. |
1118 |
|
1119 |
Any header fields which are not understood should be ignored. |
1120 |
|
1121 |
(TBS in more detail) |
1122 |
|
1123 |
ALLOWED: *METHOD |
1124 |
|
1125 |
Lists the set of requests which the requesting user is allowed to |
1126 |
issue for this URL. If this header line is omitted, the default |
1127 |
allowed methods are "GET HEAD" |
1128 |
|
1129 |
Example of use: |
1130 |
|
1131 |
Allow: GET HEAD PUT |
1132 |
|
1133 |
PUBLIC: *METHOD |
1134 |
|
1135 |
As "Allow" but lists those requests which anyone may use. If |
1136 |
omitted, the default is "GET" only. |
1137 |
|
1138 |
Example of use: |
1139 |
|
1140 |
Public: GET HEAD TEXTSEARCH |
1141 |
|
1142 |
CONTENT-LENGTH: INT |
1143 |
|
1144 |
Implies that the body is binary and should be read directly from |
1145 |
the communications link, without parsing lines, etc. When the |
1146 |
data is part of the request, prevents the escaping and de-escaping |
1147 |
of the termination sequence . |
1148 |
|
1149 |
@@@ This should be part of the MIME header, as it applies to any |
1150 |
binary encoded part. Note HTML is the first internet protocol to |
1151 |
allow MIME "binary" encoding. In MIME, the use of Content-Length is |
1152 |
currently allowed only for external messages. |
1153 |
|
1154 |
CONTENT-TYPE: |
1155 |
|
1156 |
|
1157 |
|
1158 |
|
1159 |
T. Berners-Lee 20 |
1160 |
|
1161 |
As defined in MIME, except where noted here. |
1162 |
|
1163 |
Extra non-MIME types |
1164 |
|
1165 |
It is reasonable to put strict limits on transfer formats for mail, |
1166 |
where there is no guarantee that the receiver will understand a |
1167 |
weird format. However, in HTTP one knows that the receiver will be |
1168 |
able to receive it because it will have been sent in the Accept: |
1169 |
field. There is therefore a lot to be gained from a very complete |
1170 |
registry of well-defined types for HTTP which may nevertheless not |
1171 |
be recommended for mail. In this case, the content-type list for |
1172 |
HTTP may be a superset of the MIME list. |
1173 |
|
1174 |
The x- convention for experimental types is of course still |
1175 |
available as well. |
1176 |
|
1177 |
Type parameters |
1178 |
|
1179 |
Parameters on the content type are extremely useful for describing |
1180 |
resolutions, colour depths, etc. They will allow a client to |
1181 |
specify in the Accept: field the resolution of its device. This may |
1182 |
allow the server to economise greatly on transmission time by |
1183 |
reducing the resultion of an image, for example. |
1184 |
|
1185 |
These parameters are to be specified when types are registered.. |
1186 |
@@ TBS. |
1187 |
|
1188 |
Multipart types |
1189 |
|
1190 |
MIME provides for a number of "multipart" types. These are |
1191 |
encapsulations of several body parts in the one message. In HTTP, |
1192 |
Multipart types may be returned on the condition that the client |
1193 |
has indicated acceptability (using Accept:)of the multipart type |
1194 |
and also of the content types of each consitutent body part. |
1195 |
|
1196 |
The body parts (unlike in MIME) MAY contain HTTP metainformation |
1197 |
header fields which ARE significant. |
1198 |
|
1199 |
Multipart/alternative |
1200 |
|
1201 |
This is normally used in mail to send different content-type |
1202 |
variants when the receiver's capabilities are not known. This is |
1203 |
not the case with HTTP. Multipart/alternate may however be used to |
1204 |
provide meta information of many instances of an object, in the |
1205 |
case of a indirection response. This allows, for example, pointers |
1206 |
to be returned by a name server to a set of instances of an object. |
1207 |
|
1208 |
Multipart/related |
1209 |
|
1210 |
This is the type to be used when the first body part contains |
1211 |
references to other parts which the server wishes to send at the |
1212 |
same time. For example, the first part could be an HTML document, |
1213 |
and the included bodyparts could be the inline images mentioned |
1214 |
|
1215 |
|
1216 |
|
1217 |
T. Berners-Lee 21 |
1218 |
|
1219 |
within the text. |
1220 |
|
1221 |
The body parts may have URI: fields if the body parts have URIs, |
1222 |
and so they may be referred to by these URIs in the body-parts. If |
1223 |
the body-parts are transient (as in speech insertions in mail |
1224 |
messages) then the [propsed] "cid:" URI type may be used to refer |
1225 |
to them by content-identifier. |
1226 |
|
1227 |
Multipart/mixed |
1228 |
|
1229 |
This may be used to simply transfer an unrelated unstructured set |
1230 |
of objects. |
1231 |
|
1232 |
Multipart/parallel |
1233 |
|
1234 |
This may be used as in MIME to indicate simultaneous presentation. |
1235 |
[It is the author's belief that this is a trivial case of a |
1236 |
compound presentation which in general should be described by a |
1237 |
script which would be teh first bodypart of a multipart/related |
1238 |
document]. |
1239 |
|
1240 |
DATE: DATE |
1241 |
|
1242 |
Creation date of object. (or last modified, and separately have a |
1243 |
Created: field?) Format as in RFC850 but GMT MUST BE USED. |
1244 |
|
1245 |
EXPIRES: DATE |
1246 |
|
1247 |
Gives the date after which the information given ceases to be valid |
1248 |
and should be retrieved again. This allows control of caching |
1249 |
mechanisms, and also allows for the periodic refreshing of displays |
1250 |
of volatile data. Format as for Date:. This does NOT imply that |
1251 |
the original object will cease to exist. |
1252 |
|
1253 |
LAST-MODIFIED: DATE |
1254 |
|
1255 |
Last time object was modified, i.e. the date of this version if the |
1256 |
document is a "living document". Format as for Date:. |
1257 |
|
1258 |
MESSAGE-ID: URI |
1259 |
|
1260 |
A unique identifier for the message. As in RFC850 , except that the |
1261 |
unlimited lifetime of HTTP objects requires that the Message-ID be |
1262 |
unique in all time, not just in two years. |
1263 |
|
1264 |
A document may only have one Message-ID. |
1265 |
|
1266 |
No two documents, even if different versions of the same live |
1267 |
document, may have the same Message-id. |
1268 |
|
1269 |
Note: Unlike the URI field, this does not fgive a way of accessing |
1270 |
the document, so the Message-Id cannot be used to refer to the |
1271 |
document. In the case of NNTP articles, the message-id may in fact |
1272 |
|
1273 |
|
1274 |
|
1275 |
T. Berners-Lee 22 |
1276 |
|
1277 |
be used within the URI for retrieval using NNTP. |
1278 |
|
1279 |
URI: 1*URI |
1280 |
|
1281 |
This gives a URI with which the object may be found. There is no |
1282 |
guarantee that the object can be retrieved using the URI specified. |
1283 |
However, it is guaranteed that if an object is successfully |
1284 |
retrieved using that URI it will be to a certain given degree the |
1285 |
same object as this one. |
1286 |
|
1287 |
If the URI is used to refer to a set of variants, then the |
1288 |
dimensiosn in which the variants may differ must be given with the |
1289 |
"vary" parameter: |
1290 |
|
1291 |
Syntax URI: <uri> [ ; vary = dimension [ , dimension |
1292 |
]* ] |
1293 |
dimension content-type | language | version |
1294 |
|
1295 |
|
1296 |
If no "vary" parameters are given, then the URI may not return |
1297 |
anything other than the same bit stream as this object. |
1298 |
|
1299 |
Multiple occurencies of this field give alternative access names or |
1300 |
addresses for the object. |
1301 |
|
1302 |
Examples |
1303 |
|
1304 |
URI: http://info.cern.ch/pub/www/doc/url6.multi; vary |
1305 |
=content-type |
1306 |
|
1307 |
This indicates that retrieval given the URI will return the same |
1308 |
document, never an updated version, but optionally in a different |
1309 |
rendition. |
1310 |
|
1311 |
URI: http://info.cern.ch/pub/www/doc/url.multi; |
1312 |
vary=content-type, language, version |
1313 |
|
1314 |
This indicates that the URI will return the smae document, possibly |
1315 |
in a different rendition, possibly updated, and without excluding |
1316 |
the provision of translations into different languages.URI: |
1317 |
http://info.cern.ch/pub/www/doc/url6.ps vary=content-typeThis |
1318 |
indicates that accessing the URI in question will return exactly |
1319 |
the same bitstream. |
1320 |
|
1321 |
VERSION: |
1322 |
|
1323 |
This is a string defining the version of an evolving object. Its |
1324 |
format is currently undefined , and so it should be treated as |
1325 |
opaque to the reader, defined by the informatiuon provider. The |
1326 |
version field is used to indicate evolution along a single path of |
1327 |
a partucular work. It is NOT used to indicate derived works (use a |
1328 |
link), translations , or renditions in different representations . |
1329 |
|
1330 |
|
1331 |
|
1332 |
|
1333 |
T. Berners-Lee 23 |
1334 |
|
1335 |
Note: It would be useful to have sufficient semantics to be able |
1336 |
to deduce whether one version predated or postdated another. |
1337 |
However, it may also be useful to be able to insert a particular |
1338 |
local code management system's own version stamp in this field. |
1339 |
Typically, publishers will have quite complex version information |
1340 |
containing hidden local semantics, giving value to the idea of this |
1341 |
field being opaque to other readers ofthe document. |
1342 |
|
1343 |
DERIVED-FROM: |
1344 |
|
1345 |
When an editied object is resubmitted using PUT for example, this |
1346 |
field gives the value of the Version . This typically allows a |
1347 |
server to check for example that two concurrent modifications by |
1348 |
different parties will not be lost, and for example to use |
1349 |
established version management techniques to merge both |
1350 |
modifications. |
1351 |
|
1352 |
LANGUAGE: CODE |
1353 |
|
1354 |
The language code is the ISO code for the language in which the |
1355 |
document is written. If the language is not known, this field |
1356 |
should be omitted of course . |
1357 |
|
1358 |
The language code is an ISO 3316 language code with an optional |
1359 |
ISO639 country code to specify a national variant. |
1360 |
|
1361 |
Example |
1362 |
|
1363 |
Language: en_UK |
1364 |
|
1365 |
means that the content of the message is in British English, while |
1366 |
|
1367 |
Language: en |
1368 |
|
1369 |
means that the language is English in one of its forms. (@@ If a |
1370 |
document is in more than one language, for example requires both |
1371 |
Greek Latin and French to be understood, should this be |
1372 |
representable?) |
1373 |
|
1374 |
See also: Accept-Language . |
1375 |
|
1376 |
COST: TBS |
1377 |
|
1378 |
The cost of retrieving the object is given. This is the cost of |
1379 |
access of a copyright work. Format of units to be specified. |
1380 |
Currently refers to an unspecified charging scheme to be agreed out |
1381 |
of band between parties. |
1382 |
|
1383 |
WWW-LINK: |
1384 |
|
1385 |
Note. It is proposed that any HTML metainformation element (allowed |
1386 |
withing the HEAD as opposed to BODY element of the document) be a |
1387 |
valid candidate for an HTTP object header. WWW_Link is a required |
1388 |
|
1389 |
|
1390 |
|
1391 |
T. Berners-Lee 24 |
1392 |
|
1393 |
example. The suggestion was that the isomorphism should be realized |
1394 |
by prepending "WWW-" t the HTML element name to make the HTTP |
1395 |
header name, and the HTML attributes imply identically named |
1396 |
semicolon-separated MIME-style header parameters. Other clear |
1397 |
candidates include WWW-Title. |
1398 |
|
1399 |
It is open to discussion whether the "WWW-" should be removed. |
1400 |
|
1401 |
This is semantically equivalent to the LINK element in an HTML |
1402 |
document which should be consulted for a full explanation. |
1403 |
|
1404 |
Examples |
1405 |
|
1406 |
WWW-Link: href="http://info.cern.ch/a/b/c"; rel="includes" |
1407 |
WWW-Link: href="mailto:timbl@info.cern.ch"; rev=made |
1408 |
|
1409 |
The first example indicates that this object includes the specified |
1410 |
/a/b/c object. The second indicates that the author of the object |
1411 |
is identified by the given mail address. |
1412 |
|
1413 |
Note: Client tolerance of bad servers |
1414 |
|
1415 |
Servers not implementing the specification as written are not HTTP |
1416 |
compiant. Servers should always be made completely copmpliant. |
1417 |
However, clients should also tolerate deviant servers where |
1418 |
possible. |
1419 |
|
1420 |
BACK COMPATIBILITY |
1421 |
|
1422 |
In order that clients using the HTTP protocol should be able to |
1423 |
communicate with servers using the protocol originally implemented |
1424 |
in the W3 data model, clients should tolerate responses which do |
1425 |
not start with a numeric version number and response codes. |
1426 |
|
1427 |
In this case, they should assume that the rest of the response is a |
1428 |
document body in type text/html. |
1429 |
|
1430 |
WHITE SPACE |
1431 |
|
1432 |
Clients should be tolerant in parsing response status lines, in |
1433 |
particular they should accept any sequence of white space (SP and |
1434 |
TAB) characters between fields. |
1435 |
|
1436 |
Lines should be regarded as terminated by the Line Feed, and the |
1437 |
preceeding Carriage Return character ignored. |
1438 |
|
1439 |
HTTP NEGOTIATION ALGORITM |
1440 |
|
1441 |
This note defines the significance of the q, mxb and mxs values |
1442 |
optionally sent in the Accept: field of the HTTP protocol request |
1443 |
message. |
1444 |
|
1445 |
It is assumed that there is a certain value of the presentation of |
1446 |
|
1447 |
|
1448 |
|
1449 |
T. Berners-Lee 25 |
1450 |
|
1451 |
the document, optimally rendered using all the information |
1452 |
available in its original source. |
1453 |
|
1454 |
It is further assumed that one can allocate a number between 0 and |
1455 |
1 to represent the loss of value which occurs when a document is |
1456 |
rendered into a representation with loss of information. Whilst |
1457 |
this is a very subjective measurement, and in fact largely a |
1458 |
function of the document in question, the approximation is made |
1459 |
that one can define this "degradation" figure as a function of |
1460 |
merely the representation involved. |
1461 |
|
1462 |
The next assumption is that the other cost to the user of viewing |
1463 |
the document is a function of the time taken for presentation. We |
1464 |
first assume that the cost is linear in time, and then assume that |
1465 |
the time is linear in the size of the message. |
1466 |
|
1467 |
The final net value to the user can therefore be written |
1468 |
|
1469 |
presented_value = initial_value * total-degradation - a - b * |
1470 |
size |
1471 |
|
1472 |
for a document in a given incoming representation. Suppose we |
1473 |
normalize the initial value of the document to be 1. The server |
1474 |
may judge that the value in a particular format is less than 1 is a |
1475 |
conversion on the server side has lost information. The total |
1476 |
degradation is then the product of any degradation due to |
1477 |
conversions internal to the server, and the degradation "q" sent in |
1478 |
the Accept field. If q is not sent, it defaults to 1. |
1479 |
|
1480 |
The values of a and b have components from processing time on the |
1481 |
server, network delays, and processing time on the client. These |
1482 |
delays are not additive as a good system will pipeline the |
1483 |
processing, and whilst the result may be linear in message size, |
1484 |
calculation of it in advance is not simple. The amount of |
1485 |
pipelining and the loads on machines and network are all difficult |
1486 |
to predict, so a very rough assumption must be made. |
1487 |
|
1488 |
We make the client responsible for taking into account network |
1489 |
delays. The client will in fact be in a better position to do this, |
1490 |
as the client will after one transaction be aware of the round-trip |
1491 |
time. |
1492 |
|
1493 |
We assume that the delays imposed by the server and by the client |
1494 |
(including network) are additive. We assume that the client's |
1495 |
delay is proportional to message size. |
1496 |
|
1497 |
The three parameters given by the client to the server are |
1498 |
|
1499 |
q The degradation (quality) factor between 0 |
1500 |
and 1. If omitted, 1 is assumed. |
1501 |
|
1502 |
mxb The size of message (in bytes) which even if |
1503 |
immediately available from the server will |
1504 |
|
1505 |
|
1506 |
|
1507 |
T. Berners-Lee 26 |
1508 |
|
1509 |
cause the value to the reader to become zero |
1510 |
|
1511 |
mxs The delay (in seconds) which, even for a very |
1512 |
small message with no length-related penalty, |
1513 |
will cause the value to the reader to become |
1514 |
zero. |
1515 |
|
1516 |
These parameters are chosen in part because they are easy to |
1517 |
visualize as the largest tolerable delay and size. If not sent, |
1518 |
they default to infinity. |
1519 |
|
1520 |
The server may optimize the presented value for the user when |
1521 |
deciding what to return. The hope is that fine decisions will not |
1522 |
have to be made, as in most cases the results for different formats |
1523 |
will be very different, and there will be a clear winner. |
1524 |
|
1525 |
A suitable algorithm is that the assumed value v of a document of |
1526 |
initial value u delivered to the network after a delay t whose |
1527 |
transfer length on the net is b bytes is |
1528 |
|
1529 |
v = u * q - b/mxb - t/mxs |
1530 |
|
1531 |
Note that t is the time from the arrival of the request to the |
1532 |
first byte being available on the net. [[See also: Design issues |
1533 |
discussions around this point.]] |
1534 |
|
1535 |
HTTP Negotiation algoritm |
1536 |
|
1537 |
This note defines the significance of the d, a and b values |
1538 |
optionally sent in the Accept: field of the HTTP protocol request |
1539 |
message. |
1540 |
|
1541 |
It is assumed that there is a certain value of the presentation of |
1542 |
the document, optimally rendered using all the information |
1543 |
available in its original source. |
1544 |
|
1545 |
It is further assumed that one can allocate a number between 0 and |
1546 |
1 to represent the loss of value which occurs when a document is |
1547 |
rendered intop a represenmtation with loss of information. Whilst |
1548 |
this is a very subjective measurement, and in fact largely a |
1549 |
function of the document in question, the approximation is made |
1550 |
that one can define this "degradation" figure as a function of |
1551 |
merely the representation involved. |
1552 |
|
1553 |
The next assumption is that there the other cost to the user of |
1554 |
viewing the doument is a function of the time taken for |
1555 |
presentation. We first assume that the cost is linear in time, and |
1556 |
then assume that the time is linear in the size of the message. |
1557 |
|
1558 |
The final net value to the user can therefore be written |
1559 |
|
1560 |
presented_value = initial_value * total-degradation - a - b * |
1561 |
size |
1562 |
|
1563 |
|
1564 |
|
1565 |
T. Berners-Lee 27 |
1566 |
|
1567 |
for a document in a given incoming represenattion. Suppose we |
1568 |
normalize the initial value of the document to be 1. The total |
1569 |
degradation is the product of any degradation due to conversions |
1570 |
internal to the server, and the degradation "d" sent in the Accept |
1571 |
field. The values of a and b are also sent by the protocol. If not |
1572 |
sent, they default to no cost (d=1, a=b=0). |
1573 |
|
1574 |
The server may optimize the presented value for the user when |
1575 |
deciding what to return. The hope is that fine decisions will not |
1576 |
have to be made, as in most cases the results for different formats |
1577 |
will be very different, and there will be a clear winner. |
1578 |
|
1579 |
See also: Design issues discussions around this point. |
1580 |
|
1581 |
Tim BL |
1582 |
|
1583 |
Note: The cost of retrieval time |
1584 |
|
1585 |
The assumption that the cost to the user associated with a certain |
1586 |
retrieval time is linear in that time is wildly innaccurate. The |
1587 |
real function could be very dependent on circumstances (like go to |
1588 |
infinity at a deadline). |
1589 |
|
1590 |
A better general approximation might be logarithmic for large time |
1591 |
delays, and linear for small ones, like a*log(b*t-1) which has two |
1592 |
parameters. |
1593 |
|
1594 |
REGISTRATION AUTHORITY |
1595 |
|
1596 |
The HTTP Registration Authority is responsible for maintaining |
1597 |
lists of: |
1598 |
|
1599 |
Charge account name spaces (see ChargeTo: field above) |
1600 |
|
1601 |
Authorization schemes (see Authorization: field above) |
1602 |
|
1603 |
Data format names (as MIME Content-Types) |
1604 |
|
1605 |
Data encoding names (as MIME Content-Encoding)) |
1606 |
|
1607 |
It is proposed that the Internet Assigned Numbers Authority or |
1608 |
their successors take this role. |
1609 |
|
1610 |
Unregistered values may be used for experimental purposes if they |
1611 |
are start with "X-". |
1612 |
|
1613 |
SECURITY CONSIDERATIONS |
1614 |
|
1615 |
The HTTP protocol allows requests to communication to a remote |
1616 |
server machine, and all the expetant security considerations for |
1617 |
client-server systems apply, including |
1618 |
|
1619 |
Authentication of requests |
1620 |
|
1621 |
|
1622 |
|
1623 |
T. Berners-Lee 28 |
1624 |
|
1625 |
Authenticationtion of servers |
1626 |
|
1627 |
Privacy of request and response |
1628 |
|
1629 |
Abuse of server features |
1630 |
|
1631 |
Abuse of servers by exploiting server bugs |
1632 |
|
1633 |
Unwitting actions on the net |
1634 |
|
1635 |
Abuse of log information |
1636 |
|
1637 |
The bulk of these are well known problems, tackled in part by some |
1638 |
featured of this protocol. Some aspects particular to HTTP are |
1639 |
mentioned below. |
1640 |
|
1641 |
Unwitting actions on the net |
1642 |
|
1643 |
The writers of client software should be aware that the software |
1644 |
represents the user in his interactions over the net, and should be |
1645 |
careful to allow the user to be aware of any actions he may take |
1646 |
which may be taken as having an unexpected significance by others. |
1647 |
|
1648 |
|
1649 |
TCP PORT NUMBERS |
1650 |
|
1651 |
Clients should prompt a user before allowing HTTP access to |
1652 |
reserved ports other than the port resrverd for HTTP (port 80). |
1653 |
Otherwise, the user may unwittingly cause a transaction to occur in |
1654 |
some other (present or future) protocol. |
1655 |
|
1656 |
IDEMPOTENT METHODS |
1657 |
|
1658 |
The convention should be established that the GET and HEAD methods |
1659 |
never have the significance of taking an action. The link "click |
1660 |
here to subscribe" causing the reading of a special "magic" |
1661 |
document is open to abuse by others making a link "click here to |
1662 |
see a pretty picture". These methods should be considered "safe" |
1663 |
and should not have side effects. This allows the client software |
1664 |
to represent other, methods (such as POST) in a special way, so |
1665 |
that the use is aware of the fact that an action is being |
1666 |
requested. |
1667 |
|
1668 |
Abuse of log information |
1669 |
|
1670 |
A server is in the position to save large amount of personal data |
1671 |
about information requested by different readers. This information |
1672 |
is clearly confidential in nature, and its handling may be |
1673 |
constrained by law in certain countries. Server providers shall |
1674 |
ensure that such material is not distributed without the |
1675 |
permission of any people or groups of people mentioned in the |
1676 |
results published. |
1677 |
|
1678 |
|
1679 |
|
1680 |
|
1681 |
T. Berners-Lee 29 |
1682 |
|
1683 |
A feature which increases the amount of personal data transferred |
1684 |
is the Referer: field. This allow reading patters to be studied, |
1685 |
reverse links drawn, and so is very useful. Its power can be |
1686 |
abused of course if user details are not separated from the |
1687 |
Referee-Referer pairs. Even when the personal information has been |
1688 |
removed, the Referer field may in fact be a secure document's URI, |
1689 |
whose revelation itself is breach of security. A method of |
1690 |
suppressing the Referer information in such cases may be the |
1691 |
subject of further study. |
1692 |
|
1693 |
REFERENCES |
1694 |
|
1695 |
RFC 822 "Standard for ARPA Internet Text Messages". |
1696 |
David H. Crocker, describes Internet mail |
1697 |
message fromat. |
1698 |
|
1699 |
RFC850 "Standard for Interchange of USENET |
1700 |
Messages" This RFC uses some field names in |
1701 |
common with this specification, and is |
1702 |
relevant reading. |
1703 |
|
1704 |
RFC977 "Network News Transfer Protocol", Kantor and |
1705 |
Lampsley. |
1706 |
|
1707 |
RFC 1341 Multipurpose Internet Mail Extensions |
1708 |
(MIME), Nathaniel Borenstien and Ned Freed, |
1709 |
Internet RFC 1341, 1992. Now obsoleted by |
1710 |
RFC1521: |
1711 |
|
1712 |
RFC 1521 MIME. Not available in hypertext form yet. |
1713 |
|
1714 |
RFC 1522 K Moore. "MIME: Part Two: Message Header |
1715 |
Extensions for Non-ASCII Text". |
1716 |
|
1717 |
RFC1523 MIME text/enriched. |
1718 |
|
1719 |
RFC 1524 MIME... |
1720 |
|
1721 |
URL Universal Resource Locators. RFCxxx. |
1722 |
Currently available by anonymous FTP from |
1723 |
info.cern.ch as /pub/ietf/url3.{ps,txt}. |
1724 |
|
1725 |
MIME and PEM Internet Draft only |
1726 |
|
1727 |
OBJECT CONTENTS |
1728 |
|
1729 |
The data (if any) sent with an HTTP request or reply is in a format |
1730 |
and encoding defined by the object header fields, the default being |
1731 |
"plain/text" type with "8bit" encoding. Note that while all the |
1732 |
other information in the request (just as in the reply) is in ISO |
1733 |
Latin1 with lines delimited by Carriage Return/Line Feed pairs, the |
1734 |
data may contain 8-bit binary data. |
1735 |
|
1736 |
|
1737 |
|
1738 |
|
1739 |
T. Berners-Lee 30 |
1740 |
|
1741 |
Termination |
1742 |
|
1743 |
The delimiting of the message is determined by the Content-Length: |
1744 |
field. If this is present, then the message contains the specified |
1745 |
number of bytes. |
1746 |
|
1747 |
Failing that, the content-type filed may contain a "bounday" |
1748 |
attribute which gives the boundary string with exacly the same |
1749 |
syntax as for a MIME multipart message. |
1750 |
|
1751 |
Failing either of the above conditions, the data is terminated by |
1752 |
the closing of the connection by the sending party. Note that this |
1753 |
method can not be used for data sent with the request. |
1754 |
|
1755 |
See also: note on server tolerance for back-compatibility, etc. |
1756 |
|
1757 |
|
1758 |
|
1759 |
|
1760 |
|
1761 |
|
1762 |
|
1763 |
|
1764 |
|
1765 |
|
1766 |
|
1767 |
|
1768 |
|
1769 |
|
1770 |
|
1771 |
|
1772 |
|
1773 |
|
1774 |
|
1775 |
|
1776 |
|
1777 |
|
1778 |
|
1779 |
|
1780 |
|
1781 |
|
1782 |
|
1783 |
|
1784 |
|
1785 |
|
1786 |
|
1787 |
|
1788 |
|
1789 |
|
1790 |
|
1791 |
|
1792 |
|
1793 |
|
1794 |
|
1795 |
|
1796 |
|
1797 |
|
1798 |
T. Berners-Lee 31 |
1799 |
|