/[suikacvs]/webroot/www/2004/id/draft-ietf-iiir-http-00.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-iiir-http-00.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 Hypertext Transfer Protocol (HTTP) Tim Berners-Lee, CERN
2 Internet Draft 5 Nov 1993
3 draft-ietf-iiir-http-00.txt Expires 5 May 1994
4
5
6
7 Hypertext Transfer Protocol (HTTP)
8
9 A Stateless Search, Retrieve and Manipulation Protocol
10
11
12
13
14
15 Status of this memo
16
17 This document is an Internet Draft. Internet Drafts are working
18 documents of the Internet Engineering Task Force (IETF), its Areas,
19 and its Working Groups. Note that other groups may also distribute
20 working documents as Internet Drafts.
21
22 Internet Drafts are working documents valid for a maximum of six
23 months. Internet Drafts may be updated, replaced, or obsoleted by
24 other documents at any time. It is not appropriate to use Internet
25 Drafts as reference material or to cite them other than as a
26 "working draft" or "work in progress".
27
28 This document is a DRAFT specification of a protocol in use on the
29 internet and to be proposed as an Internet standard. Discussion
30 of this protocol takes place on the www-talk@info.cern.ch mailing
31 list -- to subscribe mail to www-talk-request@info.cern.ch.
32 Distribution of this memo is unlimited.
33
34 Abstract
35
36 HTTP is a protocol with the lightness and speed necessary for a
37 distributed collaborative hypermedia information system. It is a
38 generic stateless object-oriented protocol, which may be used for
39 many similar tasks such as name servers, and distributed
40 object-oriented systems, by extending the commands, or "methods",
41 used. A feature if HTTP is the negotiation of data representation,
42 allowing systems to be built independently of the development of
43 new advanced representations.
44
45 Note: This specification
46
47 This HTTP protocol is an upgrade on the original protocol as
48 implemented in the earliest WWW releases. It is back-compatible
49 with that more limited protocol.
50
51 This specification includes the following parts:
52
53 The Request
54
55
56
57 T. Berners-Lee 1
58
59 Methods
60
61 A list of headers in the request message
62
63 The response
64
65 Status codes
66
67 A list of headers on any object transmitted
68
69 The content of any object content transmitted
70
71 Format negotiation algorithm
72
73 The HTTP Registration Authority
74
75 Security Considerations
76
77 Unresolved points
78
79 References
80
81 The following notes form recommended practice not part of the
82 specification:
83
84 Servers tolerating clients
85
86 Clients tolerating servers
87
88 Purpose
89
90 When many sources of networked information are available to a
91 reader, and when a discipline of reference between different
92 sources exists, it is possible to rapidly follow references
93 between units of information which are provided at different remote
94 locations. As response times should ideally be of the order of
95 100ms in, for example, a hypertext jump, this requires a fast,
96 stateless, information retrieval protocol.
97
98 Practical information systems require more functionality than
99 simple retrieval, including search, front-end update and
100 annotation. This protocol allows an open-ended set of methods to be
101 used. It builds on the discipline of reference provided by the
102 Universal Resource Identifier (URI) as a name (URN, RFCxxxx) or
103 address (URL, RFCxxxx) allows the object of the method to be
104 specified.
105
106 Reference is made to the Multipurpose Internet Mail Extensions
107 (MIME, RFC1341) which are used to allow objects to be transmitted
108 in an open variety of representations.
109
110 Overall operation
111
112
113
114
115 T. Berners-Lee 2
116
117 On the internet, the communication takes place over a TCP/IP
118 connection. This does not preclude this protocol being implemented
119 over any other protocol on the internet or other networks. In
120 these cases, the mapping of the HTTP request and response
121 structures onto the transport data units of the protocol in
122 question is outside the scope of this specification. It should not
123 however be at all complicated.
124
125 The protocol is basically stateless, a transaction consisting of
126
127 Connection The establishment of a connection by the
128 client to the server - when using TCP/IP
129 port 80 is the well-known port, but other
130 non-reserverd ports may be specified in the
131 URL;
132
133 Request The sending, by the client, of a request
134 message to the server;
135
136 Response The sending, by the server, of a response to
137 the client;
138
139 Close The closing of the connection by either both
140 parties.
141
142 The format of the request and response parts is defined in this
143 specification. Whilst header information defined in this
144 specification is sent in ISO Latin-1 character set in CRLF
145 terminated lines, object transmission in binary is possible.
146
147 Character sets
148
149 In all cases in HTTP where RFC822 characters are allowed, these may
150 be extended to use the full ISO Latin 1 character set. 8-bit
151 transmission is always used.
152
153 tableofcontents
154
155 REQUEST
156
157 The request is sent with a first line containing the method to be
158 applied to the object requested, the identifier of the object, and
159 the protocol version in use, followed by further information
160 encoded in the RFC822 header style. The format of the request is:
161
162
163 Request = SimpleRequest | FullRequest
164
165 SimpleRequest = GET <uri> CrLf
166
167 FullRequest = Method UR ProtocolVersion CrLf
168 [*<HTRQ Header>]
169 [<CrLf> <data>]
170
171
172
173 T. Berners-Lee 3
174
175 <Method> = <InitialAlpha>
176
177 ProtocolVersion = HTTP/V1.0
178
179 uri = <as defined in URL spec>
180
181 <HTRQ Header> = <Fieldname> : <Value> <CrLf>
182
183 <data> = MIME-conforming-message
184
185
186 The URI is the Uniform Resource Locator (URL) as defined in the
187 specification, or may be (when it is defined) a Uniform Resource
188 Name (URN) when a specification for this is settled, for servers
189 which support URN resolution.
190
191 Unless the server is being used as a gateway, a partial URL shall
192 be given with the assuptions of the protocol (HTTP:) and server
193 (the server) being obvious.
194
195 The URI should be encoded using the escaping scheme described in
196 the URL specification to a level such that (at least) spaces and
197 control characters (decimal 0-31 and 128-159) do not appear
198 unesacaped.
199
200 Note. The rest of an HTTP url after the host name and optional port
201 number is completely opaque to the client: The client may make no
202 deductions about the object from its URL.
203
204 Protocol Version
205
206 The Protocol/Version field defines the format of the rest of the
207 request.. At the moment only HTRQ is defined .
208
209 If the protocol version is not specified, the server assumes that
210 the browser uses HTTP version 0.9.
211
212 Uniform Resource Identifier
213
214 This is a string identifying the object. It contains no blanks. It
215 may be a Uniform Resource Locator [ URL ] defining the address of
216 an object as described in RFCxxxx, or it may be a representation of
217 the name of an object (URN, Universal Resource Name) where that
218 object has been registered in some name space. At the time of
219 writing, no suitable naming system exists, but this protocol will
220 accept such names so long as they are distinguishable from the
221 existing URL name spaces.
222
223 Methods
224
225 Method field indicates the method to be performed on the object
226 identified by the URL. More details are with the list of method
227 names below .
228
229
230
231 T. Berners-Lee 4
232
233 Request Headers
234
235 These are RFC822 format headers with special field names given in
236 the list below , as well as any other HTTP object headers or MIME
237 headers.
238
239 Object Body
240
241 The content of an object is sent (depending on the method) with the
242 request and/or the reply.
243
244 Methods
245
246 The Method field in HTTP indicates the method to be performed on
247 the object identified by the URL. The method GET below is always
248 supported, The list of other methods acceptable by the object are
249 returned in response to either of these two requests.
250
251 This list may be extended from time to time by a process of
252 registration with the design authority. Method names are case
253 sensitive. Currently specified methods are as follows:
254
255 GET means retrieve whatever data is identified
256 by the URI, so where the URI refers to a
257 data-producing process, or a script which can
258 be run by such a process, it is this data
259 which will be returned, and not the source
260 text of the script or process. Also used for
261 searches .
262
263 HEAD is the same as GET but returns only HTTP
264 headers and no document body.
265
266 CHECKOUT Similar to GET but locks the object against
267 update by other people. The lock may be
268 broken by a higher authority or on timeout:
269 in this case a future CHECKIN will fail. (
270 Phase out? )
271
272 SHOWMETHOD Returns a description (perhaps a form) for a
273 given method when applied to the given
274 object. The method name is specified in a
275 For-Method: field. (TBS)
276
277 PUT specifies that the data in the body section
278 is to be stored under the supplied URL. The
279 URL must already exist. The new contenst of
280 the document are the data part of the
281 request. POST and REPLY should be used for
282 creating new documents.
283
284 DELETE Requests that the server delete the
285 information corresponding to the given URL.
286
287
288
289 T. Berners-Lee 5
290
291 After a successfull DELETE method, the URL
292 becomes invalid for any future methods.
293
294 POST Creates a new object linked to the specified
295 object. The message-id field of the new
296 object may be set by the client or else will
297 be given by the server. A URL will be
298 allocated by the server and returned to the
299 client. The new document is the data part of
300 the request. It is considered to be
301 subordinate to the specified object, in the
302 way that a file is subordinate to a directory
303 containing it, or a news article is
304 subordinate to a newsgroup to which it is
305 posted.
306
307 LINK Links an existing object to the specified
308 object.
309
310 UNLINK Removes link (or other meta-) information
311 from an object.
312
313 CHECKIN Similar to PUT, but releases the lock set on
314 the object. Fails if no lock has been set by
315 CHECKOUT. Suggestion : phase out this
316 (rcs-like) model in favor of the PUT
317 (cvs-like, non-locking) model of code
318 management.
319
320 TEXTSEARCH The object may be queried with a text
321 string. The search form of the GET method is
322 used to query the object.
323
324 SPACEJUMP The object will accept a query whose terms
325 are the cooridnates of a point within the
326 object. The method is implemented using GET
327 with a derived URL .
328
329 SEARCH Proposed only. The index (etc) identified
330 by the URL is to be searched for something
331 matching in some sense the enclosed message.
332 How does the client know what message fromats
333 are acceptable to the server? (Suggestion of
334 Fred Williams)
335
336 (Some of these methods require more detailed specification)
337
338 Note: case sensitivity of method names
339
340 Although lack of case sensitivity in methos names would be a
341 tolerant approach with a limited method set, we require the
342 extensibility of HTTP to cover an arbitrary underlying object
343 system. In such a system, method names may be case sensitive, and
344
345
346
347 T. Berners-Lee 6
348
349 so we must preserve case in HTTP.
350
351 GET
352
353 A representation of the object is transferred to the client.
354
355
356
357 Some URIs refer to specific variants of an object, and some refer
358 to objects with many variants. In the latter case, the
359 representations, encodings, and languages acceptable may be
360 specified in the header request fields, and may affect the
361 particular value which is returned.
362
363 Other possible replies allow a set of URIs to be returned to the
364 client, who may use them to retrieve the object. This allows name
365 servers to be implemented using HTTP, and also forwarding address
366 to be given when objects have been moved.
367
368 Annotation replies
369
370 Some servers keep 'parallel webs" -- separate stores of information
371 about other server's objects. Typically this incldues annotation
372 links. This information may be retrieved using the GET method,
373 but a special reply is returned.
374
375 Searching using GET
376
377 When the TEXTSEARCH or SPACEJUMP methods are supported, their
378 implementation is in fact by means of the GET method, by
379 constructing a derived URL.
380
381 The URL used with GET is the URL of the object, suffixed by a "?"
382 character and the text to be searched. If the object being searched
383 is in fact itself the result of a search (ie the URL contains a
384 "?"), then those search terms are first stripped off, so the search
385 is performed on the original object.
386
387 In the URL, keywords are seperated with plus signs ("+"). Real
388 spaces, plus signs, and other illegal characters [who defines
389 illegal characters?] are represented as hexadecimal ASCII escape
390 sequences (%##, e.g., space = %20, plus = %2B)
391
392 Browsers accepting text input should define keywords as seperated
393 by spaces (and therefore map them to plus signs). More advanced
394 browsers may allow separate keyword input and, therefore, a finer
395 level of control over the content without users needing to
396 understand the underlying mechanisms (e.g., that they must use %20
397 to get a real space).
398
399 Examples
400
401 The following is a request on an object supporting TEXTSEARCH:
402
403
404
405 T. Berners-Lee 7
406
407 GET /indexes/botany?annual+plants HTTP/1.0
408
409 This request on an object supporting SPACEJUMP selects point
410 (0.2,3.8) within a map:
411
412 GET /maps/uk/dorset/winspit/0.2+3.8 HTTP/1.0
413
414
415 LINK
416
417 The link method of HTTP adds meta information (Object header
418 information) to an object, without touching the object's content.
419 For example, it requests the creation of a link from the specified
420 object to another object.
421
422 The request is followed by a set of object headers which are to be
423 added.
424
425 In cases in which a new header is added and the semantics of the
426 header do not allow it to coexist with a similar header, then the
427 previous header is deleted. For example, an object may only have
428 one title, so specifying a title overwrites and preexsiting title.
429
430 Link Types
431
432 The link type unless void is specified in the WWW-Link object field
433 for each link. If not specified, void is assumed.
434
435 Unresolved points
436
437 As this is generalised to allow any metainformation to be added, a
438 better name might be DESCRIBE or attribute (as a verb). I don't
439 like "describe" as it does not suggest alteration.
440 "Bestow"/"Rescind" might be better. For example, one could bestow a
441 title or author on something previously title-less. We are looking
442 at a general data model behind here a little like a relational
443 database. This function adds records. "Set" and "Reset"?
444
445 Related Methods
446
447 LINK is like POST except that
448
449 no storage is requested for the destination object, and
450
451 the destination is not assumed to be subordinate to the
452 specified object.
453
454 See also: UNLINK.
455
456 Tim BL
457
458 POST
459
460
461
462
463 T. Berners-Lee 8
464
465 This method of HTTP creates a new object linked to and subordinate
466 to the specified object. The content of the new object is
467 enclosed as the body of the request.
468
469 The POST method is desiged to allow a uniform function to cover
470
471 Annotation existing documents;
472
473 Posting a message to a bulletin board topic, newsgroup, mailing
474 list, or similar group of articles;
475
476 Adding a file to a directory;
477
478 Extending a document during authorship.
479
480 The client may not assume any postconditions of the method in terms
481 of web topology. For example, if a POST is accepted, the effect may
482 be delayed or overruled by human moderation, batch processing, etc.
483 The client should not be surprised if a link is not immediately
484 (or never) created.
485
486 However, the semantics are a request for a link to be made from the
487 object whose URL is quoted to the new, enclosed, object.
488
489 If no URL for the NEW object is given by the client, the server is
490 requested to see to the storage of the new object. That is, the
491 server does not have to store it but will have to return a URL be
492 which it can later be retrieved. The semantics of this method
493 (currently) imply nothing of any undertakings by the server to
494 maintain the availability of the object.
495
496 If the client gives a URL, then the server is not obliged to store
497 the object, but may take a copy and may in that case issue a new
498 URL.
499
500 Return object headers
501
502 The method shall return a set (possibly empty) of object headers
503 for the newly posted object. If a URL has been assigned by the
504 server, then that may be included. Similarly, if a URN has been
505 assigned, then that shall be returned. Other things which may be
506 returned include for example the expiry-date if any. The server may
507 return the entire metadata for the object (as in the HEAD command),
508 or a subset of it.
509
510 The object body shall not be returned, so the transaction shall end
511 with the blank line terminating the headers.
512
513 Link type
514
515 The link type may be specified by explicitly giving a (reverse)
516 link in the object header of the linked object. If a link or links
517 between the two objects are present in those headers, then that
518
519
520
521 T. Berners-Lee 9
522
523 link is used. If no such link is specified, then the server should
524 generate a link. The link type in this case is determined by the
525 server. The server may perform other operations as a result of the
526 new object being added: lists and indexes might be updated, for
527 example.
528
529 Submission
530
531 When articles are submitted, the analogy of being addeed to a body
532 of knowledge by being linked is close. When a form is submitted,
533 this can be done with POST, though in this case side-effects will
534 be expected.
535
536 (Should submission for action have a different method -- see
537 showmethod -- or should it be just POST for simplicity? When
538 interfacing with other systems such as bews and mail, the
539 distinction is not made as the system does not have the ability to
540 distinguish different methods. We now have a possibility of making
541 a separate action, though.)
542
543 Unresolved points
544
545 The client has no way of knowing what data formats the server is
546 prepared to accept. This may not be a serious problem, but it may
547 be if the server has some restrictions. This applies to all
548 submissions of object content from client to server, for all
549 methods.
550
551 Related methods.
552
553 When a new URL has been returned by the server, it may in general
554 (typically, but not necessarily) be usable as the argument of
555 DELETE , GET , PUT , etc, methods.
556
557 To make a link between two existing objects, see LINK.
558
559 Tim BL
560
561 SEARCH
562
563 Some correspondance about SEARCH:
564
565 Q
566
567 How did the client know that the server could search for tex? Maybe t
568 his server can only serach for text and GIF patterns, even though it g
569 ave out tex. How can the client know that a GIF search is possible?
570
571 Answer
572
573 I assume the following has happened:
574
575 client requests with ACCEPT: text/x-TeX
576
577
578
579 T. Berners-Lee 10
580
581 If the server, which it should if possible, replies with the doc
582 requested in ``TeX''.
583
584 Therefore a server can obviously do an ``internal representation'' to
585 ``TeX'' conversion.
586
587 Since the server responded in a particular representation (ie TeX)
588 The client should be able to search in ``TeX'' provide the ``SEARCH''
589 method is in the required ALLOWED: section of the response.
590
591 The server should only respond with ALLOWED: SEARCH ... if it can do a
592 Tex to ``internal representation'' conversion for searching.
593
594 Fred Williams fwilliam@ccs.carleton.ca
595
596 SHOWMETHOD
597
598 When an object can support more operations than are defined in this
599 specification, SHOWMETHOD allows a client to understand the
600 interface to that operation sufficiently to allow the user to
601 perform it interactively.
602
603 Required parameter field
604
605 For-Method: This filed contains only the method name
606 about which the client is inquiring.
607
608 Preconditions
609
610 The methodname spacified in the For-Method field must have been
611 previously issued in a "Allowed:" field returned with the given
612 object.
613
614 The client should specify an Accept: field which includes at least
615 one form langauge it it wants to be able to interpret the result.
616
617 Postcondidtion
618
619 SHOWMETHOD returns, if possible, a form in a representation
620 acceptable to the client. This form will contain instructions for
621 ordering the operation, and fields for the parameters.
622
623 A suitable language for the form might be HTTP2, but any language
624 may be used which is acceptable to the client.
625
626 SPACEJUMP
627
628 This method is similar to the TEXTSEARCH method, but instead of the
629 search criterion being a text string, it is a set of coordinates
630 defining a point within the image. The semantics of the operation
631 are not defined here. Typically, the user clicks on a point within
632 the image with a mouse or other pointing device.
633
634
635
636
637 T. Berners-Lee 11
638
639 Two or more coordinates are supplied, in the order x, y z, t. All
640 coordinates are scaled so that 0 represents the bottom left hand
641 point and 1.0 represents the top right hand point.
642
643 The z access direction follows the normal right-hand rule, that is
644 extends toward the viewer when the x and y axes are flat as in the
645 normal two-dimensional representation.
646
647 In the case of a time-occupying object, 0 represents the starting
648 instance, and 1.0 represents the finishing instant.
649
650 The method is implemented using GET with a derived URL .
651
652 TEXTSEARCH
653
654 This is a simple form of search. The text is assumed to derive from
655 the requesting user, and is in no special format.
656
657 The exact algorithm to be applied is not defined in this
658 specification, but techniques such as vocabulary proximity matching
659 between the request data portion and the contents or titles of
660 documents, keyword matching, stemming, and the use of a thesaurus
661 are quite appropriate.
662
663 Whilst this method name is given as a flag to specify that the
664 function is available, the search form of the GET method is in fact
665 used to query the object.
666
667 UNLINK
668
669 This method deletes metainformation about an object. The request
670 contains object geaders which are to be removed. Only headers
671 exactly matching the headers given are removed.
672
673 Obviously the operation may be used for unlinking objects. It may
674 also be used for removing other metainformation such as object
675 title, expiry date, etc.
676
677 Related methods
678
679 See LINK.
680
681 Tim BL
682
683 HTTP Request fields
684
685 These header lines are sent by the client in a HTTP protocol
686 transaction. All lines are RFC822 format headers. The list of
687 headers is terminated by an empty line.
688
689 FROM:
690
691 In Internet mail format, this gives the name of the requesting
692
693
694
695 T. Berners-Lee 12
696
697 user. This field may be used for logging purposes and an insecure
698 form of access protection. The interpretation of this field is
699 that the request is being performed on behalf of the person given,
700 who accepts responsability for the method performed.
701
702 The Internet mail address in this field does not have to correspond
703 to the internet host which issued the request. (For example, when a
704 request is passed through a gateway, then the original issuer's
705 address should be used).
706
707 The mail address should, if possible, be a valid mail address,
708 whether or not it is in fact an internet mail address or the
709 internet mail representation of an address on some other mail
710 system.
711
712
713
714 ACCEPT:
715
716 This field contains a comma-separated list of representation
717 schemes (Content-Type metainformation values) which will be
718 accepted in the response to this request.
719
720 The set given may of course vary from request to request from the
721 same user.
722
723 This field may be wrapped onto several lines according to RCFC822,
724 and also more than one occurence of the field is allowed with the
725 signifiance being the same as if all the entries has been in one
726 field. The format of each entry in the list is (/ meaning "or")
727
728 <field> = Accept: <entry> *[ ; <entry> ]
729 <entry> = <content type> *[ , <param> ]
730 <param> = <attr> = <float>
731 <attr> = q / mxs / mxb
732 <float> = <ANSI-C floating point text represntation>
733
734 See the appendix on the negotiation algorithm as a function and
735 penalty model.
736
737 If no Accept: field is present, then it is assumed that text/plain
738 and text/html are accepted.
739
740 Example
741
742 Accept: text/plain; text/html
743 Accept: text/x-dvi, q=.8, mxb=100000, mxt=5.0; text/x-
744 c
745
746 Wildcards
747
748 In order to save time, and also allow clients to receive content
749 types of which they may not be aware, an asterisk "*" may be used
750
751
752
753 T. Berners-Lee 13
754
755 in place of either the second half of the content-type value, or
756 both halves. This only applies to the Accept: filed, and not to
757 the content-type field of course.
758
759 Example
760
761 Accept: *.*, q=0.1
762 Accept: audio/*, q=0.2
763 Accept: audio/basic q=1
764
765 may be interpreted as "if you have basic audio, send it; otherwise
766 send me some other audio, or failing that, just give me what
767 you've got."
768
769 ACCEPT-ENCODING:
770
771 Similar to Accept, but lists the Content-Encoding types which are
772 acceptable in the response.
773
774 <field> = Accept-Encoding: <entry> *[ , <entry> ]
775 <entry> = <content transfer encoding> *[ , <param> ]
776
777 Example
778
779 Accept-Encoding: x-compress; x-zip
780
781
782 ACCEPT-LANGUAGE:
783
784 Similar to Accept, but lists the Language values which are
785 preferable in the response. A response in an unspecifies language
786 is not illegal. See also: Language .
787
788 Language coding TBS. (ISO standard xxxx)
789
790 USER-AGENT:
791
792 This line if present gives the software program used by the
793 original client. This is for statistical purposes and the tracing
794 of protocol violations. It should be included. The first white
795 space delimited word must be the software product name, with an
796 optional slash and version designator. Other products which form
797 part of the user agent may be put as separate words.
798
799 <field> = User-Agent: <product>+
800 <product> = <word> [/<version>]
801 <version> = <word>
802
803 Example:
804
805 UserAgent: LII-Cello/1.0 libwww/2.5
806
807 REFERER:
808
809
810
811 T. Berners-Lee 14
812
813 This optional header field allows the client to specify, for the
814 server's benefit, the address ( URI ) of the document (or element
815 within the document) from which the URI in the request was
816 obtained.
817
818 This allows a server to generate lists of back-links to documents,
819 for interest, logging, etc. It allows bad links to be traced for
820 maintenance.
821
822 If a partial URI is given, then it should be parsed relative to the
823 URI of the object of the request.
824
825 Example:
826
827 Referer: http://info.cern.ch/hypertext/DataSources/Over
828 view.html
829
830 AUTHORIZATION:
831
832 If this line is present it contains authorization information. The
833 format is To Be Specified (TBS). The format of this field is in
834 extensible form. The first word is a specification of the
835 authorisation system in use.
836
837 Proposals have been as follows: (see the specification for current
838 one implemented by AL Sep 1993)
839
840 User/Password scheme
841
842 Authorization: user fred:mypassword
843
844 The scheme name is "user". The second word is a user name
845 (typically derived from a USER environment variable or prompted
846 for), with an optional password separated by a colon (as in the URL
847 syntax for FTP). Without a password, this povides very low level
848 security. With the password, it provides a low-level security as
849 used by unmodified FTP, Telnet, etc.
850
851 Kerberos
852
853 Authorization: kerberos kerberosauthenticationsparam
854 eters
855
856 The format of the kerberosauthenticationsparameters is to be
857 specified.
858
859 CHARGETO:
860
861 This line if present contains account information for the costs of
862 the application of the method requested. The format is TBS. The
863 format of this field must be in extensible form. The first word
864 starts with a specification of the namespace in which the account
865 is . (This is similar to extensible URL definition.) No namespaces
866
867
868
869 T. Berners-Lee 15
870
871 are currently defined. Namespaces will be registered with the
872 registration authority .
873
874 The format of the rest of the line is a function of the charging
875 system, but it is recommended that this include a maximum cost
876 whose payment is authorized by the client for this transaction, and
877 a cost unit.
878
879 Note: Server tolerance of bad clients
880
881 Whilst it is seen appropriate for testing parsers to check full
882 conformance to this specification, it is recommended that
883 operational parsers be tolerant of deviations.
884
885 In particular, lines should be regarded as terminated by the Line
886 Feed, and the preceeding Carriage Return character ignored.
887
888 Any HTTP Header Field Name which is not recognised should be
889 ignored in operational parsers.
890
891 It is recommended that servers use URIs free of "variant"
892 characters whose representation differs in some of the national
893 variant character sets, punctuation characters, and spaces. This
894 will make URIs easier to handle by humans when the need (such as
895 debugging, or transmission through non hypertext systems) arises.
896
897 RESPONSE
898
899 The response from the server shall start with the following syntax
900 (See also: note on client tolerance ):
901
902 <status line> ::= <http version> <status code> <reason line>
903 <CrLf>
904 <http version> ::= 3*<digit>
905 <status code> ::= 3*<digit>
906 <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
907 <reason line> ::= * <printable>
908
909
910 <http version> identifies the HyperText Transfer Protocol
911 version being used by the server. For the
912 version described by this document version it
913 is "HTTP/1.0" (without the quotes).
914
915 < status code > gives the coded results of the attempt to
916 understand and satisfy the request. A three
917 digit ASCII decimal number.
918
919 <reason string> gives an explanation for a human reader,
920 except where noted for particular status
921 codes.
922
923 Fields on the status line are delimited by a single blank (parsers
924
925
926
927 T. Berners-Lee 16
928
929 should accept any amount of white space). The possible values of
930 the status code are listed below .
931
932 Response headers
933
934 The headers on returned objects those RFC822 format headers listed
935 as object headers , as well as any MIME conforming headers, notably
936 the Content-Type field. Note that this specification doesnot
937 define any headers particular to the response which are not also
938 apropriate to any transmission of an object with a request.
939
940 Response data
941
942 Additional information may follow, in the format of a MIME message
943 body. The significance of the data depends on the status code.
944
945 The Content-Type used for the data may be any Content-Type which
946 the client has expressed his ability to accept, or text/plain, or
947 text/html. That is, one can always assume that the client can
948 handle text/plain and text/html.
949
950 Status codes
951
952 The values of the numeric status code to HTTP requests are as
953 follows. The data sections of messages Error, Forward and
954 redirection responses may be used to contain human-readable
955 diagnostic information.
956
957 SUCCESS 2XX
958
959 These codes indicate success. The body section if present is the
960 object returned by the request. It is a MIME format object. It is
961 in MIME format, and may only be in text/plain, text/html or one fo
962 the formats specified as acceptable in the request.
963
964 OK 200
965
966 The request was fulfilled.
967
968 CREATED 201
969
970 Following a POST command, this indicates success, but the textual
971 part of the response line indicates the URI by which the newly
972 created document should be known.
973
974 Accepted 202
975
976 The request has been accepted for processing, but the processing
977 has not been completed. The request may or may not eventually be
978 acted upon, as it may be disallowed when processing actually takes
979 place. there is no facility for status returns from asynchronous
980 operations such as this.
981
982
983
984
985 T. Berners-Lee 17
986
987 Partial Information 203
988
989 When received in the response to a GET command, this indicates that
990 the returned metainformation is not a definitive set of the object
991 from a server with a copy of the object, but is from a private
992 overlaid web. This may include annotation information about the
993 object, for example
994
995 ERROR 4XX, 5XX
996
997 The 4xx codes are intended for cases in which the client seems to
998 have erred, and the 5xx codes for the cases in which the server is
999 aware that the server has erred. It is impossible to distinguish
1000 these cases in general, so the difference is only informational.
1001
1002 The body section may contain a document describing the error in
1003 human readable form. The document is in MIME format, and may only
1004 be in text/plain, text/html or one for the formats specified as
1005 acceptable in the request.
1006
1007 Bad request 400
1008
1009 The request had bad syntax or was inherently impossible to be
1010 satisfied.
1011
1012 Unauthorized 401
1013
1014 The parameter to this message gives a specification of
1015 authorization schemes which are acceptable. The client should
1016 retry the request with a suitable Authorization header.
1017
1018 PaymentRequired 402
1019
1020 The parameter to this message gives a specification of charging
1021 schemes acceptable. The client may retry the request with a
1022 suitable ChargeTo header.
1023
1024 Forbidden 403
1025
1026 The request is for something forbidden. Authorization will not
1027 help.
1028
1029 Not found 404
1030
1031 The server has not found anything matching the URI given
1032
1033 Internal Error 500
1034
1035 The server encountered an unexpected condition which prevented it
1036 from fulfilling the request.
1037
1038 Not implemented 501
1039
1040
1041
1042
1043 T. Berners-Lee 18
1044
1045 The server does not support the facility required.
1046
1047 REDIRECTION 3XX
1048
1049 The codes in this section indicate action to be taken (normally
1050 automatically) by the client in order to fulfill the request.
1051
1052 Moved 301
1053
1054 The data requested has been assigned a new URI, the change is
1055 permanent. (N.B. this is an optimisation, which must,
1056 pragmatically, be included in this definition. Browsers with link
1057 editing capabiliy should automatically relink to the new reference,
1058 where possible)
1059
1060 The response contains one or more header lines of the form
1061
1062 URI: <url> String CrLf
1063
1064 Which specify alternative addresses for the object in question.
1065 The String is an optional comment field. If the response is to
1066 indicate a set of variants which each correspond to the requested
1067 URI, then the multipart/alternative wrapping may be used to
1068 distinguish different sets
1069
1070 Found 302
1071
1072 The data requested actually resides under a different URL, however,
1073 the redirection may be altered on occasion (when making links to
1074 these kinds of document, the browser should default to using the
1075 Udi of the redirection document, but have the option of linking to
1076 the final document) as for "Forward".
1077
1078 The response format is the same as for Moved .
1079
1080 Method 303
1081
1082 Method: <method> <url>
1083 body-section
1084
1085 Like the found response, this suggests that the client go try
1086 another network address. In this case, a different method may be
1087 used too, rather than GET.
1088
1089 The body-section contains the parameters to be used for the method.
1090 This allows a document to be a pointer to a complex query
1091 operation.
1092
1093 @@TBS in more detail.
1094
1095 The body may be preceded by the following additional fields as
1096 listed .
1097
1098
1099
1100
1101 T. Berners-Lee 19
1102
1103 Object MetaInformation
1104
1105 The header fields given with or in relation to objects in HTTP are
1106 as follows. All are optional. These headers specify
1107 metainformation: that is, information about the object, not the
1108 information which is contained in the object.
1109
1110 There is no reason for limiting these fields to HTTP use, as any
1111 other system which requires metainformation is encouraged to use
1112 them.
1113
1114 The order of header lines withing the HTTP header has no
1115 significance. However, those fields which are not MIME fields
1116 should occur before the MIME fields, so that the MIME fields and
1117 following form a valid MIME document. This is not mandatory.
1118
1119 Any header fields which are not understood should be ignored.
1120
1121 (TBS in more detail)
1122
1123 ALLOWED: *METHOD
1124
1125 Lists the set of requests which the requesting user is allowed to
1126 issue for this URL. If this header line is omitted, the default
1127 allowed methods are "GET HEAD"
1128
1129 Example of use:
1130
1131 Allow: GET HEAD PUT
1132
1133 PUBLIC: *METHOD
1134
1135 As "Allow" but lists those requests which anyone may use. If
1136 omitted, the default is "GET" only.
1137
1138 Example of use:
1139
1140 Public: GET HEAD TEXTSEARCH
1141
1142 CONTENT-LENGTH: INT
1143
1144 Implies that the body is binary and should be read directly from
1145 the communications link, without parsing lines, etc. When the
1146 data is part of the request, prevents the escaping and de-escaping
1147 of the termination sequence .
1148
1149 @@@ This should be part of the MIME header, as it applies to any
1150 binary encoded part. Note HTML is the first internet protocol to
1151 allow MIME "binary" encoding. In MIME, the use of Content-Length is
1152 currently allowed only for external messages.
1153
1154 CONTENT-TYPE:
1155
1156
1157
1158
1159 T. Berners-Lee 20
1160
1161 As defined in MIME, except where noted here.
1162
1163 Extra non-MIME types
1164
1165 It is reasonable to put strict limits on transfer formats for mail,
1166 where there is no guarantee that the receiver will understand a
1167 weird format. However, in HTTP one knows that the receiver will be
1168 able to receive it because it will have been sent in the Accept:
1169 field. There is therefore a lot to be gained from a very complete
1170 registry of well-defined types for HTTP which may nevertheless not
1171 be recommended for mail. In this case, the content-type list for
1172 HTTP may be a superset of the MIME list.
1173
1174 The x- convention for experimental types is of course still
1175 available as well.
1176
1177 Type parameters
1178
1179 Parameters on the content type are extremely useful for describing
1180 resolutions, colour depths, etc. They will allow a client to
1181 specify in the Accept: field the resolution of its device. This may
1182 allow the server to economise greatly on transmission time by
1183 reducing the resultion of an image, for example.
1184
1185 These parameters are to be specified when types are registered..
1186 @@ TBS.
1187
1188 Multipart types
1189
1190 MIME provides for a number of "multipart" types. These are
1191 encapsulations of several body parts in the one message. In HTTP,
1192 Multipart types may be returned on the condition that the client
1193 has indicated acceptability (using Accept:)of the multipart type
1194 and also of the content types of each consitutent body part.
1195
1196 The body parts (unlike in MIME) MAY contain HTTP metainformation
1197 header fields which ARE significant.
1198
1199 Multipart/alternative
1200
1201 This is normally used in mail to send different content-type
1202 variants when the receiver's capabilities are not known. This is
1203 not the case with HTTP. Multipart/alternate may however be used to
1204 provide meta information of many instances of an object, in the
1205 case of a indirection response. This allows, for example, pointers
1206 to be returned by a name server to a set of instances of an object.
1207
1208 Multipart/related
1209
1210 This is the type to be used when the first body part contains
1211 references to other parts which the server wishes to send at the
1212 same time. For example, the first part could be an HTML document,
1213 and the included bodyparts could be the inline images mentioned
1214
1215
1216
1217 T. Berners-Lee 21
1218
1219 within the text.
1220
1221 The body parts may have URI: fields if the body parts have URIs,
1222 and so they may be referred to by these URIs in the body-parts. If
1223 the body-parts are transient (as in speech insertions in mail
1224 messages) then the [propsed] "cid:" URI type may be used to refer
1225 to them by content-identifier.
1226
1227 Multipart/mixed
1228
1229 This may be used to simply transfer an unrelated unstructured set
1230 of objects.
1231
1232 Multipart/parallel
1233
1234 This may be used as in MIME to indicate simultaneous presentation.
1235 [It is the author's belief that this is a trivial case of a
1236 compound presentation which in general should be described by a
1237 script which would be teh first bodypart of a multipart/related
1238 document].
1239
1240 DATE: DATE
1241
1242 Creation date of object. (or last modified, and separately have a
1243 Created: field?) Format as in RFC850 but GMT MUST BE USED.
1244
1245 EXPIRES: DATE
1246
1247 Gives the date after which the information given ceases to be valid
1248 and should be retrieved again. This allows control of caching
1249 mechanisms, and also allows for the periodic refreshing of displays
1250 of volatile data. Format as for Date:. This does NOT imply that
1251 the original object will cease to exist.
1252
1253 LAST-MODIFIED: DATE
1254
1255 Last time object was modified, i.e. the date of this version if the
1256 document is a "living document". Format as for Date:.
1257
1258 MESSAGE-ID: URI
1259
1260 A unique identifier for the message. As in RFC850 , except that the
1261 unlimited lifetime of HTTP objects requires that the Message-ID be
1262 unique in all time, not just in two years.
1263
1264 A document may only have one Message-ID.
1265
1266 No two documents, even if different versions of the same live
1267 document, may have the same Message-id.
1268
1269 Note: Unlike the URI field, this does not fgive a way of accessing
1270 the document, so the Message-Id cannot be used to refer to the
1271 document. In the case of NNTP articles, the message-id may in fact
1272
1273
1274
1275 T. Berners-Lee 22
1276
1277 be used within the URI for retrieval using NNTP.
1278
1279 URI: 1*URI
1280
1281 This gives a URI with which the object may be found. There is no
1282 guarantee that the object can be retrieved using the URI specified.
1283 However, it is guaranteed that if an object is successfully
1284 retrieved using that URI it will be to a certain given degree the
1285 same object as this one.
1286
1287 If the URI is used to refer to a set of variants, then the
1288 dimensiosn in which the variants may differ must be given with the
1289 "vary" parameter:
1290
1291 Syntax URI: <uri> [ ; vary = dimension [ , dimension
1292 ]* ]
1293 dimension content-type | language | version
1294
1295
1296 If no "vary" parameters are given, then the URI may not return
1297 anything other than the same bit stream as this object.
1298
1299 Multiple occurencies of this field give alternative access names or
1300 addresses for the object.
1301
1302 Examples
1303
1304 URI: http://info.cern.ch/pub/www/doc/url6.multi; vary
1305 =content-type
1306
1307 This indicates that retrieval given the URI will return the same
1308 document, never an updated version, but optionally in a different
1309 rendition.
1310
1311 URI: http://info.cern.ch/pub/www/doc/url.multi;
1312 vary=content-type, language, version
1313
1314 This indicates that the URI will return the smae document, possibly
1315 in a different rendition, possibly updated, and without excluding
1316 the provision of translations into different languages.URI:
1317 http://info.cern.ch/pub/www/doc/url6.ps vary=content-typeThis
1318 indicates that accessing the URI in question will return exactly
1319 the same bitstream.
1320
1321 VERSION:
1322
1323 This is a string defining the version of an evolving object. Its
1324 format is currently undefined , and so it should be treated as
1325 opaque to the reader, defined by the informatiuon provider. The
1326 version field is used to indicate evolution along a single path of
1327 a partucular work. It is NOT used to indicate derived works (use a
1328 link), translations , or renditions in different representations .
1329
1330
1331
1332
1333 T. Berners-Lee 23
1334
1335 Note: It would be useful to have sufficient semantics to be able
1336 to deduce whether one version predated or postdated another.
1337 However, it may also be useful to be able to insert a particular
1338 local code management system's own version stamp in this field.
1339 Typically, publishers will have quite complex version information
1340 containing hidden local semantics, giving value to the idea of this
1341 field being opaque to other readers ofthe document.
1342
1343 DERIVED-FROM:
1344
1345 When an editied object is resubmitted using PUT for example, this
1346 field gives the value of the Version . This typically allows a
1347 server to check for example that two concurrent modifications by
1348 different parties will not be lost, and for example to use
1349 established version management techniques to merge both
1350 modifications.
1351
1352 LANGUAGE: CODE
1353
1354 The language code is the ISO code for the language in which the
1355 document is written. If the language is not known, this field
1356 should be omitted of course .
1357
1358 The language code is an ISO 3316 language code with an optional
1359 ISO639 country code to specify a national variant.
1360
1361 Example
1362
1363 Language: en_UK
1364
1365 means that the content of the message is in British English, while
1366
1367 Language: en
1368
1369 means that the language is English in one of its forms. (@@ If a
1370 document is in more than one language, for example requires both
1371 Greek Latin and French to be understood, should this be
1372 representable?)
1373
1374 See also: Accept-Language .
1375
1376 COST: TBS
1377
1378 The cost of retrieving the object is given. This is the cost of
1379 access of a copyright work. Format of units to be specified.
1380 Currently refers to an unspecified charging scheme to be agreed out
1381 of band between parties.
1382
1383 WWW-LINK:
1384
1385 Note. It is proposed that any HTML metainformation element (allowed
1386 withing the HEAD as opposed to BODY element of the document) be a
1387 valid candidate for an HTTP object header. WWW_Link is a required
1388
1389
1390
1391 T. Berners-Lee 24
1392
1393 example. The suggestion was that the isomorphism should be realized
1394 by prepending "WWW-" t the HTML element name to make the HTTP
1395 header name, and the HTML attributes imply identically named
1396 semicolon-separated MIME-style header parameters. Other clear
1397 candidates include WWW-Title.
1398
1399 It is open to discussion whether the "WWW-" should be removed.
1400
1401 This is semantically equivalent to the LINK element in an HTML
1402 document which should be consulted for a full explanation.
1403
1404 Examples
1405
1406 WWW-Link: href="http://info.cern.ch/a/b/c"; rel="includes"
1407 WWW-Link: href="mailto:timbl@info.cern.ch"; rev=made
1408
1409 The first example indicates that this object includes the specified
1410 /a/b/c object. The second indicates that the author of the object
1411 is identified by the given mail address.
1412
1413 Note: Client tolerance of bad servers
1414
1415 Servers not implementing the specification as written are not HTTP
1416 compiant. Servers should always be made completely copmpliant.
1417 However, clients should also tolerate deviant servers where
1418 possible.
1419
1420 BACK COMPATIBILITY
1421
1422 In order that clients using the HTTP protocol should be able to
1423 communicate with servers using the protocol originally implemented
1424 in the W3 data model, clients should tolerate responses which do
1425 not start with a numeric version number and response codes.
1426
1427 In this case, they should assume that the rest of the response is a
1428 document body in type text/html.
1429
1430 WHITE SPACE
1431
1432 Clients should be tolerant in parsing response status lines, in
1433 particular they should accept any sequence of white space (SP and
1434 TAB) characters between fields.
1435
1436 Lines should be regarded as terminated by the Line Feed, and the
1437 preceeding Carriage Return character ignored.
1438
1439 HTTP NEGOTIATION ALGORITM
1440
1441 This note defines the significance of the q, mxb and mxs values
1442 optionally sent in the Accept: field of the HTTP protocol request
1443 message.
1444
1445 It is assumed that there is a certain value of the presentation of
1446
1447
1448
1449 T. Berners-Lee 25
1450
1451 the document, optimally rendered using all the information
1452 available in its original source.
1453
1454 It is further assumed that one can allocate a number between 0 and
1455 1 to represent the loss of value which occurs when a document is
1456 rendered into a representation with loss of information. Whilst
1457 this is a very subjective measurement, and in fact largely a
1458 function of the document in question, the approximation is made
1459 that one can define this "degradation" figure as a function of
1460 merely the representation involved.
1461
1462 The next assumption is that the other cost to the user of viewing
1463 the document is a function of the time taken for presentation. We
1464 first assume that the cost is linear in time, and then assume that
1465 the time is linear in the size of the message.
1466
1467 The final net value to the user can therefore be written
1468
1469 presented_value = initial_value * total-degradation - a - b *
1470 size
1471
1472 for a document in a given incoming representation. Suppose we
1473 normalize the initial value of the document to be 1. The server
1474 may judge that the value in a particular format is less than 1 is a
1475 conversion on the server side has lost information. The total
1476 degradation is then the product of any degradation due to
1477 conversions internal to the server, and the degradation "q" sent in
1478 the Accept field. If q is not sent, it defaults to 1.
1479
1480 The values of a and b have components from processing time on the
1481 server, network delays, and processing time on the client. These
1482 delays are not additive as a good system will pipeline the
1483 processing, and whilst the result may be linear in message size,
1484 calculation of it in advance is not simple. The amount of
1485 pipelining and the loads on machines and network are all difficult
1486 to predict, so a very rough assumption must be made.
1487
1488 We make the client responsible for taking into account network
1489 delays. The client will in fact be in a better position to do this,
1490 as the client will after one transaction be aware of the round-trip
1491 time.
1492
1493 We assume that the delays imposed by the server and by the client
1494 (including network) are additive. We assume that the client's
1495 delay is proportional to message size.
1496
1497 The three parameters given by the client to the server are
1498
1499 q The degradation (quality) factor between 0
1500 and 1. If omitted, 1 is assumed.
1501
1502 mxb The size of message (in bytes) which even if
1503 immediately available from the server will
1504
1505
1506
1507 T. Berners-Lee 26
1508
1509 cause the value to the reader to become zero
1510
1511 mxs The delay (in seconds) which, even for a very
1512 small message with no length-related penalty,
1513 will cause the value to the reader to become
1514 zero.
1515
1516 These parameters are chosen in part because they are easy to
1517 visualize as the largest tolerable delay and size. If not sent,
1518 they default to infinity.
1519
1520 The server may optimize the presented value for the user when
1521 deciding what to return. The hope is that fine decisions will not
1522 have to be made, as in most cases the results for different formats
1523 will be very different, and there will be a clear winner.
1524
1525 A suitable algorithm is that the assumed value v of a document of
1526 initial value u delivered to the network after a delay t whose
1527 transfer length on the net is b bytes is
1528
1529 v = u * q - b/mxb - t/mxs
1530
1531 Note that t is the time from the arrival of the request to the
1532 first byte being available on the net. [[See also: Design issues
1533 discussions around this point.]]
1534
1535 HTTP Negotiation algoritm
1536
1537 This note defines the significance of the d, a and b values
1538 optionally sent in the Accept: field of the HTTP protocol request
1539 message.
1540
1541 It is assumed that there is a certain value of the presentation of
1542 the document, optimally rendered using all the information
1543 available in its original source.
1544
1545 It is further assumed that one can allocate a number between 0 and
1546 1 to represent the loss of value which occurs when a document is
1547 rendered intop a represenmtation with loss of information. Whilst
1548 this is a very subjective measurement, and in fact largely a
1549 function of the document in question, the approximation is made
1550 that one can define this "degradation" figure as a function of
1551 merely the representation involved.
1552
1553 The next assumption is that there the other cost to the user of
1554 viewing the doument is a function of the time taken for
1555 presentation. We first assume that the cost is linear in time, and
1556 then assume that the time is linear in the size of the message.
1557
1558 The final net value to the user can therefore be written
1559
1560 presented_value = initial_value * total-degradation - a - b *
1561 size
1562
1563
1564
1565 T. Berners-Lee 27
1566
1567 for a document in a given incoming represenattion. Suppose we
1568 normalize the initial value of the document to be 1. The total
1569 degradation is the product of any degradation due to conversions
1570 internal to the server, and the degradation "d" sent in the Accept
1571 field. The values of a and b are also sent by the protocol. If not
1572 sent, they default to no cost (d=1, a=b=0).
1573
1574 The server may optimize the presented value for the user when
1575 deciding what to return. The hope is that fine decisions will not
1576 have to be made, as in most cases the results for different formats
1577 will be very different, and there will be a clear winner.
1578
1579 See also: Design issues discussions around this point.
1580
1581 Tim BL
1582
1583 Note: The cost of retrieval time
1584
1585 The assumption that the cost to the user associated with a certain
1586 retrieval time is linear in that time is wildly innaccurate. The
1587 real function could be very dependent on circumstances (like go to
1588 infinity at a deadline).
1589
1590 A better general approximation might be logarithmic for large time
1591 delays, and linear for small ones, like a*log(b*t-1) which has two
1592 parameters.
1593
1594 REGISTRATION AUTHORITY
1595
1596 The HTTP Registration Authority is responsible for maintaining
1597 lists of:
1598
1599 Charge account name spaces (see ChargeTo: field above)
1600
1601 Authorization schemes (see Authorization: field above)
1602
1603 Data format names (as MIME Content-Types)
1604
1605 Data encoding names (as MIME Content-Encoding))
1606
1607 It is proposed that the Internet Assigned Numbers Authority or
1608 their successors take this role.
1609
1610 Unregistered values may be used for experimental purposes if they
1611 are start with "X-".
1612
1613 SECURITY CONSIDERATIONS
1614
1615 The HTTP protocol allows requests to communication to a remote
1616 server machine, and all the expetant security considerations for
1617 client-server systems apply, including
1618
1619 Authentication of requests
1620
1621
1622
1623 T. Berners-Lee 28
1624
1625 Authenticationtion of servers
1626
1627 Privacy of request and response
1628
1629 Abuse of server features
1630
1631 Abuse of servers by exploiting server bugs
1632
1633 Unwitting actions on the net
1634
1635 Abuse of log information
1636
1637 The bulk of these are well known problems, tackled in part by some
1638 featured of this protocol. Some aspects particular to HTTP are
1639 mentioned below.
1640
1641 Unwitting actions on the net
1642
1643 The writers of client software should be aware that the software
1644 represents the user in his interactions over the net, and should be
1645 careful to allow the user to be aware of any actions he may take
1646 which may be taken as having an unexpected significance by others.
1647
1648
1649 TCP PORT NUMBERS
1650
1651 Clients should prompt a user before allowing HTTP access to
1652 reserved ports other than the port resrverd for HTTP (port 80).
1653 Otherwise, the user may unwittingly cause a transaction to occur in
1654 some other (present or future) protocol.
1655
1656 IDEMPOTENT METHODS
1657
1658 The convention should be established that the GET and HEAD methods
1659 never have the significance of taking an action. The link "click
1660 here to subscribe" causing the reading of a special "magic"
1661 document is open to abuse by others making a link "click here to
1662 see a pretty picture". These methods should be considered "safe"
1663 and should not have side effects. This allows the client software
1664 to represent other, methods (such as POST) in a special way, so
1665 that the use is aware of the fact that an action is being
1666 requested.
1667
1668 Abuse of log information
1669
1670 A server is in the position to save large amount of personal data
1671 about information requested by different readers. This information
1672 is clearly confidential in nature, and its handling may be
1673 constrained by law in certain countries. Server providers shall
1674 ensure that such material is not distributed without the
1675 permission of any people or groups of people mentioned in the
1676 results published.
1677
1678
1679
1680
1681 T. Berners-Lee 29
1682
1683 A feature which increases the amount of personal data transferred
1684 is the Referer: field. This allow reading patters to be studied,
1685 reverse links drawn, and so is very useful. Its power can be
1686 abused of course if user details are not separated from the
1687 Referee-Referer pairs. Even when the personal information has been
1688 removed, the Referer field may in fact be a secure document's URI,
1689 whose revelation itself is breach of security. A method of
1690 suppressing the Referer information in such cases may be the
1691 subject of further study.
1692
1693 REFERENCES
1694
1695 RFC 822 "Standard for ARPA Internet Text Messages".
1696 David H. Crocker, describes Internet mail
1697 message fromat.
1698
1699 RFC850 "Standard for Interchange of USENET
1700 Messages" This RFC uses some field names in
1701 common with this specification, and is
1702 relevant reading.
1703
1704 RFC977 "Network News Transfer Protocol", Kantor and
1705 Lampsley.
1706
1707 RFC 1341 Multipurpose Internet Mail Extensions
1708 (MIME), Nathaniel Borenstien and Ned Freed,
1709 Internet RFC 1341, 1992. Now obsoleted by
1710 RFC1521:
1711
1712 RFC 1521 MIME. Not available in hypertext form yet.
1713
1714 RFC 1522 K Moore. "MIME: Part Two: Message Header
1715 Extensions for Non-ASCII Text".
1716
1717 RFC1523 MIME text/enriched.
1718
1719 RFC 1524 MIME...
1720
1721 URL Universal Resource Locators. RFCxxx.
1722 Currently available by anonymous FTP from
1723 info.cern.ch as /pub/ietf/url3.{ps,txt}.
1724
1725 MIME and PEM Internet Draft only
1726
1727 OBJECT CONTENTS
1728
1729 The data (if any) sent with an HTTP request or reply is in a format
1730 and encoding defined by the object header fields, the default being
1731 "plain/text" type with "8bit" encoding. Note that while all the
1732 other information in the request (just as in the reply) is in ISO
1733 Latin1 with lines delimited by Carriage Return/Line Feed pairs, the
1734 data may contain 8-bit binary data.
1735
1736
1737
1738
1739 T. Berners-Lee 30
1740
1741 Termination
1742
1743 The delimiting of the message is determined by the Content-Length:
1744 field. If this is present, then the message contains the specified
1745 number of bytes.
1746
1747 Failing that, the content-type filed may contain a "bounday"
1748 attribute which gives the boundary string with exacly the same
1749 syntax as for a MIME multipart message.
1750
1751 Failing either of the above conditions, the data is terminated by
1752 the closing of the connection by the sending party. Note that this
1753 method can not be used for data sent with the request.
1754
1755 See also: note on server tolerance for back-compatibility, etc.
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798 T. Berners-Lee 31
1799

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24