1 |
HTTP Working Group T. Berners-Lee, MIT/LCS |
2 |
INTERNET-DRAFT R. Fielding, UC Irvine |
3 |
<draft-ietf-http-v10-spec-03.txt> H. Frystyk, MIT/LCS |
4 |
Expires March 4, 1996 September 4, 1995 |
5 |
|
6 |
|
7 |
Hypertext Transfer Protocol -- HTTP/1.0 |
8 |
|
9 |
|
10 |
Status of this Memo |
11 |
|
12 |
This document is an Internet-Draft. Internet-Drafts are working |
13 |
documents of the Internet Engineering Task Force (IETF), its areas, |
14 |
and its working groups. Note that other groups may also distribute |
15 |
working documents as Internet-Drafts. |
16 |
|
17 |
Internet-Drafts are draft documents valid for a maximum of six |
18 |
months and may be updated, replaced, or obsoleted by other |
19 |
documents at any time. It is inappropriate to use Internet-Drafts |
20 |
as reference material or to cite them other than as "work in |
21 |
progress". |
22 |
|
23 |
To learn the current status of any Internet-Draft, please check the |
24 |
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow |
25 |
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), |
26 |
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or |
27 |
ftp.isi.edu (US West Coast). |
28 |
|
29 |
Distribution of this document is unlimited. Please send comments to |
30 |
the HTTP working group at <http-wg@cuckoo.hpl.hp.com>. Discussions |
31 |
of the working group are archived at |
32 |
<URL:http://www.ics.uci.edu/pub/ietf/http/>. General discussions |
33 |
about HTTP and the applications which use HTTP should take place on |
34 |
the <www-talk@w3.org> mailing list. |
35 |
|
36 |
Abstract |
37 |
|
38 |
The Hypertext Transfer Protocol (HTTP) is an application-level |
39 |
protocol with the lightness and speed necessary for distributed, |
40 |
collaborative, hypermedia information systems. It is a generic, |
41 |
stateless, object-oriented protocol which can be used for many |
42 |
tasks, such as name servers and distributed object management |
43 |
systems, through extension of its request methods (commands). A |
44 |
feature of HTTP is the typing and negotiation of data |
45 |
representation, allowing systems to be built independently of the |
46 |
data being transferred. |
47 |
|
48 |
HTTP has been in use by the World-Wide Web global information |
49 |
initiative since 1990. This specification reflects common usage of |
50 |
the protocol referred to as "HTTP/1.0". |
51 |
|
52 |
Table of Contents |
53 |
|
54 |
1. Introduction |
55 |
1.1 Purpose |
56 |
1.2 Overall Operation |
57 |
1.3 Terminology |
58 |
|
59 |
2. Notational Conventions and Generic Grammar |
60 |
2.1 Augmented BNF |
61 |
2.2 Basic Rules |
62 |
|
63 |
3. Protocol Parameters |
64 |
3.1 HTTP Version |
65 |
3.2 Uniform Resource Identifiers |
66 |
3.2.1 General Syntax |
67 |
3.2.2 http URL |
68 |
3.3 Date/Time Formats |
69 |
3.4 Character Sets |
70 |
3.5 Content Codings |
71 |
3.6 Media Types |
72 |
3.6.1 Canonicalization and Text Defaults |
73 |
3.6.2 Multipart Types |
74 |
3.7 Product Tokens |
75 |
|
76 |
4. HTTP Message |
77 |
4.1 Message Types |
78 |
4.2 Message Headers |
79 |
4.3 General Message Header Fields |
80 |
|
81 |
5. Request |
82 |
5.1 Request-Line |
83 |
5.2 Method |
84 |
5.2.1 GET |
85 |
5.2.2 HEAD |
86 |
5.2.3 POST |
87 |
5.3 Request-URI |
88 |
5.4 Request Header Fields |
89 |
|
90 |
6. Response |
91 |
6.1 Status-Line |
92 |
6.2 Status Codes and Reason Phrases |
93 |
6.2.1 Informational 1xx |
94 |
6.2.2 Successful 2xx |
95 |
6.2.3 Redirection 3xx |
96 |
6.2.4 Client Error 4xx |
97 |
6.2.5 Server Errors 5xx |
98 |
6.3 Response Header Fields |
99 |
|
100 |
7. Entity |
101 |
7.1 Entity Header Fields |
102 |
7.2 Entity Body |
103 |
7.2.1 Type |
104 |
7.2.2 Length |
105 |
|
106 |
8. Header Field Definitions |
107 |
8.1 Allow |
108 |
8.2 Authorization |
109 |
8.3 Content-Encoding |
110 |
8.4 Content-Length |
111 |
8.5 Content-Type |
112 |
8.6 Date |
113 |
8.7 Expires |
114 |
8.8 From |
115 |
8.9 If-Modified-Since |
116 |
8.10 Last-Modified |
117 |
8.11 Location |
118 |
8.12 MIME-Version |
119 |
8.13 Pragma |
120 |
8.14 Referer |
121 |
8.15 Server |
122 |
8.16 User-Agent |
123 |
8.17 WWW-Authenticate |
124 |
|
125 |
9. Access Authentication |
126 |
9.1 Basic Authentication Scheme |
127 |
|
128 |
10. Security Considerations |
129 |
10.1 Authentication of Clients |
130 |
10.2 Safe Methods |
131 |
10.3 Abuse of Server Log Information |
132 |
10.4 Transfer of Sensitive Information |
133 |
|
134 |
11. Acknowledgments |
135 |
|
136 |
12. References |
137 |
|
138 |
13. Authors' Addresses |
139 |
|
140 |
Appendix A. Internet Media Type message/http |
141 |
|
142 |
Appendix B. Tolerant Applications |
143 |
|
144 |
Appendix C. Relationship to MIME |
145 |
C.1 Conversion to Canonical Form |
146 |
C.1.1 Representation of Line Breaks |
147 |
C.1.2 Default Character Set |
148 |
C.2 Conversion of Date Formats |
149 |
C.3 Introduction of Content-Encoding |
150 |
C.4 No Content-Transfer-Encoding |
151 |
|
152 |
|
153 |
|
154 |
1. Introduction |
155 |
|
156 |
1.1 Purpose |
157 |
|
158 |
The Hypertext Transfer Protocol (HTTP) is an application-level |
159 |
protocol with the lightness and speed necessary for distributed, |
160 |
collaborative, hypermedia information systems. HTTP has been in use |
161 |
by the World-Wide Web global information initiative since 1990. |
162 |
This specification reflects common usage of the protocol referred |
163 |
to as "HTTP/1.0". This specification is not intended to become an |
164 |
Internet standard; rather, it defines those features of the HTTP |
165 |
protocol that can reasonably be expected of any implementation |
166 |
which claims to be using HTTP/1.0. |
167 |
|
168 |
Practical information systems require more functionality than |
169 |
simple retrieval, including search, front-end update, and |
170 |
annotation. HTTP/1.0 allows an open-ended set of methods to be used |
171 |
to indicate the purpose of a request. It builds on the discipline |
172 |
of reference provided by the Uniform Resource Identifier (URI) [2], |
173 |
as a location (URL) [4] or name (URN) [16], for indicating the |
174 |
resource on which a method is to be applied. Messages are passed in |
175 |
a format similar to that used by Internet Mail [7] and the |
176 |
Multipurpose Internet Mail Extensions (MIME) [5]. |
177 |
|
178 |
HTTP/1.0 is also used for communication between user agents and |
179 |
various gateways, allowing hypermedia access to existing Internet |
180 |
protocols like SMTP [12], NNTP [11], FTP [14], Gopher [1], and |
181 |
WAIS [8]. HTTP/1.0 is designed to allow such gateways, via proxy |
182 |
servers, without any loss of the data conveyed by those earlier |
183 |
protocols. |
184 |
|
185 |
1.2 Overall Operation |
186 |
|
187 |
The HTTP protocol is based on a request/response paradigm. A |
188 |
requesting program (termed a client) establishes a connection with |
189 |
a receiving program (termed a server) and sends a request to the |
190 |
server in the form of a request method, URI, and protocol version, |
191 |
followed by a MIME-like message containing request modifiers, |
192 |
client information, and possible body content. The server responds |
193 |
with a status line, including its protocol version and a success or |
194 |
error code, followed by a MIME-like message containing server |
195 |
information, entity metainformation, and possible body content. It |
196 |
should be noted that a given program may be capable of being both a |
197 |
client and a server; our use of those terms refers only to the role |
198 |
being performed by the program during a particular connection, |
199 |
rather than to the program's purpose in general. |
200 |
|
201 |
On the Internet, the communication generally takes place over a |
202 |
TCP/IP connection. The default port is TCP 80 [15], but other ports |
203 |
can be used. This does not preclude the HTTP/1.0 protocol from |
204 |
being implemented on top of any other protocol on the Internet, or |
205 |
on other networks. HTTP only presumes a reliable transport; any |
206 |
protocol that provides such guarantees can be used, and the mapping |
207 |
of the HTTP/1.0 request and response structures onto the transport |
208 |
data units of the protocol in question is outside the scope of this |
209 |
specification. |
210 |
|
211 |
Current practice requires that the connection be established by the |
212 |
client prior to each request and closed by the server after sending |
213 |
the response. Both clients and servers must be capable of handling |
214 |
cases where either party closes the connection prematurely, due to |
215 |
user action, automated time-out, or program failure. In any case, |
216 |
the closing of the connection by either or both parties always |
217 |
terminates the current request, regardless of its status. |
218 |
|
219 |
1.3 Terminology |
220 |
|
221 |
This specification uses a number of terms to refer to the roles |
222 |
played by participants in, and objects of, the HTTP communication. |
223 |
|
224 |
connection |
225 |
|
226 |
A virtual circuit established between two parties for the |
227 |
purpose of communication. |
228 |
|
229 |
message |
230 |
|
231 |
A structured sequence of octets transmitted via the connection |
232 |
as the basic component of communication. |
233 |
|
234 |
request |
235 |
|
236 |
An HTTP request message (as defined in Section 5). |
237 |
|
238 |
response |
239 |
|
240 |
An HTTP response message (as defined in Section 6). |
241 |
|
242 |
resource |
243 |
|
244 |
A network data object or service which can be identified by a |
245 |
URI (Section 3.2). |
246 |
|
247 |
entity |
248 |
|
249 |
A particular representation or rendition of a resource that may |
250 |
be enclosed within a request or response message. An entity |
251 |
consists of metainformation in the form of entity headers and |
252 |
content in the form of an entity body. |
253 |
|
254 |
client |
255 |
|
256 |
A program that establishes connections for the purpose of |
257 |
sending requests. |
258 |
|
259 |
user agent |
260 |
|
261 |
The client program which is closest to the user and which |
262 |
initiates requests at their behest. These are often browsers, |
263 |
editors, spiders (web-traversing robots), or other end user |
264 |
tools. |
265 |
|
266 |
server |
267 |
|
268 |
A program that accepts connections in order to service requests |
269 |
by sending back responses. |
270 |
|
271 |
origin server |
272 |
|
273 |
The server on which a given resource resides or is to be created. |
274 |
|
275 |
proxy |
276 |
|
277 |
An intermediary program which acts as both a server and a client |
278 |
for the purpose of forwarding requests. Proxies are often used |
279 |
to act as a portal through a network firewall. A proxy server |
280 |
accepts requests from other clients and services them either |
281 |
internally or by passing them, with possible translation, on to |
282 |
other servers. A caching proxy is a proxy server with a local |
283 |
cache of server responses -- some requested resources can be |
284 |
serviced from the cache rather than from the origin server. Some |
285 |
proxy servers also act as origin servers. |
286 |
|
287 |
gateway |
288 |
|
289 |
A proxy which services HTTP requests by translation into |
290 |
protocols other than HTTP. The reply sent from the remote server |
291 |
to the gateway is likewise translated into HTTP before being |
292 |
forwarded to the user agent. |
293 |
|
294 |
2. Notational Conventions and Generic Grammar |
295 |
|
296 |
2.1 Augmented BNF |
297 |
|
298 |
All of the mechanisms specified in this document are described in |
299 |
both prose and an augmented Backus-Naur Form (BNF) similar to that |
300 |
used by RFC 822 [7]. Implementors will need to be familiar with the |
301 |
notation in order to understand this specification. The augmented |
302 |
BNF includes the following constructs: |
303 |
|
304 |
name = definition |
305 |
|
306 |
The name of a rule is simply the name itself (without any |
307 |
enclosing "<" and ">") and is separated from its definition by |
308 |
the equal character "=". Whitespace is only significant in that |
309 |
indentation of continuation lines is used to indicate a rule |
310 |
definition that spans more than one line. Certain basic rules |
311 |
are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. |
312 |
Angle brackets are used within definitions whenever their |
313 |
presence will facilitate discerning the use of rule names. |
314 |
|
315 |
"literal" |
316 |
|
317 |
Quotation marks surround literal text. Unless stated otherwise, |
318 |
the text is case-insensitive. |
319 |
|
320 |
rule1 | rule2 |
321 |
|
322 |
Elements separated by a bar ("I") are alternatives, |
323 |
e.g., "yes | no" will accept yes or no. |
324 |
|
325 |
(rule1 rule2) |
326 |
|
327 |
Elements enclosed in parentheses are treated as a single |
328 |
element. Thus, "(elem (foo | bar) elem)" allows the token |
329 |
sequences "elem foo elem" and "elem bar elem". |
330 |
|
331 |
*rule |
332 |
|
333 |
The character "*" preceding an element indicates repetition. The |
334 |
full form is "<n>*<m>element" indicating at least <n> and at |
335 |
most <m> occurrences of element. Default values are 0 and |
336 |
infinity so that "*(element)" allows any number, including zero; |
337 |
"1*element" requires at least one; and "1*2element" allows one |
338 |
or two. |
339 |
|
340 |
[rule] |
341 |
|
342 |
Square brackets enclose optional elements; "[foo bar]" is |
343 |
equivalent to "*1(foo bar)". |
344 |
|
345 |
N rule |
346 |
|
347 |
Specific repetition: "<n>(element)" is equivalent to |
348 |
"<n>*<n>(element)"; that is, exactly <n> occurrences of |
349 |
(element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a |
350 |
string of three alphabetic characters. |
351 |
|
352 |
#rule |
353 |
|
354 |
A construct "#" is defined, similar to "*", for defining lists |
355 |
of elements. The full form is "<n>#<m>element" indicating at |
356 |
least <n> and at most <m> elements, each separated by one or |
357 |
more commas (",") and optional linear whitespace (LWS). This |
358 |
makes the usual form of lists very easy; a rule such as |
359 |
"( *LWS element *( *LWS "," *LWS element ))" can be shown as |
360 |
"1#element". Wherever this construct is used, null elements are |
361 |
allowed, but do not contribute to the count of elements present. |
362 |
That is, "(element), , (element)" is permitted, but counts as |
363 |
only two elements. Therefore, where at least one element is |
364 |
required, at least one non-null element must be present. Default |
365 |
values are 0 and infinity so that "#(element)" allows any |
366 |
number, including zero; "1#element" requires at least one; and |
367 |
"1#2element" allows one or two. |
368 |
|
369 |
; comment |
370 |
|
371 |
A semi-colon, set off some distance to the right of rule text, |
372 |
starts a comment that continues to the end of line. This is a |
373 |
simple way of including useful notes in parallel with the |
374 |
specifications. |
375 |
|
376 |
implied *LWS |
377 |
|
378 |
The grammar described by this specification is word-based. |
379 |
Except where noted otherwise, zero or more linear whitespace |
380 |
(LWS) can be included between any two adjacent words (token or |
381 |
quoted-string), and between adjacent tokens and delimiters |
382 |
(tspecials), without changing the interpretation of a field. |
383 |
However, applications should attempt to follow "common form" |
384 |
when generating HTTP constructs, since there exist some |
385 |
implementations that fail to accept anything beyond the common |
386 |
forms. |
387 |
|
388 |
2.2 Basic Rules |
389 |
|
390 |
The following rules are used throughout this specification to |
391 |
describe basic parsing constructs. The US-ASCII coded character set |
392 |
is defined by [17]. |
393 |
|
394 |
OCTET = <any 8-bit sequence of data> |
395 |
CHAR = <any US-ASCII character (octets 0 - 127)> |
396 |
UPALPHA = <any US-ASCII uppercase letter "A".."Z"> |
397 |
LOALPHA = <any US-ASCII lowercase letter "a".."z"> |
398 |
ALPHA = UPALPHA | LOALPHA |
399 |
DIGIT = <any US-ASCII digit "0".."9"> |
400 |
CTL = <any US-ASCII control character |
401 |
(octets 0 - 31) and DEL (127)> |
402 |
CR = <US-ASCII CR, carriage return (13)> |
403 |
LF = <US-ASCII LF, linefeed (10)> |
404 |
SP = <US-ASCII SP, space (32)> |
405 |
HT = <US-ASCII HT, horizontal-tab (9)> |
406 |
<"> = <US-ASCII double-quote mark (34)> |
407 |
|
408 |
HTTP/1.0 defines the octet sequence CR LF as the end-of-line marker |
409 |
for all protocol elements except the Entity-Body (see Appendix B |
410 |
for tolerant applications). The end-of-line marker within an |
411 |
Entity-Body is defined by its associated media type, as described |
412 |
in Section 3.6. |
413 |
|
414 |
CRLF = CR LF |
415 |
|
416 |
HTTP/1.0 headers may be folded onto multiple lines if the |
417 |
continuation lines begin with linear whitespace characters. All |
418 |
linear whitespace, including folding, has the same semantics as SP. |
419 |
|
420 |
LWS = [CRLF] 1*( SP | HT ) |
421 |
|
422 |
However, folding of header lines is not expected by some |
423 |
applications, and should not be generated by HTTP/1.0 applications. |
424 |
|
425 |
Many HTTP/1.0 header field values consist of words separated by LWS |
426 |
or special characters. These special characters must be in a quoted |
427 |
string to be used within a parameter value. |
428 |
|
429 |
word = token | quoted-string |
430 |
|
431 |
token = 1*<any CHAR except CTLs or tspecials> |
432 |
|
433 |
tspecials = "(" | ")" | "<" | ">" | "@" |
434 |
| "," | ";" | ":" | "\" | <"> |
435 |
| "/" | "[" | "]" | "?" | "=" |
436 |
| "{" | "}" | SP | HT |
437 |
|
438 |
Comments may be included in some HTTP header fields by surrounding |
439 |
the comment text with parentheses. Comments are only allowed in |
440 |
fields containing "comment" as part of their field value definition. |
441 |
|
442 |
comment = "(" *( ctext | comment ) ")" |
443 |
ctext = <any text excluding "(" and ")"> |
444 |
|
445 |
A string of text is parsed as a single word if it is quoted using |
446 |
double-quote marks. |
447 |
|
448 |
quoted-string = ( <"> *(qdtext) <"> ) |
449 |
|
450 |
qdtext = <any CHAR except <"> and CTLs, |
451 |
but including LWS> |
452 |
|
453 |
Single-character quoting using the backslash ("\") character is not |
454 |
permitted in HTTP/1.0. |
455 |
|
456 |
The text rule is only used for descriptive field contents and |
457 |
values that are not intended to be interpreted by the message |
458 |
parser. Words of *text may contain octets from character sets other |
459 |
than US-ASCII. |
460 |
|
461 |
text = <any OCTET except CTLs, |
462 |
but including LWS> |
463 |
|
464 |
Recipients of header field text containing octets outside the |
465 |
US-ASCII character set may assume that they represent ISO-8859-1 |
466 |
characters. |
467 |
|
468 |
3. Protocol Parameters |
469 |
|
470 |
3.1 HTTP Version |
471 |
|
472 |
HTTP uses a "<major>.<minor>" numbering scheme to indicate versions |
473 |
of the protocol. The protocol versioning policy is intended to |
474 |
allow the sender to indicate the format of a message and its |
475 |
capacity for understanding further HTTP communication, rather than |
476 |
the features obtained via that communication. No change is made to |
477 |
the version number for the addition of message components which do |
478 |
not affect communication behavior or which only add to extensible |
479 |
field values. The <minor> number is incremented when the changes |
480 |
made to the protocol add features which do not change the general |
481 |
message parsing algorithm, but which may add to the message |
482 |
semantics and imply additional capabilities of the sender. The |
483 |
<major> number is incremented when the format of a message within |
484 |
the protocol is changed. |
485 |
|
486 |
The version of an HTTP message is indicated by an HTTP-Version |
487 |
field in the first line of the message. If the protocol version is |
488 |
not specified, the recipient must assume that the message is in the |
489 |
simple HTTP/0.9 format. |
490 |
|
491 |
HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT |
492 |
|
493 |
Note that the major and minor numbers should be treated as separate |
494 |
integers and that each may be incremented higher than a single |
495 |
digit. Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in |
496 |
turn is lower than HTTP/12.3. Leading zeros should be ignored by |
497 |
recipients and never generated by senders. |
498 |
|
499 |
This document defines both the 0.9 and 1.0 versions of the HTTP |
500 |
protocol. Applications sending Full-Request or Full-Response |
501 |
messages, as defined by this specification, must include an |
502 |
HTTP-Version of "HTTP/1.0". |
503 |
|
504 |
HTTP/1.0 servers must: |
505 |
|
506 |
o recognize the format of the Request-Line for HTTP/0.9 and |
507 |
HTTP/1.0 requests; |
508 |
|
509 |
o understand any valid request in the format of HTTP/0.9 or |
510 |
HTTP/1.0; |
511 |
|
512 |
o respond appropriately with a message in the same protocol |
513 |
version used by the client. |
514 |
|
515 |
HTTP/1.0 clients must: |
516 |
|
517 |
o recognize the format of the Status-Line for HTTP/1.0 responses; |
518 |
|
519 |
o understand any valid response in the format of HTTP/0.9 or |
520 |
HTTP/1.0. |
521 |
|
522 |
Proxies must be careful in forwarding requests that are received in |
523 |
a format different than that of the proxy's native version. Since |
524 |
the protocol version indicates the protocol capability of the |
525 |
sender, a proxy must never send a message with a version indicator |
526 |
which is greater than its native version; if a higher version |
527 |
request is received, the proxy must either downgrade the request |
528 |
version or respond with an error. Requests with a version lower |
529 |
than that of the proxy's native format may be upgraded by the proxy |
530 |
before being forwarded; the proxy's response to that request must |
531 |
follow the normal server requirements. |
532 |
|
533 |
3.2 Uniform Resource Identifiers |
534 |
|
535 |
URIs have been known by many names: WWW addresses, Universal |
536 |
Document Identifiers, Universal Resource Identifiers [2], and |
537 |
finally the combination of Uniform Resource Locators (URL) [4] and |
538 |
Names (URN) [16]. As far as HTTP is concerned, Uniform Resource |
539 |
Identifiers are simply formatted strings which identify--via name, |
540 |
location, or any other characteristic--a network resource. |
541 |
|
542 |
3.2.1 General Syntax |
543 |
|
544 |
URIs in HTTP/1.0 can be represented in absolute form or relative to |
545 |
some known base URI [9], depending upon the context of their use. |
546 |
The two forms are differentiated by the fact that absolute URIs |
547 |
always begin with a scheme name followed by a colon. |
548 |
|
549 |
URI = ( absoluteURI | relativeURI ) [ "#" fragment ] |
550 |
|
551 |
absoluteURI = scheme ":" *( uchar | reserved ) |
552 |
|
553 |
relativeURI = net_path | abs_path | rel_path |
554 |
|
555 |
net_path = "//" net_loc [ abs_path ] |
556 |
abs_path = "/" rel_path |
557 |
rel_path = [ path ] [ ";" params ] [ "?" query ] |
558 |
|
559 |
path = fsegment *( "/" segment ) |
560 |
fsegment = 1*pchar |
561 |
segment = *pchar |
562 |
|
563 |
params = param *( ";" param ) |
564 |
param = *( pchar | "/" ) |
565 |
|
566 |
scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." ) |
567 |
net_loc = *( pchar | ";" | "?" ) |
568 |
query = *( uchar | reserved ) |
569 |
fragment = *( uchar | reserved ) |
570 |
|
571 |
pchar = uchar | ":" | "@" | "&" | "=" |
572 |
uchar = unreserved | escape |
573 |
unreserved = ALPHA | DIGIT | safe | extra | national |
574 |
|
575 |
escape = "%" hex hex |
576 |
hex = "A" | "B" | "C" | "D" | "E" | "F" |
577 |
| "a" | "b" | "c" | "d" | "e" | "f" | DIGIT |
578 |
|
579 |
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" |
580 |
safe = "$" | "-" | "_" | "." | "+" |
581 |
extra = "!" | "*" | "'" | "(" | ")" | "," |
582 |
national = <any OCTET excluding CTLs, SP, |
583 |
ALPHA, DIGIT, reserved, safe, and extra> |
584 |
|
585 |
For definitive information on URL syntax and semantics, see |
586 |
RFC 1738 [4] and RFC 1808 [9]. The BNF above includes national |
587 |
characters not allowed in valid URLs as specified by RFC 1738, |
588 |
since HTTP servers are not restricted in the set of unreserved |
589 |
characters allowed to represent the rel_path part of addresses, and |
590 |
HTTP proxies may receive requests for URIs not defined by RFC 1738. |
591 |
|
592 |
3.2.2 http URL |
593 |
|
594 |
The "http" scheme is used to locate network resources via the HTTP |
595 |
protocol. This section defines the scheme-specific syntax and |
596 |
semantics for http URLs. |
597 |
|
598 |
http_URL = "http:" "//" host [ ":" port ] abs_path |
599 |
|
600 |
host = <FQDN or IP address, as defined in RFC 1738> |
601 |
port = *DIGIT |
602 |
|
603 |
If the port is empty or not given, port 80 is assumed. The |
604 |
semantics are that the identified resource is located at the server |
605 |
listening for TCP connections on that port of that host, and the |
606 |
Request-URI for the resource is abs_path. If the abs_path is not |
607 |
present in the URL, it must be given as "/" when used as a |
608 |
Request-URI. |
609 |
|
610 |
The canonical form for "http" URLs is obtained by converting any |
611 |
UPALPHA characters in host to their LOALPHA equivalent (hostnames |
612 |
are case-insensitive), eliding the [ ":" port ] if the port is 80, |
613 |
and replacing an empty abs_path with "/". |
614 |
|
615 |
3.3 Date/Time Formats |
616 |
|
617 |
HTTP/1.0 applications have historically allowed three different |
618 |
formats for the representation of date/time stamps: |
619 |
|
620 |
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 |
621 |
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 |
622 |
Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format |
623 |
|
624 |
The first format is preferred as an Internet standard and |
625 |
represents a fixed-length subset of that defined by RFC 1123 [6] |
626 |
(an update to RFC 822 [7]). The second format is in common use, but |
627 |
is based on the obsolete RFC 850 [10] date format and lacks a |
628 |
four-digit year. HTTP/1.0 clients and servers that parse the date |
629 |
value should accept all three formats, though they must never |
630 |
generate the third (asctime) format. |
631 |
|
632 |
Note: Recipients of date values are encouraged to be robust |
633 |
in accepting date values that may have been generated by |
634 |
non-HTTP applications, as is sometimes the case when |
635 |
retrieving or posting messages via gateways to SMTP or NNTP. |
636 |
|
637 |
All HTTP/1.0 date/time stamps must be represented in Universal Time |
638 |
(UT), also known as Greenwich Mean Time (GMT), without exception. |
639 |
This is indicated in the first two formats by the inclusion of |
640 |
"GMT" as the three-letter abbreviation for time zone, and should be |
641 |
assumed when reading the asctime format. |
642 |
|
643 |
HTTP-date = rfc1123-date | rfc850-date | asctime-date |
644 |
|
645 |
rfc1123-date = wkday "," SP date1 SP time SP "GMT" |
646 |
rfc850-date = weekday "," SP date2 SP time SP "GMT" |
647 |
asctime-date = wkday SP date3 SP time SP 4DIGIT |
648 |
|
649 |
date1 = 2DIGIT SP month SP 4DIGIT |
650 |
; day month year (e.g., 02 Jun 1982) |
651 |
date2 = 2DIGIT "-" month "-" 2DIGIT |
652 |
; day-month-year (e.g., 02-Jun-82) |
653 |
date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) |
654 |
; month day (e.g., Jun 2) |
655 |
|
656 |
time = 2DIGIT ":" 2DIGIT ":" 2DIGIT |
657 |
; 00:00:00 - 23:59:59 |
658 |
|
659 |
wkday = "Mon" | "Tue" | "Wed" |
660 |
| "Thu" | "Fri" | "Sat" | "Sun" |
661 |
|
662 |
weekday = "Monday" | "Tuesday" | "Wednesday" |
663 |
| "Thursday" | "Friday" | "Saturday" | "Sunday" |
664 |
|
665 |
month = "Jan" | "Feb" | "Mar" | "Apr" |
666 |
| "May" | "Jun" | "Jul" | "Aug" |
667 |
| "Sep" | "Oct" | "Nov" | "Dec" |
668 |
|
669 |
Note: HTTP/1.0 requirements for the date/time stamp format |
670 |
apply only to their usage within the protocol stream. |
671 |
Clients and servers are not required to use these formats |
672 |
for user presentation, request logging, etc. |
673 |
|
674 |
3.4 Character Sets |
675 |
|
676 |
HTTP uses the same definition of the term "character set" as that |
677 |
described for MIME: |
678 |
|
679 |
The term "character set" is used in this document to |
680 |
refer to a method used with one or more tables to convert |
681 |
a sequence of octets into a sequence of characters. Note |
682 |
that unconditional conversion in the other direction is |
683 |
not required, in that not all characters may be available |
684 |
in a given character set and a character set may provide |
685 |
more than one sequence of octets to represent a |
686 |
particular character. This definition is intended to |
687 |
allow various kinds of character encodings, from simple |
688 |
single-table mappings such as US-ASCII to complex table |
689 |
switching methods such as those that use ISO 2022's |
690 |
techniques. However, the definition associated with a |
691 |
MIME character set name must fully specify the mapping to |
692 |
be performed from octets to characters. In particular, |
693 |
use of external profiling information to determine the |
694 |
exact mapping is not permitted. |
695 |
|
696 |
HTTP character sets are identified by case-insensitive tokens. The |
697 |
complete set of tokens are defined by the IANA Character Set |
698 |
registry [15]. However, because that registry does not define a |
699 |
single, consistent token for each character set, we define here the |
700 |
preferred names for those character sets most likely to be used |
701 |
with HTTP entities. These character sets include those registered |
702 |
by RFC 1521 [5] -- the US-ASCII [17] and ISO-8859 [18] character |
703 |
sets -- and other names specifically recommended for use within MIME |
704 |
charset parameters. |
705 |
|
706 |
charset = "US-ASCII" |
707 |
| "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" |
708 |
| "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" |
709 |
| "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" |
710 |
| "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" |
711 |
| "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" |
712 |
| token |
713 |
|
714 |
Although HTTP allows an arbitrary token to be used as a charset |
715 |
value, any token that has a predefined value within the IANA |
716 |
Character Set registry [15] must represent the character set |
717 |
defined by that registry. Applications are encouraged, but not |
718 |
required, to limit their use of character sets to those defined by |
719 |
the IANA registry. |
720 |
|
721 |
Note: This use of the term "character set" is more commonly |
722 |
referred to as a "character encoding." However, since HTTP |
723 |
and MIME share the same registry, it is important that the |
724 |
terminology also be shared. |
725 |
|
726 |
3.5 Content Codings |
727 |
|
728 |
Content coding values are used to indicate an encoding |
729 |
transformation that has been or can be applied to a resource. |
730 |
Content codings are primarily used to allow a document to be |
731 |
compressed or encrypted without losing the identity of its |
732 |
underlying media type. Typically, the resource is stored in this |
733 |
encoding and only decoded before rendering or analogous usage. |
734 |
|
735 |
content-coding = "x-gzip" | "x-compress" | token |
736 |
|
737 |
Note: For future compatibility, HTTP/1.0 applications should |
738 |
consider "gzip" and "compress" to be equivalent to "x-gzip" |
739 |
and "x-compress", respectively. |
740 |
|
741 |
All content-coding values are case-insensitive. HTTP/1.0 uses |
742 |
content-coding values in the Content-Encoding (Section 8.3) header |
743 |
field. Although the value describes the content-coding, what is |
744 |
more important is that it indicates what decoding mechanism will be |
745 |
required to remove the encoding. Note that a single program may be |
746 |
capable of decoding multiple content-coding formats. Two values are |
747 |
defined by this specification: |
748 |
|
749 |
x-gzip |
750 |
An encoding format produced by the file compression program |
751 |
"gzip" (GNU zip) developed by Jean-loup Gailly. This format is |
752 |
typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. Gzip is |
753 |
available from the GNU project at |
754 |
<URL:ftp://prep.ai.mit.edu/pub/gnu/>. |
755 |
|
756 |
x-compress |
757 |
The encoding format produced by the file compression program |
758 |
"compress". This format is an adaptive Lempel-Ziv-Welch coding |
759 |
(LZW). |
760 |
|
761 |
Note: Use of program names for the identification of |
762 |
encoding formats is not desirable and should be discouraged |
763 |
for future encodings. Their use here is representative of |
764 |
historical practice, not good design. |
765 |
|
766 |
3.6 Media Types |
767 |
|
768 |
HTTP uses Internet Media Types [13] (formerly referred to as MIME |
769 |
Content-Types [5]) in order to provide open and extensible data |
770 |
typing and type negotiation. For mail applications, where there is |
771 |
no type negotiation between sender and receiver, it is reasonable |
772 |
to put strict limits on the set of allowed media types. With HTTP, |
773 |
where the sender and recipient can communicate directly, |
774 |
applications are allowed more freedom in the use of non-registered |
775 |
types. The following grammar for media types is a superset of that |
776 |
for MIME because it does not restrict itself to the official IANA |
777 |
and x-token types. |
778 |
|
779 |
media-type = type "/" subtype *( ";" parameter ) |
780 |
type = token |
781 |
subtype = token |
782 |
|
783 |
Parameters may follow the type/subtype in the form of |
784 |
attribute/value pairs. |
785 |
|
786 |
parameter = attribute "=" value |
787 |
attribute = token |
788 |
value = token | quoted-string |
789 |
|
790 |
The type, subtype, and parameter attribute names are |
791 |
case-insensitive. Parameter values may or may not be |
792 |
case-sensitive, depending on the semantics of the parameter name. |
793 |
LWS must not be generated between the type and subtype, nor between |
794 |
an attribute and its value. |
795 |
|
796 |
Many current applications do not recognize media type parameters. |
797 |
Since parameters are a fundamental aspect of media types, this must |
798 |
be considered an error in those applications. Nevertheless, |
799 |
HTTP/1.0 applications should only use media type parameters when |
800 |
they are necessary to define the content of a message. |
801 |
|
802 |
If a given media-type value has been registered by the IANA, any |
803 |
use of that value must be indicative of the registered data format. |
804 |
Although HTTP allows the use of non-registered media types, such |
805 |
usage must not conflict with the IANA registry. Data providers are |
806 |
strongly encouraged to register their media types with IANA via the |
807 |
procedures outlined in RFC 1590 [13]. |
808 |
|
809 |
All media-type's registered by IANA must be preferred over |
810 |
extension tokens. However, HTTP does not limit conforming |
811 |
applications to the use of officially registered media types, nor |
812 |
does it encourage the use of an "x-" prefix for unofficial types |
813 |
outside of explicitly short experimental use between consenting |
814 |
applications. |
815 |
|
816 |
3.6.1 Canonicalization and Text Defaults |
817 |
|
818 |
Media types are registered in a canonical form. In general, entity |
819 |
bodies transferred via HTTP must be represented in the appropriate |
820 |
canonical form prior to transmission. If the body has been encoded |
821 |
via a Content-Encoding, the data must be in canonical form prior to |
822 |
that encoding. However, HTTP modifies the canonical form |
823 |
requirements for media of primary type "text" and for "application" |
824 |
types consisting of text-like records. |
825 |
|
826 |
HTTP redefines the canonical form of text media to allow multiple |
827 |
octet sequences to indicate a text line break. In addition to the |
828 |
preferred form of CRLF, HTTP applications must accept a bare CR or |
829 |
LF alone as representing a single line break in text media. |
830 |
Furthermore, if the text media is represented in a character set |
831 |
which does not use octets 13 and 10 for CR and LF respectively, as |
832 |
is the case for some multi-byte character sets, HTTP allows the use |
833 |
of whatever octet sequence(s) is defined by that character set to |
834 |
represent the equivalent of CRLF, bare CR, and bare LF. It is |
835 |
assumed that any recipient capable of using such a character set |
836 |
will know the appropriate octet sequence for representing line |
837 |
breaks within that character set. |
838 |
|
839 |
Note: This interpretation of line breaks applies only to the |
840 |
contents of an Entity-Body and only after any |
841 |
Content-Encoding has been removed. All other HTTP constructs |
842 |
use CRLF exclusively to indicate a line break. Content |
843 |
codings define their own line break requirements. |
844 |
|
845 |
A recipient of an HTTP text entity should translate the received |
846 |
entity line breaks to the local line break conventions before |
847 |
saving the entity external to the application and its cache; |
848 |
whether this translation takes place immediately upon receipt of |
849 |
the entity, or only when prompted by the user, is entirely up to |
850 |
the individual application. |
851 |
|
852 |
HTTP also redefines the default character set for text media in an |
853 |
entity body. If a textual media type defines a charset parameter |
854 |
with a registered default value of "US-ASCII", HTTP changes the |
855 |
default to be "ISO-8859-1". Since the ISO-8859-1 [18] character set |
856 |
is a superset of US-ASCII [17], this has no effect upon the |
857 |
interpretation of entity bodies which only contain octets within |
858 |
the US-ASCII set (0 - 127). The presence of a charset parameter |
859 |
value in a Content-Type header field overrides the default. |
860 |
|
861 |
It is recommended that the character set of an entity body be |
862 |
labelled as the lowest common denominator of the character codes |
863 |
used within a document, with the exception that no label is |
864 |
preferred over the labels US-ASCII or ISO-8859-1. |
865 |
|
866 |
3.6.2 Multipart Types |
867 |
|
868 |
MIME provides for a number of "multipart" types -- encapsulations of |
869 |
several entities within a single message's Entity-Body. The |
870 |
multipart types registered by IANA [15] do not have any special |
871 |
meaning for HTTP/1.0, though user agents may need to understand |
872 |
each type in order to correctly interpret the purpose of each |
873 |
body-part. Ideally, an HTTP user agent should follow the same or |
874 |
similar behavior as a MIME user agent does upon receipt of a |
875 |
multipart type. |
876 |
|
877 |
As in MIME [5], all multipart types share a common syntax and must |
878 |
include a boundary parameter as part of the media type value. The |
879 |
message body is itself a protocol element and must therefore use |
880 |
only CRLF to represent line breaks between body-parts. Unlike in |
881 |
MIME, multipart body-parts may contain HTTP header fields which are |
882 |
significant to the meaning of that part. |
883 |
|
884 |
3.7 Product Tokens |
885 |
|
886 |
Product tokens are used to allow communicating applications to |
887 |
identify themselves via a simple product token, with an optional |
888 |
slash and version designator. Most fields using product tokens also |
889 |
allow subproducts which form a significant part of the application |
890 |
to be listed, separated by whitespace. By convention, the products |
891 |
are listed in order of their significance for identifying the |
892 |
application. |
893 |
|
894 |
product = token ["/" product-version] |
895 |
product-version = token |
896 |
|
897 |
Examples: |
898 |
|
899 |
User-Agent: CERN-LineMode/2.15 libwww/2.17b3 |
900 |
|
901 |
Server: Apache/0.8.4 |
902 |
|
903 |
Product tokens should be short and to the point -- use of them for |
904 |
advertizing or other non-essential information is explicitly |
905 |
forbidden. Although any token character may appear in a |
906 |
product-version, this token should only be used for a version |
907 |
identifier (i.e., successive versions of the same product should |
908 |
only differ in the product-version portion of the product value). |
909 |
|
910 |
4. HTTP Message |
911 |
|
912 |
4.1 Message Types |
913 |
|
914 |
HTTP messages consist of requests from client to server and |
915 |
responses from server to client. |
916 |
|
917 |
HTTP-message = Simple-Request ; HTTP/0.9 messages |
918 |
| Simple-Response |
919 |
| Full-Request ; HTTP/1.0 messages |
920 |
| Full-Response |
921 |
|
922 |
Full-Request and Full-Response use the generic message format of |
923 |
RFC 822 [7] for transferring entities. Both messages may include |
924 |
optional header fields (a.k.a. "headers") and an entity body. The |
925 |
entity body is separated from the headers by a null line (i.e., a |
926 |
line with nothing preceding the CRLF). |
927 |
|
928 |
Full-Request = Request-Line ; Section 5.1 |
929 |
*( General-Header ; Section 4.3 |
930 |
| Request-Header ; Section 5.4 |
931 |
| Entity-Header ) ; Section 7.1 |
932 |
CRLF |
933 |
[ Entity-Body ] ; Section 7.2 |
934 |
|
935 |
Full-Response = Status-Line ; Section 6.1 |
936 |
*( General-Header ; Section 4.3 |
937 |
| Response-Header ; Section 6.3 |
938 |
| Entity-Header ) ; Section 7.1 |
939 |
CRLF |
940 |
[ Entity-Body ] ; Section 7.2 |
941 |
|
942 |
Simple-Request and Simple-Response do not allow the use of any |
943 |
header information and are limited to a single request method (GET). |
944 |
|
945 |
Simple-Request = "GET" SP Request-URI CRLF |
946 |
|
947 |
Simple-Response = [ Entity-Body ] |
948 |
|
949 |
Use of the Simple-Request format is discouraged because it prevents |
950 |
the server from identifying the media type of the returned entity. |
951 |
|
952 |
4.2 Message Headers |
953 |
|
954 |
HTTP header fields, which include General-Header (Section 4.3), |
955 |
Request-Header (Section 5.4), Response-Header (Section 6.3), and |
956 |
Entity-Header (Section 7.1) fields, follow the same generic format |
957 |
as that given in Section 3.1 of RFC 822 [7]. Each header field |
958 |
consists of a name followed immediately by a colon (":"), a single |
959 |
space (SP) character, and the field value. Field names are |
960 |
case-insensitive. Header fields can be extended over multiple lines |
961 |
by preceding each extra line with at least one LWS, though this is |
962 |
not recommended. |
963 |
|
964 |
HTTP-header = field-name ":" [ field-value ] CRLF |
965 |
|
966 |
field-name = 1*<any CHAR, excluding CTLs, SP, and ":"> |
967 |
field-value = *( field-content | LWS ) |
968 |
|
969 |
field-content = <the OCTETs making up the field-value |
970 |
and consisting of either *text or combinations |
971 |
of token, tspecials, and quoted-string> |
972 |
|
973 |
The order in which header fields are received is not significant. |
974 |
However, it is "good practice" to send General-Header fields first, |
975 |
followed by Request-Header or Response-Header fields prior to the |
976 |
Entity-Header fields. |
977 |
|
978 |
Multiple HTTP-header fields with the same field-name may be present |
979 |
in a message if and only if the entire field-value for that header |
980 |
field is defined as a comma-separated list [i.e., #(values)]. It |
981 |
must be possible to combine the multiple header fields into one |
982 |
"field-name: field-value" pair, without changing the semantics of |
983 |
the message, by appending each subsequent field-value to the first, |
984 |
each separated by a comma. |
985 |
|
986 |
4.3 General Message Header Fields |
987 |
|
988 |
There are a few header fields which have general applicability for |
989 |
both request and response messages, but which do not apply to the |
990 |
communicating parties or the content being transferred. These |
991 |
headers apply only to the message being transmitted. |
992 |
|
993 |
General-Header = Date ; Section 8.6 |
994 |
| MIME-Version ; Section 8.12 |
995 |
| Pragma ; Section 8.13 |
996 |
|
997 |
General header field names can be extended only via a change in the |
998 |
protocol version. Unknown header fields are treated as |
999 |
Entity-Header fields. |
1000 |
|
1001 |
5. Request |
1002 |
|
1003 |
A request message from a client to a server includes, within the |
1004 |
first line of that message, the method to be applied to the |
1005 |
resource requested, the identifier of the resource, and the |
1006 |
protocol version in use. For backwards compatibility with the more |
1007 |
limited HTTP/0.9 protocol, there are two valid formats for an HTTP |
1008 |
request: |
1009 |
|
1010 |
Request = Simple-Request | Full-Request |
1011 |
|
1012 |
Simple-Request = "GET" SP Request-URI CRLF |
1013 |
|
1014 |
Full-Request = Request-Line ; Section 5.1 |
1015 |
*( General-Header ; Section 4.3 |
1016 |
| Request-Header ; Section 5.4 |
1017 |
| Entity-Header ) ; Section 7.1 |
1018 |
CRLF |
1019 |
[ Entity-Body ] ; Section 7.2 |
1020 |
|
1021 |
If an HTTP/1.0 server receives a Simple-Request, it must respond |
1022 |
with an HTTP/0.9 Simple-Response. An HTTP/1.0 client capable of |
1023 |
receiving a Full-Response should never generate a Simple-Request. |
1024 |
|
1025 |
5.1 Request-Line |
1026 |
|
1027 |
The Request-Line begins with a method token, followed by the |
1028 |
Request-URI and the protocol version, and ending with CRLF. The |
1029 |
elements are separated by SP characters. No CR or LF are allowed |
1030 |
except in the final CRLF sequence. |
1031 |
|
1032 |
Request-Line = Method SP Request-URI SP HTTP-Version CRLF |
1033 |
|
1034 |
Note that the difference between a Simple-Request and the |
1035 |
Request-Line of a Full-Request is the presence of the HTTP-Version |
1036 |
field and the availability of methods other than GET. |
1037 |
|
1038 |
5.2 Method |
1039 |
|
1040 |
The Method token indicates the method to be performed on the |
1041 |
resource identified by the Request-URI. The method is |
1042 |
case-sensitive. |
1043 |
|
1044 |
Method = "GET" | "HEAD" | "POST" |
1045 |
| extension-method |
1046 |
|
1047 |
extension-method = token |
1048 |
|
1049 |
The list of methods acceptable by a specific resource can change |
1050 |
dynamically; the client is notified through the return code of the |
1051 |
response if a method is not allowed on a resource. Servers should |
1052 |
return the status code 501 (not implemented) if the method is |
1053 |
unknown or not implemented. |
1054 |
|
1055 |
The set of common methods for HTTP/1.0 is described below. Although |
1056 |
this set can be easily expanded, additional methods cannot be |
1057 |
assumed to share the same semantics for separately extended clients |
1058 |
and servers. |
1059 |
|
1060 |
5.2.1 GET |
1061 |
|
1062 |
The GET method means retrieve whatever information (in the form of |
1063 |
an entity) is identified by the Request-URI. If the Request-URI |
1064 |
refers to a data-producing process, it is the produced data which |
1065 |
shall be returned as the entity in the response and not the source |
1066 |
text of the process, unless that text happens to be the output of |
1067 |
the process. |
1068 |
|
1069 |
The semantics of the GET method changes to a "conditional GET" if |
1070 |
the request message includes an If-Modified-Since header field. A |
1071 |
conditional GET method requests that the identified resource be |
1072 |
transferred only if it has been modified since the date given by |
1073 |
the If-Modified-Since header, as described in Section 8.9. The |
1074 |
conditional GET method is intended to reduce network usage by |
1075 |
allowing cached entities to be refreshed without requiring multiple |
1076 |
requests or transferring unnecessary data. |
1077 |
|
1078 |
5.2.2 HEAD |
1079 |
|
1080 |
The HEAD method is identical to GET except that the server must not |
1081 |
return any Entity-Body in the response. The metainformation |
1082 |
contained in the HTTP headers in response to a HEAD request should |
1083 |
be identical to the information sent in response to a GET request. |
1084 |
This method can be used for obtaining metainformation about the |
1085 |
resource identified by the Request-URI without transferring the |
1086 |
Entity-Body itself. This method is often used for testing hypertext |
1087 |
links for validity, accessibility, and recent modification. |
1088 |
|
1089 |
There is no "conditional HEAD" request analogous to the conditional |
1090 |
GET. If an If-Modified-Since header field is included with a HEAD |
1091 |
request, it should be ignored. |
1092 |
|
1093 |
5.2.3 POST |
1094 |
|
1095 |
The POST method is used to request that the destination server |
1096 |
accept the entity enclosed in the request as a new subordinate of |
1097 |
the resource identified by the Request-URI in the Request-Line. |
1098 |
POST is designed to allow a uniform method to cover the following |
1099 |
functions: |
1100 |
|
1101 |
o Annotation of existing resources; |
1102 |
|
1103 |
o Posting a message to a bulletin board, newsgroup, mailing list, |
1104 |
or similar group of articles; |
1105 |
|
1106 |
o Providing a block of data, such as the result of submitting a |
1107 |
form [3], to a data-handling process; |
1108 |
|
1109 |
o Extending a database through an append operation. |
1110 |
|
1111 |
The actual function performed by the POST method is determined by |
1112 |
the server and is usually dependent on the Request-URI. The posted |
1113 |
entity is subordinate to that URI in the same way that a file is |
1114 |
subordinate to a directory containing it, a news article is |
1115 |
subordinate to a newsgroup to which it is posted, or a record is |
1116 |
subordinate to a database. |
1117 |
|
1118 |
A successful POST does not require that the entity be created as a |
1119 |
resource on the origin server or made accessible for future |
1120 |
reference. That is, the action performed by the POST method might |
1121 |
not result in a resource that can be identified by a URI. In this |
1122 |
case, either 200 (ok) or 204 (no content) is the appropriate |
1123 |
response status, depending on whether or not the response includes |
1124 |
an entity that describes the result. |
1125 |
|
1126 |
If a resource has been created on the origin server, the response |
1127 |
should be 201 (created) and contain an entity (preferably of type |
1128 |
"text/html") which describes the status of the request and refers |
1129 |
to the new resource. |
1130 |
|
1131 |
A valid Content-Length is required on all HTTP/1.0 POST requests. |
1132 |
An HTTP/1.0 server should respond with a 400 (bad request) message |
1133 |
if it cannot determine the length of the request message's content. |
1134 |
|
1135 |
Caching intermediaries must not cache responses to a POST request. |
1136 |
|
1137 |
5.3 Request-URI |
1138 |
|
1139 |
The Request-URI is a Uniform Resource Identifier (Section 3.2) and |
1140 |
identifies the resource upon which to apply the request. |
1141 |
|
1142 |
Request-URI = absoluteURI | abs_path |
1143 |
|
1144 |
The two options for Request-URI are dependent on the nature of the |
1145 |
request. |
1146 |
|
1147 |
The absoluteURI form is only allowed when the request is being made |
1148 |
to a proxy server. The proxy is requested to forward the request |
1149 |
and return the response. If the request is GET or HEAD and a |
1150 |
response is cached, the proxy may use the cached message if it |
1151 |
passes any restrictions in the Expires header field. Note that the |
1152 |
proxy may forward the request on to another proxy or directly to |
1153 |
the origin server specified by the absoluteURI. In order to avoid |
1154 |
request loops, a proxy must be able to recognize all of its server |
1155 |
names, including any aliases, local variations, and the numeric IP |
1156 |
address. An example Request-Line would be: |
1157 |
|
1158 |
GET http://www.w3.org/hypertext/WWW/TheProject.html HTTP/1.0 |
1159 |
|
1160 |
The most common form of Request-URI is that used to identify a |
1161 |
resource on an origin server. In this case, only the absolute path |
1162 |
of the URI is transmitted (see Section 3.2.1, abs_path). For |
1163 |
example, a client wishing to retrieve the resource above directly |
1164 |
from the origin server would create a TCP connection to port 80 of |
1165 |
the host "www.w3.org" and send the line: |
1166 |
|
1167 |
GET /hypertext/WWW/TheProject.html HTTP/1.0 |
1168 |
|
1169 |
followed by the remainder of the Full-Request. Note that the |
1170 |
absolute path cannot be empty; if none is present in the original |
1171 |
URI, it must be given as "/" (the server root). |
1172 |
|
1173 |
5.4 Request Header Fields |
1174 |
|
1175 |
The request header fields allow the client to pass additional |
1176 |
information about the request, and about the client itself, to the |
1177 |
server. All header fields are optional and conform to the generic |
1178 |
HTTP-header syntax. |
1179 |
|
1180 |
Request-Header = Authorization ; Section 8.2 |
1181 |
| From ; Section 8.8 |
1182 |
| If-Modified-Since ; Section 8.9 |
1183 |
| Referer ; Section 8.14 |
1184 |
| User-Agent ; Section 8.16 |
1185 |
|
1186 |
Request-Header field names can be extended only via a change in the |
1187 |
protocol version. Unknown header fields are treated as |
1188 |
Entity-Header fields. |
1189 |
|
1190 |
6. Response |
1191 |
|
1192 |
After receiving and interpreting a request message, a server |
1193 |
responds in the form of an HTTP response message. |
1194 |
|
1195 |
Response = Simple-Response | Full-Response |
1196 |
|
1197 |
Simple-Response= [ Entity-Body ] |
1198 |
|
1199 |
Full-Response = Status-Line ; Section 6.1 |
1200 |
*( General-Header ; Section 4.3 |
1201 |
| Response-Header ; Section 6.3 |
1202 |
| Entity-Header ) ; Section 7.1 |
1203 |
CRLF |
1204 |
[ Entity-Body ] ; Section 7.2 |
1205 |
|
1206 |
A Simple-Response should only be sent in response to an HTTP/0.9 |
1207 |
Simple-Request or if the server only supports the more limited |
1208 |
HTTP/0.9 protocol. If a client sends an HTTP/1.0 Full-Request and |
1209 |
receives a response that does not begin with a Status-Line, it |
1210 |
should assume that the response is a Simple-Response and parse it |
1211 |
accordingly. Note that the Simple-Response consists only of the |
1212 |
entity body and is terminated by the server closing the connection. |
1213 |
|
1214 |
6.1 Status-Line |
1215 |
|
1216 |
The first line of a Full-Response message is the Status-Line, |
1217 |
consisting of the protocol version followed by a numeric status |
1218 |
code and its associated textual phrase, with each element separated |
1219 |
by SP characters. No CR or LF is allowed except in the final CRLF |
1220 |
sequence. |
1221 |
|
1222 |
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF |
1223 |
|
1224 |
Since a status line always begins with the protocol version and |
1225 |
status code |
1226 |
|
1227 |
"HTTP/" 1*DIGIT "." 1*DIGIT SP 3DIGIT SP |
1228 |
|
1229 |
(e.g., "HTTP/1.0 200 "), the presence of that expression is |
1230 |
sufficient to differentiate a Full-Response from a Simple-Response. |
1231 |
Although the Simple-Response format may allow such an expression to |
1232 |
occur at the beginning of an entity body, and thus cause a |
1233 |
misinterpretation of the message if it was given in response to a |
1234 |
Full-Request, most HTTP/0.9 servers are limited to responses of |
1235 |
type "text/html" and therefore would never generate such a response. |
1236 |
|
1237 |
6.2 Status Codes and Reason Phrases |
1238 |
|
1239 |
The Status-Code element is a 3-digit integer result code of the |
1240 |
attempt to understand and satisfy the request. The Reason-Phrase is |
1241 |
intended to give a short textual description of the Status-Code. |
1242 |
The Status-Code is intended for use by automata and the |
1243 |
Reason-Phrase is intended for the human user. The client is not |
1244 |
required to examine or display the Reason-Phrase. |
1245 |
|
1246 |
The first digit of the Status-Code defines the class of response. |
1247 |
The last two digits do not have any categorization role. There are |
1248 |
5 values for the first digit: |
1249 |
|
1250 |
o 1xx: Informational - Not used, but reserved for future use |
1251 |
|
1252 |
o 2xx: Success - The action was successfully received, |
1253 |
understood, and accepted. |
1254 |
|
1255 |
o 3xx: Redirection - Further action must be taken in order to |
1256 |
complete the request |
1257 |
|
1258 |
o 4xx: Client Error - The request contains bad syntax or cannot |
1259 |
be fulfilled |
1260 |
|
1261 |
o 5xx: Server Error - The server failed to fulfill an apparently |
1262 |
valid request |
1263 |
|
1264 |
The individual values of the numeric status codes defined for |
1265 |
HTTP/1.0, and an example set of corresponding Reason-Phrase's, are |
1266 |
presented below. The reason phrases listed here are only |
1267 |
recommended -- they may be replaced by local equivalents without |
1268 |
affecting the protocol. |
1269 |
|
1270 |
Status-Code = "200" ; OK |
1271 |
| "201" ; Created |
1272 |
| "202" ; Accepted |
1273 |
| "204" ; No Content |
1274 |
| "301" ; Moved Permanently |
1275 |
| "302" ; Moved Temporarily |
1276 |
| "304" ; Not Modified |
1277 |
| "400" ; Bad Request |
1278 |
| "401" ; Unauthorized |
1279 |
| "403" ; Forbidden |
1280 |
| "404" ; Not Found |
1281 |
| "500" ; Internal Server Error |
1282 |
| "501" ; Not Implemented |
1283 |
| "502" ; Bad Gateway |
1284 |
| "503" ; Service Unavailable |
1285 |
| extension-code |
1286 |
|
1287 |
extension-code = 3DIGIT |
1288 |
|
1289 |
Reason-Phrase = *<text, excluding CR, LF> |
1290 |
|
1291 |
HTTP status codes are extensible, but the above codes are the only |
1292 |
ones generally recognized in current practice. HTTP applications |
1293 |
are not required to understand the meaning of all registered status |
1294 |
codes, though such understanding is obviously desirable. However, |
1295 |
applications must understand the class of any status code, as |
1296 |
indicated by the first digit, and treat any unknown response as |
1297 |
being equivalent to the x00 status code of that class. For example, |
1298 |
if an unknown status code of 421 is received by the client, it can |
1299 |
safely assume that there was something wrong with its request and |
1300 |
treat the response as if it had received a 400 status code. In such |
1301 |
cases, user agents are encouraged to present the entity returned |
1302 |
with the response to the user, since that entity is likely to |
1303 |
include human-readable information which will explain the unusual |
1304 |
status. |
1305 |
|
1306 |
Each Status-Code is described below, including a description of |
1307 |
which method(s) it can follow and any metainformation required in |
1308 |
the response. |
1309 |
|
1310 |
6.2.1 Informational 1xx |
1311 |
|
1312 |
This class of status code indicates a provisional response, |
1313 |
consisting only of the Status-Line and optional headers, and is |
1314 |
terminated by an empty line. HTTP/1.0 does not define any 1xx |
1315 |
status codes and they are not a valid response to a HTTP/1.0 |
1316 |
request. However, they may be useful for experimental applications |
1317 |
which are outside the scope of this specification. |
1318 |
|
1319 |
6.2.2 Successful 2xx |
1320 |
|
1321 |
This class of status code indicates that the client's request was |
1322 |
successfully received, understood, and accepted. |
1323 |
|
1324 |
200 OK |
1325 |
|
1326 |
The request has succeeded. The information returned with the |
1327 |
response is dependent on the method used in the request, as follows: |
1328 |
|
1329 |
GET an entity corresponding to the requested resource is being |
1330 |
sent in the response; |
1331 |
|
1332 |
HEAD the response must only contain the header information and |
1333 |
no Entity-Body; |
1334 |
|
1335 |
POST an entity describing or containing the result of the action. |
1336 |
|
1337 |
201 Created |
1338 |
|
1339 |
The request has been fulfilled and resulted in a new resource being |
1340 |
created. The newly created resource can be referenced by the URI(s) |
1341 |
returned in the entity of the response. The origin server is |
1342 |
encouraged, but not obliged, to actually create the resource before |
1343 |
using this Status-Code. If the action cannot be carried out |
1344 |
immediately, or within a clearly defined timeframe, the server |
1345 |
should respond with 202 (accepted) instead. |
1346 |
|
1347 |
Of the methods defined by this specification, only POST can create |
1348 |
a resource. |
1349 |
|
1350 |
202 Accepted |
1351 |
|
1352 |
The request has been accepted for processing, but the processing |
1353 |
has not been completed. The request may or may not eventually be |
1354 |
acted upon, as it may be disallowed when processing actually takes |
1355 |
place. There is no facility for re-sending a status code from an |
1356 |
asynchronous operation such as this. |
1357 |
|
1358 |
The 202 response is intentionally non-committal. Its purpose is to |
1359 |
allow a server to accept a request for some other process (perhaps |
1360 |
a batch-oriented process that is only run once per day) without |
1361 |
requiring that the user agent's connection to the server persist |
1362 |
until the process is completed. The entity returned with this |
1363 |
response should include an indication of the request's current |
1364 |
status and either a pointer to a status monitor or some estimate of |
1365 |
when the user can expect the request to be fulfilled. |
1366 |
|
1367 |
204 No Content |
1368 |
|
1369 |
The server has fulfilled the request but there is no new |
1370 |
information to send back. If the client is a user agent, it should |
1371 |
not change its document view from that which caused the request to |
1372 |
be generated. This response is primarily intended to allow input |
1373 |
for scripts or other actions to take place without causing a change |
1374 |
to the user agent's active document view. The response may include |
1375 |
new metainformation in the form of entity headers, which should |
1376 |
apply to the document currently in the user agent's active view. |
1377 |
|
1378 |
6.2.3 Redirection 3xx |
1379 |
|
1380 |
This class of status code indicates that further action needs to be |
1381 |
taken by the user agent in order to fulfill the request. The action |
1382 |
required can sometimes be carried out by the user agent without |
1383 |
interaction with the user, but it is strongly recommended that this |
1384 |
only take place if the method used in the request is GET or HEAD. A |
1385 |
user agent should never automatically redirect a request more than |
1386 |
5 times, since such redirections usually indicate an infinite loop. |
1387 |
|
1388 |
300 Multiple Choices |
1389 |
|
1390 |
This response code is not directly used by HTTP/1.0 applications, |
1391 |
but serves as the default for interpreting the 3xx class of |
1392 |
responses. |
1393 |
|
1394 |
The requested resource is available at one or more locations. |
1395 |
Unless it was a HEAD request, the response should include an entity |
1396 |
containing a list of resource characteristics and locations from |
1397 |
which the user or user agent can choose the one most appropriate. |
1398 |
If the server has a preferred choice, it should include the URL in |
1399 |
a Location field; user agents may use the Location value for |
1400 |
automatic redirection. |
1401 |
|
1402 |
301 Moved Permanently |
1403 |
|
1404 |
The requested resource has been assigned a new permanent URL and |
1405 |
any future references to this resource should be done using that |
1406 |
URL. Clients with link editing capabilities are encouraged to |
1407 |
automatically relink references to the Request-URI to the new |
1408 |
reference returned by the server, where possible. |
1409 |
|
1410 |
The new URL must be given by the Location field in the response. |
1411 |
Unless it was a HEAD request, the Entity-Body of the response |
1412 |
should contain a short note with a hyperlink to the new URL. |
1413 |
|
1414 |
If the 301 status code is received in response to a request using |
1415 |
the POST method, the user agent must not automatically redirect the |
1416 |
request unless it can be confirmed by the user, since this might |
1417 |
change the conditions under which the request was issued. |
1418 |
|
1419 |
302 Moved Temporarily |
1420 |
|
1421 |
The requested resource resides temporarily under a different URL. |
1422 |
Since the redirection may be altered on occasion, the client should |
1423 |
continue to use the Request-URI for future requests. |
1424 |
|
1425 |
The URL must be given by the Location field in the response. Unless |
1426 |
it was a HEAD request, the Entity-Body of the response should |
1427 |
contain a short note with a hyperlink to the new URI(s). |
1428 |
|
1429 |
If the 302 status code is received in response to a request using |
1430 |
the POST method, the user agent must not automatically redirect the |
1431 |
request unless it can be confirmed by the user, since this might |
1432 |
change the conditions under which the request was issued. |
1433 |
|
1434 |
304 Not Modified |
1435 |
|
1436 |
If the client has performed a conditional GET request and access is |
1437 |
allowed, but the document has not been modified since the date and |
1438 |
time specified in the If-Modified-Since field, the server shall |
1439 |
respond with this status code and not send an Entity-Body to the |
1440 |
client. Header fields contained in the response should only include |
1441 |
information which is relevant to cache managers and which may have |
1442 |
changed independently of the entity's Last-Modified date. Examples |
1443 |
of relevant header fields include: Date, Server, and Expires. |
1444 |
|
1445 |
6.2.4 Client Error 4xx |
1446 |
|
1447 |
The 4xx class of status code is intended for cases in which the |
1448 |
client seems to have erred. If the client has not completed the |
1449 |
request when a 4xx code is received, it should immediately cease |
1450 |
sending data to the server. Except when responding to a HEAD |
1451 |
request, the server is encouraged to include an entity containing |
1452 |
an explanation of the error situation, and whether it is a |
1453 |
temporary or permanent condition. These status codes are applicable |
1454 |
to any request method. |
1455 |
|
1456 |
Note: If the client is sending data, server implementations |
1457 |
on TCP should be careful to ensure that the client |
1458 |
acknowledges receipt of the packet(s) containing the |
1459 |
response prior to closing the input connection. If the |
1460 |
client continues sending data to the server after the close, |
1461 |
the server's controller will send a reset packet to the |
1462 |
client, which may erase the client's unacknowledged input |
1463 |
buffers before they can be read and interpreted by the HTTP |
1464 |
application. |
1465 |
|
1466 |
400 Bad Request |
1467 |
|
1468 |
The request could not be understood by the server due to malformed |
1469 |
syntax. The client is discouraged from repeating the request |
1470 |
without modifications. |
1471 |
|
1472 |
401 Unauthorized |
1473 |
|
1474 |
The request requires user authentication. The response must include |
1475 |
a WWW-Authenticate header field (Section 8.17) containing a |
1476 |
challenge applicable to the requested resource. The client may |
1477 |
repeat the request with a suitable Authorization header field. If |
1478 |
the request already included Authorization credentials, then the |
1479 |
401 response indicates that authorization has been refused for |
1480 |
those credentials. If the 401 response contains the same challenge |
1481 |
as the prior response, and the user agent has already attempted |
1482 |
authentication at least once, then the user should be presented the |
1483 |
entity that was given in the response, since that entity may |
1484 |
include relevent diagnostic information. HTTP access authentication |
1485 |
is explained in Section 9. |
1486 |
|
1487 |
403 Forbidden |
1488 |
|
1489 |
The server understood the request, but is refusing to perform the |
1490 |
request for an unspecified reason. Authorization will not help and |
1491 |
the request should not be repeated. This status code can be used if |
1492 |
the server does not want to make public why the request has not |
1493 |
been fulfilled. |
1494 |
|
1495 |
404 Not Found |
1496 |
|
1497 |
The server has not found anything matching the Request-URI. No |
1498 |
indication is given of whether the condition is temporary or |
1499 |
permanent. If the server does not wish to make this information |
1500 |
available to the client, the status code 403 (forbidden) can be |
1501 |
used instead. |
1502 |
|
1503 |
6.2.5 Server Errors 5xx |
1504 |
|
1505 |
Response status codes beginning with the digit "5" indicate cases |
1506 |
in which the server is aware that it has erred or is incapable of |
1507 |
performing the request. If the client has not completed the request |
1508 |
when a 5xx code is received, it should immediately cease sending |
1509 |
data to the server. Except when responding to a HEAD request, the |
1510 |
server is encouraged to include an entity containing an explanation |
1511 |
of the error situation, and whether it is a temporary or permanent |
1512 |
condition. These response codes are applicable to any request |
1513 |
method and there are no required header fields. |
1514 |
|
1515 |
500 Internal Server Error |
1516 |
|
1517 |
The server encountered an unexpected condition which prevented it |
1518 |
from fulfilling the request. |
1519 |
|
1520 |
501 Not Implemented |
1521 |
|
1522 |
The server does not support the functionality required to fulfill |
1523 |
the request. This is the appropriate response when the server does |
1524 |
not recognize the request method and is not capable of supporting |
1525 |
it for any resource. |
1526 |
|
1527 |
502 Bad Gateway |
1528 |
|
1529 |
The server received an invalid response from the gateway or |
1530 |
upstream server it accessed in attempting to fulfill the request. |
1531 |
|
1532 |
503 Service Unavailable |
1533 |
|
1534 |
The server is currently unable to handle the request due to a |
1535 |
temporary overloading or maintenance of the server. The implication |
1536 |
is that this is a temporary condition which will be alleviated |
1537 |
after some delay. |
1538 |
|
1539 |
Note: The existence of the 503 status code does not imply |
1540 |
that a server must use it when becoming overloaded. Some |
1541 |
servers may wish to simply refuse the connection. |
1542 |
|
1543 |
6.3 Response Header Fields |
1544 |
|
1545 |
The response header fields allow the server to pass additional |
1546 |
information about the response which cannot be placed in the |
1547 |
Status-Line. These header fields are not intended to give |
1548 |
information about an Entity-Body returned in the response, but |
1549 |
about the server itself. |
1550 |
|
1551 |
Response-Header = Location ; Section 8.11 |
1552 |
| Server ; Section 8.15 |
1553 |
| WWW-Authenticate ; Section 8.17 |
1554 |
|
1555 |
Response-Header field names can be extended only via a change in |
1556 |
the protocol version. Unknown header fields are treated as |
1557 |
Entity-Header fields. |
1558 |
|
1559 |
7. Entity |
1560 |
|
1561 |
Full-Request and Full-Response messages may transfer an entity |
1562 |
within some requests and responses. An entity consists of |
1563 |
Entity-Header fields and (usually) an Entity-Body. In this section, |
1564 |
both sender and recipient refer to either the client or the server, |
1565 |
depending on who sends and who receives the entity. |
1566 |
|
1567 |
7.1 Entity Header Fields |
1568 |
|
1569 |
Entity-Header fields define optional metainformation about the |
1570 |
Entity-Body or, if no body is present, about the resource |
1571 |
identified by the request. |
1572 |
|
1573 |
Entity-Header = Allow ; Section 8.1 |
1574 |
| Content-Encoding ; Section 8.3 |
1575 |
| Content-Length ; Section 8.4 |
1576 |
| Content-Type ; Section 8.5 |
1577 |
| Expires ; Section 8.7 |
1578 |
| Last-Modified ; Section 8.10 |
1579 |
| extension-header |
1580 |
|
1581 |
extension-header=HTTP-header |
1582 |
|
1583 |
The extension-header mechanism allows additional Entity-Header to |
1584 |
be defined without changing the protocol, but these fields cannot |
1585 |
be assumed to be recognizable by the recipient. Unknown header |
1586 |
fields should be ignored by the recipient and forwarded by proxies. |
1587 |
|
1588 |
7.2 Entity Body |
1589 |
|
1590 |
The entity-body (if any) sent with an HTTP/1.0 request or response |
1591 |
is in a format and encoding defined by the Entity-Header fields. |
1592 |
|
1593 |
Entity-Body = *OCTET |
1594 |
|
1595 |
An entity-body is included with a request message only when the |
1596 |
request method calls for one. This specification defines one |
1597 |
request method, POST, that allows an entity-body. In general, the |
1598 |
presence of an entity-body in a request is signaled by the |
1599 |
inclusion of a Content-Length header field in the request message |
1600 |
headers. HTTP/1.0 requests containing content must include a valid |
1601 |
Content-Length header field. |
1602 |
|
1603 |
For response messages, whether or not an entity-body is included |
1604 |
with a message is dependent on both the request method and the |
1605 |
response code. All responses to the HEAD request method must not |
1606 |
include a body, even though the presence of content header fields |
1607 |
may lead one to believe they do. The responses 204 (no content) and |
1608 |
304 (not modified) must not include a message body. |
1609 |
|
1610 |
7.2.1 Type |
1611 |
|
1612 |
When an Entity-Body is included with a message, the data type of |
1613 |
that body is determined via the header fields Content-Type and |
1614 |
Content-Encoding. These define a two-layer, ordered encoding model: |
1615 |
|
1616 |
entity-body := Content-Encoding( Content-Type( data ) ) |
1617 |
|
1618 |
A Content-Type specifies the media type of the underlying data. A |
1619 |
Content-Encoding may be used to indicate any additional content |
1620 |
coding applied to the type, usually for the purpose of data |
1621 |
compression, that is a property of the resource requested. The |
1622 |
default for the content encoding is none (i.e., the identity |
1623 |
function). |
1624 |
|
1625 |
The Content-Type header field has no default value. If and only if |
1626 |
the media type is not given by a Content-Type header, as is always |
1627 |
the case for Simple-Response messages, the receiver may attempt to |
1628 |
guess the media type via inspection of its content and/or the name |
1629 |
extension(s) of the URL used to specify the resource. If the media |
1630 |
type remains unknown, the receiver should treat it as type |
1631 |
"application/octet-stream". |
1632 |
|
1633 |
7.2.2 Length |
1634 |
|
1635 |
When an Entity-Body is included with a message, the length of that |
1636 |
body may be determined in one of several ways. If a Content-Length |
1637 |
header field is present, its value in bytes represents the length |
1638 |
of the Entity-Body. Otherwise, the body length is determined by the |
1639 |
closing of the connection by the server. |
1640 |
|
1641 |
Closing the connection cannot be used to indicate the end of a |
1642 |
request body, since it leaves no possibility for the server to send |
1643 |
back a response. Therefore, HTTP/1.0 requests containing content |
1644 |
must include a valid Content-Length header field. If a request |
1645 |
contains an entity body and Content-Length is not specified, and |
1646 |
the server does not recognize or cannot calculate the length from |
1647 |
other fields, then the server should send a 400 (bad request) |
1648 |
response. |
1649 |
|
1650 |
Note: Some older servers supply an invalid Content-Length |
1651 |
when sending a document that contains server-side includes |
1652 |
dynamically inserted into the data stream. It must be |
1653 |
emphasized that this will not be tolerated by future |
1654 |
versions of HTTP. Unless the client knows that it is |
1655 |
receiving a response from a compliant server, it should not |
1656 |
depend on the Content-Length value being correct. |
1657 |
|
1658 |
8. Header Field Definitions |
1659 |
|
1660 |
This section defines the syntax and semantics of all standard |
1661 |
HTTP/1.0 header fields. For Entity-Header fields, both sender and |
1662 |
recipient refer to either the client or the server, depending on |
1663 |
who sends and who receives the entity. |
1664 |
|
1665 |
8.1 Allow |
1666 |
|
1667 |
The Allow header field lists the set of methods supported by the |
1668 |
resource identified by the Request-URI. The purpose of this field |
1669 |
is strictly to inform the recipient of valid methods associated |
1670 |
with the resource. The Allow header field is not permitted in a |
1671 |
request using the POST method, and thus should be ignored if it is |
1672 |
received as part of a POST entity. |
1673 |
|
1674 |
Allow = "Allow" ":" 1#method |
1675 |
|
1676 |
Example of use: |
1677 |
|
1678 |
Allow: GET, HEAD |
1679 |
|
1680 |
This field cannot prevent a client from trying other methods. |
1681 |
However, the indications given by the Allow field value should be |
1682 |
followed. This field has no default value; if left undefined, the |
1683 |
set of allowed methods is defined by the origin server at the time |
1684 |
of each request. |
1685 |
|
1686 |
A proxy must not modify the allow header even if it does not |
1687 |
understand all the methods specified, since the user agent may have |
1688 |
other means of communicating with the origin server. |
1689 |
|
1690 |
The Allow header field does not indicate what methods are |
1691 |
implemented by the server. |
1692 |
|
1693 |
8.2 Authorization |
1694 |
|
1695 |
A user agent that wishes to authenticate itself with a |
1696 |
server--usually, but not necessarily, after receiving a 401 |
1697 |
response--may do so by including an Authorization header field with |
1698 |
the request. The Authorization field value consists of credentials |
1699 |
containing the authentication information of the user agent for the |
1700 |
realm of the resource being requested. |
1701 |
|
1702 |
Authorization = "Authorization" ":" credentials |
1703 |
|
1704 |
HTTP access authentication is described in Section 9. If a request |
1705 |
is authenticated and a realm specified, the same credentials should |
1706 |
be valid for all other requests within this realm. |
1707 |
|
1708 |
Proxies must not cache the response to a request containing an |
1709 |
Authorization field. |
1710 |
|
1711 |
8.3 Content-Encoding |
1712 |
|
1713 |
The Content-Encoding header field is used as a modifier to the |
1714 |
media-type. When present, its value indicates what additional |
1715 |
content coding has been applied to the resource, and thus what |
1716 |
decoding mechanism must be applied in order to obtain the |
1717 |
media-type referenced by the Content-Type header field. The |
1718 |
Content-Encoding is primarily used to allow a document to be |
1719 |
compressed without losing the identity of its underlying media type. |
1720 |
|
1721 |
Content-Encoding = "Content-Encoding" ":" content-coding |
1722 |
|
1723 |
Content codings are defined in Section 3.5. An example of its use is |
1724 |
|
1725 |
Content-Encoding: x-gzip |
1726 |
|
1727 |
The Content-Encoding is a characteristic of the resource identified |
1728 |
by the Request-URI. Typically, the resource is stored with this |
1729 |
encoding and is only decoded before rendering or analogous usage. |
1730 |
|
1731 |
8.4 Content-Length |
1732 |
|
1733 |
The Content-Length header field indicates the size of the |
1734 |
Entity-Body, in decimal number of octets, sent to the recipient or, |
1735 |
in the case of the HEAD method, the size of the Entity-Body that |
1736 |
would have been sent had the request been a GET. |
1737 |
|
1738 |
Content-Length = "Content-Length" ":" 1*DIGIT |
1739 |
|
1740 |
An example is |
1741 |
|
1742 |
Content-Length: 3495 |
1743 |
|
1744 |
Although it is not required, applications are strongly encouraged |
1745 |
to use this field to indicate the size of the Entity-Body to be |
1746 |
transferred, regardless of the media type of the entity. |
1747 |
|
1748 |
Any Content-Length greater than or equal to zero is a valid value. |
1749 |
Section 7.2.2 describes how to determine the length of an |
1750 |
Entity-Body if a Content-Length is not given. |
1751 |
|
1752 |
Note: The meaning of this field is significantly different |
1753 |
from the corresponding definition in MIME, where it is an |
1754 |
optional field used within the "message/external-body" |
1755 |
content-type. In HTTP, it should be used whenever the |
1756 |
entity's length can be determined prior to being transferred. |
1757 |
|
1758 |
8.5 Content-Type |
1759 |
|
1760 |
The Content-Type header field indicates the media type of the |
1761 |
Entity-Body sent to the recipient or, in the case of the HEAD |
1762 |
method, the media type that would have been sent had the request |
1763 |
been a GET. |
1764 |
|
1765 |
Content-Type = "Content-Type" ":" media-type |
1766 |
|
1767 |
Media types are defined in Section 3.6. An example of the field is |
1768 |
|
1769 |
Content-Type: text/html |
1770 |
|
1771 |
The Content-Type header field has no default value. Further |
1772 |
discussion of methods for identifying the media type of an entity |
1773 |
is provided in Section 7.2.1. |
1774 |
|
1775 |
8.6 Date |
1776 |
|
1777 |
The Date header represents the date and time at which the message |
1778 |
was originated, having the same semantics as orig-date in RFC 822. |
1779 |
The field value is an HTTP-date, as described in Section 3.3. |
1780 |
|
1781 |
Date = "Date" ":" HTTP-date |
1782 |
|
1783 |
An example is |
1784 |
|
1785 |
Date: Tue, 15 Nov 1994 08:12:31 GMT |
1786 |
|
1787 |
If a message is received via direct connection with the user agent |
1788 |
(in the case of requests) or the origin server (in the case of |
1789 |
responses), then the default date can be assumed to be the current |
1790 |
date at the receiving end. However, since the date--as it is |
1791 |
believed by the origin--is important for evaluating cached |
1792 |
responses, origin servers should always include a Date header. |
1793 |
Clients should only send a Date header field in messages that |
1794 |
include an entity body, as in the case of the POST request, and |
1795 |
even then it is optional. A received message which does not have a |
1796 |
Date header field should be assigned one by the receiver if and |
1797 |
only if the message will be cached by that receiver or gatewayed |
1798 |
via a protocol which requires a Date. |
1799 |
|
1800 |
Only one Date header field is allowed per message. In theory, the |
1801 |
date should represent the moment just before the entity is |
1802 |
generated. In practice, the date can be generated at any time |
1803 |
during the message origination without affecting its semantic value. |
1804 |
|
1805 |
Note: An earlier version of this document incorrectly |
1806 |
specified that this field should contain the creation date |
1807 |
of the enclosed Entity-Body. This has been changed to |
1808 |
reflect actual (and proper) usage. |
1809 |
|
1810 |
8.7 Expires |
1811 |
|
1812 |
The Expires field gives the date/time after which the entity should |
1813 |
be considered stale. This allows information providers to suggest |
1814 |
the volatility of the resource. Caching clients, including proxies, |
1815 |
must not cache this copy of the resource beyond the date given, |
1816 |
unless its status has been updated by a later check of the origin |
1817 |
server. The presence of an Expires field does not imply that the |
1818 |
original resource will change or cease to exist at, before, or |
1819 |
after that time. However, information providers that know or even |
1820 |
suspect that a resource will change by a certain date are strongly |
1821 |
encouraged to include an Expires header with that date. The format |
1822 |
is an absolute date and time as defined by HTTP-date in Section 3.3. |
1823 |
|
1824 |
Expires = "Expires" ":" HTTP-date |
1825 |
|
1826 |
An example of its use is |
1827 |
|
1828 |
Expires: Thu, 01 Dec 1994 16:00:00 GMT |
1829 |
|
1830 |
The Expires field has no default value. If the date given is equal |
1831 |
to or earlier than the value of the Date header, the recipient must |
1832 |
not cache the enclosed entity. If a resource is dynamic by nature, |
1833 |
as is the case with many data-producing processes, copies of that |
1834 |
resource should be given an appropriate Expires value which |
1835 |
reflects that dynamism. |
1836 |
|
1837 |
The Expires field cannot be used to force a user agent to refresh |
1838 |
its display or reload a resource; its semantics apply only to |
1839 |
caching mechanisms, and such mechanisms need only check a |
1840 |
resource's expiration status when a new request for that resource |
1841 |
is initiated. |
1842 |
|
1843 |
User agents often have history mechanisms, such as "Back" buttons |
1844 |
and history lists, which can be used to redisplay an entity |
1845 |
retrieved earlier in a session. By default, the Expires field does |
1846 |
not apply to history mechanisms. If the entity is still in storage, |
1847 |
a history mechanism should display it even if the entity has |
1848 |
expired, unless the user has specifically configured the agent to |
1849 |
refresh expired history documents. |
1850 |
|
1851 |
Note: Applications are encouraged to be tolerant of bad or |
1852 |
misinformed implementations of the Expires header. A value |
1853 |
of zero (0) or an invalid date format should be considered |
1854 |
equivalent to an "expires immediately." Although these |
1855 |
values are not legitimate for HTTP/1.0, a robust |
1856 |
implementation is always desirable. |
1857 |
|
1858 |
8.8 From |
1859 |
|
1860 |
The From header field, if given, should contain an Internet e-mail |
1861 |
address for the human user who controls the requesting user agent. |
1862 |
The address should be machine-usable, as defined by mailbox in |
1863 |
RFC 822 [7] (as updated by RFC 1123 [6]): |
1864 |
|
1865 |
From = "From" ":" mailbox |
1866 |
|
1867 |
An example is: |
1868 |
|
1869 |
From: webmaster@w3.org |
1870 |
|
1871 |
This header field may be used for logging purposes and as a means |
1872 |
for identifying the source of invalid or unwanted requests. It |
1873 |
should not be used as an insecure form of access protection. The |
1874 |
interpretation of this field is that the request is being performed |
1875 |
on behalf of the person given, who accepts responsibility for the |
1876 |
method performed. In particular, robot agents should include this |
1877 |
header so that the person responsible for running the robot can be |
1878 |
contacted if problems occur on the receiving end. |
1879 |
|
1880 |
The Internet e-mail address in this field may be separate from the |
1881 |
Internet host which issued the request. For example, when a request |
1882 |
is passed through a proxy, the original issuer's address should be |
1883 |
used. |
1884 |
|
1885 |
Note: The client should not send the From header field |
1886 |
without the user's approval, as it may conflict with the |
1887 |
user's privacy interests or their site's security policy. It |
1888 |
is strongly recommended that the user be able to disable, |
1889 |
enable, and modify the value of this field at any time prior |
1890 |
to a request. |
1891 |
|
1892 |
8.9 If-Modified-Since |
1893 |
|
1894 |
The If-Modified-Since header field is used with the GET method to |
1895 |
make it conditional: if the requested resource has not been |
1896 |
modified since the time specified in this field, a copy of the |
1897 |
resource will not be returned from the server; instead, a 304 (not |
1898 |
modified) response will be returned without any Entity-Body. |
1899 |
|
1900 |
If-Modified-Since = "If-Modified-Since" ":" HTTP-date |
1901 |
|
1902 |
An example of the field is: |
1903 |
|
1904 |
If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT |
1905 |
|
1906 |
A conditional GET method requests that the identified resource be |
1907 |
transferred only if it has been modified since the date given by |
1908 |
the If-Modified-Since header. The algorithm for determining this |
1909 |
includes the following cases: |
1910 |
|
1911 |
a) If the request would normally result in anything other than |
1912 |
a 200 (ok) status, or if the passed If-Modified-Since date |
1913 |
is invalid, the response is exactly the same as for a |
1914 |
normal GET. A date which is later than the server's current |
1915 |
time is invalid. |
1916 |
|
1917 |
b) If the resource has been modified since the |
1918 |
If-Modified-Since date, the response is exactly the same as |
1919 |
for a normal GET. |
1920 |
|
1921 |
c) If the resource has not been modified since a valid |
1922 |
If-Modified-Since date, the server shall return a 304 (not |
1923 |
modified) response. |
1924 |
|
1925 |
The purpose of this feature is to allow efficient updates of cached |
1926 |
information with a minimum amount of transaction overhead. |
1927 |
|
1928 |
8.10 Last-Modified |
1929 |
|
1930 |
The Last-Modified header field indicates the date and time at which |
1931 |
the sender believes the resource was last modified. The exact |
1932 |
semantics of this field are defined in terms of how the receiver |
1933 |
should interpret it: if the receiver has a copy of this resource |
1934 |
which is older than the date given by the Last-Modified field, that |
1935 |
copy should be considered stale. |
1936 |
|
1937 |
Last-Modified = "Last-Modified" ":" HTTP-date |
1938 |
|
1939 |
An example of its use is |
1940 |
|
1941 |
Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT |
1942 |
|
1943 |
The exact meaning of this header field depends on the |
1944 |
implementation of the sender and the nature of the original |
1945 |
resource. For files, it may be just the file system last-modified |
1946 |
time. For entities with dynamically included parts, it may be the |
1947 |
most recent of the set of last-modify times for its component |
1948 |
parts. For database gateways, it may be the last-update timestamp |
1949 |
of the record. For virtual objects, it may be the last time the |
1950 |
internal state changed. |
1951 |
|
1952 |
An origin server must not send a Last-Modified date which is later |
1953 |
than the server's time of message origination. In such cases, where |
1954 |
the resource's last modification would indicate some time in the |
1955 |
future, the server must replace that date with the message |
1956 |
origination date. |
1957 |
|
1958 |
8.11 Location |
1959 |
|
1960 |
The Location response header field defines the exact location of |
1961 |
the resource that was identified by the Request-URI. For 3xx |
1962 |
responses, the location must indicate the server's preferred URL |
1963 |
for automatic redirection to the resource. Only one absolute URL is |
1964 |
allowed. |
1965 |
|
1966 |
Location = "Location" ":" absoluteURI |
1967 |
|
1968 |
An example is |
1969 |
|
1970 |
Location: http://www.w3.org/hypertext/WWW/NewLocation.html |
1971 |
|
1972 |
8.12 MIME-Version |
1973 |
|
1974 |
HTTP is not a MIME-conformant protocol (see Appendix C). However, |
1975 |
HTTP/1.0 messages may include a single MIME-Version header field to |
1976 |
indicate what version of the MIME protocol was used to construct |
1977 |
the message. Use of the MIME-Version header field should indicate |
1978 |
that the message is in full compliance with the MIME protocol (as |
1979 |
defined in [5]). Unfortunately, current versions of HTTP/1.0 |
1980 |
clients and servers use this field indiscriminately, and thus |
1981 |
receivers must not take it for granted that the message is indeed |
1982 |
in full compliance with MIME. Gateways are responsible for ensuring |
1983 |
this compliance (where possible) when exporting HTTP messages to |
1984 |
strict MIME environments. Future HTTP/1.0 applications must only |
1985 |
use MIME-Version when the message is intended to be MIME-conformant. |
1986 |
|
1987 |
MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT |
1988 |
|
1989 |
MIME version "1.0" is the default for use in HTTP/1.0. However, |
1990 |
HTTP/1.0 message parsing and semantics are defined by this document |
1991 |
and not the MIME specification. |
1992 |
|
1993 |
8.13 Pragma |
1994 |
|
1995 |
The Pragma message header field is used to include |
1996 |
implementation-specific directives that may apply to any recipient |
1997 |
along the request/response chain. The directives typically specify |
1998 |
behavior intended to prevent intermediate proxies or caches from |
1999 |
adversely interfering with the request or response. All pragma |
2000 |
directives specify optional behavior from the viewpoint of the |
2001 |
protocol; however, some systems may require that behavior be |
2002 |
consistent with the directives. HTTP/1.0 only defines semantics for |
2003 |
the "no-cache" directive on request messages. |
2004 |
|
2005 |
Pragma = "Pragma" ":" 1#pragma-directive |
2006 |
|
2007 |
pragma-directive = "no-cache" | extension-pragma |
2008 |
extension-pragma = token [ "=" word ] |
2009 |
|
2010 |
When the "no-cache" directive is present in a request message, a |
2011 |
caching intermediary should forward the request toward the origin |
2012 |
server even if it has a cached copy of what is being requested. |
2013 |
This allows a client to insist upon receiving an authoritative |
2014 |
response to its request. It also allows a client to refresh a |
2015 |
cached copy which is known to be corrupted or stale. |
2016 |
|
2017 |
Pragma directives must be passed through by a proxy, regardless of |
2018 |
their significance to that proxy, since the directives may be |
2019 |
applicable to all recipients along the request/response chain. It |
2020 |
is not possible to specify a pragma for a specific recipient; |
2021 |
however, any pragma directive not relevant to a recipient should be |
2022 |
ignored by that recipient. |
2023 |
|
2024 |
8.14 Referer |
2025 |
|
2026 |
The Referer request header field allows the client to specify, for |
2027 |
the server's benefit, the address (URI) of the resource from which |
2028 |
the Request-URI was obtained. This allows a server to generate |
2029 |
lists of back-links to resources for interest, logging, optimized |
2030 |
caching, etc. It also allows obsolete or mistyped links to be |
2031 |
traced for maintenance. The Referer field must not be sent if the |
2032 |
Request-URI was obtained from a source that does not have its own |
2033 |
URI, such as input from the user keyboard. |
2034 |
|
2035 |
Referer = "Referer" ":" ( absoluteURI | relativeURI ) |
2036 |
|
2037 |
Example: |
2038 |
|
2039 |
Referer: http://www.w3.org/hypertext/DataSources/Overview.html |
2040 |
|
2041 |
If a partial URI is given, it should be interpreted relative to the |
2042 |
Request-URI. The URI must not include a fragment. |
2043 |
|
2044 |
Note: Because the source of a link may be private |
2045 |
information or may reveal an otherwise private information |
2046 |
source, it is strongly recommended that the user be able to |
2047 |
select whether or not the Referer field is sent. For |
2048 |
example, a browser client could have a toggle switch for |
2049 |
browsing openly/anonymously, which would respectively |
2050 |
enable/disable the sending of Referer and From information. |
2051 |
|
2052 |
8.15 Server |
2053 |
|
2054 |
The Server response header field contains information about the |
2055 |
software used by the origin server to handle the request. The field |
2056 |
can contain multiple product tokens (Section 3.7) and comments |
2057 |
identifying the server and any significant subproducts. By |
2058 |
convention, the product tokens are listed in order of their |
2059 |
significance for identifying the application. |
2060 |
|
2061 |
Server = "Server" ":" 1*( product | comment ) |
2062 |
|
2063 |
Example: |
2064 |
|
2065 |
Server: CERN/3.0 libwww/2.17 |
2066 |
|
2067 |
If the response is being forwarded through a proxy, the proxy |
2068 |
application should not add its data to the product list. |
2069 |
|
2070 |
Note: Revealing the specific software version of the server |
2071 |
may allow the server machine to become more vulnerable to |
2072 |
attacks against software that is known to contain security |
2073 |
holes. Server implementors are encouraged to make this field |
2074 |
a configurable option. |
2075 |
|
2076 |
8.16 User-Agent |
2077 |
|
2078 |
The User-Agent field contains information about the user agent |
2079 |
originating the request. This is for statistical purposes, the |
2080 |
tracing of protocol violations, and automated recognition of user |
2081 |
agents for the sake of tailoring responses to avoid particular user |
2082 |
agent limitations. Although it is not required, user agents should |
2083 |
always include this field with requests. The field can contain |
2084 |
multiple product tokens (Section 3.7) and comments identifying the |
2085 |
agent and any subproducts which form a significant part of the user |
2086 |
agent. By convention, the product tokens are listed in order of |
2087 |
their significance for identifying the application. |
2088 |
|
2089 |
User-Agent = "User-Agent" ":" 1*( product | comment ) |
2090 |
|
2091 |
Example: |
2092 |
|
2093 |
User-Agent: CERN-LineMode/2.15 libwww/2.17b3 |
2094 |
|
2095 |
The User-Agent field may include additional information within |
2096 |
comments. |
2097 |
|
2098 |
Note: Some current proxy applications append their product |
2099 |
information to the list in the User-Agent field. This is not |
2100 |
recommended, since it makes machine interpretation of these |
2101 |
fields ambiguous. |
2102 |
|
2103 |
8.17 WWW-Authenticate |
2104 |
|
2105 |
The WWW-Authenticate header field must be included in 401 |
2106 |
(unauthorized) response messages. The field value consists of at |
2107 |
least one challenge that indicates the authentication scheme(s) and |
2108 |
parameters applicable to the Request-URI. |
2109 |
|
2110 |
WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge |
2111 |
|
2112 |
The HTTP access authentication process is described in Section 9. |
2113 |
User agents must take special care in parsing the WWW-Authenticate |
2114 |
field value if it contains more than one challenge, or if more than |
2115 |
one WWW-Authenticate header field is provided, since the contents |
2116 |
of a challenge may itself contain a comma-separated list of |
2117 |
authentication parameters. |
2118 |
|
2119 |
9. Access Authentication |
2120 |
|
2121 |
HTTP provides a simple challenge-response authentication mechanism |
2122 |
which may be used by a server to challenge a client request and by |
2123 |
a client to provide authentication information. It uses an |
2124 |
extensible, case-insensitive token to identify the authentication |
2125 |
scheme, followed by a comma-separated list of attribute-value pairs |
2126 |
which carry the parameters necessary for achieving authentication |
2127 |
via that scheme. |
2128 |
|
2129 |
auth-scheme = token |
2130 |
|
2131 |
auth-param = token "=" quoted-string |
2132 |
|
2133 |
The 401 (unauthorized) response message is used by an origin server |
2134 |
to challenge the authorization of a user agent. This response must |
2135 |
include a WWW-Authenticate header field containing at least one |
2136 |
challenge applicable to the requested resource. |
2137 |
|
2138 |
challenge = auth-scheme 1*SP realm *( "," auth-param ) |
2139 |
|
2140 |
realm = "realm" "=" realm-value |
2141 |
realm-value = quoted-string |
2142 |
|
2143 |
The realm attribute (case-insensitive) is required for all |
2144 |
authentication schemes which issue a challenge. The realm value |
2145 |
(case-sensitive), in combination with the canonical root URL of the |
2146 |
server being accessed, defines the protection space. These realms |
2147 |
allow the protected resources on a server to be partitioned into a |
2148 |
set of protection spaces, each with its own authentication scheme |
2149 |
and/or authorization database. The realm value is a string, |
2150 |
generally assigned by the origin server, which may have additional |
2151 |
semantics specific to the authentication scheme. |
2152 |
|
2153 |
A user agent that wishes to authenticate itself with a |
2154 |
server--usually, but not necessarily, after receiving a 401 |
2155 |
response--may do so by including an Authorization header field with |
2156 |
the request. The Authorization field value consists of credentials |
2157 |
containing the authentication information of the user agent for the |
2158 |
realm of the resource being requested. |
2159 |
|
2160 |
credentials = basic-credentials |
2161 |
| ( auth-scheme #auth-param ) |
2162 |
|
2163 |
The domain over which credentials can be automatically applied by a |
2164 |
user agent is determined by the protection space. If a prior |
2165 |
request has been authorized, the same credentials may be reused for |
2166 |
all other requests within that protection space for a period of |
2167 |
time determined by the authentication scheme, parameters, and/or |
2168 |
user preference. Unless otherwise defined by the authentication |
2169 |
scheme, a single protection space cannot extend outside the scope |
2170 |
of its server. |
2171 |
|
2172 |
If the server does not wish to accept the credentials sent with a |
2173 |
request, it should return a 403 (forbidden) response. |
2174 |
|
2175 |
The HTTP protocol does not restrict applications to this simple |
2176 |
challenge-response mechanism for access authentication. Additional |
2177 |
mechanisms may be used at the transport level, via message |
2178 |
encapsulation, and/or with additional header fields specifying |
2179 |
authentication information. However, these additional mechanisms |
2180 |
are not defined by this specification. |
2181 |
|
2182 |
Proxies must be completely transparent regarding user agent |
2183 |
authentication. That is, they must forward the WWW-Authenticate and |
2184 |
Authorization headers untouched, and must not cache the response to |
2185 |
a request containing Authorization. HTTP/1.0 does not provide a |
2186 |
means for a client to be authenticated with a proxy. |
2187 |
|
2188 |
9.1 Basic Authentication Scheme |
2189 |
|
2190 |
The "basic" authentication scheme is based on the model that the |
2191 |
user agent must authenticate itself with a user-ID and a password |
2192 |
for each realm. The realm value should be considered an opaque |
2193 |
string which can only be compared for equality with other realms on |
2194 |
that server. The server will authorize the request only if it can |
2195 |
validate the user-ID and password for the protection space of the |
2196 |
Request-URI. There are no optional authentication parameters. |
2197 |
|
2198 |
Upon receipt of an unauthorized request for a URI within the |
2199 |
protection space, the server should respond with a challenge like |
2200 |
the following: |
2201 |
|
2202 |
WWW-Authenticate: Basic realm="WallyWorld" |
2203 |
|
2204 |
where "WallyWorld" is the string assigned by the server to identify |
2205 |
the protection space of the Request-URI. |
2206 |
|
2207 |
To receive authorization, the client sends the user-ID and |
2208 |
password, separated by a single colon (":") character, within a |
2209 |
base64 [5] encoded string in the credentials. |
2210 |
|
2211 |
basic-credentials = "Basic" SP basic-cookie |
2212 |
|
2213 |
basic-cookie = <base64 [5] encoding of userid-password, |
2214 |
except not limited to 76 char/line> |
2215 |
|
2216 |
userid-password = [ token ] ":" *text |
2217 |
|
2218 |
If the user agent wishes to send the user-ID "Aladdin" and password |
2219 |
"open sesame", it would use the following header field: |
2220 |
|
2221 |
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== |
2222 |
|
2223 |
The basic authentication scheme is a non-secure method of filtering |
2224 |
unauthorized access to resources on an HTTP server. It is based on |
2225 |
the assumption that the connection between the client and the |
2226 |
server can be regarded as a trusted carrier. As this is not |
2227 |
generally true on an open network, the basic authentication scheme |
2228 |
should be used accordingly. In spite of this, clients are |
2229 |
encouraged to implement the scheme in order to communicate with |
2230 |
servers that use it. |
2231 |
|
2232 |
10. Security Considerations |
2233 |
|
2234 |
This section is meant to inform application developers, information |
2235 |
providers, and users of the security limitations in HTTP/1.0 as |
2236 |
described by this document. The discussion does not include |
2237 |
definitive solutions to the problems revealed, though it does make |
2238 |
some suggestions for reducing security risks. |
2239 |
|
2240 |
10.1 Authentication of Clients |
2241 |
|
2242 |
As mentioned in Section 9.1, the Basic authentication scheme is not |
2243 |
a secure method of user authentication, nor does it prevent the |
2244 |
Entity-Body from being transmitted in clear text across the |
2245 |
physical network used as the carrier. HTTP/1.0 does not prevent |
2246 |
additional authentication schemes and encryption mechanisms from |
2247 |
being employed to increase security. |
2248 |
|
2249 |
10.2 Safe Methods |
2250 |
|
2251 |
The writers of client software should be aware that the software |
2252 |
represents the user in their interactions over the Internet, and |
2253 |
should be careful to allow the user to be aware of any actions they |
2254 |
may take which may have an unexpected significance to themselves or |
2255 |
others. |
2256 |
|
2257 |
In particular, the convention has been established that the GET and |
2258 |
HEAD methods should never have the significance of taking an action |
2259 |
other than retrieval. These methods should be considered "safe." |
2260 |
This allows user agents to represent other methods, such as POST, |
2261 |
in a special way, so that the user is made aware of the fact that a |
2262 |
possibly unsafe action is being requested. |
2263 |
|
2264 |
Naturally, it is not possible to ensure that the server does not |
2265 |
generate side-effects as a result of performing a GET request; in |
2266 |
fact, some dynamic resources consider that a feature. The important |
2267 |
distinction here is that the user did not request the side-effects, |
2268 |
so therefore cannot be held accountable for them. |
2269 |
|
2270 |
10.3 Abuse of Server Log Information |
2271 |
|
2272 |
A server is in the position to save personal data about a user's |
2273 |
requests which may identify their reading patterns or subjects of |
2274 |
interest. This information is clearly confidential in nature and |
2275 |
its handling may be constrained by law in certain countries. People |
2276 |
using the HTTP protocol to provide data are responsible for |
2277 |
ensuring that such material is not distributed without the |
2278 |
permission of any individuals that are identifiable by the |
2279 |
published results. |
2280 |
|
2281 |
10.4 Transfer of Sensitive Information |
2282 |
|
2283 |
Like any generic data transfer protocol, HTTP cannot regulate the |
2284 |
content of the data that is transferred, nor is there any a priori |
2285 |
method of determining the sensitivity of any particular piece of |
2286 |
information within the context of any given request. Therefore, |
2287 |
applications are encouraged to supply as much control over this |
2288 |
information as possible to the provider of that information. Three |
2289 |
header fields are worth special mention in this context: Server, |
2290 |
Referer and From. |
2291 |
|
2292 |
Revealing the specific software version of the server may allow the |
2293 |
server machine to become more vulnerable to attacks against |
2294 |
software that is known to contain security holes. Implementors are |
2295 |
encouraged to make the Server header field a configurable option. |
2296 |
|
2297 |
The Referer field allows reading patterns to be studied and reverse |
2298 |
links drawn. Although it can be very useful, its power can be |
2299 |
abused if user details are not separated from the information |
2300 |
contained in the Referer. Even when the personal information has |
2301 |
been removed, the Referer field may indicate a private document's |
2302 |
URI whose publication would be inappropriate. |
2303 |
|
2304 |
The information sent in the From field might conflict with the |
2305 |
user's privacy interests or their site's security policy, and hence |
2306 |
it should not be transmitted without the user being able to |
2307 |
disable, enable, and modify the contents of the field. The user |
2308 |
must be able to set the contents of this field within a user |
2309 |
preference or application defaults configuration. |
2310 |
|
2311 |
We suggest, though do not require, that a convenient toggle |
2312 |
interface be provided for the user to enable or disable the sending |
2313 |
of From and Referer information. |
2314 |
|
2315 |
11. Acknowledgments |
2316 |
|
2317 |
This specification makes heavy use of the augmented BNF and generic |
2318 |
constructs defined by David H. Crocker for RFC 822 [7]. Similarly, |
2319 |
it reuses many of the definitions provided by Nathaniel Borenstein |
2320 |
and Ned Freed for MIME [5]. We hope that their inclusion in this |
2321 |
specification will help reduce past confusion over the relationship |
2322 |
between HTTP/1.0 and Internet mail message formats. |
2323 |
|
2324 |
The HTTP protocol has evolved considerably over the past three |
2325 |
years. It has benefited from a large and active developer |
2326 |
community--the many people who have participated on the www-talk |
2327 |
mailing list--and it is that community which has been most |
2328 |
responsible for the success of HTTP and of the World-Wide Web in |
2329 |
general. Marc Andreessen, Robert Cailliau, Daniel W. Connolly, |
2330 |
Bob Denny, Jean Francois-Groff, Phillip M. Hallam-Baker, |
2331 |
Hakon W. Lie, Ari Luotonen, Rob McCool, Lou Montulli, Dave Raggett, |
2332 |
Tony Sanders, and Marc VanHeyningen deserve special recognition for |
2333 |
their efforts in defining aspects of the protocol for early versions |
2334 |
of this specification. |
2335 |
|
2336 |
This document has benefited greatly from the comments of all those |
2337 |
participating in the HTTP-WG. In addition to those already |
2338 |
mentioned, the following individuals have contributed to this |
2339 |
specification: |
2340 |
|
2341 |
Gary Adams Harald Tveit Alvestrand |
2342 |
Keith Ball Brian Behlendorf |
2343 |
Paul Burchard Maurizio Codogno |
2344 |
Mike Cowlishaw Roman Czyborra |
2345 |
Michael A. Dolan John Franks |
2346 |
Jim Gettys Marc Hedlund |
2347 |
Koen Holtman Alex Hopmann |
2348 |
Bob Jernigan Shel Kaphan |
2349 |
Martijn Koster Dave Kristol |
2350 |
Daniel LaLiberte Paul Leach |
2351 |
Albert Lunde John C. Mallery |
2352 |
Larry Masinter Mitra |
2353 |
Gavin Nicol Bill Perry |
2354 |
Jeffrey Perry Owen Rees |
2355 |
David Robinson Marc Salomon |
2356 |
Rich Salz Jim Seidman |
2357 |
Chuck Shotton Eric W. Sink |
2358 |
Simon E. Spero Robert S. Thau |
2359 |
Francois Yergeau Mary Ellen Zurko |
2360 |
|
2361 |
12. References |
2362 |
|
2363 |
[1] F. Anklesaria, M. McCahill, P. Lindner, D. Johnson, D. Torrey, |
2364 |
and B. Alberti. "The Internet Gopher Protocol: A distributed |
2365 |
document search and retrieval protocol." RFC 1436, University |
2366 |
of Minnesota, March 1993. |
2367 |
|
2368 |
[2] T. Berners-Lee. "Universal Resource Identifiers in WWW: |
2369 |
A Unifying Syntax for the Expression of Names and Addresses of |
2370 |
Objects on the Network as used in the World-Wide Web." |
2371 |
RFC 1630, CERN, June 1994. |
2372 |
|
2373 |
[3] T. Berners-Lee and D. Connolly. "HyperText Markup Language |
2374 |
Specification - 2.0." Work in Progress |
2375 |
(draft-ietf-html-spec-05.txt), MIT/W3C, August 1995. |
2376 |
|
2377 |
[4] T. Berners-Lee, L. Masinter, and M. McCahill. "Uniform Resource |
2378 |
Locators (URL)." RFC 1738, CERN, Xerox PARC, University of |
2379 |
Minnesota, December 1994. |
2380 |
|
2381 |
[5] N. Borenstein and N. Freed. "MIME (Multipurpose Internet Mail |
2382 |
Extensions) Part One: Mechanisms for Specifying and Describing |
2383 |
the Format of Internet Message Bodies." RFC 1521, Bellcore, |
2384 |
Innosoft, September 1993. |
2385 |
|
2386 |
[6] R. Braden. "Requirements for Internet hosts - application and |
2387 |
support." STD 3, RFC 1123, IETF, October 1989. |
2388 |
|
2389 |
[7] D. H. Crocker. "Standard for the Format of ARPA Internet Text |
2390 |
Messages." STD 11, RFC 822, UDEL, August 1982. |
2391 |
|
2392 |
[8] F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang, |
2393 |
J. Sui, and M. Grinbaum. "WAIS Interface Protocol Prototype |
2394 |
Functional Specification." (v1.5), Thinking Machines |
2395 |
Corporation, April 1990. |
2396 |
|
2397 |
[9] R. Fielding. "Relative Uniform Resource Locators." RFC 1808, |
2398 |
UC Irvine, June 1995. |
2399 |
|
2400 |
[10] M. Horton and R. Adams. "Standard for interchange of USENET |
2401 |
messages." RFC 1036 (Obsoletes RFC 850), AT&T Bell |
2402 |
Laboratories, Center for Seismic Studies, December 1987. |
2403 |
|
2404 |
[11] B. Kantor and P. Lapsley. "Network News Transfer Protocol: |
2405 |
A Proposed Standard for the Stream-Based Transmission of News." |
2406 |
RFC 977, UC San Diego, UC Berkeley, February 1986. |
2407 |
|
2408 |
[12] J. Postel. "Simple Mail Transfer Protocol." STD 10, RFC 821, |
2409 |
USC/ISI, August 1982. |
2410 |
|
2411 |
[13] J. Postel. "Media Type Registration Procedure." RFC 1590, |
2412 |
USC/ISI, March 1994. |
2413 |
|
2414 |
[14] J. Postel and J. K. Reynolds. "File Transfer Protocol (FTP)." |
2415 |
STD 9, RFC 959, USC/ISI, October 1985. |
2416 |
|
2417 |
[15] J. Reynolds and J. Postel. "Assigned Numbers." STD 2, RFC 1700, |
2418 |
USC/ISI, October 1994. |
2419 |
|
2420 |
[16] K. Sollins and L. Masinter. "Functional Requirements for |
2421 |
Uniform Resource Names." RFC 1737, MIT/LCS, Xerox Corporation, |
2422 |
December 1994. |
2423 |
|
2424 |
[17] US-ASCII. Coded Character Set - 7-Bit American Standard Code |
2425 |
for Information Interchange. Standard ANSI X3.4-1986, ANSI, |
2426 |
1986. |
2427 |
|
2428 |
[18] ISO-8859. International Standard -- Information Processing -- |
2429 |
8-bit Single-Byte Coded Graphic Character Sets -- |
2430 |
Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. |
2431 |
Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. |
2432 |
Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. |
2433 |
Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. |
2434 |
Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. |
2435 |
Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. |
2436 |
Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. |
2437 |
Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. |
2438 |
Part 9: Latin alphabet No. 5, ISO 8859-9, 1990. |
2439 |
|
2440 |
13. Authors' Addresses |
2441 |
|
2442 |
Tim Berners-Lee |
2443 |
Director, W3 Consortium |
2444 |
MIT Laboratory for Computer Science |
2445 |
545 Technology Square |
2446 |
Cambridge, MA 02139, U.S.A. |
2447 |
Tel: +1 (617) 253 5702 |
2448 |
Fax: +1 (617) 258 8682 |
2449 |
Email: timbl@w3.org |
2450 |
|
2451 |
Roy T. Fielding |
2452 |
Department of Information and Computer Science |
2453 |
University of California |
2454 |
Irvine, CA 92717-3425, U.S.A. |
2455 |
Tel: +1 (714) 824-4049 |
2456 |
Fax: +1 (714) 824-4056 |
2457 |
Email: fielding@ics.uci.edu |
2458 |
|
2459 |
Henrik Frystyk Nielsen |
2460 |
W3 Consortium |
2461 |
MIT Laboratory for Computer Science |
2462 |
545 Technology Square |
2463 |
Cambridge, MA 02139, U.S.A. |
2464 |
Tel: +1 (617) 258 8143 |
2465 |
Fax: +1 (617) 258 8682 |
2466 |
Email: frystyk@w3.org |
2467 |
|
2468 |
Appendices |
2469 |
|
2470 |
These appendices are provided for informational reasons only -- they |
2471 |
do not form a part of the HTTP/1.0 specification. |
2472 |
|
2473 |
A. Internet Media Type message/http |
2474 |
|
2475 |
In addition to defining the HTTP/1.0 protocol, this document serves |
2476 |
as the specification for the Internet media type "message/http". |
2477 |
The following is to be registered with IANA [13]. |
2478 |
|
2479 |
Media Type name: message |
2480 |
|
2481 |
Media subtype name: http |
2482 |
|
2483 |
Required parameters: none |
2484 |
|
2485 |
Optional parameters: version, msgtype |
2486 |
|
2487 |
version: The HTTP-Version number of the enclosed message |
2488 |
(e.g., "1.0"). If not present, the version can be |
2489 |
determined from the first line of the body. |
2490 |
|
2491 |
msgtype: The message type -- "request" or "response". If |
2492 |
not present, the type can be determined from the |
2493 |
first line of the body. |
2494 |
|
2495 |
Encoding considerations: only "7bit", "8bit", or "binary" are |
2496 |
permitted |
2497 |
|
2498 |
Security considerations: none |
2499 |
|
2500 |
B. Tolerant Applications |
2501 |
|
2502 |
Although this document specifies the requirements for the |
2503 |
generation of HTTP/1.0 messages, not all applications will be |
2504 |
correct in their implementation. We therefore recommend that |
2505 |
operational applications be tolerant of deviations whenever those |
2506 |
deviations can be interpreted unambiguously. |
2507 |
|
2508 |
Clients should be tolerant in parsing the StatusLine and servers |
2509 |
tolerant when parsing the RequestLine. In particular, they should |
2510 |
accept any amount of SP or HT characters between fields, even |
2511 |
though only a single SP is required. |
2512 |
|
2513 |
The line terminator for HTTP-header fields is the sequence CRLF. |
2514 |
However, we recommend that applications, when parsing such headers, |
2515 |
recognize a single LF as a line terminator and ignore the leading CR. |
2516 |
|
2517 |
C. Relationship to MIME |
2518 |
|
2519 |
HTTP/1.0 reuses many of the constructs defined for Internet Mail |
2520 |
(RFC 822 [7]) and the Multipurpose Internet Mail Extensions |
2521 |
(MIME [5]) to allow entities to be transmitted in an open variety |
2522 |
of representations and with extensible mechanisms. However, HTTP is |
2523 |
not a MIME-conforming application. HTTP's performance requirements |
2524 |
differ substantially from those of Internet mail. Since it is not |
2525 |
limited by the restrictions of existing mail protocols and |
2526 |
gateways, HTTP does not obey some of the constraints imposed by |
2527 |
RFC 822 and MIME for mail transport. |
2528 |
|
2529 |
This appendix describes specific areas where HTTP differs from |
2530 |
MIME. Gateways to MIME-compliant protocols must be aware of these |
2531 |
differences and provide the appropriate conversions where necessary. |
2532 |
|
2533 |
C.1 Conversion to Canonical Form |
2534 |
|
2535 |
MIME requires that an entity be converted to canonical form prior |
2536 |
to being transferred, as described in Appendix G of RFC 1521 [5]. |
2537 |
Although HTTP does require media types to be transferred in |
2538 |
canonical form, it changes the definition of "canonical form" for |
2539 |
text-based media types as described in Section 3.6.1. |
2540 |
|
2541 |
C.1.1 Representation of Line Breaks |
2542 |
|
2543 |
MIME requires that the canonical form of any text type represent |
2544 |
line breaks as CRLF and forbids the use of CR or LF outside of line |
2545 |
break sequences. Since HTTP allows CRLF, bare CR, and bare LF (or |
2546 |
the octet sequence(s) to which they would be translated for the |
2547 |
given character set) to indicate a line break within text content, |
2548 |
recipients of an HTTP message cannot rely upon receiving |
2549 |
MIME-canonical line breaks in text. |
2550 |
|
2551 |
Where it is possible, a gateway from HTTP to a MIME-conformant |
2552 |
protocol should translate all line breaks within text/* media types |
2553 |
to the MIME canonical form of CRLF. However, this may be |
2554 |
complicated by the presence of a Content-Encoding and by the fact |
2555 |
that HTTP allows the use of some character sets which do not use |
2556 |
octets 13 and 10 to represent CR and LF, as is the case for some |
2557 |
multi-byte character sets. If canonicalization is performed, the |
2558 |
Content-Length header field value must be updated to reflect the |
2559 |
new body length. |
2560 |
|
2561 |
C.1.2 Default Character Set |
2562 |
|
2563 |
MIME requires that all subtypes of the top-level Content-Type |
2564 |
"text" have a default character set of US-ASCII [17]. In contrast, |
2565 |
HTTP defines the default character set for "text" to be |
2566 |
ISO-8859-1 [18] (a superset of US-ASCII). Therefore, if a text/* |
2567 |
media type given in the Content-Type header field does not already |
2568 |
include an explicit charset parameter, the parameter |
2569 |
|
2570 |
;charset="iso-8859-1" |
2571 |
|
2572 |
should be added by the gateway if the entity contains any octets |
2573 |
greater than 127. |
2574 |
|
2575 |
C.2 Conversion of Date Formats |
2576 |
|
2577 |
HTTP/1.0 uses a restricted subset of date formats to simplify the |
2578 |
process of date comparison. Gateways from other protocols should |
2579 |
ensure that any Date header field present in a message conforms to |
2580 |
one of the HTTP/1.0 formats and rewrite the date if necessary. |
2581 |
|
2582 |
C.3 Introduction of Content-Encoding |
2583 |
|
2584 |
MIME does not include any concept equivalent to HTTP's |
2585 |
Content-Encoding header field. Since this acts as a modifier on the |
2586 |
media type, gateways to MIME-conformant protocols must either |
2587 |
change the value of the Content-Type header field or decode the |
2588 |
Entity-Body before forwarding the message. |
2589 |
|
2590 |
Note: Some experimental applications of Content-Type for |
2591 |
Internet mail have used a media-type parameter of |
2592 |
";conversions=<content-coding>" to perform an equivalent |
2593 |
function as Content-Encoding. However, this parameter is not |
2594 |
part of the MIME specification at the time of this writing. |
2595 |
|
2596 |
C.4 No Content-Transfer-Encoding |
2597 |
|
2598 |
HTTP does not use the Content-Transfer-Encoding (CTE) field of |
2599 |
MIME. Gateways from MIME-compliant protocols must remove any |
2600 |
non-identity CTE ("quoted-printable" or "base64") encoding prior to |
2601 |
delivering the response message to an HTTP client. Gateways to |
2602 |
MIME-compliant protocols are responsible for ensuring that the |
2603 |
message is in the correct format and encoding for safe transport on |
2604 |
that protocol, where "safe transport" is defined by the limitations |
2605 |
of the protocol being used. At a minimum, the CTE field of |
2606 |
|
2607 |
Content-Transfer-Encoding: binary |
2608 |
|
2609 |
should be added by the gateway if it is unwilling to apply a |
2610 |
content transfer encoding. |
2611 |
|
2612 |
An HTTP client may include a Content-Transfer-Encoding as an |
2613 |
extension Entity-Header in a POST request when it knows the |
2614 |
destination of that request is a gateway to a MIME-compliant |
2615 |
protocol. |
2616 |
|