/[suikacvs]/webroot/www/2004/id/draft-ietf-urlreg-guide-03.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urlreg-guide-03.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2 INTERNET-DRAFT Larry Masinter
3 <draft-ietf-urlreg-guide-03.txt> Harald T. Alvestrand
4 August 7, 1998 Dan Zigmond
5 Rich Petke
6
7
8 Guidelines for new URL Schemes
9
10
11 Status of this Memo
12
13 This document is an Internet-Draft. Internet-Drafts are working
14 documents of the Internet Engineering Task Force (IETF), its
15 areas, and its working groups. Note that other groups may also
16 distribute working documents as Internet-Drafts.
17
18 Internet-Drafts are draft documents valid for a maximum of six
19 months and may be updated, replaced, or obsoleted by other
20 documents at any time. It is inappropriate to use Internet-
21 Drafts as reference material or to cite them other than as
22 "work in progress."
23
24 To view the entire list of current Internet-Drafts, please check
25 the "1id-abstracts.txt" listing contained in the Internet-Drafts
26 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
27 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
28 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
29 (US West Coast).
30
31 Distribution of this memo is unlimited.
32
33 This Internet Draft expires February 7, 1999.
34
35 Copyright Notice
36
37 Copyright (C) The Internet Society (1998). All Rights Reserved.
38
39 Abstract
40
41 A Uniform Resource Locator (URL) is a compact string representation
42 of the location for a resource that is available via the Internet.
43 This document provides guidelines for the definition of new URL
44 schemes.
45
46
47 1. Introduction
48
49 A Uniform Resource Locator (URL) is a compact string representation
50 of the location for a resource that is available via the Internet.
51 RFC [URI-SYNTAX] [1] defines the general syntax and semantics of URIs,
52 and, by inclusion, URLs. URLs are designated by including a
53 "<scheme>:" and then a "<scheme-specific-part>". Many URL schemes
54 are already defined.
55
56 This document provides guidelines for the definition of new URL
57 schemes, for consideration by those who are defining and
58 registering or evaluating those definitions.
59
60 The process by which new URL schemes are registered is defined in
61 RFC [URL-PROCESS] [2].
62
63
64 2. Guidelines for new URL schemes
65
66 Because new URL schemes potentially complicate client software, new
67 schemes must have demonstrable utility and operability, as well as
68 compatibility with existing URL schemes. This section elaborates
69 these criteria.
70
71
72 2.1 Syntactic compatibility
73
74 New URL schemes should follow the same syntactic conventions of
75 existing schemes when appropriate.
76
77
78 2.1.1 Improper use of "//" following "<scheme>:"
79
80 Contrary to some examples set in past years, the use of double
81 slashes as the first component of the <scheme-specific-part> of a
82 URL is not simply an artistic indicator that what follows is a URL:
83 Double slashes are used ONLY when the syntax of the URL's
84 <scheme-specific-part> contains a hierarchical structure as
85 described in RFC [URI-SYNTAX]. In URLs from such schemes, the use
86 of double slashes indicates that what follows is the top
87 hierarchical element for a naming authority. (See section 3 of RFC
88 [URI-SYNTAX] for more details.) URL schemes which do not contain a
89 conformant hierarchical structure in their <scheme-specific-part>
90 should not use double slashes following the "<scheme>:" string.
91
92
93 2.1.2 Compatibility with relative URLs
94
95 URL schemes should use the generic URL syntax if they are intended
96 to be used with relative URLs. A description of the allowed
97 relative forms should be included in the scheme's definition.
98 Many applications use relative URLs extensively. Specifically,
99
100 o Can the scheme be parsed according to RFC [URI-SYNTAX] - that is,
101 if the tokens "//", "/", ";", "?" and "#" are used, do they have
102 the meaning given in RFC [URI-SYNTAX]?
103
104 o Does the scheme make sense to use it in relative URLs like those
105 RFC [URI-SYNTAX] specifies?
106
107 o If the scheme syntax is designed to be broken into pieces, does
108 the documentation for the scheme's syntax specify what those
109 pieces are, why it should be broken in this way, and why the
110 breaks aren't where RFC [URI-SYNTAX] says that they usually should
111 be?
112
113 o If the scheme has a hierarchy, does it go left-to-right and with
114 slash separators like RFC [URI-SYNTAX]? If not, why not?
115
116
117 2.2 Is the scheme well defined?
118
119 It is important that the semantics of the "resource" that a URL
120 "locates" be well defined. This might mean different things
121 depending on the nature of the URL scheme.
122
123
124 2.2.1 Clear mapping from other name spaces
125
126 In many cases, new URL schemes are defined as ways to translate
127 other protocols and name spaces into the general framework of
128 URLs. The "ftp" URL scheme translates from the FTP protocol, while
129 the "mid" URL scheme translates from the Message-ID field of
130 messages.
131
132 In either case, the description of the mapping must be complete,
133 must describe how character sets get encoded or not in URLs, must
134 describe exactly how all legal values of the base standard can be
135 represented using the URL scheme, and exactly which modifiers,
136 alternate forms and other artifacts from the base standards are
137 included or not included. These requirements are elaborated
138 below.
139
140
141 2.2.2 URL schemes associated with network protocols
142
143 Most new URL schemes are associated with network resources that
144 have one or several network protocols that can access them. The
145 'ftp', 'news', and 'http' schemes are of this nature. For such
146 schemes, the specification should completely describe how URLs are
147 translated into protocol actions in sufficient detail to make the
148 access of the network resource unambiguous. If an implementation
149 of the URL scheme requires some configuration, the configuration
150 elements must be clearly identified. (For example, the 'news'
151 scheme, if implemented using NTTP, requires configuration of the
152 NTTP server.)
153
154
155 2.2.3 Character encoding
156
157 When describing URL schemes in which (some of) the elements of
158 the URL are actually representations of sequences of characters,
159 care should be taken not to introduce unnecessary variety in the
160 ways in which characters are encoded into octets and then into
161 URL characters. Unless there is some compelling reason for a
162 particular scheme to do otherwise, translating character sequences
163 into UTF-8 (RFC 2044) [3] and then subsequently using the %HH
164 encoding for unsafe octets is recommended.
165
166
167 2.2.4 Definition of non-protocol URL schemes
168
169 In some cases, URL schemes do not have particular network protocols
170 associated with them, because their use is limited to contexts
171 where the access method is understood. This is the case, for
172 example, with the "cid" and "mid" URL schemes. For these URL
173 schemes, the specification should describe the notation of the
174 scheme and a complete mapping of the locator from its source.
175
176
177 2.2.5 Definition of URL schemes not associated with data resources
178
179 Most URL schemes locate Internet resources that correspond
180 to data objects that can be retrieved or modified. This is the
181 case with "ftp" and "http", for example. However, some URL schemes
182 do not; for example, the "mailto" URL scheme corresponds to an
183 Internet mail address.
184
185 If a new URL scheme does not locate resources that are data
186 objects, the properties of names in the new space must be clearly
187 defined.
188
189
190 2.2.6 Definition of operations
191
192 In some contexts (for example, HTML forms) it is possible to
193 specify any one of a list of operations to be performed on a
194 specific URL. (Outside forms, it is generally assumed to be
195 something you GET.)
196
197 The URL scheme definition should describe all well-defined
198 operations on the URL identifier, and what they are supposed to
199 do.
200
201 Some URL schemes (for example, "telnet") provide location
202 information for hooking onto bi-directional data streams, and don't
203 fit the "infoaccess" paradigm of most URLs very well; this should
204 be documented.
205
206 NOTE: It is perfectly valid to say that "no operation apart from
207 GET is defined for this URL". It is also valid to say that "there's
208 only one operation defined for this URL, and it's not very
209 GET-like". The important point is that what is defined on this type
210 is described.
211
212
213 2.3 Demonstrated utility
214
215 URL schemes should have demonstrated utility. New URL schemes are
216 expensive things to support. Often they require special code in
217 browsers, proxies, and/or servers. Having a lot of ways to say the
218 same thing needless complicates these programs without adding value
219 to the Internet.
220
221 The kinds of things that are useful include:
222
223 o Things that cannot be referred to in any other way.
224
225 o Things where it is much easier to get at them using this scheme
226 than (for instance) a proxy gateway.
227
228
229 2.3.1 Proxy into HTTP/HTML
230
231 One way to provide a demonstration of utility is via a gateway
232 which provides objects in the new scheme for clients using an
233 existing protocol. It is much easier to deploy gateways to a new
234 service than it is to deploy browsers that understand the new URL
235 object.
236
237 Things to look for when thinking about a proxy are:
238
239 o Is there a single global resolution mechanism whereby any proxy
240 can find the referenced object?
241 o If not, is there a way in which the user can find any object of
242 this type, and "run his own proxy"?
243 o Are the operations mappable one-to-one (or possibly using
244 modifiers) to HTTP operations?
245 o Is the type of returned objects well defined?
246 - as MIME content-types?
247 - as something that can be translated to HTML?
248 o Is there running code for a proxy?
249
250
251 2.4 Are there security considerations?
252
253 Above and beyond the security considerations of the base mechanism
254 a scheme builds upon, one must think of things that can happen in
255 the normal course of URL usage.
256
257 In particular:
258
259 o Does the user need to be warned that such a thing is happening
260 without an explicit request (GET for the source of an IMG tag,
261 for instance)? This has implications for the design of a proxy
262 gateway, of course.
263
264 o Is it possible to fake URLs of this type that point to different
265 things in a dangerous way?
266
267 o Are there mechanisms for identifying the requester that can be
268 used or need to be used with this mechanism (the From: field in a
269 mailto: URL, or the Kerberos login required for AFS access in the
270 AFS: URL, for instance)?
271
272 o Does the mechanism contain passwords or other security
273 information that are passed inside the referring document in the
274 clear (as in the "ftp" URL, for instance)?
275
276
277 2.5 Does it start with UR?
278
279 Any scheme starting with the letters "U" and "R", in particular if
280 it attaches any of the meanings "uniform", "universal" or
281 "unifying" to the first letter, is going to cause intense debate,
282 and generate much heat (but maybe little light).
283
284 Any such proposal should either make sure that there is a large
285 consensus behind it that it will be the only scheme of its type, or
286 pick another name.
287
288
289 2.6 Non-considerations
290
291 Some issues that are often raised but are not relevant to new URL
292 schemes include the following.
293
294
295 2.6.1 Are all objects accessible?
296
297 Can all objects in the world that are validly identified by a
298 scheme be accessed by any UA implementing it?
299
300 Sometimes the answer will be yes and sometimes no; often it will
301 depend on factors (like firewalls or client configuration) not
302 directly related to the scheme itself.
303
304
305 3. Security considerations
306
307 New URL schemes are required to address all security considerations
308 in their definitions.
309
310
311 4. IANA considerations
312
313 The process by which URL schemes names are registered is specified
314 in RFC [URL-PROCESS].
315
316
317 5. References
318
319 [1] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform Resource
320 Identifiers (URI): Generic Syntax", RFC [URI-SYNTAX], August 1998
321
322 [2] Petke, R., "Registration Procedures for URL Scheme Names",
323 RFC [URL-PROCESS], August 1998
324
325 [3] Yergeau, F., "UTF-8, A Transformation Format of Unicode and ISO
326 10646", RFC 2044, October 1996.
327
328
329 6. Authors' Addresses
330
331 Larry Masinter
332 Xerox Corporation
333 Palo Alto Research Center
334 3333 Coyote Hill Road
335 Palo Alto, CA 94304
336 Fax: +1-415-812-4333
337 EMail: masinter@parc.xerox.com
338
339 Harald Tveit Alvestrand
340 Maxware, Pirsenteret
341 N-7005 Trondheim
342 NORWAY
343 Voice: +47 73 54 57 00
344 EMail: harald.alvestrand@maxware.no
345
346 Dan Zigmond
347 WebTV Networks, Inc.
348 305 Lytton Avenue
349 Palo Alto, CA 94301
350 USA
351 Voice: +1-650-614-6071
352 EMail: djz@corp.webtv.net
353
354 Rich Petke
355 WorldCom Advanced Networks
356 5000 Britton Road
357 P. O. Box 5000
358 Hilliard, OH 43026-5000
359 Voice: +1-614-723-4157
360 Fax: +1-614-723-1333
361 EMail: rpetke@compuserve.net
362

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24