/[suikacvs]/webroot/www/2004/id/draft-ietf-urlreg-guide-02.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urlreg-guide-02.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2 INTERNET-DRAFT Larry Masinter
3 <draft-ietf-urlreg-guide-02.txt> Harald T. Alvestrand
4 May 8, 1998 Dan Zigmond
5 Rich Petke
6
7
8 Guidelines for new URL Schemes
9
10
11 Status of this Memo
12
13 This document is an Internet-Draft. Internet-Drafts are working
14 documents of the Internet Engineering Task Force (IETF), its areas,
15 and its working groups. Note that other groups may also distribute
16 working documents as Internet-Drafts.
17
18 Internet-Drafts are draft documents valid for a maximum of six
19 months and may be updated, replaced, or obsoleted by other
20 documents at any time. It is inappropriate to use Internet-Drafts
21 as reference material or to cite them other than as ``work in
22 progress.''
23
24 To view the entire list of current Internet-Drafts, please check
25 the "1id-abstracts.txt" listing contained in the Internet-Drafts
26 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
27 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
28 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
29 (US West Coast).
30
31 Abstract
32
33 A Uniform Resource Locator (URL) is a compact string representation
34 of the location for a resource that is available via the Internet.
35 This document provides guidelines for the definition of new URL
36 schemes.
37
38
39 1. Introduction
40
41 A Uniform Resource Locator (URL) is a compact string representation
42 of the location for a resource that is available via the Internet.
43 RFC [URI-SYNTAX] defines the general syntax and semantics of URIs,
44 and, by inclusion, URLs. URLs are designated by including a
45 "scheme" and then a "scheme-specific part". Many URL schemes are
46 already defined.
47
48 This document provides guidelines for the definition of new URL
49 schemes, for consideration by those who are defining and
50 registering or evaluating those definitions.
51
52 The process by which new URL schemes are registered is defined in
53 RFC [URL-PROCESS].
54
55
56 2. Guidelines for new URL schemes
57
58 Because new URL schemes potentially complicate client software, new
59 schemes must have demonstrable utility and operability, as well as
60 compatibility with existing URL schemes. This section elaborates
61 these criteria.
62
63
64 2.1 Syntactic compatibility
65
66 New URL schemes should follow the same syntactic conventions of
67 existing schemes when appropriate.
68
69
70 2.1.1 Use of initial "//" for top level
71
72 Many proposed new URL schemes seem to use "://" as a kind of
73 indicator that what follows is a URL. However, the use of "//"
74 indicates a "top level" for schemes that support relative
75 URLs, and is not necessary (and just confusing) for schemes
76 that have no relative forms. URL schemes without relative
77 forms (such as mailto, cid, mid) do not use an initial "//".
78
79
80 2.1.2 Compatibility with relative URLs
81
82 URL schemes should use the generic URL syntax if they are intended
83 to be used with relative URLs. A description of the allowed
84 relative forms should be included in the scheme's definition.
85 Many applications use relative URLs extensively. Specifically,
86
87 o Can the scheme be parsed according to RFC [URI-SYNTAX] - that is,
88 if the tokens "//", "/", ";", "?" and "#" are used, do they have
89 the meaning given in RFC [URI-SYNTAX]?
90
91 o Does the scheme make sense to use it in relative URLs like those
92 RFC [URI-SYNTAX] specifies?
93
94 o If the scheme syntax is designed to be broken into pieces, does
95 the documentation for the scheme's syntax specify what those
96 pieces are, why it should be broken in this way, and why the
97 breaks aren't where RFC [URI-SYNTAX] says that they usually should
98 be?
99
100 o If the scheme has a hierarchy, does it go left-to-right and with
101 slash separators like RFC [URI-SYNTAX]? If not, why not?
102
103
104 2.2 Is the scheme well defined?
105
106 It is important that the semantics of the "resource" that a URL
107 "locates" be well defined. This might mean different things
108 depending on the nature of the URL scheme.
109
110
111 2.2.1 Clear mapping from other name spaces
112
113 In many cases, new URL schemes are defined as ways to translate
114 other protocols and name spaces into the general framework of
115 URLs. The "ftp" URL scheme translates from the FTP protocol, while
116 the "mid" URL scheme translates from the Message-ID field of
117 messages.
118
119 In either case, the description of the mapping must be complete,
120 must describe how character sets get encoded or not in URLs, must
121 describe exactly how all legal values of the base standard can be
122 represented using the URL scheme, and exactly which modifiers,
123 alternate forms and other artifacts from the base standards are
124 included or not included. These requirements are elaborated
125 below.
126
127
128 2.2.2 URL schemes associated with network protocols
129
130 Most new URL schemes are associated with network resources that
131 have one or several network protocols that can access them. The
132 'ftp', 'news', and 'http' schemes are of this nature. For such
133 schemes, the specification should completely describe how URLs are
134 translated into protocol actions in sufficient detail to make the
135 access of the network resource unambiguous. If an implementation
136 of the URL scheme requires some configuration, the configuration
137 elements must be clearly identified. (For example, the 'news'
138 scheme, if implemented using NTTP, requires configuration of the
139 NTTP server.)
140
141
142 2.2.3 Character encoding
143
144 When describing URL schemes in which (some of) the elements of
145 the URL are actually representations of sequences of characters,
146 care should be taken not to introduce unnecessary variety in the
147 ways in which characters are encoded into octets and then into
148 URL characters. Unless there is some compelling reason for a
149 particular scheme to do otherwise, translating character sequences
150 into UTF-8 [RFC 2044] and then subsequently using the %HH encoding
151 for unsafe octets is recommended.
152
153
154 2.2.4 Definition of non-protocol URL schemes
155
156 In some cases, URL schemes do not have particular network protocols
157 associated with them, because their use is limited to contexts
158 where the access method is understood. This is the case, for
159 example, with the "cid" and "mid" URL schemes. For these URL
160 schemes, the specification should describe the notation of the
161 scheme and a complete mapping of the locator from its source.
162
163
164 2.2.5 Definition of URL schemes not associated with data resources
165
166 Most URL schemes locate Internet resources that correspond
167 to data objects that can be retrieved or modified. This is the
168 case with "ftp" and "http", for example. However, some URL schemes
169 do not; for example, the "mailto" URL scheme corresponds to an
170 Internet mail address.
171
172 If a new URL scheme does not locate resources that are data
173 objects, the properties of names in the new space must be clearly
174 defined.
175
176
177 2.2.6 Definition of operations
178
179 In some contexts (for example, HTML forms) it is possible to
180 specify any one of a list of operations to be performed on a
181 specific URL. (Outside forms, it is generally assumed to be
182 something you GET.)
183
184 The URL scheme definition should describe all well-defined
185 operations on the URL identifier, and what they are supposed to
186 do.
187
188 Some URL schemes (for example, "telnet") provide location
189 information for hooking onto bi-directional data streams, and don't
190 fit the "infoaccess" paradigm of most URLs very well; this should
191 be documented.
192
193 NOTE: It is perfectly valid to say that "no operation apart from
194 GET is defined for this URL". It is also valid to say that "there's
195 only one operation defined for this URL, and it's not very
196 GET-like". The important point is that what is defined on this type
197 is described.
198
199
200 2.3 Demonstrated utility
201
202 URL schemes should have demonstrated utility. New URL schemes are
203 expensive things to support. Often they require special code in
204 browsers, proxies, and/or servers. Having a lot of ways to say the
205 same thing needless complicates these programs without adding value
206 to the Internet.
207
208 The kinds of things that are useful include:
209
210 o Things that cannot be referred to in any other way.
211
212 o Things where it is much easier to get at them using this
213 scheme than (for instance) a proxy gateway.
214
215
216 2.3.1 Proxy into HTTP/HTML
217
218 One way to provide a demonstration of utility is via a gateway
219 which provides objects in the new scheme for clients using an
220 existing protocol. It is much easier to deploy gateways to a new
221 service than it is to deploy browsers that understand the new URL
222 object.
223
224 Things to look for when thinking about a proxy are:
225
226 o Is there a single global resolution mechanism whereby any proxy
227 can find the referenced object?
228 o If not, is there a way in which the user can find any object of
229 this type, and "run his own proxy"?
230 o Are the operations mappable one-to-one (or possibly using
231 modifiers) to HTTP operations?
232 o Is the type of returned objects well defined?
233 * as MIME content-types?
234 * as something that can be translated to HTML?
235 o Is there running code for a proxy?
236
237
238 2.4 Are there security considerations?
239
240 Above and beyond the security considerations of the base mechanism
241 a scheme builds upon, one must think of things that can happen in
242 the normal course of URL usage.
243
244 In particular:
245
246 o Does the user need to be warned that such a thing is happening
247 without an explicit request (GET for the source of an IMG tag,
248 for instance)? This has implications for the design of a proxy
249 gateway, of course.
250
251 o Is it possible to fake URLs of this type that point to different
252 things in a dangerous way?
253
254 o Are there mechanisms for identifying the requester that can be
255 used or need to be used with this mechanism (the From: field in a
256 mailto: URL, or the Kerberos login required for AFS access in the
257 AFS: URL, for instance)?
258
259 o Does the mechanism contain passwords or other security
260 information that are passed inside the referring document in the
261 clear (as in the "ftp" URL, for instance)?
262
263
264 2.5 Does it start with UR?
265
266 Any scheme starting with the letters "U" and "R", in particular if
267 it attaches any of the meanings "uniform", "universal" or
268 "unifying" to the first letter, is going to cause intense debate,
269 and generate much heat (but maybe little light).
270
271 Any such proposal should either make sure that there is a large
272 consensus behind it that it will be the only scheme of its type, or
273 pick another name.
274
275
276 2.6 Non-considerations
277
278 Some issues that are often raised but are not relevant to new URL
279 schemes include the following.
280
281
282 2.6.1 Are all objects accessible?
283
284 Can all objects in the world that are validly identified by a
285 scheme be accessed by any UA implementing it?
286
287 Sometimes the answer will be yes and sometimes no; often it will
288 depend on factors (like firewalls or client configuration) not
289 directly related to the scheme itself.
290
291
292 3. Security considerations
293
294 New URL schemes are required to address all security considerations
295 in their definitions.
296
297
298 4. IANA considerations
299
300 The process by which URL schemes names are registered is specified
301 in RFC [URL-PROCESS].
302
303
304 5. References
305
306 RFC 2044 F. Yergeau, "UTF-8, A Transformation Format of Unicode
307 and ISO 10646", Alis Technologies, October 1996.
308
309 RFC [URI-SYNTAX] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform
310 Resource Identifiers (URI): Generic Syntax and Semantics",
311 <draft-fielding-uri-syntax-*.txt>.
312
313 RFC [URL-PROCESS] R. Petke, "Registration Procedures for URL Scheme
314 Names", <draft-ietf-urlreg-procedures-*.txt>
315
316
317 6. Authors' Addresses
318
319 Larry Masinter
320 Xerox Corporation
321 Palo Alto Research Center
322 3333 Coyote Hill Road
323 Palo Alto, CA 94304
324 Fax: +1-415-812-4333
325 EMail: masinter@parc.xerox.com
326
327 Harald Tveit Alvestrand
328 Maxware, Pirsenteret
329 N-7005 Trondheim
330 NORWAY
331 Voice: +47 73 54 57 00
332 EMail: harald.alvestrand@maxware.no
333
334 Dan Zigmond
335 WebTV Networks, Inc.
336 305 Lytton Avenue
337 Palo Alto, CA 94301
338 USA
339 Voice: +1-650-614-6071
340 EMail: djz@corp.webtv.net
341
342 Rich Petke
343 WorldCom Advanced Networks
344 5000 Britton Road
345 P. O. Box 5000
346 Hilliard, OH 43026-5000
347 Voice: +1-614-723-4157
348 Fax: +1-614-723-1333
349 EMail: rpetke@compuserve.net
350

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24