/[suikacvs]/webroot/www/2004/id/draft-ietf-urlreg-guide-03.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urlreg-guide-03.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (20 years, 10 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 wakaba 1.1
2     INTERNET-DRAFT Larry Masinter
3     <draft-ietf-urlreg-guide-03.txt> Harald T. Alvestrand
4     August 7, 1998 Dan Zigmond
5     Rich Petke
6    
7    
8     Guidelines for new URL Schemes
9    
10    
11     Status of this Memo
12    
13     This document is an Internet-Draft. Internet-Drafts are working
14     documents of the Internet Engineering Task Force (IETF), its
15     areas, and its working groups. Note that other groups may also
16     distribute working documents as Internet-Drafts.
17    
18     Internet-Drafts are draft documents valid for a maximum of six
19     months and may be updated, replaced, or obsoleted by other
20     documents at any time. It is inappropriate to use Internet-
21     Drafts as reference material or to cite them other than as
22     "work in progress."
23    
24     To view the entire list of current Internet-Drafts, please check
25     the "1id-abstracts.txt" listing contained in the Internet-Drafts
26     Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
27     (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
28     (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
29     (US West Coast).
30    
31     Distribution of this memo is unlimited.
32    
33     This Internet Draft expires February 7, 1999.
34    
35     Copyright Notice
36    
37     Copyright (C) The Internet Society (1998). All Rights Reserved.
38    
39     Abstract
40    
41     A Uniform Resource Locator (URL) is a compact string representation
42     of the location for a resource that is available via the Internet.
43     This document provides guidelines for the definition of new URL
44     schemes.
45    
46    
47     1. Introduction
48    
49     A Uniform Resource Locator (URL) is a compact string representation
50     of the location for a resource that is available via the Internet.
51     RFC [URI-SYNTAX] [1] defines the general syntax and semantics of URIs,
52     and, by inclusion, URLs. URLs are designated by including a
53     "<scheme>:" and then a "<scheme-specific-part>". Many URL schemes
54     are already defined.
55    
56     This document provides guidelines for the definition of new URL
57     schemes, for consideration by those who are defining and
58     registering or evaluating those definitions.
59    
60     The process by which new URL schemes are registered is defined in
61     RFC [URL-PROCESS] [2].
62    
63    
64     2. Guidelines for new URL schemes
65    
66     Because new URL schemes potentially complicate client software, new
67     schemes must have demonstrable utility and operability, as well as
68     compatibility with existing URL schemes. This section elaborates
69     these criteria.
70    
71    
72     2.1 Syntactic compatibility
73    
74     New URL schemes should follow the same syntactic conventions of
75     existing schemes when appropriate.
76    
77    
78     2.1.1 Improper use of "//" following "<scheme>:"
79    
80     Contrary to some examples set in past years, the use of double
81     slashes as the first component of the <scheme-specific-part> of a
82     URL is not simply an artistic indicator that what follows is a URL:
83     Double slashes are used ONLY when the syntax of the URL's
84     <scheme-specific-part> contains a hierarchical structure as
85     described in RFC [URI-SYNTAX]. In URLs from such schemes, the use
86     of double slashes indicates that what follows is the top
87     hierarchical element for a naming authority. (See section 3 of RFC
88     [URI-SYNTAX] for more details.) URL schemes which do not contain a
89     conformant hierarchical structure in their <scheme-specific-part>
90     should not use double slashes following the "<scheme>:" string.
91    
92    
93     2.1.2 Compatibility with relative URLs
94    
95     URL schemes should use the generic URL syntax if they are intended
96     to be used with relative URLs. A description of the allowed
97     relative forms should be included in the scheme's definition.
98     Many applications use relative URLs extensively. Specifically,
99    
100     o Can the scheme be parsed according to RFC [URI-SYNTAX] - that is,
101     if the tokens "//", "/", ";", "?" and "#" are used, do they have
102     the meaning given in RFC [URI-SYNTAX]?
103    
104     o Does the scheme make sense to use it in relative URLs like those
105     RFC [URI-SYNTAX] specifies?
106    
107     o If the scheme syntax is designed to be broken into pieces, does
108     the documentation for the scheme's syntax specify what those
109     pieces are, why it should be broken in this way, and why the
110     breaks aren't where RFC [URI-SYNTAX] says that they usually should
111     be?
112    
113     o If the scheme has a hierarchy, does it go left-to-right and with
114     slash separators like RFC [URI-SYNTAX]? If not, why not?
115    
116    
117     2.2 Is the scheme well defined?
118    
119     It is important that the semantics of the "resource" that a URL
120     "locates" be well defined. This might mean different things
121     depending on the nature of the URL scheme.
122    
123    
124     2.2.1 Clear mapping from other name spaces
125    
126     In many cases, new URL schemes are defined as ways to translate
127     other protocols and name spaces into the general framework of
128     URLs. The "ftp" URL scheme translates from the FTP protocol, while
129     the "mid" URL scheme translates from the Message-ID field of
130     messages.
131    
132     In either case, the description of the mapping must be complete,
133     must describe how character sets get encoded or not in URLs, must
134     describe exactly how all legal values of the base standard can be
135     represented using the URL scheme, and exactly which modifiers,
136     alternate forms and other artifacts from the base standards are
137     included or not included. These requirements are elaborated
138     below.
139    
140    
141     2.2.2 URL schemes associated with network protocols
142    
143     Most new URL schemes are associated with network resources that
144     have one or several network protocols that can access them. The
145     'ftp', 'news', and 'http' schemes are of this nature. For such
146     schemes, the specification should completely describe how URLs are
147     translated into protocol actions in sufficient detail to make the
148     access of the network resource unambiguous. If an implementation
149     of the URL scheme requires some configuration, the configuration
150     elements must be clearly identified. (For example, the 'news'
151     scheme, if implemented using NTTP, requires configuration of the
152     NTTP server.)
153    
154    
155     2.2.3 Character encoding
156    
157     When describing URL schemes in which (some of) the elements of
158     the URL are actually representations of sequences of characters,
159     care should be taken not to introduce unnecessary variety in the
160     ways in which characters are encoded into octets and then into
161     URL characters. Unless there is some compelling reason for a
162     particular scheme to do otherwise, translating character sequences
163     into UTF-8 (RFC 2044) [3] and then subsequently using the %HH
164     encoding for unsafe octets is recommended.
165    
166    
167     2.2.4 Definition of non-protocol URL schemes
168    
169     In some cases, URL schemes do not have particular network protocols
170     associated with them, because their use is limited to contexts
171     where the access method is understood. This is the case, for
172     example, with the "cid" and "mid" URL schemes. For these URL
173     schemes, the specification should describe the notation of the
174     scheme and a complete mapping of the locator from its source.
175    
176    
177     2.2.5 Definition of URL schemes not associated with data resources
178    
179     Most URL schemes locate Internet resources that correspond
180     to data objects that can be retrieved or modified. This is the
181     case with "ftp" and "http", for example. However, some URL schemes
182     do not; for example, the "mailto" URL scheme corresponds to an
183     Internet mail address.
184    
185     If a new URL scheme does not locate resources that are data
186     objects, the properties of names in the new space must be clearly
187     defined.
188    
189    
190     2.2.6 Definition of operations
191    
192     In some contexts (for example, HTML forms) it is possible to
193     specify any one of a list of operations to be performed on a
194     specific URL. (Outside forms, it is generally assumed to be
195     something you GET.)
196    
197     The URL scheme definition should describe all well-defined
198     operations on the URL identifier, and what they are supposed to
199     do.
200    
201     Some URL schemes (for example, "telnet") provide location
202     information for hooking onto bi-directional data streams, and don't
203     fit the "infoaccess" paradigm of most URLs very well; this should
204     be documented.
205    
206     NOTE: It is perfectly valid to say that "no operation apart from
207     GET is defined for this URL". It is also valid to say that "there's
208     only one operation defined for this URL, and it's not very
209     GET-like". The important point is that what is defined on this type
210     is described.
211    
212    
213     2.3 Demonstrated utility
214    
215     URL schemes should have demonstrated utility. New URL schemes are
216     expensive things to support. Often they require special code in
217     browsers, proxies, and/or servers. Having a lot of ways to say the
218     same thing needless complicates these programs without adding value
219     to the Internet.
220    
221     The kinds of things that are useful include:
222    
223     o Things that cannot be referred to in any other way.
224    
225     o Things where it is much easier to get at them using this scheme
226     than (for instance) a proxy gateway.
227    
228    
229     2.3.1 Proxy into HTTP/HTML
230    
231     One way to provide a demonstration of utility is via a gateway
232     which provides objects in the new scheme for clients using an
233     existing protocol. It is much easier to deploy gateways to a new
234     service than it is to deploy browsers that understand the new URL
235     object.
236    
237     Things to look for when thinking about a proxy are:
238    
239     o Is there a single global resolution mechanism whereby any proxy
240     can find the referenced object?
241     o If not, is there a way in which the user can find any object of
242     this type, and "run his own proxy"?
243     o Are the operations mappable one-to-one (or possibly using
244     modifiers) to HTTP operations?
245     o Is the type of returned objects well defined?
246     - as MIME content-types?
247     - as something that can be translated to HTML?
248     o Is there running code for a proxy?
249    
250    
251     2.4 Are there security considerations?
252    
253     Above and beyond the security considerations of the base mechanism
254     a scheme builds upon, one must think of things that can happen in
255     the normal course of URL usage.
256    
257     In particular:
258    
259     o Does the user need to be warned that such a thing is happening
260     without an explicit request (GET for the source of an IMG tag,
261     for instance)? This has implications for the design of a proxy
262     gateway, of course.
263    
264     o Is it possible to fake URLs of this type that point to different
265     things in a dangerous way?
266    
267     o Are there mechanisms for identifying the requester that can be
268     used or need to be used with this mechanism (the From: field in a
269     mailto: URL, or the Kerberos login required for AFS access in the
270     AFS: URL, for instance)?
271    
272     o Does the mechanism contain passwords or other security
273     information that are passed inside the referring document in the
274     clear (as in the "ftp" URL, for instance)?
275    
276    
277     2.5 Does it start with UR?
278    
279     Any scheme starting with the letters "U" and "R", in particular if
280     it attaches any of the meanings "uniform", "universal" or
281     "unifying" to the first letter, is going to cause intense debate,
282     and generate much heat (but maybe little light).
283    
284     Any such proposal should either make sure that there is a large
285     consensus behind it that it will be the only scheme of its type, or
286     pick another name.
287    
288    
289     2.6 Non-considerations
290    
291     Some issues that are often raised but are not relevant to new URL
292     schemes include the following.
293    
294    
295     2.6.1 Are all objects accessible?
296    
297     Can all objects in the world that are validly identified by a
298     scheme be accessed by any UA implementing it?
299    
300     Sometimes the answer will be yes and sometimes no; often it will
301     depend on factors (like firewalls or client configuration) not
302     directly related to the scheme itself.
303    
304    
305     3. Security considerations
306    
307     New URL schemes are required to address all security considerations
308     in their definitions.
309    
310    
311     4. IANA considerations
312    
313     The process by which URL schemes names are registered is specified
314     in RFC [URL-PROCESS].
315    
316    
317     5. References
318    
319     [1] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform Resource
320     Identifiers (URI): Generic Syntax", RFC [URI-SYNTAX], August 1998
321    
322     [2] Petke, R., "Registration Procedures for URL Scheme Names",
323     RFC [URL-PROCESS], August 1998
324    
325     [3] Yergeau, F., "UTF-8, A Transformation Format of Unicode and ISO
326     10646", RFC 2044, October 1996.
327    
328    
329     6. Authors' Addresses
330    
331     Larry Masinter
332     Xerox Corporation
333     Palo Alto Research Center
334     3333 Coyote Hill Road
335     Palo Alto, CA 94304
336     Fax: +1-415-812-4333
337     EMail: masinter@parc.xerox.com
338    
339     Harald Tveit Alvestrand
340     Maxware, Pirsenteret
341     N-7005 Trondheim
342     NORWAY
343     Voice: +47 73 54 57 00
344     EMail: harald.alvestrand@maxware.no
345    
346     Dan Zigmond
347     WebTV Networks, Inc.
348     305 Lytton Avenue
349     Palo Alto, CA 94301
350     USA
351     Voice: +1-650-614-6071
352     EMail: djz@corp.webtv.net
353    
354     Rich Petke
355     WorldCom Advanced Networks
356     5000 Britton Road
357     P. O. Box 5000
358     Hilliard, OH 43026-5000
359     Voice: +1-614-723-4157
360     Fax: +1-614-723-1333
361     EMail: rpetke@compuserve.net
362    

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24