/[suikacvs]/webroot/www/2004/id/draft-ietf-urlreg-guide-01.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urlreg-guide-01.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2 INTERNET-DRAFT Larry Masinter
3 <draft-ietf-urlreg-guide-01.txt> Harald T. Alvestrand
4 December 9, 1997 Dan Zigmond
5
6 Guidelines for new URL Schemes
7
8 Status of this Memo
9
10 This document is an Internet-Draft. Internet-Drafts are working
11 documents of the Internet Engineering Task Force (IETF), its areas,
12 and its working groups. Note that other groups may also distribute
13 working documents as Internet-Drafts.
14
15 Internet-Drafts are draft documents valid for a maximum of six
16 months and may be updated, replaced, or obsoleted by other
17 documents at any time. It is inappropriate to use Internet-Drafts
18 as reference material or to cite them other than as ``work in
19 progress.''
20
21 To learn the current status of any Internet-Draft, please check the
22 ``1id-abstracts.txt'' listing contained in the Internet-Drafts
23 Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
24 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
25 Coast), or ftp.isi.edu (US West Coast).
26
27 Abstract
28
29 A Uniform Resource Locator (URL) is a compact string representation
30 of the location for a resource that is available via the Internet.
31 This document provides guidelines for the definition of new URL
32 schemes.
33
34 1. Introduction
35
36 A Uniform Resource Locator (URL) is a compact string representation
37 of the location for a resource that is available via the Internet.
38 [URI-SYNTAX] defines the general syntax and semantics of URIs, and,
39 by inclusion, URLs. URLs are designated by including a "scheme"
40 and then a "scheme-specific part". Many URL schemes are already
41 defined.
42
43 This document provides guidelines for the definition of new URL
44 schemes, for consideration by those who are defining and
45 registering or evaluating those definitions.
46
47 The process by which new URL schemes are registered or defined
48 is not defined here.
49
50 2. Guildelines for new URL schemes
51
52 Because new URL schemes potentially complicate client software, new
53 schemes must have demonstrable utility and operability, as well as
54 compatibility with existing URL schemes. This section elaborates
55 these criteria.
56
57 2.1 Syntactic compatibility
58
59 New URL schemes should follow the same syntactic conventions of
60 existing schemes when appropriate.
61
62 2.1.1 Use of initial "//" for top level
63
64 Many proposed new URL schemes seem to use "://" as a kind of
65 indicator that what follows is a URL. However, the use of "//"
66 indicates a "top level" for schemes that support relative
67 URLs, and is not necessary (and just confusing) for schemes
68 that have no relative forms. URL schemes without relative
69 forms (such as mailto, cid, mid) do not use an initial "//".
70
71 2.1.2 Compatibility with relative URLs
72
73 URL schemes should use the generic-URL syntax if they are intended
74 to be used with relative URLs. A description of the allowed
75 relative forms should be included in the scheme's definition.
76 Many applications use relative URLs extensively.
77
78 o Can it be parsed according to [URI-SYNTAX] - that is, if the
79 tokens "//", "/", ";", "?" and "#" are used, do they have the
80 meaning given in RFC URI-SYNTAX?
81
82 o Does it make sense to use it in relative URLs like those
83 [URI-SYNTAX] specifies?
84
85 o If something is designed to be broken into pieces, does it
86 document what those pieces are, why it should be broken in this
87 way, and why the breaks aren't where URI-SYNTAX says that they
88 usually should be?
89
90 o If it has a hierarchy, does it go left-to-right and with slash
91 separators like [URN-SYNTAX]? If not, why not?
92
93 2.2 Is the scheme well defined?
94
95 It is important that the semantics of the "resource" that a URL
96 "locates" be well defined. This might mean different things
97 depending on the nature of the URL scheme.
98
99 2.2.1 Clear mapping from other name spaces
100
101 In many cases, new URL schemes are defined as ways to translate
102 other protocols and name spaces into the general framework of
103 URLs. The "ftp" URL scheme translates from the FTP protocol, while
104 the "mid" URL scheme translates from the Message-ID field of
105 messages.
106
107 In either case, the description of the mapping must be complete,
108 must describe how character sets get encoded or not in URLs, must
109 describe exactly how all legal values of the base standard can be
110 represented using the URL scheme, and exactly which modifiers,
111 alternate forms and other artifacts from the base standards are
112 included or not included. These requirements are elaborated
113 below.
114
115 2.2.2 URL schemes associated with network protocols
116
117 Most new URL schemes are associated with network resources that
118 have one or several network protocols that can access them. The
119 'ftp', 'news', and 'http' schemes are of this nature. For such
120 schemes, the specification should completely describe how URLs are
121 translated into protocol actions in sufficient detail to make the
122 access of the network resource unambiguous. If an implementation
123 of of the URL scheme requires some configuration, the configuration
124 elements must be clearly identified. (For example, the 'news'
125 scheme, if implemented using NTTP, requires configuration of the
126 NTTP server.)
127
128 2.2.3 Character encoding
129
130 When describing URL schemes in which (some of) the elements of
131 the URL are actually representations of sequences of characters,
132 care should be taken not to introduce unnecessary variety in the
133 ways in which characters are encoded into octets and then into
134 URL characters. Unless there is some compelling reason for a
135 particular scheme to do otherwise, translating character sequences
136 into UTF-8 [RFC2044] and then subsequently using the %HH encoding
137 for unsafe octets is recommended.
138
139 2.2.4 Definition of non-protocol URL schemes
140
141 In some cases, URL schemes do not have particular network protocols
142 associated with them, because their use is limited to contexts
143 where the access method is understood. This is the case, for
144 example, with the "cid" and "mid" URL schemes. For these URL
145 schemes, the specification should describe the notation of the
146 scheme and a complete mapping of the locator from its source.
147
148 2.2.5 Definition of URL schemes not associated with data resources
149
150 Most URL schemes locate Internet resources that correspond
151 to data objects that can be retrieved or modified. This is the
152 case with "ftp" and "http", for example. However, some URL schemes
153 do not; for example, the "mailto" URL scheme corresponds to an
154 Internet mail address.
155
156 If a new URL scheme does not locate resources that are data
157 objects, the properties of names in the new space must be clearly
158 defined.
159
160 2.2.6 Definition of operations
161
162 In some contexts (for example, HTML forms) it is possible to
163 specify any one of a list of operations to be performed on a
164 specifc URL. (Outside forms, it is generally assumed to be
165 something you GET.)
166
167 The URL scheme definition should describe all well-defined
168 operations on the URL identifier, and what they are supposed to
169 do.
170
171 Some URL schemes (for example, "telnet") provide location
172 information for hooking onto bidirectional data streams, and don't
173 fit the "infoaccess" paradigm of most URLs very well; this should
174 be documented.
175
176 NOTE: It is perfectly valid to say that "no operation apart from
177 GET is defined for this URL". It is also valid to say that "there's
178 only one operation defined for this URL, and it's not very
179 GET-like". The important point is that what is defined on this type
180 is described.
181
182 2.3 Demonstrated utility
183
184 URL schemes should have demonstrated utility. New URL schemes are
185 expensive things to support. Often they require special code in
186 browsers, proxies, and/or servers. Having a lot of ways to say the
187 same thing needless complicates these programs without adding value
188 to the Internet.
189
190 The kinds of things that are useful include:
191
192 o Things that cannot be referred to in any other way.
193
194 o Things where it is much easier to get at them using this
195 scheme than (for instance) a proxy gateway.
196
197
198 2.3.1 Proxy into HTTP/HTML
199
200 One way to provide a demonstration of utility is via a gateway
201 which provides objects in the new scheme for clients using an
202 existing protocol. It is much easier to deploy gateways to a new
203 service than it is to deploy browsers that understand the new URL
204 object.
205
206 Things to look for when thinking about a proxy are:
207
208 o Is there a single global resolution mechanism whereby any proxy can
209 find the referenced object?
210 o If not, is there a way in which the user can find any object of
211 this type, and "run his own proxy"?
212 o Are the operations mappable one-to-one (or possibly using
213 modifiers) to HTTP operations?
214 o Is the type of returned objects well defined?
215 * as MIME content-types?
216 * as something that can be translated to HTML?
217 o Is there running code for a proxy?
218
219 2.4 Are there security considerations?
220
221 Above and beyond the security considerations of the base mechanism
222 a scheme builds upon, one must think of things that can happen in
223 the normal course of URL usage.
224
225 In particular:
226
227 o Does the user need to be warned that such a thing is happening
228 without an explicit request (GET for the source of an IMG tag,
229 for instance)? This has implications for the design of a proxy
230 gateway, of course.
231
232 o Is it possible to fake URLs of this type that point to different
233 things in a dangerous way?
234
235 o Are there mechanisms for identifying the requester that can be
236 used or need to be used with this mechanism (the From: field in a
237 mailto: URL, or the Kerberos login required for AFS access in the
238 AFS: url, for instance)?
239
240 o Does the mechanism contain passwords or other security
241 information that are passed inside the referring document in the
242 clear (as in the "ftp" URL, for instance)?
243
244 2.5 Does it start with UR?
245
246 Any scheme starting with the letters "U" and "R", in particular if
247 it attaches any of the meanings "uniform", "universal" or
248 "unifying" to the first letter, is going to cause intense debate,
249 and generate much heat (but maybe little light).
250
251 Any such proposal should either make sure that there is a large
252 consensus behind it that it will be the only scheme of its type, or
253 pick another name.
254
255 2.6 Non-considerations
256
257 Some issues that are often raised but are not relevent to new URL
258 schemes include the following.
259
260 2.6.1 Are all objects acessible?
261
262 Can all objects in the world that are validly identified by a
263 scheme be accessed by any UA implementing it?
264
265 Sometimes the answer will be yes and sometimes no; often it will
266 depend on factors (like firewalls or client configuration) not
267 directly related to the scheme itself.
268
269 3. Security considerations
270
271 New URL schemes are required to address all security considerations
272 in their definitions.
273
274 4. IANA considerations
275
276 The process by which URL schemes are defined and registered
277 is not defined in this document.
278
279 5. References
280
281 [RFC2044] F. Yergeau, "UTF-8, A Transformation Format of Unicode
282 and ISO 10646", Alis Technologies, October 1996.
283
284 [URI-SYNTAX] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform
285 Resource Identifiers (URI): Generic Syntax and Semantics",
286 <draft-fielding-uri-syntax-*>.
287
288 6. Author's Addresses
289
290 Larry Masinter
291 Xerox Corporation
292 Palo Alto Research Center
293 3333 Coyote Hill Road
294 Palo Alto, CA 94304
295 Fax: +1-415-812-4333
296 EMail: masinter@parc.xerox.com
297
298 Harald T. Alvestrand
299 UNINETT A/S
300 Postboks 6683 Elgeseter 7002
301 Trondheim, Norway
302 Tel: +47 73 59 70 94
303 EMail: Harald.T.Alvestrand@uninett.no
304
305 Dan Zigmond
306 Wink Communications
307 1001 Marina Village Parkway
308 Alameda CA 94610
309 Fax: +1-510-337-2960
310 Phone: +1-510-337-6359
311 Email: dan.zigmond@wink.com
312

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24