/[suikacvs]/webroot/www/2004/id/draft-ietf-uri-urn-path-01.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-uri-urn-path-01.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:37:16 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 INTERNET-DRAFT
2
3 The Path URN Specification
4 **************************
5
6 draft-ietf-uri-urn-path-01.txt
7 Expires 17 Jan 96
8
9 Daniel LaLiberte <liberte@ncsa.uiuc.edu>
10 Michael Shapiro <mshapiro@ncsa.uiuc.edu>
11
12 Status of this memo
13 ===================
14
15 This document is an Internet-Draft. Internet-Drafts are working
16 documents of the Internet Engineering Task Force (IETF), its
17 areas, and its working groups. Note that other groups may also
18 distribute working documents as Internet-Drafts.
19
20 Internet-Drafts are draft documents valid for a maximum of six
21 months and may be updated, replaced, or obsoleted by other
22 documents at any time. It is inappropriate to use Internet-Drafts as
23 reference material or to cite them other than as "work in progress."
24
25 To learn the current status of any Internet-Draft, please check the
26 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
27 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
28 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
29 ftp.isi.edu (US West Coast).
30
31 This Internet Draft expires 17 Jan 96.
32
33 Last modified: Wed Jul 26 09:56:05 CDT 1995
34
35 This document is also available in HTML at:
36
37 <URL: http://union.ncsa.uiuc.edu/~liberte/www/path.html>
38
39 Modifications of that document relative to the internet draft are
40 shown in italic font.
41
42 Abstract
43 ========
44
45 A new "path" URN scheme is proposed that defines a uniformly
46 hierarchical name space. This URN scheme supports dynamic
47 relocation and replication of resources. Existing DNS technology is
48 used to resolve a path into sets of equivalent URLs, and then one
49 URL is resolved into the named resource.
50
51 Introduction
52 ============
53
54 The path scheme defines a uniformly hierarchical name space
55 where a path URN is a sequence of components and an optional
56 opaque string. An example path URN is:
57
58 path:/A/B/C/doc.html
59
60 The path is /A/B/C and the opaque string is doc.html.
61
62 The significant features of the path URN scheme include the
63 following:
64
65 Highly Scalable
66
67 The resolution process is highly scalable due to several
68 factors. Resolution is distributed as much as the named
69 resources themselves are. (This also permits the resolution
70 of names to be handled by servers that are motivated to
71 maintain the service because they also serve the named
72 resources.) The public hierarchy enables clients to make
73 use of caches of resolver locations.
74
75 Dynamically Reconfigurable
76
77 The resolution process is reconfigurable to support
78 additional scalability and persistence of names in the event
79 of relocations. The responsibility for resolution of a part of a
80 name space may be delegated to another resolver or
81 several parts of the name space may be recombined and
82 resolved by a single server.
83
84 Built-in Fallback Mechanism
85
86 The resolution process has a built-in fallback mechanism in
87 case the original resolver is uncooperative in forwarding
88 references to resources that have moved.
89
90 Easily Deployed
91
92 The resolution and name assignment mechanisms are easily
93 deployable since they use existing DNS technology and URL
94 resolution schemes such as HTTP and FTP. Only a small
95 amount of path-specific code is added to clients or proxy
96 servers. Existing URLs may be automatically mapped to path
97 URNs.
98
99 Resolves to Resource
100
101 A path resolves first into a list of sets of equivalent URLs,
102 and then second, that list is resolved into the named
103 resource using one of the URLs. The type of the resource is
104 identified by the protocol of the particular URL that is used; if
105 metadata for the resource is desired instead, the particular
106 URL scheme may provide it. The path URN scheme does not
107 depend on URCs.
108
109 In this document, we first describe the name assignment and
110 resolution process conceptually. This is followed by a more detailed
111 description of the protocol, the encoding rules, and the compliance
112 to URN requirements.
113
114 Name Assignment
115 ===============
116
117 Names of resources are assigned by naming authorities that are
118 responsible for a subtree of the name space, and naming
119 authorities may delegate naming responsibility to sub-authorities.
120 The top-most naming authority in the hierarchy is known as the root
121 naming authority. Each naming authority corresponds to a name
122 resolution service; a name resolution service may be shared by
123 several naming authorities.
124
125 A naming authority may create any new name for a resource as
126 long as the encoding rules described below are met. Once a name
127 has been assigned, it should never be assigned again for a different
128 resource, as per the URN requirements. Naming authorities are
129 responsibile for meeting this uniqueness requirement.
130
131 A path name may be declared by the appropriate naming authority
132 as the name of a collection of resources. Such a name must end
133 with a final "/". The resource that a collection name resolves into is
134 undefined by the path scheme protocol. Not all prefixes of path
135 names are guaranteed to be names of collections.
136
137 An automatic mapping from most FTP and HTTP URLs to path
138 URNs is feasible and will speed deployment. However, the
139 generated names may not be appropriate for some HTTP URLs due
140 to encoding requirements or misleading semantics, so some
141 manual intervention or customization of the generation process will
142 be required. Since the process is repeatable, the same generator
143 service may be used as a URN lookup service given URLs. The
144 generator service is not described in this document.
145
146 The Name Resolution Process
147 ===========================
148
149 The resolution process is described in two steps. The first step
150 resolves the name into an ordered list of URL-sets. The second
151 step attempts to resolve URLs from successive sets in the list until
152 the resolution succeeds or the list is exhausted.
153
154 The first step in the resolution process involves traversing the
155 components of the path, left to right. Each component in the path
156 (except the final opaque string) has two functions. One function is to
157 provide a context for resolving the remainder of the path. The
158 context for resolving the first component is the resolver for the root
159 naming authority. The other function is to optionally provide a set of
160 equivalent URLs (called a URL-set) constructed from URL-prefixes
161 and the remainder of the path. All URLs in a set are equivalent in
162 that each should resolve to the "same" resource, if it resolves at all.
163
164 The first step ends when no more URL-sets are found. The result is
165 a list of URL-sets ordered from most-specific to least-specific in
166 the reverse order that they were discovered during the first step.
167
168 The second step is to resolve the list of URL-sets to the named
169 resource. A URL (which may be a URN) is selected from the first set
170 (e.g., randomly) and resolution of the URL is attempted. Any of the
171 URLs may be URNs, even other path URNs. If the resolution fails
172 because the URL service is unavailable (e.g. connection failure),
173 another URL is selected from the set, until none are left; retries with
174 exponential backoff may then follow, or the path resolution process
175 may be declared a failure. Alternatively, if the resolution of a URL
176 fails because the URL is unknown, then the process is repeated
177 with the next set in the list. The process is repeated until the
178 resolution succeeds or the list is exhausted (which implies
179 resolution failure).
180
181 If the resolution of a URL results in a redirection to yet another URL,
182 then that redirection should be followed to determine if it succeeds
183 before declaring that the first URL has been resolved. A failure to
184 resolve the redirection should be treated as the same kind of failure
185 to resolve the first URL.
186
187 Reconfiguration of the Resolution Service
188 +++++++++++++++++++++++++++++++++++++++++
189
190 The resolution process may be dynamically reconfigured in a
191 number of ways to meet the requirements of scalability and
192 persistence.
193
194 o Resolvers may delegate part of their resolution service to
195 sub-resolvers. Since the most specific URL-set is used
196 first, the sub-resolver will have the first chance to resolve
197 the URLs. Note, however, that a sub-resolver can only be
198 created where there is already a corresponding sub-naming
199 authority; that is, the name space must have already been
200 subdivided.
201
202 Administrators of a resolution service may want to delegate
203 resolution to sub-resolvers for one of two reasons: to
204 reduce the load on a resolver, or to allow a sub-resolver to
205 be located elsewhere on the internet.
206
207 o Responsibility for resolution of a set of path names may fall
208 back to higher level resolution services in the event that a
209 lower level resolution service declines to either resolve the
210 paths to the resources or provide redirects.
211
212 Examples
213 ++++++++
214
215 In the following partial tree diagram, the nodes marked with * have
216 URL-sets associated with them.
217
218 /
219 |
220 -------------------------------
221 A1 A2
222 | |
223 --------------------------
224 B1* B2*
225 | |
226 ---------- |
227 C1 C2* C
228 |
229 D*
230
231 /A/B1 names resources under /A/B1 except those under
232 /A/B1/C2
233 /A/B2 names resources under /A/B2 except those under
234 /A/B2/C/D
235
236 Details of the Resolution Process
237 =================================
238
239 This section describes more details of the path scheme resolution
240 process using existing capabilities of the Domain Name System
241 (DNS) [3]. In principle, the path scheme protocol could use any
242 global, hierarchical name system that provides the necessary
243 functionality, but it is necessary to specify one protocol so clients
244 and servers can communicate. The main reason for using DNS is
245 that it is widely deployed and relatively stable.
246
247 The path name space may use existing the DNS name space, or a
248 newly created name space within DNS devoted to the path name
249 space, or some combination of both. (This draft does not specify
250 which will be used.)
251
252 A small amount of new code is required on the client side to drive
253 the resolution process, but generic proxy mechanisms available in
254 many WWW browsers may be used with a path proxy server to
255 share the process across a number of clients.
256
257 Resolving the Name into URL-sets
258 +++++++++++++++++++++++++++++++++
259
260 The implementation uses DNS TXT records that are typed, based
261 on the information they contain. At present, there is one type of path
262 TXT record beginning with "path-u". TXT records that begin with
263 "path-" are reserved for future extensions.
264
265 The "path-u" TXT record is followed by a single URL-prefix. Note
266 that a URL-prefix is not necessarily a full URL; it specifies a
267 resolution service and it is used to construct a full URL during
268 resolution. There may be multiple "path-u" TXT records for a single
269 DNS name, and each should logically specify equivalent resolution
270 services.
271
272 The DNS step of the resolution process proceeds as follows.
273
274 1. The list of URL-sets is initialized to the empty list.
275
276 2. The entire path URN, except the scheme and the opaque
277 string, is converted to lowercase and then to DNS names
278 (one name for each component of the path). For example,
279
280 path:/A/B2/C1/doc.html is converted to
281 /a/b2/c1/doc.html and then to the DNS names
282
283 . (the root of DNS)
284 a.
285 a.b2.
286 a.b2.c2.
287
288 3. For each of the DNS names, in order of the shortest name to
289 longest name, all TXT records associated with it are
290 requested using DNS resolvers.
291
292 If there are any "path-u" TXT records for a particular DNS
293 name, then a URL-set is constructed from the URL-prefixes
294 in the TXT records and the set is added to the head of the
295 list. The URLs in a URL-set are constructed by appending
296 the remaining components of the path and the opaque string
297 to each URL-prefix.
298
299 For example, suppose that while resolving
300 path:/A/B2/C1/doc.html, we discover the the TXT record
301 corresponding to the DNS name b2.a. is
302 "path-u http://ietf.org/path/docs"
303 Since b2.a. corresponds to /A/B2I, and remainder of the
304 path is "/c1/doc.html", then the URL for this URL-prefix
305 would be
306 http://ietf.org/path/docs/c1/doc.html
307
308 To clarify the above algorithm, some examples are presented. The
309 examples use the partial document tree specified previously. The
310 DNS entries for this partial tree are:
311
312 TXT
313 a. -none-
314 b1.a. "path-u http://ietf.org/path/docs"
315 c2.b1.a. "path-u http://www.org:70/docs"
316 b2.a. "path-u http://ietf.org/path/docs"
317 d.c.b1.a. "path-u http://www.org:70/docs"
318 "path-u http://w3c.org/docs/www"
319
320 Example lookups
321
322 /A/B1/C1/doc.html
323
324 a. no "path-u" record
325 repeat with b1.a.
326 b1.a. URL http://ietf.org/path/docs/c1/doc.html
327 repeat with c1.b1.a.
328 c1.b1.a. unknown DNS name - done
329
330 List of URL-sets is
331
332 {http://ietf.org/path/docs/c1/doc.html}
333
334 /A/B2/C/D/doc.html
335
336 a. no "path-u" record
337 repeat with b2.a.
338 b2.a. URL ftp://ietf.org/path/docs/c/d/doc.html
339 repeat with c.b2.a.
340 c.b2.a. no "path-u" record
341 repeat with d.c.b2.a.
342 d.c.b2.a. URL http://www.org:70/docs/doc.html
343 URL ftp://w3c.org/docs/www/doc.html
344 done
345
346 List of URL-sets is
347
348 {http://www.org:70/docs/doc.html, ftp://w3c.org/docs/www/doc.html}
349 {ftp://ietf.org/path/docs/c/d/doc.html}
350
351
352 Resolving the URL-sets into the Resource
353 +++++++++++++++++++++++++++++++++++++++++
354
355 After constructing a list of URL-sets, it must be resolved into the
356 named resource. The list of URL-sets could itself be an object that
357 may be passed back from proxy servers to clients or cached for
358 later use. But here we describe the resolution of the list of URL-sets
359 into the named resource independent of which agent resolves it or
360 whether it is a first class object.
361
362 A URL is selected from the first set (e.g., randomly) and resolution of
363 the URL is attempted. Any of the URLs may be URNs, even other
364 path URNs. The appropriate protocol, as indicated by the scheme of
365 the URL and user preference, is used to resolve it. For example, if
366 the URL were http://ietf.org/path/docs/c1/doc.html, then
367 the HTTP protocol is typically used to resolve that URL using the
368 GET method.
369
370 If the resolution fails because the URL service is unavailable,
371 another URL is selected from the set, until none are left; retries with
372 exponential backoff may then follow, or the path resolution process
373 may be declared a failure. Resolution may fail because the server
374 doesn't exist, or the connection times out before or after it is made,
375 or the server returns an error code indicating that the service is
376 unavailable.
377
378 Alternatively, if the resolution of a URL fails because the URL is
379 unknown, then the process is repeated with the next set in the list.
380 The process is repeated until the resolution succeeds or the list is
381 exhausted (which implies resolution failure).
382
383 If the resolution of a URL results in a redirection to yet another URL
384 (which may be a URN), then that redirection should be followed to
385 determine if it succeeds before declaring that the first URL has been
386 resolved.
387
388 Management Issues
389 =================
390
391 (This section will describe what administrators of naming authorities
392 and resolvers need to do to manage their portion of the path name
393 space.)
394
395 Encoding Syntax
396 ===============
397
398 The encoding rules may vary depending on the underlying
399 implemenation, but, again, we assume DNS is used. Therefore, the
400 components of a path must be compatible with DNS <label> names.
401 Hex encodings must be used for uppercase characters in the name
402 that are to be distinguished from the corresponding lowercase
403 characters. Hex encoding is also required for dot ("."), the DNS
404 component separator, and slash ("/"), if it is used within a
405 component name or the opaque string. Here is a BNF description of
406 the encoding rules.
407
408 <path-urn> ::= "path:" <name>
409 <name> ::= <path> "/" [ <final-part> ]
410 <path> ::= "" | "/" <label> [ <path> ]
411
412 <final-part> ::= any ascii character except "/"
413
414 <label> ::= any ascii character except "/", or "."
415
416
417 URN Requirements
418 ================
419
420 The path scheme meets all of the requirements for Universal
421 Resource Names, as described in [2]. For each functional
422 requirement, we discuss how the path scheme is in conformance
423 with it. We also discuss conformance to the encoding requirements.
424
425 Functional Requirements
426 +++++++++++++++++++++++
427
428 o Global scope: The root of the path name space will be known
429 to all clients, and for each node in the hierarchical name
430 space, the corresponding resolution service will know all its
431 subnodes. This guarantees that any particular path URN will
432 have the same meaning for each client.
433
434 o Global uniqueness: Each node in the hierarchical name
435 space corresponds to a naming authority that is responsible
436 for guaranteeing uniqueness within that portion of the name
437 space, or for delegating that responsibility to a
438 sub-authority.
439
440 o Persistence: To help guarantee that path URNs remain
441 useful as long as they are needed, the path scheme allows
442 any subtree of the name space to be served at any net
443 location, and this location may be changed without having to
444 change names. Additionally, a fallback mechanism is
445 provided by the protocol in case a resolver does not wish to
446 forward requests.
447
448 o Scalability: Assignment of path names is scalable for an
449 arbitrarily large number of names since the assignment
450 process is distributed across an arbitrarily large number of
451 naming authorities. The name resolution process is also
452 scalable for any number of names and clients, as discussed
453 below under "Resolution". Each naming authority and
454 resolution service need know about only a small number of
455 neighboring authorities and services.
456
457 o Legacy support: A legacy naming scheme may be supported
458 in the path URN scheme by assigning it a well-known path
459 naming authority, preferrably near the root. Hierarchical
460 names may be mapped to appropriate path names, or any
461 names may be embedded in the opaque string of a path
462 name. E.g. path:/isbn/1-884777-01-5 or
463 path:/isbn/1/884777/01/5
464
465 o Extensibility: Same as for legacy support.
466
467 o Independence: Every path naming authority is constrained
468 by the requirements of the path scheme (e.g. components of
469 the path must follow the encoding rules), but control of
470 whether a naming authority issues a conforming name in its
471 name space is up to that authority alone.
472
473 o Resolution: The path scheme facilitates efficient resolution of
474 path URNs. The hierarchical nature of the name space
475 allows clients to use caches of remote resolution server
476 locations, so clients rarely need to query servers near the
477 top of the hierarchy. For additional scalability, a server may
478 delegate resolution of parts of its name space to other
479 servers, and clients may then bypass contacting the original
480 server.
481
482 There is an implied assumption in the URN requirements document
483 that names resolve into locations or metadata as opposed to the
484 resources themselves. This based on the need for indirection to
485 allow the resource to change location, which we agree with.
486 However, a path name is actually a dynamic location since the
487 resolution process always finds the current location of the resolvers
488 along the path. So there is no need to impose the requirement of an
489 explicit indirection solely for the purpose of finding the current
490 location.
491
492 Encoding Requirements
493 +++++++++++++++++++++
494
495 The encoding syntax for path URNs conforms to the requirements
496 for generic URLs and for URNs.
497
498 Security Considerations
499 =======================
500
501 The decentralized path scheme is arguably less vulnerable to
502 attack than are centralized services.
503
504 The path scheme depends on DNS for most of the resolution
505 process, and insofar as DNS is secure or insecure, so is the path
506 scheme. A more complete reference of relevant weaknesses
507 should be included here.
508
509 The hierarchical path scheme allows security constraints to be
510 imposed on just the subtree of names that require it. The resolution
511 process hides whether a name actually is resolvable by first
512 requesting authentication.
513
514 References
515 ==========
516
517 1. Berners-Lee, T., Masinter, L., McCahill, M. (editors),
518 "Uniform Resource Locators (URL)", RFC 1738, December
519 1994. ftp://ds.internic.net/rfc/rfc1738.txt
520
521 2. Sollins, K., Masinter, L. "Functional Requirements for Uniform
522 Resource Names", RFC 1737, December 1994.
523 ftp://ds.internic.net/rfc/rfc1737.txt
524
525 3. Mockapetris, P., "Domain Names - Implementation and
526 Specification", RFC 1035, November 1987.
527 ftp://ds.internic.net/rfc/rfc1035.txt
528
529 4. T. Berners-Lee, R. T. Fielding, H. Frystyk Nielsen, HTTP
530 Internet-Draft, "Hypertext Transfer Protocol -- HTTP/1.0".
531 The name of the draft at the time of this writing is
532 "draft-ietf-http-v10-spec-00.txt".
533
534 Author Contact Information
535 ==========================
536
537 Daniel LaLiberte
538 National Center for Supercomputing Applications
539 152 Computing Appliations Building
540 605 East Springfield Avenue
541 Champaign, IL 61820
542 Tel: (217) 244-0013
543 liberte@ncsa.uiuc.edu
544
545 Michael Shapiro
546 National Center for Supercomputing Applications
547 152 Computing Applications Building
548 605 East Springfield Avenue
549 Champaign, IL 61820
550 Tel: (217) 244-6642
551 mshapiro@ncsa.uiuc.edu
552
553 draft-ietf-uri-urn-path-01.txt
554 Expires 17 Jan 96
555

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24