/[suikacvs]/webroot/www/2004/id/draft-ietf-urn-syntax-02.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urn-syntax-02.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (20 years, 10 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 wakaba 1.1
2    
3     Internet-Draft Ryan Moats
4     draft-ietf-urn-syntax-02.txt AT&T
5     Expires in six months January 1997
6    
7    
8     URN Syntax
9     Filename: draft-ietf-urn-syntax-02.txt
10    
11    
12     Status of This Memo
13    
14     This document is an Internet-Draft. Internet-Drafts are working
15     documents of the Internet Engineering Task Force (IETF), its
16     areas, and its working groups. Note that other groups may also
17     distribute working documents as Internet-Drafts.
18    
19     Internet-Drafts are draft documents valid for a maximum of six
20     months and may be updated, replaced, or obsoleted by other
21     documents at any time. It is inappropriate to use Internet-
22     Drafts as reference material or to cite them other than as ``work
23     in progress.''
24    
25     To learn the current status of any Internet-Draft, please check
26     the ``1id-abstracts.txt'' listing contained in the Internet-
27     Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
28     (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
29     Coast), or ftp.isi.edu (US West Coast).
30    
31    
32     Abstract
33    
34     Uniform Resource Names (URNs) are intended to serve as persistent
35     resource identifiers. This document sets forward the canonical syntax
36     for URNs. A discussion of both existing legacy and new namespaces
37     and requirements for URN presentation and transmission are presented.
38     Finally, there is a discussion of URN equivalence and how to
39     determine it.
40    
41     1. Introduction
42    
43     Uniform Resource Names (URNs) are intended to serve as persistent
44     resource identifiers and are designed to make it easy to map other
45     namespaces (which share the properties of URNs) into URN-space.
46     Therefore, the URN syntax provides a means to encode character data
47     in a form that can be sent in existing protocols, transcribed on most
48     keyboards, etc.
49    
50    
51    
52    
53    
54     Expires 7/31/97 [Page 1]
55    
56    
57    
58    
59    
60     INTERNET DRAFT URN Syntax January 1997
61    
62    
63     2. Syntax
64    
65     All URNs have the following syntax (phrases enclosed in quotes are
66     REQUIRED):
67    
68     <URN> ::= "urn:" <NID> ":" <NSS>
69    
70     where <NID> is the Namespace Identifier, and <NSS> is the Namespace
71     Specific String. The leading "urn:" sequence is case-insensitive.
72     The Namespace ID determines the _syntactic_ interpretation of the
73     Namespace Specific String (as discussed in [1]).
74    
75     RFC 1630 [2] and RFC 1737 [3] each presents additional considerations
76     for URN encoding, which have implications as far as limiting syntax.
77     On the other hand, the requirement to support existing legacy naming
78     systems has the effect of broadening syntax. Thus, we discuss the
79     acceptable syntax for both the Namespace Identifier and the Namespace
80     Specific String separately.
81    
82     2.1 Namespace Identifier Syntax
83    
84     The following is the syntax for the Namespace Identifier. To (a) be
85     consistent with all potential resolution schemes and (b) not put any
86     undue constraints on any potential resolution scheme, the syntax for
87     the Namespace Identifier is:
88    
89     <NID> ::= <let-num> [ *<let-num-hyp> ]
90    
91     <let-num-hyp> ::= <upper> | <lower> | <number> | "-"
92    
93     <let-num> ::= <upper> | <lower> | <number>
94    
95     <upper> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
96     "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
97     "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
98     "Y" | "Z"
99    
100     <lower> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
101     "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
102     "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
103     "y" | "z"
104    
105     <number> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
106     "8" | "9"
107    
108    
109    
110     This is slightly more restrictive that what is stated in [4] (which
111    
112    
113    
114     Expires 7/31/97 [Page 2]
115    
116    
117    
118    
119    
120     INTERNET DRAFT URN Syntax January 1997
121    
122    
123     allows the characters "." and "+"). Further, the Namespace
124     Identifier is case insensitive, so that "ISBN" and "isbn" refer to
125     the same namespace.
126    
127     To avoid confusion with the "urn:" identifier, the NID "urn" is
128     reserved and MUST NOT be used.
129    
130     2.2 Namespace Specific String Syntax
131    
132     As required by RFC 1737, there is a single canonical representation
133     of the NSS portion of an URN. The format of this single canonical
134     form follows:
135    
136     <NSS> ::= 1*<URN chars>
137    
138     <URN chars> ::= <trans> | "%" <hex> <hex>
139    
140     <trans> ::= <upper> | <lower> | <number> | <other> | <reserved>
141    
142     <hex> ::= <number> | "A" | "B" | "C" | "D" | "E" | "F" |
143     "a" | "b" | "c" | "d" | "e" | "f"
144    
145     <other> ::= "(" | ")" | "+" | "," | "-" | "." |
146     ":" | "=" | "?" | "@" | ";" | "$" |
147     "_" | "!" | "~" | "*" | "'"
148    
149     Depending on the rules governing a namespace, valid identifiers in a
150     namespace might contain characters that are not members of the URN
151     character set above (<URN chars>). Such strings MUST be translated
152     into canonical NSS format before using them as protocol elements or
153     otherwise passing them on to other applications. Translation is done
154     by encoding each character outside the URN character set as a
155     sequence of one to six octets using UTF-8 encoding, and the encoding
156     of each of those octets as "%" followed by two characters from the
157     <hex> character set above. The two characters give the hexadecimal
158     representation of that octet.
159    
160     2.3 Reserved characters
161    
162     The remaining character set left to be discussed above is the
163     reserved character set, which contains various characters reserved
164     from normal use. The reserved character set follows, with a
165     discussion on the specifics of why each character is reserved.
166    
167     The reserved character set is:
168    
169     <reserved> ::= "/" | "%"
170    
171    
172    
173    
174     Expires 7/31/97 [Page 3]
175    
176    
177    
178    
179    
180     INTERNET DRAFT URN Syntax January 1997
181    
182    
183     2.3.1 The "%" character
184    
185     The "%" character is reserved in the URN syntax for introducing the
186     escape sequence for an octet. Literal use of the "%" character in a
187     namespace must be encoded using "%25" in URNs for that namespace.
188     The presence of an "%" character in an URN MUST be followed by two
189     characters from the <hex> character set.
190    
191     Namespaces MAY designate one or more characters from the URN
192     character set as having special meaning for that namespace. If the
193     namespace also uses that character in a literal sense as well, the
194     character used in a literal sense MUST be encoded with "%" followed
195     by the hexadecimal representation of that octet. Therefore, the
196     process of registering a namespace identifier shall include
197     publication of a definition of which characters have a special
198     meaning to that namespace.
199    
200     2.3.2 The "/" character
201    
202     The "/" character is RESERVED for future developments. It might be
203     used for denoting hierarchy to allow for relative URN processing, but
204     the WG has not yet reached consensus on this, so such developments
205     will be documented separately. Meanwhile, namespace developers
206     SHOULD NOT use an unencoded "/", but rather use %-encoding for "/"
207     ("%2F").
208    
209     2.4 Excluded characters
210    
211     The following list is included only for the sake of completeness.
212     Any octets/characters on this list are explicitly NOT part of the URN
213     character set, and if used in an URN, MUST be %encoded:
214    
215     <excluded> ::= octets 0-32 (0-20 hex) | "\" | """ | "#" | "&" | "<"
216     | ">" | "[" | "]" | "^" | "`" | "{" | "|" | "}" | octets 127-255 (7F-FF hex)
217    
218     An URN ends when an octet/character from the excluded character set
219     (<excluded>) is encountered. The character from the excluded
220     character set is NOT part of the URN.
221    
222     3. Support of existing legacy naming systems and new naming systems
223    
224     Any namespace (existing or newly-devised) that is proposed as an
225     URN-namespace and fulfills the criteria of URN-namespaces MUST be
226     expressed in this syntax. If names in these namespaces contain
227     characters other than those defined for the URN character set, they
228     MUST be translated into canonical form as discussed in section 2.2.
229    
230    
231    
232    
233    
234     Expires 7/31/97 [Page 4]
235    
236    
237    
238    
239    
240     INTERNET DRAFT URN Syntax January 1997
241    
242    
243     4. URN presentation and transport
244    
245     The URN syntax defines the canonical format for URNs and all URN
246     transport and interchanges MUST take place in this format. Further,
247     all URN-aware applications MUST offer the option of displaying URNs
248     in this canonical form to allow for direct transcription (for example
249     by cut and paste techniques). Such applications MAY support display
250     of URNs in a more human-friendly form and may use a character set
251     that includes characters that aren't permitted in URN syntax as
252     defined in this RFC (that is, they may replace %-notation by
253     characters in some extended character set in display to humans).
254    
255     5. Lexical Equivalence in URNs
256    
257     For various purposes such as caching, it's often desirable to determine
258     if two URNs are the same without resolving them. The general purpose
259     means of doing so is by testing for "lexical equivalence" as defined
260     below.
261    
262     Two URNs are lexically equivalent if they are octet-by-octet equal after
263     the following preprocessing:
264    
265     1. normalize the case of the leading "urn:" token
266     2. normalize the case of the NID
267     3. normalizing the case of any %-escaping
268    
269     Note that %-escaping MUST NOT be removed.
270    
271     Some namespaces may define additional lexical equivalences, such as
272     case-insensitivity of the NSS (or parts thereof). Additional lexical
273     equivalences MUST be documented as part of namespace registration, MUST
274     always have the effect of eliminating some of the false negatives
275     obtained by the procedure above, and MUST NEVER say that two URNs are
276     not equivalent if the procedure above says they are equivalent.
277     6. Examples of lexical equivalence
278    
279     The following URN comparisons highlight the lexical equivalence
280     definitions:
281    
282     1- URN:foo:a123/456
283     2- urn:foo:a123/456
284     3- urn:FOO:a123/456
285     4- urn:foo:A123/456
286     5- urn:foo:a123%2F456
287     6- URN:FOO:a123%2f456
288     URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not
289     lexically equivalent any of the other URNs of the above set. URNs 5
290     and 6 are only lexically equivalent to each other.
291    
292    
293    
294     Expires 7/31/97 [Page 5]
295    
296    
297    
298    
299    
300     INTERNET DRAFT URN Syntax January 1997
301    
302    
303     7. Functional Equivalence in URNs
304    
305     Functional equivalence is determined by practice within a given
306     namespace and managed by resolvers for that namespeace. Namespace
307     registration must include guidance on how to determine functional
308     equivalence for that namespace, i.e. when two URNs are the identical
309     within a namespace.
310    
311     8. Security considerations
312    
313     This document specifies the syntax for URNs. While some namespaces
314     resolvers may assign special meaning to certain of the characters of
315     the Namespace Specific String, any security consideration resulting
316     from such assignment are outside the scope of this document. It is
317     strongly recommended that the process of registering a namespace
318     identifier include any such considerations.
319    
320     9. Acknowledgments
321    
322     Thanks to various members of the URN working group and <<your name
323     here!!>> for comments on earlier drafts of this document. This
324     document is partially supported by the National Science Foundation,
325     Cooperative Agreement NCR-9218179.
326    
327     10. References
328    
329     Request For Comments (RFC) and Internet Draft documents are available
330     from <URL:ftp://ftp.internic.net> and numerous mirror sites.
331    
332     [1] K. R. Sollins, "Requirements and a Framework for
333     URN Resolution Systems," Internet Draft (work in
334     progress), November 1996.
335    
336    
337     [2]
338     T. Berners-Lee, "Universal Resource Identifiers in WWW," RFC
339     1630, June 1994.
340    
341    
342     [3] K. Sollins and L. Masinter, "Functional Require-
343     ments for Uniform Resource Names," RFC 1737.
344     December 1994.
345    
346    
347     [4] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform
348     Resource Locators (URL)," Internet Draft (work in
349     progress), December 1996.
350    
351    
352    
353    
354     Expires 7/31/97 [Page 6]
355    
356    
357    
358    
359    
360     INTERNET DRAFT URN Syntax January 1997
361    
362    
363     11. Editor's address
364    
365     Ryan Moats
366     AT&T
367     15621 Drexel Circle
368     Omaha, NE 68135-2358
369     USA
370    
371     Phone: +1 402 894-9456
372     EMail: jayhawk@ds.internic.net
373    
374    
375    
376    
377    
378     Appendix A. Handling of URNs by URL resolvers/browsers.
379    
380     The URN syntax has been defined so that URNs can be used in places
381     where URLs are expected. A resolver that conforms to the current URL
382     syntax specification [3] will extract a scheme value of "urn:"
383     rather than a scheme value of "urn:<nid>".
384    
385     An URN MUST be considered an opaque URL by URL resolvers and either
386     passed (with the "urn:" tag) to an URN resolver for resolution. The
387     URN resolver can either be an external resolver that the URL resolver
388     knows of, or it can be functionality built-in to the URL resolver.
389    
390     To avoid confusion of users, an URL browser SHOULD display the com-
391     plete URN (including the "urn:" tag) to ensure that there is no con-
392     fusion between URN namespace identifiers and URL scheme identifiers.
393    
394    
395     This Internet Draft expires July 31, 1997.
396    
397    
398    
399    
400    
401    
402    
403    
404    
405    
406    
407    
408    
409    
410    
411    
412    
413    
414     Expires 7/31/97 [Page 7]
415    
416    
417    

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24