/[suikacvs]/webroot/www/2004/id/draft-ietf-urn-syntax-02.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urn-syntax-02.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2
3 Internet-Draft Ryan Moats
4 draft-ietf-urn-syntax-02.txt AT&T
5 Expires in six months January 1997
6
7
8 URN Syntax
9 Filename: draft-ietf-urn-syntax-02.txt
10
11
12 Status of This Memo
13
14 This document is an Internet-Draft. Internet-Drafts are working
15 documents of the Internet Engineering Task Force (IETF), its
16 areas, and its working groups. Note that other groups may also
17 distribute working documents as Internet-Drafts.
18
19 Internet-Drafts are draft documents valid for a maximum of six
20 months and may be updated, replaced, or obsoleted by other
21 documents at any time. It is inappropriate to use Internet-
22 Drafts as reference material or to cite them other than as ``work
23 in progress.''
24
25 To learn the current status of any Internet-Draft, please check
26 the ``1id-abstracts.txt'' listing contained in the Internet-
27 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
28 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
29 Coast), or ftp.isi.edu (US West Coast).
30
31
32 Abstract
33
34 Uniform Resource Names (URNs) are intended to serve as persistent
35 resource identifiers. This document sets forward the canonical syntax
36 for URNs. A discussion of both existing legacy and new namespaces
37 and requirements for URN presentation and transmission are presented.
38 Finally, there is a discussion of URN equivalence and how to
39 determine it.
40
41 1. Introduction
42
43 Uniform Resource Names (URNs) are intended to serve as persistent
44 resource identifiers and are designed to make it easy to map other
45 namespaces (which share the properties of URNs) into URN-space.
46 Therefore, the URN syntax provides a means to encode character data
47 in a form that can be sent in existing protocols, transcribed on most
48 keyboards, etc.
49
50
51
52
53
54 Expires 7/31/97 [Page 1]
55
56
57
58
59
60 INTERNET DRAFT URN Syntax January 1997
61
62
63 2. Syntax
64
65 All URNs have the following syntax (phrases enclosed in quotes are
66 REQUIRED):
67
68 <URN> ::= "urn:" <NID> ":" <NSS>
69
70 where <NID> is the Namespace Identifier, and <NSS> is the Namespace
71 Specific String. The leading "urn:" sequence is case-insensitive.
72 The Namespace ID determines the _syntactic_ interpretation of the
73 Namespace Specific String (as discussed in [1]).
74
75 RFC 1630 [2] and RFC 1737 [3] each presents additional considerations
76 for URN encoding, which have implications as far as limiting syntax.
77 On the other hand, the requirement to support existing legacy naming
78 systems has the effect of broadening syntax. Thus, we discuss the
79 acceptable syntax for both the Namespace Identifier and the Namespace
80 Specific String separately.
81
82 2.1 Namespace Identifier Syntax
83
84 The following is the syntax for the Namespace Identifier. To (a) be
85 consistent with all potential resolution schemes and (b) not put any
86 undue constraints on any potential resolution scheme, the syntax for
87 the Namespace Identifier is:
88
89 <NID> ::= <let-num> [ *<let-num-hyp> ]
90
91 <let-num-hyp> ::= <upper> | <lower> | <number> | "-"
92
93 <let-num> ::= <upper> | <lower> | <number>
94
95 <upper> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
96 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
97 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
98 "Y" | "Z"
99
100 <lower> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
101 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
102 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
103 "y" | "z"
104
105 <number> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
106 "8" | "9"
107
108
109
110 This is slightly more restrictive that what is stated in [4] (which
111
112
113
114 Expires 7/31/97 [Page 2]
115
116
117
118
119
120 INTERNET DRAFT URN Syntax January 1997
121
122
123 allows the characters "." and "+"). Further, the Namespace
124 Identifier is case insensitive, so that "ISBN" and "isbn" refer to
125 the same namespace.
126
127 To avoid confusion with the "urn:" identifier, the NID "urn" is
128 reserved and MUST NOT be used.
129
130 2.2 Namespace Specific String Syntax
131
132 As required by RFC 1737, there is a single canonical representation
133 of the NSS portion of an URN. The format of this single canonical
134 form follows:
135
136 <NSS> ::= 1*<URN chars>
137
138 <URN chars> ::= <trans> | "%" <hex> <hex>
139
140 <trans> ::= <upper> | <lower> | <number> | <other> | <reserved>
141
142 <hex> ::= <number> | "A" | "B" | "C" | "D" | "E" | "F" |
143 "a" | "b" | "c" | "d" | "e" | "f"
144
145 <other> ::= "(" | ")" | "+" | "," | "-" | "." |
146 ":" | "=" | "?" | "@" | ";" | "$" |
147 "_" | "!" | "~" | "*" | "'"
148
149 Depending on the rules governing a namespace, valid identifiers in a
150 namespace might contain characters that are not members of the URN
151 character set above (<URN chars>). Such strings MUST be translated
152 into canonical NSS format before using them as protocol elements or
153 otherwise passing them on to other applications. Translation is done
154 by encoding each character outside the URN character set as a
155 sequence of one to six octets using UTF-8 encoding, and the encoding
156 of each of those octets as "%" followed by two characters from the
157 <hex> character set above. The two characters give the hexadecimal
158 representation of that octet.
159
160 2.3 Reserved characters
161
162 The remaining character set left to be discussed above is the
163 reserved character set, which contains various characters reserved
164 from normal use. The reserved character set follows, with a
165 discussion on the specifics of why each character is reserved.
166
167 The reserved character set is:
168
169 <reserved> ::= "/" | "%"
170
171
172
173
174 Expires 7/31/97 [Page 3]
175
176
177
178
179
180 INTERNET DRAFT URN Syntax January 1997
181
182
183 2.3.1 The "%" character
184
185 The "%" character is reserved in the URN syntax for introducing the
186 escape sequence for an octet. Literal use of the "%" character in a
187 namespace must be encoded using "%25" in URNs for that namespace.
188 The presence of an "%" character in an URN MUST be followed by two
189 characters from the <hex> character set.
190
191 Namespaces MAY designate one or more characters from the URN
192 character set as having special meaning for that namespace. If the
193 namespace also uses that character in a literal sense as well, the
194 character used in a literal sense MUST be encoded with "%" followed
195 by the hexadecimal representation of that octet. Therefore, the
196 process of registering a namespace identifier shall include
197 publication of a definition of which characters have a special
198 meaning to that namespace.
199
200 2.3.2 The "/" character
201
202 The "/" character is RESERVED for future developments. It might be
203 used for denoting hierarchy to allow for relative URN processing, but
204 the WG has not yet reached consensus on this, so such developments
205 will be documented separately. Meanwhile, namespace developers
206 SHOULD NOT use an unencoded "/", but rather use %-encoding for "/"
207 ("%2F").
208
209 2.4 Excluded characters
210
211 The following list is included only for the sake of completeness.
212 Any octets/characters on this list are explicitly NOT part of the URN
213 character set, and if used in an URN, MUST be %encoded:
214
215 <excluded> ::= octets 0-32 (0-20 hex) | "\" | """ | "#" | "&" | "<"
216 | ">" | "[" | "]" | "^" | "`" | "{" | "|" | "}" | octets 127-255 (7F-FF hex)
217
218 An URN ends when an octet/character from the excluded character set
219 (<excluded>) is encountered. The character from the excluded
220 character set is NOT part of the URN.
221
222 3. Support of existing legacy naming systems and new naming systems
223
224 Any namespace (existing or newly-devised) that is proposed as an
225 URN-namespace and fulfills the criteria of URN-namespaces MUST be
226 expressed in this syntax. If names in these namespaces contain
227 characters other than those defined for the URN character set, they
228 MUST be translated into canonical form as discussed in section 2.2.
229
230
231
232
233
234 Expires 7/31/97 [Page 4]
235
236
237
238
239
240 INTERNET DRAFT URN Syntax January 1997
241
242
243 4. URN presentation and transport
244
245 The URN syntax defines the canonical format for URNs and all URN
246 transport and interchanges MUST take place in this format. Further,
247 all URN-aware applications MUST offer the option of displaying URNs
248 in this canonical form to allow for direct transcription (for example
249 by cut and paste techniques). Such applications MAY support display
250 of URNs in a more human-friendly form and may use a character set
251 that includes characters that aren't permitted in URN syntax as
252 defined in this RFC (that is, they may replace %-notation by
253 characters in some extended character set in display to humans).
254
255 5. Lexical Equivalence in URNs
256
257 For various purposes such as caching, it's often desirable to determine
258 if two URNs are the same without resolving them. The general purpose
259 means of doing so is by testing for "lexical equivalence" as defined
260 below.
261
262 Two URNs are lexically equivalent if they are octet-by-octet equal after
263 the following preprocessing:
264
265 1. normalize the case of the leading "urn:" token
266 2. normalize the case of the NID
267 3. normalizing the case of any %-escaping
268
269 Note that %-escaping MUST NOT be removed.
270
271 Some namespaces may define additional lexical equivalences, such as
272 case-insensitivity of the NSS (or parts thereof). Additional lexical
273 equivalences MUST be documented as part of namespace registration, MUST
274 always have the effect of eliminating some of the false negatives
275 obtained by the procedure above, and MUST NEVER say that two URNs are
276 not equivalent if the procedure above says they are equivalent.
277 6. Examples of lexical equivalence
278
279 The following URN comparisons highlight the lexical equivalence
280 definitions:
281
282 1- URN:foo:a123/456
283 2- urn:foo:a123/456
284 3- urn:FOO:a123/456
285 4- urn:foo:A123/456
286 5- urn:foo:a123%2F456
287 6- URN:FOO:a123%2f456
288 URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not
289 lexically equivalent any of the other URNs of the above set. URNs 5
290 and 6 are only lexically equivalent to each other.
291
292
293
294 Expires 7/31/97 [Page 5]
295
296
297
298
299
300 INTERNET DRAFT URN Syntax January 1997
301
302
303 7. Functional Equivalence in URNs
304
305 Functional equivalence is determined by practice within a given
306 namespace and managed by resolvers for that namespeace. Namespace
307 registration must include guidance on how to determine functional
308 equivalence for that namespace, i.e. when two URNs are the identical
309 within a namespace.
310
311 8. Security considerations
312
313 This document specifies the syntax for URNs. While some namespaces
314 resolvers may assign special meaning to certain of the characters of
315 the Namespace Specific String, any security consideration resulting
316 from such assignment are outside the scope of this document. It is
317 strongly recommended that the process of registering a namespace
318 identifier include any such considerations.
319
320 9. Acknowledgments
321
322 Thanks to various members of the URN working group and <<your name
323 here!!>> for comments on earlier drafts of this document. This
324 document is partially supported by the National Science Foundation,
325 Cooperative Agreement NCR-9218179.
326
327 10. References
328
329 Request For Comments (RFC) and Internet Draft documents are available
330 from <URL:ftp://ftp.internic.net> and numerous mirror sites.
331
332 [1] K. R. Sollins, "Requirements and a Framework for
333 URN Resolution Systems," Internet Draft (work in
334 progress), November 1996.
335
336
337 [2]
338 T. Berners-Lee, "Universal Resource Identifiers in WWW," RFC
339 1630, June 1994.
340
341
342 [3] K. Sollins and L. Masinter, "Functional Require-
343 ments for Uniform Resource Names," RFC 1737.
344 December 1994.
345
346
347 [4] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform
348 Resource Locators (URL)," Internet Draft (work in
349 progress), December 1996.
350
351
352
353
354 Expires 7/31/97 [Page 6]
355
356
357
358
359
360 INTERNET DRAFT URN Syntax January 1997
361
362
363 11. Editor's address
364
365 Ryan Moats
366 AT&T
367 15621 Drexel Circle
368 Omaha, NE 68135-2358
369 USA
370
371 Phone: +1 402 894-9456
372 EMail: jayhawk@ds.internic.net
373
374
375
376
377
378 Appendix A. Handling of URNs by URL resolvers/browsers.
379
380 The URN syntax has been defined so that URNs can be used in places
381 where URLs are expected. A resolver that conforms to the current URL
382 syntax specification [3] will extract a scheme value of "urn:"
383 rather than a scheme value of "urn:<nid>".
384
385 An URN MUST be considered an opaque URL by URL resolvers and either
386 passed (with the "urn:" tag) to an URN resolver for resolution. The
387 URN resolver can either be an external resolver that the URL resolver
388 knows of, or it can be functionality built-in to the URL resolver.
389
390 To avoid confusion of users, an URL browser SHOULD display the com-
391 plete URN (including the "urn:" tag) to ensure that there is no con-
392 fusion between URN namespace identifiers and URL scheme identifiers.
393
394
395 This Internet Draft expires July 31, 1997.
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414 Expires 7/31/97 [Page 7]
415
416
417

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24