/[suikacvs]/webroot/www/2004/id/draft-ietf-urn-syntax-03.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urn-syntax-03.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2
3
4
5
6 Internet-Draft Ryan Moats
7 draft-ietf-urn-syntax-03.txt AT&T
8 Expires in six months March 1997
9
10
11 URN Syntax
12 Filename: draft-ietf-urn-syntax-03.txt
13
14
15 Status of This Memo
16
17 This document is an Internet-Draft. Internet-Drafts are working
18 documents of the Internet Engineering Task Force (IETF), its
19 areas, and its working groups. Note that other groups may also
20 distribute working documents as Internet-Drafts.
21
22 Internet-Drafts are draft documents valid for a maximum of six
23 months and may be updated, replaced, or obsoleted by other
24 documents at any time. It is inappropriate to use Internet-
25 Drafts as reference material or to cite them other than as ``work
26 in progress.''
27
28 To learn the current status of any Internet-Draft, please check
29 the ``1id-abstracts.txt'' listing contained in the Internet-
30 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
31 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
32 Coast), or ftp.isi.edu (US West Coast).
33
34
35 Abstract
36
37 Uniform Resource Names (URNs) are intended to serve as persistent,
38 location-independent, resource identifiers. This document sets
39 forward the canonical syntax for URNs. A discussion of both existing
40 legacy and new namespaces and requirements for URN presentation and
41 transmission are presented. Finally, there is a discussion of URN
42 equivalence and how to determine it.
43
44 1. Introduction
45
46 Uniform Resource Names (URNs) are intended to serve as persistent,
47 location-independent, resource identifiers and are designed to make
48 it easy to map other namespaces (which share the properties of URNs)
49 into URN-space. Therefore, the URN syntax provides a means to encode
50 character data in a form that can be sent in existing protocols,
51 transcribed on most keyboards, etc.
52
53
54
55
56
57 Expires 9/30/97 [Page 1]
58
59
60
61
62
63 INTERNET DRAFT URN Syntax March 1997
64
65
66 2. Syntax
67
68 All URNs have the following syntax (phrases enclosed in quotes are
69 REQUIRED):
70
71 <URN> ::= "urn:" <NID> ":" <NSS>
72
73 where <NID> is the Namespace Identifier, and <NSS> is the Namespace
74 Specific String. The leading "urn:" sequence is case-insensitive.
75 The Namespace ID determines the _syntactic_ interpretation of the
76 Namespace Specific String (as discussed in [1]).
77
78 RFC 1630 [2] and RFC 1737 [3] each presents additional considerations
79 for URN encoding, which have implications as far as limiting syntax.
80 On the other hand, the requirement to support existing legacy naming
81 systems has the effect of broadening syntax. Thus, we discuss the
82 acceptable syntax for both the Namespace Identifier and the Namespace
83 Specific String separately.
84
85 2.1 Namespace Identifier Syntax
86
87 The following is the syntax for the Namespace Identifier. To (a) be
88 consistent with all potential resolution schemes and (b) not put any
89 undue constraints on any potential resolution scheme, the syntax for
90 the Namespace Identifier is:
91
92 <NID> ::= <let-num> [ 1,31<let-num-hyp> ]
93
94 <let-num-hyp> ::= <upper> | <lower> | <number> | "-"
95
96 <let-num> ::= <upper> | <lower> | <number>
97
98 <upper> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
99 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
100 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
101 "Y" | "Z"
102
103 <lower> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
104 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
105 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
106 "y" | "z"
107
108 <number> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
109 "8" | "9"
110
111
112
113 This is slightly more restrictive that what is stated in [4] (which
114
115
116
117 Expires 9/30/97 [Page 2]
118
119
120
121
122
123 INTERNET DRAFT URN Syntax March 1997
124
125
126 allows the characters "." and "+"). Further, the Namespace
127 Identifier is case insensitive, so that "ISBN" and "isbn" refer to
128 the same namespace.
129
130 To avoid confusion with the "urn:" identifier, the NID "urn" is
131 reserved and MUST NOT be used.
132
133 2.2 Namespace Specific String Syntax
134
135 As required by RFC 1737, there is a single canonical representation
136 of the NSS portion of an URN. The format of this single canonical
137 form follows:
138
139 <NSS> ::= 1*<URN chars>
140
141 <URN chars> ::= <trans> | "%" <hex> <hex>
142
143 <trans> ::= <upper> | <lower> | <number> | <other> | <reserved>
144
145 <hex> ::= <number> | "A" | "B" | "C" | "D" | "E" | "F" |
146 "a" | "b" | "c" | "d" | "e" | "f"
147
148 <other> ::= "(" | ")" | "+" | "," | "-" | "." |
149 ":" | "=" | "@" | ";" | "$" |
150 "_" | "!" | "*" | "'"
151
152 Depending on the rules governing a namespace, valid identifiers in a
153 namespace might contain characters that are not members of the URN
154 character set above (<URN chars>). Such strings MUST be translated
155 into canonical NSS format before using them as protocol elements or
156 otherwise passing them on to other applications. Translation is done
157 by encoding each character outside the URN character set as a
158 sequence of one to six octets using UTF-8 encoding, and the encoding
159 of each of those octets as "%" followed by two characters from the
160 <hex> character set above. The two characters give the hexadecimal
161 representation of that octet.
162
163 2.3 Reserved characters
164
165 The remaining character set left to be discussed above is the
166 reserved character set, which contains various characters reserved
167 from normal use. The reserved character set follows, with a
168 discussion on the specifics of why each character is reserved.
169
170 The reserved character set is:
171
172 <reserved> ::= '%" | "/" | "?" | "#"
173
174
175
176
177 Expires 9/30/97 [Page 3]
178
179
180
181
182
183 INTERNET DRAFT URN Syntax March 1997
184
185
186 2.3.1 The "%" character
187
188 The "%" character is reserved in the URN syntax for introducing the
189 escape sequence for an octet. Literal use of the "%" character in a
190 namespace must be encoded using "%25" in URNs for that namespace.
191 The presence of an "%" character in an URN MUST be followed by two
192 characters from the <hex> character set.
193
194 Namespaces MAY designate one or more characters from the URN
195 character set as having special meaning for that namespace. If the
196 namespace also uses that character in a literal sense as well, the
197 character used in a literal sense MUST be encoded with "%" followed
198 by the hexadecimal representation of that octet. Further, a
199 character MUST NOT be "%"-encoded if the character is not a reserved
200 character. Therefore, the process of registering a namespace
201 identifier shall include publication of a definition of which
202 characters have a special meaning to that namespace.
203
204 2.3.2 The other reserved characters
205
206 RFC 1630 [2] reserves the characters "/", "?", and "#" for particular
207 purposes. The URN-WG has not yet debated the applicability and
208 precise semantics of those purposes as applied to URNs. Therefore,
209 these characters are RESERVED for future developments. Namespace
210 developers SHOULD NOT use these characters in unencoded form, but
211 rather use the appropriate %-encoding for each character.
212
213 2.4 Excluded characters
214
215 The following list is included only for the sake of completeness.
216 Any octets/characters on this list are explicitly NOT part of the URN
217 character set, and if used in an URN, MUST be %encoded:
218
219 <excluded> ::= octets 1-32 (1-20 hex) | "\" | """ | "&" | "<"
220 | ">" | "[" | "]" | "^" | "`" | "{" | "|" | "}" | "~"
221 | octets 127-255 (7F-FF hex)
222
223 In addition, octet 0 (0 hex) should NEVER be used, in either
224 unencoded or %-encoded form.
225
226 An URN ends when an octet/character from the excluded character set
227 (<excluded>) is encountered. The character from the excluded
228 character set is NOT part of the URN.
229
230 3. Support of existing legacy naming systems and new naming systems
231
232 Any namespace (existing or newly-devised) that is proposed as an
233 URN-namespace and fulfills the criteria of URN-namespaces MUST be
234
235
236
237 Expires 9/30/97 [Page 4]
238
239
240
241
242
243 INTERNET DRAFT URN Syntax March 1997
244
245
246 expressed in this syntax. If names in these namespaces contain
247 characters other than those defined for the URN character set, they
248 MUST be translated into canonical form as discussed in section 2.2.
249
250 4. URN presentation and transport
251
252 The URN syntax defines the canonical format for URNs and all URN
253 transport and interchanges MUST take place in this format. Further,
254 all URN-aware applications MUST offer the option of displaying URNs
255 in this canonical form to allow for direct transcription (for example
256 by cut and paste techniques). Such applications MAY support display
257 of URNs in a more human-friendly form and may use a character set
258 that includes characters that aren't permitted in URN syntax as
259 defined in this RFC (that is, they may replace %-notation by
260 characters in some extended character set in display to humans).
261
262 5. Lexical Equivalence in URNs
263
264 For various purposes such as caching, it's often desirable to
265 determine if two URNs are the same without resolving them. The
266 general purpose means of doing so is by testing for "lexical
267 equivalence" as defined below.
268
269 Two URNs are lexically equivalent if they are octet-by-octet equal
270 after the following preprocessing:
271
272 1. normalize the case of the leading "urn:" token
273 2. normalize the case of the NID
274 3. normalizing the case of any %-escaping
275
276 Note that %-escaping MUST NOT be removed.
277
278 Some namespaces may define additional lexical equivalences, such as
279 case-insensitivity of the NSS (or parts thereof). Additional lexical
280 equivalences MUST be documented as part of namespace registration,
281 MUST always have the effect of eliminating some of the false
282 negatives obtained by the procedure above, and MUST NEVER say that
283 two URNs are not equivalent if the procedure above says they are
284 equivalent.
285
286 6. Examples of lexical equivalence
287
288 The following URN comparisons highlight the lexical equivalence
289 definitions:
290
291 1- URN:foo:a123,456
292 2- urn:foo:a123,456
293 3- urn:FOO:a123,456
294
295
296
297 Expires 9/30/97 [Page 5]
298
299
300
301
302
303 INTERNET DRAFT URN Syntax March 1997
304
305
306 4- urn:foo:A123,456
307 5- urn:foo:a123%2C456
308 6- URN:FOO:a123%2c456
309 URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not
310 lexically equivalent any of the other URNs of the above set. URNs 5
311 and 6 are only lexically equivalent to each other.
312
313 7. Functional Equivalence in URNs
314
315 Functional equivalence is determined by practice within a given
316 namespace and managed by resolvers for that namespeace. Thus, it is
317 beyond the scope of this document. Namespace registration must
318 include guidance on how to determine functional equivalence for that
319 namespace, i.e. when two URNs are the identical within a namespace.
320
321 8. Security considerations
322
323 This document specifies the syntax for URNs. While some namespaces
324 resolvers may assign special meaning to certain of the characters of
325 the Namespace Specific String, any security consideration resulting
326 from such assignment are outside the scope of this document. It is
327 strongly recommended that the process of registering a namespace
328 identifier include any such considerations.
329
330 9. Acknowledgments
331
332 Thanks to various members of the URN working group and <<your name
333 here!!>> for comments on earlier drafts of this document. This
334 document is partially supported by the National Science Foundation,
335 Cooperative Agreement NCR-9218179.
336
337 10. References
338
339 Request For Comments (RFC) and Internet Draft documents are available
340 from <URL:ftp://ftp.internic.net> and numerous mirror sites.
341
342 [1] K. R. Sollins, "Requirements and a Framework for
343 URN Resolution Systems," Internet Draft (work in
344 progress), November 1996.
345
346
347 [2]
348 T. Berners-Lee, "Universal Resource Identifiers in WWW," RFC
349 1630, June 1994.
350
351
352 [3] K. Sollins and L. Masinter, "Functional Require-
353 ments for Uniform Resource Names," RFC 1737.
354
355
356
357 Expires 9/30/97 [Page 6]
358
359
360
361
362
363 INTERNET DRAFT URN Syntax March 1997
364
365
366 December 1994.
367
368
369 [4] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform
370 Resource Locators (URL)," Internet Draft (work in
371 progress), December 1996.
372
373 11. Editor's address
374
375 Ryan Moats
376 AT&T
377 15621 Drexel Circle
378 Omaha, NE 68135-2358
379 USA
380
381 Phone: +1 402 894-9456
382 EMail: jayhawk@ds.internic.net
383
384
385
386
387
388 Appendix A. Handling of URNs by URL resolvers/browsers.
389
390 The URN syntax has been defined so that URNs can be used in places
391 where URLs are expected. A resolver that conforms to the current URL
392 syntax specification [3] will extract a scheme value of "urn:"
393 rather than a scheme value of "urn:<nid>".
394
395 An URN MUST be considered an opaque URL by URL resolvers and passed
396 (with the "urn:" tag) to an URN resolver for resolution. The URN
397 resolver can either be an external resolver that the URL resolver
398 knows of, or it can be functionality built-in to the URL resolver.
399
400 To avoid confusion of users, an URL browser SHOULD display the com-
401 plete URN (including the "urn:" tag) to ensure that there is no con-
402 fusion between URN namespace identifiers and URL scheme identifiers.
403
404
405 This Internet Draft expires September 30, 1997.
406
407
408
409
410
411
412
413
414
415
416
417 Expires 9/30/97 [Page 7]
418
419

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24