/[suikacvs]/webroot/www/2004/id/draft-ietf-urn-syntax-05.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urn-syntax-05.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2
3
4
5
6
7 Internet-Draft Ryan Moats
8 draft-ietf-urn-syntax-05.txt AT&T
9 Expires in six months March 1997
10
11
12 URN Syntax
13 Filename: draft-ietf-urn-syntax-05.txt
14
15
16 Status of This Memo
17
18 This document is an Internet-Draft. Internet-Drafts are working
19 documents of the Internet Engineering Task Force (IETF), its
20 areas, and its working groups. Note that other groups may also
21 distribute working documents as Internet-Drafts.
22
23 Internet-Drafts are draft documents valid for a maximum of six
24 months and may be updated, replaced, or obsoleted by other
25 documents at any time. It is inappropriate to use Internet-
26 Drafts as reference material or to cite them other than as ``work
27 in progress.''
28
29 To learn the current status of any Internet-Draft, please check
30 the ``1id-abstracts.txt'' listing contained in the Internet-
31 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
32 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
33 Coast), or ftp.isi.edu (US West Coast).
34
35
36 Abstract
37
38 Uniform Resource Names (URNs) are intended to serve as persistent,
39 location-independent, resource identifiers. This document sets
40 forward the canonical syntax for URNs. A discussion of both existing
41 legacy and new namespaces and requirements for URN presentation and
42 transmission are presented. Finally, there is a discussion of URN
43 equivalence and how to determine it.
44
45 1. Introduction
46
47 Uniform Resource Names (URNs) are intended to serve as persistent,
48 location-independent, resource identifiers and are designed to make
49 it easy to map other namespaces (which share the properties of URNs)
50 into URN-space. Therefore, the URN syntax provides a means to encode
51 character data in a form that can be sent in existing protocols,
52 transcribed on most keyboards, etc.
53
54
55
56
57
58 Expires 9/30/97 [Page 1]
59
60
61
62
63
64 INTERNET DRAFT URN Syntax March 1997
65
66
67 2. Syntax
68
69 All URNs have the following syntax (phrases enclosed in quotes are
70 REQUIRED):
71
72 URN ::= "urn:" NID ":" NSS
73
74 where NID is the Namespace Identifier, and NSS is the Namespace
75 Specific String. The leading "urn:" sequence is case-insensitive.
76 The Namespace ID determines the _syntactic_ interpretation of the
77 Namespace Specific String (as discussed in [1]).
78
79 RFC 1630 [2] and RFC 1737 [3] each presents additional considerations
80 for URN encoding, which have implications as far as limiting syntax.
81 On the other hand, the requirement to support existing legacy naming
82 systems has the effect of broadening syntax. Thus, we discuss the
83 acceptable syntax for both the Namespace Identifier and the Namespace
84 Specific String separately.
85
86 2.1 Namespace Identifier Syntax
87
88 The following is the syntax for the Namespace Identifier. To (a) be
89 consistent with all potential resolution schemes and (b) not put any
90 undue constraints on any potential resolution scheme, the syntax for
91 the Namespace Identifier is:
92
93 NID ::= let-num [ 1*31let-num-hyp ]
94
95 let-num-hyp ::= letter / number / "-"
96
97 let-num ::= letter / number
98
99 letter ::= %x41..5A / %x61..7A
100
101 number ::= %x30..39
102
103 This is slightly more restrictive that what is stated in [4] (which
104 allows the characters "." and "+"). Further, the Namespace
105 Identifier is case insensitive, so that "ISBN" and "isbn" refer to
106 the same namespace.
107
108 To avoid confusion with the "urn:" identifier, the NID "urn" is
109 reserved and MUST NOT be used.
110
111 2.2 Namespace Specific String Syntax
112
113 As required by RFC 1737, there is a single canonical representation
114 of the NSS portion of an URN. The format of this single canonical
115
116
117
118 Expires 9/30/97 [Page 2]
119
120
121
122
123
124 INTERNET DRAFT URN Syntax March 1997
125
126
127 form follows:
128
129 NSS ::= 1*URN_chars
130
131 URN_chars ::= trans / ("%" hex hex)
132
133 trans ::= letter / number / other / reserved
134
135 hex ::= number / %x41..46 / %x61..66
136
137 other ::= "(" / ")" / "+" / "," / "-" / "." /
138 ":" / "=" / "@" / ";" / "$" / "_" /
139 "!" / "*" / "'"
140
141 Depending on the rules governing a namespace, valid identifiers in a
142 namespace might contain characters that are not members of the URN
143 character set above (URN_chars). Such strings MUST be translated
144 into canonical NSS format before using them as protocol elements or
145 otherwise passing them on to other applications. Translation is done
146 by encoding each character outside the URN character set as a
147 sequence of one to six octets using normalized UTF8 [5], and the
148 encoding of each of those octets as "%" followed by two characters
149 from the hex character set above. The two characters give the
150 hexadecimal representation of that octet.
151
152 2.3 Reserved characters
153
154 The remaining character set left to be discussed above is the
155 reserved character set, which contains various characters reserved
156 from normal use. The reserved character set follows, with a
157 discussion on the specifics of why each character is reserved.
158
159 The reserved character set is:
160
161 reserved ::= '%" / "/" / "?" / "#"
162
163 2.3.1 The "%" character
164
165 The "%" character is reserved in the URN syntax for introducing the
166 escape sequence for an octet. Literal use of the "%" character in a
167 namespace must be encoded using "%25" in URNs for that namespace.
168 The presence of an "%" character in an URN MUST be followed by two
169 characters from the <hex> character set.
170
171 Namespaces MAY designate one or more characters from the URN
172 character set as having special meaning for that namespace. If the
173 namespace also uses that character in a literal sense as well, the
174 character used in a literal sense MUST be encoded with "%" followed
175
176
177
178 Expires 9/30/97 [Page 3]
179
180
181
182
183
184 INTERNET DRAFT URN Syntax March 1997
185
186
187 by the hexadecimal representation of that octet. Further, a
188 character MUST NOT be "%"-encoded if the character is not a reserved
189 character. Therefore, the process of registering a namespace
190 identifier shall include publication of a definition of which
191 characters have a special meaning to that namespace.
192
193 2.3.2 The other reserved characters
194
195 RFC 1630 [2] reserves the characters "/", "?", and "#" for particular
196 purposes. The URN-WG has not yet debated the applicability and
197 precise semantics of those purposes as applied to URNs. Therefore,
198 these characters are RESERVED for future developments. Namespace
199 developers SHOULD NOT use these characters in unencoded form, but
200 rather use the appropriate %-encoding for each character.
201
202 2.4 Excluded characters
203
204 The following list is included only for the sake of completeness.
205 Any octets/characters on this list are explicitly NOT part of the URN
206 character set, and if used in an URN, MUST be %encoded:
207
208 excluded ::= octets 1-32 (1-20 hex) / "\" / """ /
209 "&" / "<" / ">" / "[" / "]" / "^" /
210 "`" / "{" / "|" / "}" / "~" /
211 octets 127-255 (7F-FF hex)
212
213 In addition, octet 0 (0 hex) should NEVER be used, in either
214 unencoded or %-encoded form.
215
216 An URN ends when an octet/character from the excluded character set
217 (excluded) is encountered. The character from the excluded character
218 set is NOT part of the URN.
219
220 3. Support of existing legacy naming systems and new naming systems
221
222 Any namespace (existing or newly-devised) that is proposed as an
223 URN-namespace and fulfills the criteria of URN-namespaces MUST be
224 expressed in this syntax. If names in these namespaces contain
225 characters other than those defined for the URN character set, they
226 MUST be translated into canonical form as discussed in section 2.2.
227
228 4. URN presentation and transport
229
230 The URN syntax defines the canonical format for URNs and all URN
231 transport and interchanges MUST take place in this format. Further,
232 all URN-aware applications MUST offer the option of displaying URNs
233 in this canonical form to allow for direct transcription (for example
234 by cut and paste techniques). Such applications MAY support display
235
236
237
238 Expires 9/30/97 [Page 4]
239
240
241
242
243
244 INTERNET DRAFT URN Syntax March 1997
245
246
247 of URNs in a more human-friendly form and may use a character set
248 that includes characters that aren't permitted in URN syntax as
249 defined in this RFC (that is, they may replace %-notation by
250 characters in some extended character set in display to humans).
251
252 5. Lexical Equivalence in URNs
253
254 For various purposes such as caching, it's often desirable to
255 determine if two URNs are the same without resolving them. The
256 general purpose means of doing so is by testing for "lexical
257 equivalence" as defined below.
258
259 Two URNs are lexically equivalent if they are octet-by-octet equal
260 after the following preprocessing:
261
262 1. normalize the case of the leading "urn:" token
263 2. normalize the case of the NID
264 3. normalizing the case of any %-escaping
265
266 Note that %-escaping MUST NOT be removed.
267
268 Some namespaces may define additional lexical equivalences, such as
269 case-insensitivity of the NSS (or parts thereof). Additional lexical
270 equivalences MUST be documented as part of namespace registration,
271 MUST always have the effect of eliminating some of the false
272 negatives obtained by the procedure above, and MUST NEVER say that
273 two URNs are not equivalent if the procedure above says they are
274 equivalent.
275
276 6. Examples of lexical equivalence
277
278 The following URN comparisons highlight the lexical equivalence
279 definitions:
280
281 1- URN:foo:a123,456
282 2- urn:foo:a123,456
283 3- urn:FOO:a123,456
284 4- urn:foo:A123,456
285 5- urn:foo:a123%2C456
286 6- URN:FOO:a123%2c456
287 URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not
288 lexically equivalent any of the other URNs of the above set. URNs 5
289 and 6 are only lexically equivalent to each other.
290
291 7. Functional Equivalence in URNs
292
293 Functional equivalence is determined by practice within a given
294 namespace and managed by resolvers for that namespeace. Thus, it is
295
296
297
298 Expires 9/30/97 [Page 5]
299
300
301
302
303
304 INTERNET DRAFT URN Syntax March 1997
305
306
307 beyond the scope of this document. Namespace registration must
308 include guidance on how to determine functional equivalence for that
309 namespace, i.e. when two URNs are the identical within a namespace.
310
311 8. Security considerations
312
313 This document specifies the syntax for URNs. While some namespaces
314 resolvers may assign special meaning to certain of the characters of
315 the Namespace Specific String, any security consideration resulting
316 from such assignment are outside the scope of this document. It is
317 strongly recommended that the process of registering a namespace
318 identifier include any such considerations.
319
320 9. Acknowledgments
321
322 Thanks to various members of the URN working group for comments on
323 earlier drafts of this document. This document is partially
324 supported by the National Science Foundation, Cooperative Agreement
325 NCR-9218179.
326
327 10. References
328
329 Request For Comments (RFC) and Internet Draft documents are available
330 from <URL:ftp://ftp.internic.net> and numerous mirror sites.
331
332 [1] K. R. Sollins, "Requirements and a Framework for
333 URN Resolution Systems," Internet Draft (work in
334 progress), November 1996.
335
336
337 [2] T. Berners-Lee, "Universal Resource Identifiers in
338 WWW," RFC 1630, June 1994.
339
340
341 [3] K. Sollins and L. Masinter, "Functional Require-
342 ments for Uniform Resource Names," RFC 1737.
343 December 1994.
344
345
346 [4] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform
347 Resource Locators (URL)," Internet Draft (work in
348 progress), December 1996.
349
350
351 [5] Appendix A.2 of The Unicode Consortium, "The
352 Unicode Standard, Version 2.0", Addison-Wesley
353 Developers Press, 1996. ISBN 0-201-48345-9.
354
355
356
357
358 Expires 9/30/97 [Page 6]
359
360
361
362
363
364 INTERNET DRAFT URN Syntax March 1997
365
366
367 11. Editor's address
368
369 Ryan Moats
370 AT&T
371 15621 Drexel Circle
372 Omaha, NE 68135-2358
373 USA
374
375 Phone: +1 402 894-9456
376 EMail: jayhawk@ds.internic.net
377
378
379
380
381
382 Appendix A. Handling of URNs by URL resolvers/browsers.
383
384 The URN syntax has been defined so that URNs can be used in places
385 where URLs are expected. A resolver that conforms to the current URL
386 syntax specification [3] will extract a scheme value of "urn:"
387 rather than a scheme value of "urn:<nid>".
388
389 An URN MUST be considered an opaque URL by URL resolvers and passed
390 (with the "urn:" tag) to an URN resolver for resolution. The URN
391 resolver can either be an external resolver that the URL resolver
392 knows of, or it can be functionality built-in to the URL resolver.
393
394 To avoid confusion of users, an URL browser SHOULD display the com-
395 plete URN (including the "urn:" tag) to ensure that there is no con-
396 fusion between URN namespace identifiers and URL scheme identifiers.
397
398
399 This Internet Draft expires September 30, 1997.
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418 Expires 9/30/97 [Page 7]
419
420
421

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24