/[suikacvs]/webroot/www/2004/id/draft-ietf-urn-syntax-01.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-urn-syntax-01.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2
3 Internet-Draft Ryan Moats
4 draft-ietf-urn-syntax-01.txt AT&T
5 Expires in six months November 1996
6
7
8 URN Syntax
9 Filename: draft-ietf-urn-syntax-01.txt
10
11
12 Status of This Memo
13
14 This document is an Internet-Draft. Internet-Drafts are working
15 documents of the Internet Engineering Task Force (IETF), its
16 areas, and its working groups. Note that other groups may also
17 distribute working documents as Internet-Drafts.
18
19 Internet-Drafts are draft documents valid for a maximum of six
20 months and may be updated, replaced, or obsoleted by other
21 documents at any time. It is inappropriate to use Internet-
22 Drafts as reference material or to cite them other than as ``work
23 in progress.''
24
25 To learn the current status of any Internet-Draft, please check
26 the ``1id-abstracts.txt'' listing contained in the Internet-
27 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net
28 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East
29 Coast), or ftp.isi.edu (US West Coast).
30
31
32 Abstract
33
34 Uniform Resource Names (URNs) are intended to serve as persistent
35 resource identifiers. This document sets forward the canonical syntax
36 for URNs. Support for both existing legacy and new namespaces is
37 discussed. Requirements for URN presentation and transmission are
38 presented. Finally, there is a discussion of URN equivalence and how
39 to determine it.
40
41 1. Introduction
42
43 Uniform Resource Names (URNs) are intended to serve as persistent
44 resource identifiers and are designed to make it easy to map other
45 namespaces (which share the properties of URNs) into URN-space. The
46 URN syntax therefore provides a means to encode character data in a
47 form that can be sent in existing protocols, transcribed on most
48 keyboards, etc.
49
50
51
52
53
54 Expires 5/19/97 [Page 1]
55
56
57
58
59
60 INTERNET DRAFT URN Syntax November 1996
61
62
63 2. Syntax
64
65 All URNs have the following syntax:
66
67 <URN> ::= ["urn:"] <NID> ":" <NSS>
68
69 <NID> is the Namespace Identifier, and <NSS> is the Namespace
70 Specific String. The leading case-insensitive "urn:" sequence is
71 currently optional, as no closure on its definite presence or absence
72 has been reached. The Namespace ID is used to determine the
73 _syntactic_ interpretation of the Namespace Specific String (as
74 discussed in [1]).
75
76 RFC 1737 [2] presents additional requirements on URN encoding, which
77 all have implications as far as limiting syntax. On the other hand,
78 the requirement to support existing legacy naming systems has the
79 effect of broadening syntax. Thus, we discuss the acceptable syntax
80 for both the Namespace Identifier and the Namespace Specific String
81 separately.
82
83 2.1 Namespace Identifier Syntax
84
85 The following is the syntax for the Namespace Identifier. To (a) be
86 consistent with all potential resolution schemes and (b) not put any
87 undue constraints on any potential resolution scheme, the syntax for
88 the Namespace Identifier is:
89
90 <NID> ::= <letter> [ <let-hyp> ]
91
92 <let-hyp> ::= <letter> | "-" | <let-hyp>
93
94 <letter> ::= any one of the 52 alphabetic characters A through Z
95 in upper case and a through z in lower case
96
97 This is slightly more restrictive that what is stated in RFC 1738 [4]
98 (which allows the period "."). Further, the Namespace Identifier is
99 case insensitive, so that "ISBN" and "isbn" refer to the same
100 namespace.
101
102 To avoid confusion with the optional "urn:" identifier, the NID "urn"
103 is reserved and may not be used.
104
105 2.2 Namespace Specific String Syntax
106
107 As required by 1737, there is a single canonical representation of
108 the NSS portion of an URN. The format of this single canonical form
109 follows:
110
111
112
113
114 Expires 5/19/97 [Page 2]
115
116
117
118
119
120 INTERNET DRAFT URN Syntax November 1996
121
122
123 <NSS> ::= <URN chars>*
124
125 <URN chars> ::= <trans> | "%" <hex> <hex>
126
127 <trans> ::= <upper> | <lower> | <number> | <other>
128
129 <hex> ::= <number> | "A" | "B" | "C" | "D" | "E" | "F"
130
131 <upper> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
132 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
133 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
134 "Y" | "Z"
135
136 <lower> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
137 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
138 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
139 "y" | "z"
140
141 <number> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
142 "8" | "9"
143
144 <other> ::= "(" | ")" | "+" | "
145 ":" | "=" | "?" | "@"
146
147
148 Depending on the rules governing a namespace, valid identifiers in a
149 namespace might contain characters that are not members of the URN
150 character set above (<URN chars>). Such strings MUST be translated
151 into canonical NSS format before using them as protocol elements or
152 otherwise passing them on to other applications. Translation is done
153 by encoding each character outside the URN character set as a
154 sequence of one to six octets using UTF-8 encoding, and the encoding
155 of each of those octets as "%" followed by two characters from the
156 <hex> character set above. The two characters give the hexadecimal
157 representation of that octet.
158
159 Namespaces MAY designate one or more characters from the URN
160 character set as having special meaning for that namespace. If the
161 namespace also uses that character in a literal sense as well, the
162 character used in a literal sense must be encoded with "%" followed
163 by the hexadecimal representation of that octet. Therefore, the
164 process of registering a namespace identifier shall include
165 publication of a definition of which characters have a special
166 meaning and how to encode these characters if used in a literal
167 sense.
168
169
170
171
172
173
174 Expires 5/19/97 [Page 3]
175
176
177
178
179
180 INTERNET DRAFT URN Syntax November 1996
181
182
183 3. Support of existing legacy naming systems and new naming systems
184
185 URN-aware applications MAY accept as input other resource identifiers
186 from existing legacy namespaces. If such identifiers contain
187 characters that are not members of the URN character set specified in
188 section 2.2, the identifier MUST be translated to canonical format as
189 discussed in section 2.2.
190
191 Some existing name spaces that have the properties of the URN-space
192 contain some human-significant components, and these exist in a wide
193 variety of languages. However, URNs are NOT intended to convey
194 information that is significant to humans. While the translation
195 rule in section 2.2 is provided for existing namespaces, new
196 namespaces, as part of their registration documentation, MUST define
197 a discipline for assigning new URNs that does not simplify the
198 generation of human-significant names.
199
200 4. URN presentation and transport
201
202 URN-aware applications MAY support "natural" display of URNs which
203 contain characters encoded using "%" notation. However, they MUST
204 provide for display of URNs in canonical form (i.e. in a format
205 suitable for transcription).
206
207 URNs may only be transported in canonical format.
208
209 5. Equivalence in URNs
210
211 URNs are considered equivalent if they return the same resource. For
212 various purposes, such as caching, a test is necessary to determine
213 equivalence without actually resolving the URNs and fetching/comparing
214 the underlying resources. "Lexical equivalence" is a stricter condition
215 that the equivalence described above (functional equivalence).
216
217 5.1 Lexical Equivalence
218
219 Lexical equivalence may be determined by comparing two URNs without
220 making any network accesses. Two URNs are lexically equivalent if
221 they are octet-by-octet equal after the following preprocessing
222
223 1. drop any preceding "urn:" token
224 2. normalize the case of the NID
225
226 Some namespaces may define additional lexical equivalences, such as
227 case-insensitivity of the NSS (or parts thereof). Additional lexical
228 equivalences MUST be documented as part of namespace registration,
229 MUST always have the effect of eliminating some of the false
230 negatives obtained by the procedure above, and MUST NEVER says that
231
232
233
234 Expires 5/19/97 [Page 4]
235
236
237
238
239
240 INTERNET DRAFT URN Syntax November 1996
241
242
243 two URNs are not equivalent if the procedure above says they are
244 equivalent.
245
246 5.2 Functional Equivalence
247
248 Resolvers determine functional equivalence based on specific rules
249 for the namespace. Therefore, namespace registration must include
250 documentation on how to determine functional equivalence for that
251 namespace.
252
253 5.3 Examples
254
255 The following URN comparisons highlight the difference between these
256 types of equivalence:
257
258 urn:isbn:1-23485-8-29, isbn:1-23485-8-29 are lexically equiv.
259 urn:isbn:1-23485-8-29, ISBN:1-23485-8-29 are lexically equiv.
260 urn:isbn:1-23485-8-29, isbn:123485829 are not lexically equiv.
261 but may be functionally equivalent.
262
263 6. Security considerations
264
265 Because of the number of potential namespaces, it must be restated
266 that certain of the characters in the Namespace Specific String may
267 have special meaning to certain namespace resolvers. The process of
268 registering a namespace identifier shall therefore include
269 publication of a definition of which characters have a special
270 meaning.
271
272 7. Acknowledgments
273
274 Thanks to various members of the URN working group and <<your name
275 here!!>> for comments on earlier drafts of this document. This
276 document is partially supported by the National Science Foundation.
277
278 8. References
279
280 Request For Comments (RFC) and Internet Draft documents are available
281 from <URL:ftp://ftp.internic.net> and numerous mirror sites.
282
283 [1] L. L. Daigle, P. Faltstrom, R. Iannella. "A Frame-
284 work for the Assignment and Resolution of Uniform
285 Resource Names," Internet Draft (work in progress).
286 June 1996.
287
288
289 [2] K. Sollins, L. Masinter. "Functional Requirements
290 for Uniform Resource Names," RFC 1737. December
291
292
293
294 Expires 5/19/97 [Page 5]
295
296
297
298
299
300 INTERNET DRAFT URN Syntax November 1996
301
302
303 1994.
304
305
306 [3] T. Berners-Lee. "Universal Resource Identifiers in
307 WWW," RFC 1630. June 1994.
308
309
310 [4] T. Berners-Lee, L. Masinter, M. McCahill. "Uniform
311 Resource Locators (URL)," RFC 1738. December 1994.
312
313 9. Editor's address
314
315 Ryan Moats
316 AT&T
317 15621 Drexel Circle
318 Omaha, NE 68135-2358
319 USA
320
321 Phone: +1 402 894-9456
322 EMail: jayhawk@ds.internic.net
323
324
325 This Internet Draft expires May 19, 1997.
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354 Expires 5/19/97 [Page 6]
355
356
357

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24