1 |
|
2 |
|
3 |
Internet-Draft Ryan Moats |
4 |
draft-ietf-urn-syntax-01.txt AT&T |
5 |
Expires in six months November 1996 |
6 |
|
7 |
|
8 |
URN Syntax |
9 |
Filename: draft-ietf-urn-syntax-01.txt |
10 |
|
11 |
|
12 |
Status of This Memo |
13 |
|
14 |
This document is an Internet-Draft. Internet-Drafts are working |
15 |
documents of the Internet Engineering Task Force (IETF), its |
16 |
areas, and its working groups. Note that other groups may also |
17 |
distribute working documents as Internet-Drafts. |
18 |
|
19 |
Internet-Drafts are draft documents valid for a maximum of six |
20 |
months and may be updated, replaced, or obsoleted by other |
21 |
documents at any time. It is inappropriate to use Internet- |
22 |
Drafts as reference material or to cite them other than as ``work |
23 |
in progress.'' |
24 |
|
25 |
To learn the current status of any Internet-Draft, please check |
26 |
the ``1id-abstracts.txt'' listing contained in the Internet- |
27 |
Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net |
28 |
(Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East |
29 |
Coast), or ftp.isi.edu (US West Coast). |
30 |
|
31 |
|
32 |
Abstract |
33 |
|
34 |
Uniform Resource Names (URNs) are intended to serve as persistent |
35 |
resource identifiers. This document sets forward the canonical syntax |
36 |
for URNs. Support for both existing legacy and new namespaces is |
37 |
discussed. Requirements for URN presentation and transmission are |
38 |
presented. Finally, there is a discussion of URN equivalence and how |
39 |
to determine it. |
40 |
|
41 |
1. Introduction |
42 |
|
43 |
Uniform Resource Names (URNs) are intended to serve as persistent |
44 |
resource identifiers and are designed to make it easy to map other |
45 |
namespaces (which share the properties of URNs) into URN-space. The |
46 |
URN syntax therefore provides a means to encode character data in a |
47 |
form that can be sent in existing protocols, transcribed on most |
48 |
keyboards, etc. |
49 |
|
50 |
|
51 |
|
52 |
|
53 |
|
54 |
Expires 5/19/97 [Page 1] |
55 |
|
56 |
|
57 |
|
58 |
|
59 |
|
60 |
INTERNET DRAFT URN Syntax November 1996 |
61 |
|
62 |
|
63 |
2. Syntax |
64 |
|
65 |
All URNs have the following syntax: |
66 |
|
67 |
<URN> ::= ["urn:"] <NID> ":" <NSS> |
68 |
|
69 |
<NID> is the Namespace Identifier, and <NSS> is the Namespace |
70 |
Specific String. The leading case-insensitive "urn:" sequence is |
71 |
currently optional, as no closure on its definite presence or absence |
72 |
has been reached. The Namespace ID is used to determine the |
73 |
_syntactic_ interpretation of the Namespace Specific String (as |
74 |
discussed in [1]). |
75 |
|
76 |
RFC 1737 [2] presents additional requirements on URN encoding, which |
77 |
all have implications as far as limiting syntax. On the other hand, |
78 |
the requirement to support existing legacy naming systems has the |
79 |
effect of broadening syntax. Thus, we discuss the acceptable syntax |
80 |
for both the Namespace Identifier and the Namespace Specific String |
81 |
separately. |
82 |
|
83 |
2.1 Namespace Identifier Syntax |
84 |
|
85 |
The following is the syntax for the Namespace Identifier. To (a) be |
86 |
consistent with all potential resolution schemes and (b) not put any |
87 |
undue constraints on any potential resolution scheme, the syntax for |
88 |
the Namespace Identifier is: |
89 |
|
90 |
<NID> ::= <letter> [ <let-hyp> ] |
91 |
|
92 |
<let-hyp> ::= <letter> | "-" | <let-hyp> |
93 |
|
94 |
<letter> ::= any one of the 52 alphabetic characters A through Z |
95 |
in upper case and a through z in lower case |
96 |
|
97 |
This is slightly more restrictive that what is stated in RFC 1738 [4] |
98 |
(which allows the period "."). Further, the Namespace Identifier is |
99 |
case insensitive, so that "ISBN" and "isbn" refer to the same |
100 |
namespace. |
101 |
|
102 |
To avoid confusion with the optional "urn:" identifier, the NID "urn" |
103 |
is reserved and may not be used. |
104 |
|
105 |
2.2 Namespace Specific String Syntax |
106 |
|
107 |
As required by 1737, there is a single canonical representation of |
108 |
the NSS portion of an URN. The format of this single canonical form |
109 |
follows: |
110 |
|
111 |
|
112 |
|
113 |
|
114 |
Expires 5/19/97 [Page 2] |
115 |
|
116 |
|
117 |
|
118 |
|
119 |
|
120 |
INTERNET DRAFT URN Syntax November 1996 |
121 |
|
122 |
|
123 |
<NSS> ::= <URN chars>* |
124 |
|
125 |
<URN chars> ::= <trans> | "%" <hex> <hex> |
126 |
|
127 |
<trans> ::= <upper> | <lower> | <number> | <other> |
128 |
|
129 |
<hex> ::= <number> | "A" | "B" | "C" | "D" | "E" | "F" |
130 |
|
131 |
<upper> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | |
132 |
"I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | |
133 |
"Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | |
134 |
"Y" | "Z" |
135 |
|
136 |
<lower> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | |
137 |
"i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | |
138 |
"q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | |
139 |
"y" | "z" |
140 |
|
141 |
<number> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | |
142 |
"8" | "9" |
143 |
|
144 |
<other> ::= "(" | ")" | "+" | " |
145 |
":" | "=" | "?" | "@" |
146 |
|
147 |
|
148 |
Depending on the rules governing a namespace, valid identifiers in a |
149 |
namespace might contain characters that are not members of the URN |
150 |
character set above (<URN chars>). Such strings MUST be translated |
151 |
into canonical NSS format before using them as protocol elements or |
152 |
otherwise passing them on to other applications. Translation is done |
153 |
by encoding each character outside the URN character set as a |
154 |
sequence of one to six octets using UTF-8 encoding, and the encoding |
155 |
of each of those octets as "%" followed by two characters from the |
156 |
<hex> character set above. The two characters give the hexadecimal |
157 |
representation of that octet. |
158 |
|
159 |
Namespaces MAY designate one or more characters from the URN |
160 |
character set as having special meaning for that namespace. If the |
161 |
namespace also uses that character in a literal sense as well, the |
162 |
character used in a literal sense must be encoded with "%" followed |
163 |
by the hexadecimal representation of that octet. Therefore, the |
164 |
process of registering a namespace identifier shall include |
165 |
publication of a definition of which characters have a special |
166 |
meaning and how to encode these characters if used in a literal |
167 |
sense. |
168 |
|
169 |
|
170 |
|
171 |
|
172 |
|
173 |
|
174 |
Expires 5/19/97 [Page 3] |
175 |
|
176 |
|
177 |
|
178 |
|
179 |
|
180 |
INTERNET DRAFT URN Syntax November 1996 |
181 |
|
182 |
|
183 |
3. Support of existing legacy naming systems and new naming systems |
184 |
|
185 |
URN-aware applications MAY accept as input other resource identifiers |
186 |
from existing legacy namespaces. If such identifiers contain |
187 |
characters that are not members of the URN character set specified in |
188 |
section 2.2, the identifier MUST be translated to canonical format as |
189 |
discussed in section 2.2. |
190 |
|
191 |
Some existing name spaces that have the properties of the URN-space |
192 |
contain some human-significant components, and these exist in a wide |
193 |
variety of languages. However, URNs are NOT intended to convey |
194 |
information that is significant to humans. While the translation |
195 |
rule in section 2.2 is provided for existing namespaces, new |
196 |
namespaces, as part of their registration documentation, MUST define |
197 |
a discipline for assigning new URNs that does not simplify the |
198 |
generation of human-significant names. |
199 |
|
200 |
4. URN presentation and transport |
201 |
|
202 |
URN-aware applications MAY support "natural" display of URNs which |
203 |
contain characters encoded using "%" notation. However, they MUST |
204 |
provide for display of URNs in canonical form (i.e. in a format |
205 |
suitable for transcription). |
206 |
|
207 |
URNs may only be transported in canonical format. |
208 |
|
209 |
5. Equivalence in URNs |
210 |
|
211 |
URNs are considered equivalent if they return the same resource. For |
212 |
various purposes, such as caching, a test is necessary to determine |
213 |
equivalence without actually resolving the URNs and fetching/comparing |
214 |
the underlying resources. "Lexical equivalence" is a stricter condition |
215 |
that the equivalence described above (functional equivalence). |
216 |
|
217 |
5.1 Lexical Equivalence |
218 |
|
219 |
Lexical equivalence may be determined by comparing two URNs without |
220 |
making any network accesses. Two URNs are lexically equivalent if |
221 |
they are octet-by-octet equal after the following preprocessing |
222 |
|
223 |
1. drop any preceding "urn:" token |
224 |
2. normalize the case of the NID |
225 |
|
226 |
Some namespaces may define additional lexical equivalences, such as |
227 |
case-insensitivity of the NSS (or parts thereof). Additional lexical |
228 |
equivalences MUST be documented as part of namespace registration, |
229 |
MUST always have the effect of eliminating some of the false |
230 |
negatives obtained by the procedure above, and MUST NEVER says that |
231 |
|
232 |
|
233 |
|
234 |
Expires 5/19/97 [Page 4] |
235 |
|
236 |
|
237 |
|
238 |
|
239 |
|
240 |
INTERNET DRAFT URN Syntax November 1996 |
241 |
|
242 |
|
243 |
two URNs are not equivalent if the procedure above says they are |
244 |
equivalent. |
245 |
|
246 |
5.2 Functional Equivalence |
247 |
|
248 |
Resolvers determine functional equivalence based on specific rules |
249 |
for the namespace. Therefore, namespace registration must include |
250 |
documentation on how to determine functional equivalence for that |
251 |
namespace. |
252 |
|
253 |
5.3 Examples |
254 |
|
255 |
The following URN comparisons highlight the difference between these |
256 |
types of equivalence: |
257 |
|
258 |
urn:isbn:1-23485-8-29, isbn:1-23485-8-29 are lexically equiv. |
259 |
urn:isbn:1-23485-8-29, ISBN:1-23485-8-29 are lexically equiv. |
260 |
urn:isbn:1-23485-8-29, isbn:123485829 are not lexically equiv. |
261 |
but may be functionally equivalent. |
262 |
|
263 |
6. Security considerations |
264 |
|
265 |
Because of the number of potential namespaces, it must be restated |
266 |
that certain of the characters in the Namespace Specific String may |
267 |
have special meaning to certain namespace resolvers. The process of |
268 |
registering a namespace identifier shall therefore include |
269 |
publication of a definition of which characters have a special |
270 |
meaning. |
271 |
|
272 |
7. Acknowledgments |
273 |
|
274 |
Thanks to various members of the URN working group and <<your name |
275 |
here!!>> for comments on earlier drafts of this document. This |
276 |
document is partially supported by the National Science Foundation. |
277 |
|
278 |
8. References |
279 |
|
280 |
Request For Comments (RFC) and Internet Draft documents are available |
281 |
from <URL:ftp://ftp.internic.net> and numerous mirror sites. |
282 |
|
283 |
[1] L. L. Daigle, P. Faltstrom, R. Iannella. "A Frame- |
284 |
work for the Assignment and Resolution of Uniform |
285 |
Resource Names," Internet Draft (work in progress). |
286 |
June 1996. |
287 |
|
288 |
|
289 |
[2] K. Sollins, L. Masinter. "Functional Requirements |
290 |
for Uniform Resource Names," RFC 1737. December |
291 |
|
292 |
|
293 |
|
294 |
Expires 5/19/97 [Page 5] |
295 |
|
296 |
|
297 |
|
298 |
|
299 |
|
300 |
INTERNET DRAFT URN Syntax November 1996 |
301 |
|
302 |
|
303 |
1994. |
304 |
|
305 |
|
306 |
[3] T. Berners-Lee. "Universal Resource Identifiers in |
307 |
WWW," RFC 1630. June 1994. |
308 |
|
309 |
|
310 |
[4] T. Berners-Lee, L. Masinter, M. McCahill. "Uniform |
311 |
Resource Locators (URL)," RFC 1738. December 1994. |
312 |
|
313 |
9. Editor's address |
314 |
|
315 |
Ryan Moats |
316 |
AT&T |
317 |
15621 Drexel Circle |
318 |
Omaha, NE 68135-2358 |
319 |
USA |
320 |
|
321 |
Phone: +1 402 894-9456 |
322 |
EMail: jayhawk@ds.internic.net |
323 |
|
324 |
|
325 |
This Internet Draft expires May 19, 1997. |
326 |
|
327 |
|
328 |
|
329 |
|
330 |
|
331 |
|
332 |
|
333 |
|
334 |
|
335 |
|
336 |
|
337 |
|
338 |
|
339 |
|
340 |
|
341 |
|
342 |
|
343 |
|
344 |
|
345 |
|
346 |
|
347 |
|
348 |
|
349 |
|
350 |
|
351 |
|
352 |
|
353 |
|
354 |
Expires 5/19/97 [Page 6] |
355 |
|
356 |
|
357 |
|