1 |
|
2 |
Internet Draft Clifford Lynch |
3 |
draft-ietf-urn-biblio-00.txt University of California |
4 |
22 March 1997 Cecilia Preston |
5 |
Expires in six months Preston & Lynch |
6 |
Ron Daniel Jr. |
7 |
Los Alamos National Laboratory |
8 |
|
9 |
|
10 |
Using Existing Bibliographic Identifiers |
11 |
as |
12 |
Uniform Resource Names |
13 |
|
14 |
|
15 |
Status of this Document |
16 |
|
17 |
This document is an Internet-Draft. Internet-Drafts are |
18 |
working documents of the Internet Engineering Task Force |
19 |
(IETF), its areas, and its working groups. Note that other |
20 |
groups may also distribute working documents as Internet- |
21 |
Drafts. |
22 |
|
23 |
Internet-Drafts are draft documents valid for a maximum of |
24 |
six months and may be updated, replaced or made obsolete by |
25 |
other documents at any time. It is inappropriate to use |
26 |
Internet-Drafts as reference material or to cite them other |
27 |
than as works in progress. |
28 |
|
29 |
Distribution of this document is unlimited. Please send |
30 |
comments to clifford.lynch@ucop.edu and cecilia@well.com. |
31 |
|
32 |
This document does not specify a standard; it is purely |
33 |
informational. |
34 |
|
35 |
|
36 |
0. Abstract |
37 |
|
38 |
A system for Uniform Resource Names (URNs) must be capable |
39 |
of supporting identifiers from existing widely-used naming |
40 |
systems. This document discusses how three major |
41 |
bibliographic identifiers (the ISBN, ISSN and SICI) can be |
42 |
supported within the URN framework and the currently |
43 |
proposed syntax for URNs. |
44 |
|
45 |
|
46 |
|
47 |
[Page 1] |
48 |
|
49 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
50 |
|
51 |
|
52 |
1. Introduction |
53 |
|
54 |
The ongoing work of several IETF working groups, most |
55 |
recently in the Uniform Resource Names working group, has |
56 |
culminated the development of a syntax for Uniform Resource |
57 |
Names (URNs). The functional requirements and overall |
58 |
framework for Uniform Resource Names are specified in RFC |
59 |
1737 [Sollins & Masinter] and the current proposal for the |
60 |
URN syntax is draft-ietf-urn-syntax-04.txt [Moats]. |
61 |
|
62 |
As part of the validation process for the development of |
63 |
URNs the IETF working group has agreed that it is important |
64 |
to demonstrate that the current URN syntax proposal can |
65 |
accommodate existing identifiers from well managed |
66 |
namespaces. One such well-established infrastructure for |
67 |
assigning and managing names comes from the bibliographic |
68 |
community. Bibliographic identifiers function as names for |
69 |
objects that exist both in print and, increasingly, in |
70 |
electronic formats. This Internet draft demonstrates the |
71 |
feasibility of supporting three representative bibliographic |
72 |
identifiers within the currently proposed URN framework and |
73 |
syntax. |
74 |
|
75 |
Note that this document does not purport to define the |
76 |
"official" standard way of doing so; it merely demonstrates |
77 |
feasibility. It has not been developed in consultation with |
78 |
the standards bodies and maintenance agencies that oversee |
79 |
the existing bibliographic identifiers. Any actual Internet |
80 |
standard for encoding these bibliographic identifiers as |
81 |
URNs will need to be developed in consultation with the |
82 |
responsible standards bodies and maintenance agencies. |
83 |
|
84 |
In addition, there are several open questions with regard to |
85 |
the management and registry of Namespace Identifiers (NIDs) |
86 |
for URNs. For purposes of illustration, we have used the |
87 |
three NIDs "ISBN", "ISSN" and "SICI" for the three |
88 |
corresponding bibliographic identifiers discussed in this |
89 |
document. While we believe this to be the most appropriate |
90 |
choice, it is not the only one. The NIDs could be based on |
91 |
the standards body and standard number (e.g. "US-ANSI-NISO- |
92 |
Z39.56-1997" rather than "SICI"). Alternatively, one could |
93 |
lump all bibliographic identifiers into a single |
94 |
"BIBLIOGRAPHIC" name space, and structure the namespace- |
95 |
|
96 |
[Page 2] |
97 |
|
98 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
99 |
|
100 |
specific string to specify which identifier is being used. |
101 |
We do not believe that these are advantageous approaches, |
102 |
but must wait for the outcome of namespace management |
103 |
discussions in the working group. |
104 |
|
105 |
For the purposes of this document, we have selected three |
106 |
major bibliographic identifiers (national and international) |
107 |
to fit within the URN framework. These are the |
108 |
International Standard Book Number (ISBN) [ISO1], the |
109 |
International Standard Serials Number (ISSN) [NISO1,ISO2, |
110 |
ISO3], and the Serial Item and Contribution Identifier |
111 |
(SICI) [NISO2]. ISBNs are used to identify monographs |
112 |
(books). ISSNs are used to identify serial publications |
113 |
(journals, newspapers) as a whole. SICIs augment the ISSN |
114 |
in order to identify individual issues of serial |
115 |
publications, or components within those issues (such as an |
116 |
individual article, or the table of contents of a given |
117 |
issue). The ISBN and ISSN are defined in the United States |
118 |
by standards issued by the National Information Standards |
119 |
Organization (NISO) and also by parallel international |
120 |
standards issued under the auspices of the International |
121 |
Organization for Standardization (ISO). NISO is the ANSI- |
122 |
accredited standards body serving libraries, publishers and |
123 |
information services. The SICI code is defined by a NISO |
124 |
document in the United States and does not have a parallel |
125 |
international standards document at present. |
126 |
|
127 |
Many other bibliographic identifiers are in common use (for |
128 |
example, the CODEN, numbers assigned by major bibliographic |
129 |
utilities such as OCLC and RLG, national library numbers |
130 |
such as the Library of Congress Control Number) or are under |
131 |
development. While we do not discuss them in this document, |
132 |
many of these will also need to be supported within the URN |
133 |
framework as it moves to large scale implementation. The |
134 |
issues involved in supporting those additional identifiers |
135 |
are anticipated to be broadly similar to those involved in |
136 |
supporting ISBNs, ISSNs, and SICIs. |
137 |
|
138 |
|
139 |
2. Identification vs. Resolution |
140 |
|
141 |
It is important to distinguish between the resource |
142 |
identified by a URN and the resources that can reasonably be |
143 |
provided when attempting to resolve an identifier. For |
144 |
|
145 |
[Page 3] |
146 |
|
147 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
148 |
|
149 |
|
150 |
example, the ISSN 0040-781X identifies the popular |
151 |
"Time". All of it, every issue for from the start of |
152 |
publication to present. Resolving such an identifier should |
153 |
not result in the equivalent of hundreds of thousands of |
154 |
pages of text and photos being dumped to the user's machine. |
155 |
It is more reasonable for ISSNs to resolve to a navigational |
156 |
system, such as an HTML-based search form, so the user may |
157 |
select issues or articles of interest. ISBNs and SICIs, on |
158 |
the other hand, do identify finite, manageably-sized |
159 |
objects, but they may still be large enough that resolution |
160 |
to a hierarchical system is appropriate. |
161 |
|
162 |
In addition, the materials identified by an ISSN, ISBN or |
163 |
SICI may exist only in printed or other physical form, not |
164 |
electronically. The best that a resolver may be able to |
165 |
offer is information about where to get the physical |
166 |
resource, such as library holdings or a bookstore or |
167 |
publisher order form. The URN Framework provides resolution |
168 |
services that may be used to describe any differences |
169 |
between the resource identified by a URN and the resource |
170 |
that would be returned as a result of resolving that URN. |
171 |
|
172 |
|
173 |
3. International Standard Book Numbers |
174 |
|
175 |
3.1 Overview |
176 |
|
177 |
An International Standard Book Number (ISBN) identifies an |
178 |
edition of a monographic work. The ISBN is defined by the |
179 |
standard NISO/ANSI/ISO 2108:1992 [ISO 1] |
180 |
|
181 |
Basically, an ISBN is a ten-digit number (actually, the last |
182 |
digit can be the letter "X" as well, as described below) |
183 |
which is divided into four variable length parts usually |
184 |
separated by hyphens when printed. The parts are as follows |
185 |
(in this order): |
186 |
|
187 |
* a group identifier which specifies a group of publishers, |
188 |
based on national, geographic or some other criteria, |
189 |
|
190 |
* the publisher identifier, |
191 |
|
192 |
[Page 4] |
193 |
|
194 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
195 |
|
196 |
|
197 |
* the title identifier, |
198 |
|
199 |
* and a modulus 11 check digit, using X in lieu of 10. |
200 |
|
201 |
The group and publisher number assignments are managed in |
202 |
such a way that the hyphens are not needed to parse the ISBN |
203 |
unambiguously into its constituent parts. However, the ISBN |
204 |
is normally transmitted and displayed with hyphens to make |
205 |
it easy for human beings to recognize these parts without |
206 |
having to make reference to or have knowledge of the number |
207 |
assignments for group and publisher identifiers. |
208 |
|
209 |
3.2 Encoding Considerations |
210 |
|
211 |
Embedding ISBNs within the URN framework presents no |
212 |
particular coding problems, since all of the characters that |
213 |
can appear in an ISBN are valid in the identifier segment of |
214 |
the URN. %-encoding is never needed. |
215 |
|
216 |
Example: URN:ISBN:0-395-36341-1 |
217 |
|
218 |
For the ISBN namespace, some additional equivalence rules |
219 |
are appropriate. Prior to comparing two ISBN URNs for |
220 |
equivalence, it is appropriate to remove all hyphens, and to |
221 |
convert any occurrences of the letter X to upper case. |
222 |
|
223 |
3.3 Additional considerations |
224 |
|
225 |
The ISBN standard and related community implementation |
226 |
guidelines define when different versions of a work should |
227 |
be assigned the same or differing ISBNs. In actuality, |
228 |
however, practice varies somewhat depending on publisher as |
229 |
to whether different ISBNs are assigned for paperbound vs. |
230 |
hardbound versions of the same work, electronic vs. printed |
231 |
versions of the same work, or versions of the same work |
232 |
published for example in the US and in Europe. The choice |
233 |
of whether to assign a new ISBN or to reuse an existing one |
234 |
when publishing a revised printing of an existing edition of |
235 |
a work or even a revised edition of a work is somewhat |
236 |
subjective. Practice varies from publisher to publisher |
237 |
(indeed, the distinction between a revised printing and a |
238 |
new edition is itself somewhat subjective). The use of |
239 |
|
240 |
[Page 5] |
241 |
|
242 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
243 |
|
244 |
|
245 |
ISBNs within the URN framework simply reflects these |
246 |
existing practices. Note that it is likely that an ISBN URN |
247 |
will often resolve to many instances of the work (many |
248 |
URLs). |
249 |
|
250 |
|
251 |
4. International Standard Serials Numbers |
252 |
|
253 |
4.1 Overview |
254 |
|
255 |
International Standard Serials Numbers (ISSN) identify a |
256 |
work that is being published on a continued basis in issues; |
257 |
they identify the entire (often open-ended, in the case of |
258 |
an actively published) work. ISSNs are defined by the |
259 |
standards ISO 3297:1986 [ISO 2] and ISO/DIS 3297 [ISO 3] and |
260 |
within the United States by NISO Z39.9-1992 [NISO 1]. The |
261 |
ISSN International Centre is located in Paris and |
262 |
coordinates a network of regional centers. The National |
263 |
Serials Data Program within the Library of Congress is the |
264 |
US Center of this network. |
265 |
|
266 |
ISSNs have the form NNNN-NNNN where N is a digit, the last |
267 |
digit may be an upper case X as the result of the check |
268 |
character calculation. Unlike the ISBN the ISSN components |
269 |
do not have much structure; blocks of numbers are passed out |
270 |
to the regional centers and publishers. |
271 |
|
272 |
4.2 Encoding Considerations |
273 |
|
274 |
Again, there is no problem representing ISSNs in the |
275 |
namespace-specific string of URNs since all characters valid |
276 |
in the ISSN are valid in the namespace-specific URN string, |
277 |
and %-encoding is never required. |
278 |
|
279 |
Example: URN:ISSN:1046-8188 |
280 |
|
281 |
Supplementary comparison rules are also appropriate for the |
282 |
ISSN namespace. Just as for ISBNs, hyphens should be |
283 |
dropped prior to comparison and occurrences of 'x' |
284 |
normalized to uppercase. |
285 |
|
286 |
|
287 |
[Page 6] |
288 |
|
289 |
|
290 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
291 |
|
292 |
|
293 |
4.3 Additional Considerations |
294 |
|
295 |
The ISSN standard and related community implementation |
296 |
guidelines specify when new ISSNs should be assigned vs. |
297 |
continuing to use an existing one. There are some |
298 |
publications where practice within the bibliographic |
299 |
community varies from site to site, such as annuals or |
300 |
annual conference proceedings. In some cases these are |
301 |
treated as serials and ISSNs are used, and in some cases |
302 |
they are treated as monographs and ISBNs are used. For |
303 |
example SIGMOD Record volume 24 number 2 June 1995 contains |
304 |
the Proceedings of the 1995 ACM SIGMOD International |
305 |
Conference on Management of Data. If you subscribe to the |
306 |
journal (ISSN 0163-5808) this is simply the June issue. On |
307 |
the other hand you may have acquired this volume as the |
308 |
conference proceedings (a monograph) and as such would use |
309 |
the ISBN 0-89791-731-6 to identify the work. There are also |
310 |
varying practices within the publishing community as to when |
311 |
new ISSNs are assigned due to the change in the name of a |
312 |
periodical (Atlantic becomes Atlantic Monthly); or when a |
313 |
periodical is published both in printed and electronic |
314 |
versions (The New York Times). The use of ISSNs as URNs |
315 |
will reflect these judgments and practices. |
316 |
|
317 |
|
318 |
5. Serial Item and Contribution Identifiers |
319 |
|
320 |
5.1 Overview |
321 |
|
322 |
The standard for Serial Item and Contribution Identifiers |
323 |
(SICI) has recently been extensively revised and is defined |
324 |
by NISO/ANSI Z39.56-1997 [NISO 2]. The maintenance agency |
325 |
for the SICI code is the UnCover Corporation. |
326 |
|
327 |
SICI codes can be used to identify an issue of a serial, or |
328 |
a specific contribution (i.e., an article, or the table of |
329 |
contents) within an issue of a serial. SICI codes are not |
330 |
assigned, they are constructed based on information about |
331 |
the issue or issue component in question. |
332 |
|
333 |
The complete syntax for the SICI code will not be discussed |
334 |
here; see NISO/ANSI Z39.56-1997 for details. However an |
335 |
example and brief review of the major components is needed |
336 |
|
337 |
[Page 7] |
338 |
|
339 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
340 |
|
341 |
|
342 |
to understand the relationship with the ISSN and how this |
343 |
identifier differs. An example of a SICI code is: |
344 |
|
345 |
0015-6914(19960101)157:1<62:KTSW>2.0.TX;2-F |
346 |
|
347 |
The first nine characters are the ISSN identifying the |
348 |
serial title. The second component, in parentheses, is the |
349 |
chronology information giving the date the particular serial |
350 |
issue was published. In this example that date was January |
351 |
1, 1996. The third component, 157:1, is enumeration |
352 |
information (volume, number) on the particular issue of the |
353 |
serial. These three components comprise the "item segment" |
354 |
of a SICI code. By augmenting the ISSN with the chronology |
355 |
and/or enumeration information, specific issues of the |
356 |
serial can be identified. The next segment, <62:KTSW>, |
357 |
identifies a particular contribution within the issue. In |
358 |
this example we provide the starting page number and a title |
359 |
code constructed from the initial characters of the title. |
360 |
Identifiers assigned to a contribution can be used in the |
361 |
contribution segment if page numbers are inappropriate. The |
362 |
rest of the identifier is the control segment, which |
363 |
includes a check character. Interested readers are |
364 |
encouraged to consult the standard for an explanation of the |
365 |
fields in that segment. |
366 |
|
367 |
5.2 Encoding Considerations |
368 |
|
369 |
The character set for SICIs is intended to be email- |
370 |
transport-transparent, so it does not present major |
371 |
problems. However, all printable excluded and reserved |
372 |
characters from the URN syntax draft are valid in the SICI |
373 |
character set and must be %-encoded. |
374 |
|
375 |
Example of a SICI for an issue of a journal |
376 |
|
377 |
URN:SICI:1046-8188(199501)13:1%3C%3E1.0.TX;2-F |
378 |
|
379 |
For an article contained within that issue |
380 |
|
381 |
URN:SICI:1046-8188(199501)13:1%3C69:FTTHBI%3E2.0.TX;2-4 |
382 |
|
383 |
|
384 |
[Page 8] |
385 |
|
386 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
387 |
|
388 |
|
389 |
Special equivalence rules for SICIs are not appropriate for |
390 |
definition as part of the namespace and incorporation in |
391 |
areas such as cache management algorithms. These are best |
392 |
left to resolver systems which try to determine if two SICIs |
393 |
refer to the same content. Consequently, we do not propose |
394 |
any specific rules for equivalence testing through lexical |
395 |
manipulation. |
396 |
|
397 |
5.3 Additional Considerations |
398 |
|
399 |
Since the serial is identified by an ISSN, some of the |
400 |
ambiguity currently found in the assignment of ISSNs carries |
401 |
over into SICI codes. In cases where an ISSN may refer to a |
402 |
serial that exists in multiple formats, the SICI contains a |
403 |
qualifier that specifies the format type (for example, |
404 |
print, microform, or electronic). SICI codes may be |
405 |
constructed from a variety of sources (the actual issue of |
406 |
the serial, a citation or a record from an abstracting |
407 |
service) and, as such are based on the principle of using |
408 |
all available information, so there may be multiple SICI |
409 |
codes representing the same article [NISO2, Appenidx D]. |
410 |
For example, one code might be constructed with access to |
411 |
both chronology and enumeration (that is, date of issue and |
412 |
volume, issue and page number), another code might be |
413 |
constructed based only on enumeration information and |
414 |
without benefit of chronology. Systems that use SICI codes |
415 |
employ complex matching algorithms to try to match SICI |
416 |
codes constructed from incomplete information to SICI codes |
417 |
constructed with the benefit of all relevant information. |
418 |
|
419 |
6. Security Considerations |
420 |
|
421 |
This document proposes means of encoding several existing |
422 |
bibliographic identifiers within the URN framework. It does |
423 |
not discuss resolution; thus questions of secure or |
424 |
authenticated resolution mechanisms are out of scope. It |
425 |
does not address means of validating the integrity or |
426 |
authenticating the source or provenance of URNs that contain |
427 |
bibliographic identifiers. Issues regarding intellectual |
428 |
property rights associated with objects identified by the |
429 |
various bibliographic identifiers are also beyond the scope |
430 |
of this document, as are questions about rights to the |
431 |
databases that might be used to construct resolvers. |
432 |
|
433 |
[Page 9] |
434 |
|
435 |
INTERNET DRAFT:Bibliographic Identifiers as URNs 3/1997 |
436 |
|
437 |
7. References |
438 |
|
439 |
[ISO1] NISO/ANSI/ISO 2108:1992 Information and documentation |
440 |
-- International standard book number (ISBN) |
441 |
[ISO2] ISO 3297:1986 Documentation -- International standard |
442 |
serial numbering (ISSN) |
443 |
[ISO3] ISO/DIS 3297 Information and documentation -- |
444 |
International standard serial numbering (ISSN) |
445 |
(Revision of ISO 3297:1986) |
446 |
[Moats] R. Moats, "URN Syntax" draft-ietf-urn-syntax- |
447 |
04.text. March 1997 |
448 |
[NISO 1] NISO/ANSI Z39.9-1992 International standard serial |
449 |
numbering (ISSN) |
450 |
[NISO 2] NISO/ANSI Z39.56-1997 Serial Item and Contribution |
451 |
Identifier |
452 |
[Sollins & Masinter] K. Sollins and L. Masinter, "Functional |
453 |
Requirements for Uniform Resource Names", RFC 1737 |
454 |
December 1994. |
455 |
|
456 |
8. Author's Addresses |
457 |
|
458 |
Clifford Lynch |
459 |
University of California Office of the President |
460 |
300 Lakeside Drive, 8th floor |
461 |
Oakland CA 94612-3550 |
462 |
clifford.lynch@ucop.edu |
463 |
|
464 |
Cecilia Preston |
465 |
Preston & Lynch |
466 |
PO Box 8310 |
467 |
Emeryville, CA 94662 |
468 |
cecilia@well.com |
469 |
|
470 |
Ron Daniel Jr. |
471 |
Advanced Computing Lab, MS B287 |
472 |
Los Alamos National Laboratory |
473 |
Los Alamos, NM, 87545 |
474 |
voice: +1 505 665 0597 |
475 |
fax: +1 505 665 4939 |
476 |
http://www.acl.lanl.gov/~rdaniel |
477 |
|
478 |
|
479 |
|
480 |
|
481 |
[Page 10] |
482 |
|
483 |
|
484 |
|