2004/id/draft-ietf-uri-urn-req-00.txt


INTERNET DRAFT                                               K. Sollins
Requirements for Uniform Resource Names                         MIT/LCS
draft-ietf-uri-urn-req-00.txt                               L. Masinter
Replaces draft-sollins-urn-02.txt                     Xerox Corporation
Expires March 10, 1995                               September 10, 1994


              Requirements for Uniform Resource Names

Status of this Memo

   This memo provides information for the Internet community. This memo
   does not specify an Internet standard of any kind. Distribution of
   this memo is unlimited.

1.  Introduction

   This document sets out the requirements for Uniform Resource Names
   (URNs) within a larger Internet information architecture, which in
   turn is composed of, additionally, Uniform Resource Characteristics
   (URCs), and Uniform Resource Locators (URLs).  URNs are used for
   identification, URCs for including meta-information, and URLs for
   locating or finding resources.  It is provided as a basis for
   evaluating standards for URNs.  The discussions of this work have
   occurred on the mailing list uri@bunyip.com and at the URI Working
   Group sessions of the IETF.

   The requirements for uniform resource names (URNs) fit within the
   overall architecture of Uniform Resource Identification.  In order to
   build applications in the most general case, the user must be able to
   discover and identify the information, objects, or what we will call
   in this architecture resources, on which the application is to
   operate.  Beyond this statement, the URI architecture does not
   define "resource."  As the network and interconnectivity grow, the 
   ability to make use of remote, perhaps independently managed,
   resources will become more and more important.  This activity of
   discovering and utilizing resources can be broken down into those
   activities where one of the primary constraints is human utility and
   facility and those in which human involvement is small or nonexistent.
   Human naming must have such characteristics as being both mnemonic and
   short.  Humans, in contrast with computers, are good at heuristic
   disambiguation and wide variability in structure.  In order for
   computer and network based systems to support global naming and
   access to resources that have perhaps an indeterminate lifetime, the
   flexibility and attendant unreliability of human-friendly names
   should be translated into a naming infrastructure more appropriate
   for the underlying support system.  It is this underlying support
   system that the Internet Information Infrastructure Architecture
   (IIIA) is addressing.

   Within the IIIA, several sorts of information about resources are
   specified and divided among different sorts of structures, along
   functional lines.  In order to access information, one must be able
   to discover or identify the particular information desired,
   determined both how and where it might be used or accessed.  The


Sollins & Masinter                                              [Page 1]

INTERNET-DRAFT  Requirements for Uniform Resource Names    Sept. 10,1994


   partitioning of the functionality in this architecture is into
   uniform resource names (URN), uniform resource characteristics (URC),
   and uniform resource locators (URL).  A URN identifies a resource or
   unit of information.  It may identify, for example, intellectual
   content, a particular presentation of intellectual content, or
   whatever a name assignment authority determines is a distinctly namable
   entity.  A URL identifies the location or a container for an instance
   of a resource identified by a URN.  The resource identified by a URN
   may reside in one or more locations at any given time, may move, or
   may not be available at all.  Of course, not all resources will move
   during their lifetimes, and not all resources, although identifiable
   and identified by a URN will be instantiated at any given time.  As
   such a URL is identifying a place where a resource may reside, or a
   container, as distinct from the resource itself identified by the
   URN.  A URC is a set of meta-level information about a resource.
   Some examples of such meta-information are: owner, encoding, access
   restrictions (perhaps for particular instances), cost.

   With this in mind, we can make the following statement:

   o  The purpose or function of a URN is to provide a globally unique,
      persistent identifier used for recognition, for access to
      characteristics of the resource or for access to the resource
      itself.

   More specifically, there are two kinds of requirements on URNs:
   requirements on the functional capabilities of URNs, and requirements
   on the way URNs are encoded in data streams and written
   communications.

2. Requirements for functional capabilities

   These are the requirements for URNs' functional capabilities:

   o Global scope: A URN is a name with global scope which does not
     imply a location.  It has the same meaning everywhere.

   o Global uniqueness: The same URN will never be assigned to two
     different resources.

   o Persistence: It is intended that the lifetime of a URN be
     permanent.  That is, the URN will be globally unique forever, and
     may well be used as a reference to a resource well beyond the
     lifetime of the resource it identifies or of any naming authority
     involved in the assignment of its name.

   o Scalability: URNs can be assigned to any resource that might
     conceivably be available on the network, for hundreds of years.


Sollins & Masinter                                              [Page 2]

INTERNET-DRAFT  Requirements for Uniform Resource Names   Sept. 10, 1994


   o Legacy support: The scheme must permit the support of existing
     legacy naming systems.  For example, ISBN numbers, ISO public
     identifiers, UPC product codes and the like are naming schemes
     which should be allowed to be embedded within the URN system.

   o Extensibility: Any scheme for URNs must permit future extensions to
     the scheme.

   o Independence: It is solely the responsibility of a name issuing
     authority to determine the conditions under which it will issue a
     name.

   o Resolution: A URN will not impede resolution (translation into a
     URL, q.v.). To be more specific, for URNs that have corresponding
     URLs, there must be some feasible mechanism to translate a URN to a
     URL.

3. Requirements for URN encoding

   In addition to requirements on the functional elements of the URNs,
   there are requirements for how they are encoded in a string:

   o Single encoding: The encoding for presentation for people in clear
     text, electronic mail and the like is the same as the encoding in
     other transmissions.

   o Simple comparison: A comparison algorithm for URNs is simple,
     local, and deterministic. That is, there is a single algorithm for
     comparing two URNs that does not require contacting any external
     server, is well specified and simple.

   o Human transcribability: For URNs to be easily transcribable by
     humans without error, they should be short, use a minimum of
     special characters, and be case insensitive. (There is no strong
     requirement that it be easy for a human to generate or interpret a
     URN; explicit human-accessible semantics of the names is not a
     requirement.)  For this reason, URN comparison is insensitive to
     case, and probably white space and some punctuation marks.

   o Transport friendliness: A URN can be transported unmodified in the
     common Internet protocols, such as TCP, SMTP, FTP, Telnet, etc., as
     well as printed paper.

   o Machine consumption: A URN can be parsed by a computer.

   o Text recognition: The encoding of a URN should enhance the
     ability to find and parse URNs in free text.


Sollins & Masinter                                              [Page 3]

INTERNET-DRAFT  Requirements for Uniform Resource Names   Sept. 10, 1994


4. Implications

   For a URN specification to be acceptible, it must meet the previous
   requirements.  We draw a set of conclusions, listed below, from
   those requirements; a specification that satisfies the requirments
   without meetings these conclusions is deemed acceptable, although
   unlikely to occur.

   o To satisfy the requirements of uniqueness and scalability, name
     assignment is delegated to naming authorities, who may then assign
     names directly or delegate that authority to sub-authorities.
     Uniqueness is guaranteed by requiring each naming authority to
     guarantee uniqueness.  The names of the naming authorities
     themselves are persistent and globally unique and top level
     authorities will be centrally registered.

   o Naming authorities that support scalable naming are encouraged, but
     not required.  Scalability implies that a scheme for devising names
     may be scalable both at its terminators as well as within the
     structure; e.g., in a hierarchical naming scheme, a naming
     authority might have an extensible mechanism for adding new
     sub-registries.

   o It is strongly recommended that there be a mapping between the
     names generated by each naming authority and URLs.  At any specific
     time there will be zero or more URLs into which a particular URN
     can be mapped.  The naming authority itself need not provide the
     mapping from URN to URL.

   o For URNs to be transcribable and transported in mail, it is
     necessary to limit the character set usable in URNs, although there
     is not yet consensus on what the limit might be.

   In assigning names, a name assignment authority must abide by the
   preceding constraints, as well as defining its own criteria for
   determining the necessity or indication of a new name assignment.

5. Other considerations

   There are three issues about which this document has intentionally not
   taken a position, because it is believed that these are issues to be
   decided by local determination or other services within an information
   infrastructure.  These issues are equality of resources, reflection of
   visible semantics in a URN, and name resolution.

   One of the ways in which naming authorities, the assigners of names,
   may choose to make themselves distinctive is by the algorithms by
   which they distinguish or do not distinguish resources from each
   other.  For example, a publisher may choose to distinguish among
   multiple printings of a book, in which minor spelling and
   typographical mistakes have been made, but a library may prefer not to
   make that distinction.  Furthermore, no one algorithm for testing
   for equality is likely to applicable to all sorts of information.
   For example, an algorithm based on testing the equality of two


Sollins & Masinter                                              [Page 4]

INTERNET-DRAFT  Requirements for Uniform Resource Names   Sept. 10, 1994


   books is unlikely to be useful when testing the equality of two
   spreadsheets.  Thus, although this document requires that any
   particular naming authority use one algorithm for determining
   whether two resources it is comparing are the same or different,
   each naming authority can use a different such algorithm and a
   naming authority may restrict the set of resources it chooses to
   identify in any way at all.

   A naming authority will also have some algorithm for actually choosing
   a name within its namespace.  It may have an algorithm that actually
   embeds in some way some knowledge about the resource.  In turn, that
   embedding may or may not be made public, and may or may not be visible
   to potential clients.  For example, an unreflective URN, simply
   provides monotonically increasing serial numbers for resources.  This
   conveys nothing other than the identity determined by the equality
   testing algorithm and an ordering of name assignment by this server.
   It carries no information about the resource itself.  An MD5 of the
   resource at some point, in and of itself may be reflective of its
   contents, and, in fact, the naming authority may be perfectly willing
   to publish the fact that it is using MD5, but if the resource is
   mutable, it still will be the case that any potential client cannot do
   much with the URN other than check for equality.  If, in contrast, a
   URN scheme has much in common with the assignment ISBN numbers, the
   algorithm for assigning them is public and by knowing it, given a
   particular ISBN number, one can learn something more about the
   resource in question.  This full range of possibilities is allowed
   according to this requirements document, although it is intended that
   naming authorities be discouraged from making accessible to clients
   semantic information about the resource, on the assumption that that
   may change with time and therefore it is unwise to encourage people in
   any way to depend on that semantics being valid.

   Last, this document intentionally does not address the problem of name
   resolution, other than to recommend that for each naming authority a
   name translation mechanism exist.  Naming authorities assign names,
   while resolvers or location services of some sort assist or provide
   URN to URL mapping.  There may be one or many such services for the
   resources named by a particular naming authority.  It may also be the
   case that there are generic ones providing service for many resources of
   differing naming authorities.  Some may be authoritative and others
   not.  Some may be highly reliable or highly available or highly
   responsive to updates or highly focussed by other criteria such as
   subject matter.  Of course, it is also possible that some naming
   authorities will also act as resolvers for the resources they have
   named.  This document supports and encourages third party and
   distributed services in this area, and therefore intentionally makes
   no statements about requirements of URNs or naming authorities on
   resolvers.


Sollins & Masinter                                              [Page 5]

INTERNET-DRAFT  Requirements for Uniform Resource Names   Sept. 10, 1994


References

Security Considerations
   
   Applications that require translation from names to locations, and
   the resources themselves may require the resources to be
   authenticated. It seems generally that the information about the
   authentication of either the name or the resource to which it refers
   should be carried by separate information passed along with the URN
   rather than in the URN itself.

Author's Address

   Larry Masinter                    Karen Sollins
   Xerox Palo Alto Research Center   MIT Laboratory for Computer Science
   3333 Coyote Hill Road             545 Technology Square
   Palo Alto, CA 94304               Cambridge, MA 02139
   masinter@parc.xerox.com           sollins@lcs.mit.edu
   Voice: (415) 812-4365             Voice: (617) 253-6006
   Fax:   (415) 812-4333             Tel:   (617) 253-2673
   

Sollins & Masinter                                              [Page 6]

1
2	INTERNET DRAFT K. Sollins
3	Requirements for Uniform Resource Names MIT/LCS
4	draft-ietf-uri-urn-req-00.txt L. Masinter
5	Replaces draft-sollins-urn-02.txt Xerox Corporation
6	Expires March 10, 1995 September 10, 1994
7
8
9	Requirements for Uniform Resource Names
10
11	Status of this Memo
12
13	This memo provides information for the Internet community. This memo
14	does not specify an Internet standard of any kind. Distribution of
15	this memo is unlimited.
16
17	1. Introduction
18
19	This document sets out the requirements for Uniform Resource Names
20	(URNs) within a larger Internet information architecture, which in
21	turn is composed of, additionally, Uniform Resource Characteristics
22	(URCs), and Uniform Resource Locators (URLs). URNs are used for
23	identification, URCs for including meta-information, and URLs for
24	locating or finding resources. It is provided as a basis for
25	evaluating standards for URNs. The discussions of this work have
26	occurred on the mailing list uri@bunyip.com and at the URI Working
27	Group sessions of the IETF.
28
29	The requirements for uniform resource names (URNs) fit within the
30	overall architecture of Uniform Resource Identification. In order to
31	build applications in the most general case, the user must be able to
32	discover and identify the information, objects, or what we will call
33	in this architecture resources, on which the application is to
34	operate. Beyond this statement, the URI architecture does not
35	define "resource." As the network and interconnectivity grow, the
36	ability to make use of remote, perhaps independently managed,
37	resources will become more and more important. This activity of
38	discovering and utilizing resources can be broken down into those
39	activities where one of the primary constraints is human utility and
40	facility and those in which human involvement is small or nonexistent.
41	Human naming must have such characteristics as being both mnemonic and
42	short. Humans, in contrast with computers, are good at heuristic
43	disambiguation and wide variability in structure. In order for
44	computer and network based systems to support global naming and
45	access to resources that have perhaps an indeterminate lifetime, the
46	flexibility and attendant unreliability of human-friendly names
47	should be translated into a naming infrastructure more appropriate
48	for the underlying support system. It is this underlying support
49	system that the Internet Information Infrastructure Architecture
50	(IIIA) is addressing.
51
52	Within the IIIA, several sorts of information about resources are
53	specified and divided among different sorts of structures, along
54	functional lines. In order to access information, one must be able
55	to discover or identify the particular information desired,
56	determined both how and where it might be used or accessed. The
57
58
59	Sollins & Masinter [Page 1]
60
61	INTERNET-DRAFT Requirements for Uniform Resource Names Sept. 10,1994
62
63
64	partitioning of the functionality in this architecture is into
65	uniform resource names (URN), uniform resource characteristics (URC),
66	and uniform resource locators (URL). A URN identifies a resource or
67	unit of information. It may identify, for example, intellectual
68	content, a particular presentation of intellectual content, or
69	whatever a name assignment authority determines is a distinctly namable
70	entity. A URL identifies the location or a container for an instance
71	of a resource identified by a URN. The resource identified by a URN
72	may reside in one or more locations at any given time, may move, or
73	may not be available at all. Of course, not all resources will move
74	during their lifetimes, and not all resources, although identifiable
75	and identified by a URN will be instantiated at any given time. As
76	such a URL is identifying a place where a resource may reside, or a
77	container, as distinct from the resource itself identified by the
78	URN. A URC is a set of meta-level information about a resource.
79	Some examples of such meta-information are: owner, encoding, access
80	restrictions (perhaps for particular instances), cost.
81
82	With this in mind, we can make the following statement:
83
84	o The purpose or function of a URN is to provide a globally unique,
85	persistent identifier used for recognition, for access to
86	characteristics of the resource or for access to the resource
87	itself.
88
89	More specifically, there are two kinds of requirements on URNs:
90	requirements on the functional capabilities of URNs, and requirements
91	on the way URNs are encoded in data streams and written
92	communications.
93
94	2. Requirements for functional capabilities
95
96	These are the requirements for URNs' functional capabilities:
97
98	o Global scope: A URN is a name with global scope which does not
99	imply a location. It has the same meaning everywhere.
100
101	o Global uniqueness: The same URN will never be assigned to two
102	different resources.
103
104	o Persistence: It is intended that the lifetime of a URN be
105	permanent. That is, the URN will be globally unique forever, and
106	may well be used as a reference to a resource well beyond the
107	lifetime of the resource it identifies or of any naming authority
108	involved in the assignment of its name.
109
110	o Scalability: URNs can be assigned to any resource that might
111	conceivably be available on the network, for hundreds of years.
112
113
114
115
116
117
118	Sollins & Masinter [Page 2]
119
120	INTERNET-DRAFT Requirements for Uniform Resource Names Sept. 10, 1994
121
122
123	o Legacy support: The scheme must permit the support of existing
124	legacy naming systems. For example, ISBN numbers, ISO public
125	identifiers, UPC product codes and the like are naming schemes
126	which should be allowed to be embedded within the URN system.
127
128	o Extensibility: Any scheme for URNs must permit future extensions to
129	the scheme.
130
131	o Independence: It is solely the responsibility of a name issuing
132	authority to determine the conditions under which it will issue a
133	name.
134
135	o Resolution: A URN will not impede resolution (translation into a
136	URL, q.v.). To be more specific, for URNs that have corresponding
137	URLs, there must be some feasible mechanism to translate a URN to a
138	URL.
139
140	3. Requirements for URN encoding
141
142	In addition to requirements on the functional elements of the URNs,
143	there are requirements for how they are encoded in a string:
144
145	o Single encoding: The encoding for presentation for people in clear
146	text, electronic mail and the like is the same as the encoding in
147	other transmissions.
148
149	o Simple comparison: A comparison algorithm for URNs is simple,
150	local, and deterministic. That is, there is a single algorithm for
151	comparing two URNs that does not require contacting any external
152	server, is well specified and simple.
153
154	o Human transcribability: For URNs to be easily transcribable by
155	humans without error, they should be short, use a minimum of
156	special characters, and be case insensitive. (There is no strong
157	requirement that it be easy for a human to generate or interpret a
158	URN; explicit human-accessible semantics of the names is not a
159	requirement.) For this reason, URN comparison is insensitive to
160	case, and probably white space and some punctuation marks.
161
162	o Transport friendliness: A URN can be transported unmodified in the
163	common Internet protocols, such as TCP, SMTP, FTP, Telnet, etc., as
164	well as printed paper.
165
166	o Machine consumption: A URN can be parsed by a computer.
167
168	o Text recognition: The encoding of a URN should enhance the
169	ability to find and parse URNs in free text.
170
171
172
173
174
175	Sollins & Masinter [Page 3]
176
177	INTERNET-DRAFT Requirements for Uniform Resource Names Sept. 10, 1994
178
179
180	4. Implications
181
182	For a URN specification to be acceptible, it must meet the previous
183	requirements. We draw a set of conclusions, listed below, from
184	those requirements; a specification that satisfies the requirments
185	without meetings these conclusions is deemed acceptable, although
186	unlikely to occur.
187
188	o To satisfy the requirements of uniqueness and scalability, name
189	assignment is delegated to naming authorities, who may then assign
190	names directly or delegate that authority to sub-authorities.
191	Uniqueness is guaranteed by requiring each naming authority to
192	guarantee uniqueness. The names of the naming authorities
193	themselves are persistent and globally unique and top level
194	authorities will be centrally registered.
195
196	o Naming authorities that support scalable naming are encouraged, but
197	not required. Scalability implies that a scheme for devising names
198	may be scalable both at its terminators as well as within the
199	structure; e.g., in a hierarchical naming scheme, a naming
200	authority might have an extensible mechanism for adding new
201	sub-registries.
202
203	o It is strongly recommended that there be a mapping between the
204	names generated by each naming authority and URLs. At any specific
205	time there will be zero or more URLs into which a particular URN
206	can be mapped. The naming authority itself need not provide the
207	mapping from URN to URL.
208
209	o For URNs to be transcribable and transported in mail, it is
210	necessary to limit the character set usable in URNs, although there
211	is not yet consensus on what the limit might be.
212
213	In assigning names, a name assignment authority must abide by the
214	preceding constraints, as well as defining its own criteria for
215	determining the necessity or indication of a new name assignment.
216
217	5. Other considerations
218
219	There are three issues about which this document has intentionally not
220	taken a position, because it is believed that these are issues to be
221	decided by local determination or other services within an information
222	infrastructure. These issues are equality of resources, reflection of
223	visible semantics in a URN, and name resolution.
224
225	One of the ways in which naming authorities, the assigners of names,
226	may choose to make themselves distinctive is by the algorithms by
227	which they distinguish or do not distinguish resources from each
228	other. For example, a publisher may choose to distinguish among
229	multiple printings of a book, in which minor spelling and
230	typographical mistakes have been made, but a library may prefer not to
231	make that distinction. Furthermore, no one algorithm for testing
232	for equality is likely to applicable to all sorts of information.
233	For example, an algorithm based on testing the equality of two
234
235
236	Sollins & Masinter [Page 4]
237
238	INTERNET-DRAFT Requirements for Uniform Resource Names Sept. 10, 1994
239
240
241	books is unlikely to be useful when testing the equality of two
242	spreadsheets. Thus, although this document requires that any
243	particular naming authority use one algorithm for determining
244	whether two resources it is comparing are the same or different,
245	each naming authority can use a different such algorithm and a
246	naming authority may restrict the set of resources it chooses to
247	identify in any way at all.
248
249	A naming authority will also have some algorithm for actually choosing
250	a name within its namespace. It may have an algorithm that actually
251	embeds in some way some knowledge about the resource. In turn, that
252	embedding may or may not be made public, and may or may not be visible
253	to potential clients. For example, an unreflective URN, simply
254	provides monotonically increasing serial numbers for resources. This
255	conveys nothing other than the identity determined by the equality
256	testing algorithm and an ordering of name assignment by this server.
257	It carries no information about the resource itself. An MD5 of the
258	resource at some point, in and of itself may be reflective of its
259	contents, and, in fact, the naming authority may be perfectly willing
260	to publish the fact that it is using MD5, but if the resource is
261	mutable, it still will be the case that any potential client cannot do
262	much with the URN other than check for equality. If, in contrast, a
263	URN scheme has much in common with the assignment ISBN numbers, the
264	algorithm for assigning them is public and by knowing it, given a
265	particular ISBN number, one can learn something more about the
266	resource in question. This full range of possibilities is allowed
267	according to this requirements document, although it is intended that
268	naming authorities be discouraged from making accessible to clients
269	semantic information about the resource, on the assumption that that
270	may change with time and therefore it is unwise to encourage people in
271	any way to depend on that semantics being valid.
272
273	Last, this document intentionally does not address the problem of name
274	resolution, other than to recommend that for each naming authority a
275	name translation mechanism exist. Naming authorities assign names,
276	while resolvers or location services of some sort assist or provide
277	URN to URL mapping. There may be one or many such services for the
278	resources named by a particular naming authority. It may also be the
279	case that there are generic ones providing service for many resources of
280	differing naming authorities. Some may be authoritative and others
281	not. Some may be highly reliable or highly available or highly
282	responsive to updates or highly focussed by other criteria such as
283	subject matter. Of course, it is also possible that some naming
284	authorities will also act as resolvers for the resources they have
285	named. This document supports and encourages third party and
286	distributed services in this area, and therefore intentionally makes
287	no statements about requirements of URNs or naming authorities on
288	resolvers.
289
290
291
292
293
294
295
296
297	Sollins & Masinter [Page 5]
298
299	INTERNET-DRAFT Requirements for Uniform Resource Names Sept. 10, 1994
300
301
302	References
303
304	Security Considerations
305
306	Applications that require translation from names to locations, and
307	the resources themselves may require the resources to be
308	authenticated. It seems generally that the information about the
309	authentication of either the name or the resource to which it refers
310	should be carried by separate information passed along with the URN
311	rather than in the URN itself.
312
313	Author's Address
314
315	Larry Masinter Karen Sollins
316	Xerox Palo Alto Research Center MIT Laboratory for Computer Science
317	3333 Coyote Hill Road 545 Technology Square
318	Palo Alto, CA 94304 Cambridge, MA 02139
319	masinter@parc.xerox.com sollins@lcs.mit.edu
320	Voice: (415) 812-4365 Voice: (617) 253-6006
321	Fax: (415) 812-4333 Tel: (617) 253-2673
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358	Sollins & Masinter [Page 6]
359