2004/id/draft-ietf-uri-urn2urc-00.txt

draft-ietf-uri-urn2urc-00.txt                   URI working group
Expires 16 October 1994                         19 Feb 94

                URN to URC resolution scenario
                ==============================

Abstract
========

This document is intended to address the issue of URN to URC resolution
at a level between the IIIR Vision document [clw/peterd:1] and the
various standards documents such as the URL specification. [timb].
This document is also intended to act as some pointers for people who
might want to implement URNs in information systems they are building.

Status of this Memo
===================
     This document is an Internet-Draft.  Internet-Drafts are
     working documents of the Internet Engineering Task Force
     (IETF), its areas, and its working groups.  Note that other
     groups may also distribute working documents as
     Internet-Drafts.
 
     Internet-Drafts are draft documents valid for a maximum of six
     months.  Internet-Drafts may be updated, replaced, or obsoleted
     by other documents at any time.  It is not appropriate to use
     Internet-Drafts as reference material or to cite them other
     than as a ``working draft'' or ``work in progress.''
 
     To learn the current status of any Internet-Draft, please check
     the 1id-abstracts.txt listing contained in the Internet-Drafts
     Shadow Directories on ds.internic.net, nic.nordu.net,
     ftp.isi.edu, or munnari.oz.au.
 

Sample URC
==========
Title:          URN to URC resolution scenario
Version:        0.3
URN:            Unknown - how about <urn:ietf/pandora/mitra:urn2urc>
URL:            ftp://pandora.sf.ca.us/pub/mitra/urn2urc-04.txt
URL:            ftp://ftp.isi.edu/internet-drafts/draft-ietf-uri-urn-00.txt
Format:         Text/plain              
Cost:           0
Date:           19 Feb 93
Comment:        This version has had minor changes since 0.3


Note
====

{Throughout the document, I've tried to identify places where discussion
 is still going on that may affect this proposal, and indicate with
 '{' and '}' }.  

The first version was written prior to IETF Houston - the discussion has 
moved along a bit since then, but it isnt worth updating this document
until more choices are clarified.

Acronym Soup
============

I've avoided defining any new acronyms in this document. The following
acronyms will be used.

URI     Uniform Resource Identifier: Any of URN, URL, URC etc
URN     Uniform Resource Name: A persistant, location independent
        identifier for an object.
URL     Uniform Resource Location: The address of an object, contains
        enough information to identify a protocol and retrieve the object.
URC     Uniform Resource Characteristics: Any combination of one or more
        URN's or URLs with meta information. The set of information in
        a URC is not defined. In some documents this is referred to as
        URT or URM. [mm]
Id Authority    That part of a URN which identifiers the authority
        that issued the URN. In documents written prior to it becoming
        a hierarchical entity, [clw/peterd:2] it is usually split into
        two parts "Naming Authority" and "Publisher Id"
IIIR    Integration of Internet Information Retrieval.

General scenario
================

The general scenario described here might proceed as follows, although
this document is not constrained by the scenario it is usefull to articulate
it so that we can check that what we are proposing makes sense.

1) A client program, running on a users workstation, receives a
hypertext document, menu, or search result etc containing a number of
URNs.

2) The user selects one of the URNs

3) The client locates, a URN->URC resolution service.

4) The client contacts the URN->URC resolution service, and
retrieves a number of URLs for this document, along with meta
information about those URLs (e.g. cost and format) and about the URN
itself.

5) The user, or the client, pick the "best" URL

6) The client either retrieves the URL itself (e.g. via its access
library) or if the URL is for a access method it doesnt speak, via a
gateway service.

7) The client either displays the object, or if its in a format it doesnt
handle, launches an appropriate viewer.

Technical detail for each step
==============================

1) The client program receives the URNs 

The URNs are embedded in an object, or other data structure, the client
needs a way to locate and extract those URNs. 

The current proposal is that in
text they are always of the form <urn:xxx/yyy:ABC123456> where:

xxx/yyy is a hierarchical identifier (read left to right) for the ID
authority. {Note other proposals are to use one or more ":" as seperators
for the hierarchy, or to reverse it to look like a FQDN}

ABC123456 is an opaque string assigned by the ID authority.

For more information on URNs see the current version of the URN
document [clw/peterd:2]


2) The user selects one of those URNs.

For this to be meaningfull their must be enough meta-information with
the URN for the user to make a selection. Depending on the protocol,
this might be outside the standardisation process, (e.g. a
textual description in a mail message.  Or it might be part of another
protocol (e.g. the title in gopher or html). Or it might be that the
URN is received embedded in a URC

3) The client locates a URN->URC resolution service. 

The URN needs to contain enough information within it to locate the
resolution service. The client can extract the ID Authority from the
URN in the example above this would be xxx/yyy, this is reversed and
".urn" appended to form a FQDN of yyy.xxx.urn

{ If other punctuation schemes are adopted, then the process will
change, but the principle remain the same. Ditto if we adopt a
different top-level domain. This does have implications for the
character set allowed in the ID Authority part of a URN, it either has
to be those characters allowed in a FQDN, or an escaping scheme
chosen}

This FQDN, can be passed to the DNS which can resolve it to an address,
while this might involve several network accesses to traverse the
hierarchy, the standard caching and UDP parts of DNS make this an
efficient process. Typically this requires just a call to
"gethostbyname" which returns a IP address.

There have been some concerns raised about not increasing the load on
the fragile DNS system and software. Note also that in this scheme the
DNS needs no records changed or added, and only ID authorities are
registered - not documents.  Total increased load on DNS for any
transaction is going to be of a similar order as the load for any
document retrieval etc.

4) The client contacts the URN->URC service.

If we are to avoid mucking with the DNS, then the URN->URC service is
going to have to be on a registered port, talking a known protocol.

Currently the proposal is to use a subset of the whois++ protocol, but
sitting on a different port. The simplest query is of the form:

Client->Server:         Template=URC;URN="urn:xxx/yyy:ABCD123456"

Server->Client:         URN:    urn:xxx/yyy:ABCD123456
                        Author: Mitra <mitra@path.net>
                        URL:    gopher://path.net/00/papers/mitra/urn2urc
                        Format: Text/plain
                        URL:    ftp://path.net/pub/docs/urn2urc.ps
                        Format: Application/postscript

Decisions are needed about what the minimum set of queries for
URN->URC resolution are, also whether the returned information is in
whois++ format, or is in URC format, however we define that.

However this is defined, it is going to be a subset of the evolving
whois++, and is going to have to be an extendable protocol, in the
sense that we are going to want to add more functionality to URN
servers as time goes by.  Therefore, URN->URC servers should fail
gracefully if a client requests a function they dont support, and
clients should behave gracefully if the server doesnt support a
feature they request. Crashing because you see an unrecognised field,
or request is NOT conformance with this standard. See the whois++
document [peterd:3]

{Note: Whois++ needs to add a statement that order can not be arbitrarily
rearranged in templates}

5) The user, or client chooses the "best" URL

In some cases the URC will contain enough information to pick the
document without further input from the user. For instance, in the
above case, if the client doesnt support Postscript, then it might
automatically select the "Text/plain" version.


In most cases, the client is going to have to present some kind of
menu, or dialog box to a user for him or her to make that decision. This
means that the meta-information in a template should ideally be
interpretable by the computer, and should if possible be
human-readable. 


It would be usefull to enable experimentation at this time, before
agreement on these fields is reached. So for now, template fields
shall consist of any registered mime field (e.g. Content type) or any
IAFA template field.  If these are insufficient then select a field
from the report of the "non-existant" Data Elements Working Group
[NEDEWG].  If other fields are needed, prepend them with "X-".

It is hoped that the URI group, or some other WG can gradually standardize
the contents of most of these fields. However in the shifting world of
information systems this will never be a complete task, so clients
should never choke on an element they dont recognize.


6) The client retrieves the URL - 

The URL theoretically contains enough information for a client to
retrieve it. There are three possibilities for what a well-behaved
client might do:
a) The client is clever enough to understand the protocol, and passes
the URL to its access library.
b) The client knows about the protocol, but cannot handle it itself,
in which case it can pass the URL to a gateway that it knows handles this
protocol.
c) The client has never heard of the protocol, in which case it should
hand the URL to its default gateway, and hope for the best.

Of course, accessing this URL involves DNS lookup and other network
functionality, but this is a well understood problem.


7) The client displays the object

The client now has an object - file, menu etc. By virtue of the
earlier steps, it also should have enough information to know what to
do with it. Typically this will involve either displaying the object
itself, or checking in some configuration table for an appropriate
application (e.g. xv) to pass the object to.


Conclusion
==========

Hopefully, this document has outlined a scenario and some ways to
achieve, it - I believe the scenario is generic enough to fit many
people's needs, if not then lets outline alternative scenarios and
determine if the techniques above are sufficient for handling it.


Other docs
==========
{I'd appreciate URL's etc where these are missing}

X-Ref:          peterd:3
Description:    Whois++ protocol spec
Author:         ??
Title:          ??
URL:            ??


X-Ref:          timbl
Title:          Unifrom Resource Locators
Author:         Tim Berners-Lee <timbl@info.cern.ch>
Date:           March 93
URL:            ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-url-03.txt

X-Ref:          NEESNWG 
Description:    Report of the "non-existant" Element Set Names Working Group
Title:          ??
URL:            ??

X-Ref:          clw/peterd:1
Author:         Chris Weider <clw@merit.edu>
Author:         Peter Deutsch <peterd@bunyip.com>
Title:          A Vision of an Integrated Information Service
Date:           Oct 93
URL:            ftp://cnri.reston.va.us/internet-drafts/draft-ietf-iiir-vision-00.txt

X-Ref:          clw/peterd:2
Author:         Chris Weider <clw@merit.edu>
Author:         Peter Deutsch <peterd@bunyip.com>
Title:          Uniform Resource Names
URL:            ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-resource-names-01.txt

X-Ref:          mm
Author:         Michael Mealing <oit.gatech.edu>
Title:          Uniform Resource Identifiers: The Grand Menagerie
URL:            http://www.gatech.edu/urm.paper
Date:           July 93


1	draft-ietf-uri-urn2urc-00.txt URI working group
2	Expires 16 October 1994 19 Feb 94
3
4	URN to URC resolution scenario
5	==============================
6
7	Abstract
8	========
9
10	This document is intended to address the issue of URN to URC resolution
11	at a level between the IIIR Vision document [clw/peterd:1] and the
12	various standards documents such as the URL specification. [timb].
13	This document is also intended to act as some pointers for people who
14	might want to implement URNs in information systems they are building.
15
16	Status of this Memo
17	===================
18	This document is an Internet-Draft. Internet-Drafts are
19	working documents of the Internet Engineering Task Force
20	(IETF), its areas, and its working groups. Note that other
21	groups may also distribute working documents as
22	Internet-Drafts.
23
24	Internet-Drafts are draft documents valid for a maximum of six
25	months. Internet-Drafts may be updated, replaced, or obsoleted
26	by other documents at any time. It is not appropriate to use
27	Internet-Drafts as reference material or to cite them other
28	than as a ``working draft'' or ``work in progress.''
29
30	To learn the current status of any Internet-Draft, please check
31	the 1id-abstracts.txt listing contained in the Internet-Drafts
32	Shadow Directories on ds.internic.net, nic.nordu.net,
33	ftp.isi.edu, or munnari.oz.au.
34
35
36	Sample URC
37	==========
38	Title: URN to URC resolution scenario
39	Version: 0.3
40	URN: Unknown - how about <urn:ietf/pandora/mitra:urn2urc>
41	URL: ftp://pandora.sf.ca.us/pub/mitra/urn2urc-04.txt
42	URL: ftp://ftp.isi.edu/internet-drafts/draft-ietf-uri-urn-00.txt
43	Format: Text/plain
44	Cost: 0
45	Date: 19 Feb 93
46	Comment: This version has had minor changes since 0.3
47
48
49	Note
50	====
51
52	{Throughout the document, I've tried to identify places where discussion
53	is still going on that may affect this proposal, and indicate with
54	'{' and '}' }.
55
56	The first version was written prior to IETF Houston - the discussion has
57	moved along a bit since then, but it isnt worth updating this document
58	until more choices are clarified.
59
60	Acronym Soup
61	============
62
63	I've avoided defining any new acronyms in this document. The following
64	acronyms will be used.
65
66	URI Uniform Resource Identifier: Any of URN, URL, URC etc
67	URN Uniform Resource Name: A persistant, location independent
68	identifier for an object.
69	URL Uniform Resource Location: The address of an object, contains
70	enough information to identify a protocol and retrieve the object.
71	URC Uniform Resource Characteristics: Any combination of one or more
72	URN's or URLs with meta information. The set of information in
73	a URC is not defined. In some documents this is referred to as
74	URT or URM. [mm]
75	Id Authority That part of a URN which identifiers the authority
76	that issued the URN. In documents written prior to it becoming
77	a hierarchical entity, [clw/peterd:2] it is usually split into
78	two parts "Naming Authority" and "Publisher Id"
79	IIIR Integration of Internet Information Retrieval.
80
81	General scenario
82	================
83
84	The general scenario described here might proceed as follows, although
85	this document is not constrained by the scenario it is usefull to articulate
86	it so that we can check that what we are proposing makes sense.
87
88	1) A client program, running on a users workstation, receives a
89	hypertext document, menu, or search result etc containing a number of
90	URNs.
91
92	2) The user selects one of the URNs
93
94	3) The client locates, a URN->URC resolution service.
95
96	4) The client contacts the URN->URC resolution service, and
97	retrieves a number of URLs for this document, along with meta
98	information about those URLs (e.g. cost and format) and about the URN
99	itself.
100
101	5) The user, or the client, pick the "best" URL
102
103	6) The client either retrieves the URL itself (e.g. via its access
104	library) or if the URL is for a access method it doesnt speak, via a
105	gateway service.
106
107	7) The client either displays the object, or if its in a format it doesnt
108	handle, launches an appropriate viewer.
109
110	Technical detail for each step
111	==============================
112
113	1) The client program receives the URNs
114
115	The URNs are embedded in an object, or other data structure, the client
116	needs a way to locate and extract those URNs.
117
118	The current proposal is that in
119	text they are always of the form <urn:xxx/yyy:ABC123456> where:
120
121	xxx/yyy is a hierarchical identifier (read left to right) for the ID
122	authority. {Note other proposals are to use one or more ":" as seperators
123	for the hierarchy, or to reverse it to look like a FQDN}
124
125	ABC123456 is an opaque string assigned by the ID authority.
126
127	For more information on URNs see the current version of the URN
128	document [clw/peterd:2]
129
130
131	2) The user selects one of those URNs.
132
133	For this to be meaningfull their must be enough meta-information with
134	the URN for the user to make a selection. Depending on the protocol,
135	this might be outside the standardisation process, (e.g. a
136	textual description in a mail message. Or it might be part of another
137	protocol (e.g. the title in gopher or html). Or it might be that the
138	URN is received embedded in a URC
139
140	3) The client locates a URN->URC resolution service.
141
142	The URN needs to contain enough information within it to locate the
143	resolution service. The client can extract the ID Authority from the
144	URN in the example above this would be xxx/yyy, this is reversed and
145	".urn" appended to form a FQDN of yyy.xxx.urn
146
147	{ If other punctuation schemes are adopted, then the process will
148	change, but the principle remain the same. Ditto if we adopt a
149	different top-level domain. This does have implications for the
150	character set allowed in the ID Authority part of a URN, it either has
151	to be those characters allowed in a FQDN, or an escaping scheme
152	chosen}
153
154	This FQDN, can be passed to the DNS which can resolve it to an address,
155	while this might involve several network accesses to traverse the
156	hierarchy, the standard caching and UDP parts of DNS make this an
157	efficient process. Typically this requires just a call to
158	"gethostbyname" which returns a IP address.
159
160	There have been some concerns raised about not increasing the load on
161	the fragile DNS system and software. Note also that in this scheme the
162	DNS needs no records changed or added, and only ID authorities are
163	registered - not documents. Total increased load on DNS for any
164	transaction is going to be of a similar order as the load for any
165	document retrieval etc.
166
167	4) The client contacts the URN->URC service.
168
169	If we are to avoid mucking with the DNS, then the URN->URC service is
170	going to have to be on a registered port, talking a known protocol.
171
172	Currently the proposal is to use a subset of the whois++ protocol, but
173	sitting on a different port. The simplest query is of the form:
174
175	Client->Server: Template=URC;URN="urn:xxx/yyy:ABCD123456"
176
177	Server->Client: URN: urn:xxx/yyy:ABCD123456
178	Author: Mitra <mitra@path.net>
179	URL: gopher://path.net/00/papers/mitra/urn2urc
180	Format: Text/plain
181	URL: ftp://path.net/pub/docs/urn2urc.ps
182	Format: Application/postscript
183
184	Decisions are needed about what the minimum set of queries for
185	URN->URC resolution are, also whether the returned information is in
186	whois++ format, or is in URC format, however we define that.
187
188	However this is defined, it is going to be a subset of the evolving
189	whois++, and is going to have to be an extendable protocol, in the
190	sense that we are going to want to add more functionality to URN
191	servers as time goes by. Therefore, URN->URC servers should fail
192	gracefully if a client requests a function they dont support, and
193	clients should behave gracefully if the server doesnt support a
194	feature they request. Crashing because you see an unrecognised field,
195	or request is NOT conformance with this standard. See the whois++
196	document [peterd:3]
197
198	{Note: Whois++ needs to add a statement that order can not be arbitrarily
199	rearranged in templates}
200
201	5) The user, or client chooses the "best" URL
202
203	In some cases the URC will contain enough information to pick the
204	document without further input from the user. For instance, in the
205	above case, if the client doesnt support Postscript, then it might
206	automatically select the "Text/plain" version.
207
208
209	In most cases, the client is going to have to present some kind of
210	menu, or dialog box to a user for him or her to make that decision. This
211	means that the meta-information in a template should ideally be
212	interpretable by the computer, and should if possible be
213	human-readable.
214
215
216	It would be usefull to enable experimentation at this time, before
217	agreement on these fields is reached. So for now, template fields
218	shall consist of any registered mime field (e.g. Content type) or any
219	IAFA template field. If these are insufficient then select a field
220	from the report of the "non-existant" Data Elements Working Group
221	[NEDEWG]. If other fields are needed, prepend them with "X-".
222
223	It is hoped that the URI group, or some other WG can gradually standardize
224	the contents of most of these fields. However in the shifting world of
225	information systems this will never be a complete task, so clients
226	should never choke on an element they dont recognize.
227
228
229	6) The client retrieves the URL -
230
231	The URL theoretically contains enough information for a client to
232	retrieve it. There are three possibilities for what a well-behaved
233	client might do:
234	a) The client is clever enough to understand the protocol, and passes
235	the URL to its access library.
236	b) The client knows about the protocol, but cannot handle it itself,
237	in which case it can pass the URL to a gateway that it knows handles this
238	protocol.
239	c) The client has never heard of the protocol, in which case it should
240	hand the URL to its default gateway, and hope for the best.
241
242	Of course, accessing this URL involves DNS lookup and other network
243	functionality, but this is a well understood problem.
244
245
246	7) The client displays the object
247
248	The client now has an object - file, menu etc. By virtue of the
249	earlier steps, it also should have enough information to know what to
250	do with it. Typically this will involve either displaying the object
251	itself, or checking in some configuration table for an appropriate
252	application (e.g. xv) to pass the object to.
253
254
255	Conclusion
256	==========
257
258	Hopefully, this document has outlined a scenario and some ways to
259	achieve, it - I believe the scenario is generic enough to fit many
260	people's needs, if not then lets outline alternative scenarios and
261	determine if the techniques above are sufficient for handling it.
262
263
264	Other docs
265	==========
266	{I'd appreciate URL's etc where these are missing}
267
268	X-Ref: peterd:3
269	Description: Whois++ protocol spec
270	Author: ??
271	Title: ??
272	URL: ??
273
274
275	X-Ref: timbl
276	Title: Unifrom Resource Locators
277	Author: Tim Berners-Lee <timbl@info.cern.ch>
278	Date: March 93
279	URL: ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-url-03.txt
280
281	X-Ref: NEESNWG
282	Description: Report of the "non-existant" Element Set Names Working Group
283	Title: ??
284	URL: ??
285
286	X-Ref: clw/peterd:1
287	Author: Chris Weider <clw@merit.edu>
288	Author: Peter Deutsch <peterd@bunyip.com>
289	Title: A Vision of an Integrated Information Service
290	Date: Oct 93
291	URL: ftp://cnri.reston.va.us/internet-drafts/draft-ietf-iiir-vision-00.txt
292
293	X-Ref: clw/peterd:2
294	Author: Chris Weider <clw@merit.edu>
295	Author: Peter Deutsch <peterd@bunyip.com>
296	Title: Uniform Resource Names
297	URL: ftp://cnri.reston.va.us/internet-drafts/draft-ietf-uri-resource-names-01.txt
298
299	X-Ref: mm
300	Author: Michael Mealing <oit.gatech.edu>
301	Title: Uniform Resource Identifiers: The Grand Menagerie
302	URL: http://www.gatech.edu/urm.paper
303	Date: July 93
304
305