1 |
|
2 |
IETF IIIR Working Group Chris Weider |
3 |
INTERNET--DRAFT Merit Network, Inc. |
4 |
<draft-ietf-iiir-vision-01.txt> Peter Deutsch |
5 |
Bunyip, Inc. |
6 |
October, 1993 |
7 |
|
8 |
A Vision of an Integrated Internet Information Service |
9 |
|
10 |
Status of this Memo |
11 |
|
12 |
In this paper, the authors put forth their vision of an integrated Internet |
13 |
information service which ties together the information services currently |
14 |
deployed and allows them to work together. |
15 |
|
16 |
This document is an Internet Draft. Internet Drafts are working |
17 |
documents of the Internet Engineering Task Force (IETF), its Areas, |
18 |
and its Working Groups. Note that other groups may also distribute |
19 |
working documents as Internet Drafts. |
20 |
|
21 |
Internet Drafts are draft documents valid for a maximum of six |
22 |
months. Internet Drafts may be updated, replaced, or obsoleted |
23 |
by other documents at any time. It is not appropriate to use |
24 |
Internet Drafts as reference material or to cite them other than |
25 |
as a "working draft" or "work in progress." |
26 |
|
27 |
Please check the I-D abstract listing contained in each Internet |
28 |
Draft directory to learn the current status of this or any |
29 |
other Internet Draft. |
30 |
|
31 |
This Internet Draft expires April 25, 1994. |
32 |
|
33 |
Abstract |
34 |
|
35 |
This paper lays out a vision of how Internet information services might |
36 |
be integrated over the next few years, and discusses in some detail what |
37 |
steps will be needed to achieve this integration. |
38 |
|
39 |
0: Acknowledgments |
40 |
|
41 |
Thanks to the whole gang of information service wonks who have |
42 |
wrangled with us about the future of information services in countless bar |
43 |
bofs (in no particular order): Cliff Lynch, Cliff Newman, Alan Emtage, |
44 |
Jim Fullton, Joan Gargano, Mike Schwartz, John Kunze, Janet Vratny, |
45 |
Mark McCahill, Tim Berners-Lee, John Curran, Jill Foster, and many others. |
46 |
Extra special thanks to George Brett of CNIDR and Anders Gillner of |
47 |
RARE, who have given us the opportunity to start tying together the |
48 |
networking community and the librarian community. |
49 |
|
50 |
1: Introduction |
51 |
|
52 |
The current landscape in information tools is much the same as the landscape |
53 |
in communications networks in the early 1980's. In the early 80's, there |
54 |
were a number of proprietary networking protocols that connected large |
55 |
but autonomous regions of computers, and it was difficult to coalesce these |
56 |
regions into a unified network. Today, we have a number of large but |
57 |
autonomous regions of networked information. We have a vast set of FTPable |
58 |
files, a budding WAIS network, a budding GOPHER network, a budding World Wide |
59 |
Web network, etc. Although there are a number of gateways between various |
60 |
protocols, and information service providers are starting to use GOPHER |
61 |
to provide a glue between various services, we are not yet in that golden |
62 |
age when all human information is at our fingertips. (And we're even farther |
63 |
from that platinum age when the computer knows what we're looking for and |
64 |
retrieves it before we even touch the keyboard.) |
65 |
|
66 |
|
67 |
|
68 |
IIIR Working Group Vision of Information Services Weider, Deutsch |
69 |
|
70 |
In this paper, we'll propose one possible vision of the information services |
71 |
landscape of the near future, and lay out a plan to get us there from here. |
72 |
|
73 |
2: Axioms of information services |
74 |
|
75 |
There are a number of unspoken assumptions that we've used in our discussions. |
76 |
It might be useful to lay them out explicitly before we start our exploration. |
77 |
|
78 |
The first is that there is no _unique_ information protocol that will |
79 |
provide the flexibility, scale, responsiveness, worldview, and mix of services |
80 |
that every information consumer wants. A protocol designed to give quick and |
81 |
meaningful access to a collection of stock prices might look functionally |
82 |
very different from one which will search digitized music for a particular |
83 |
musical phrase and deliver it to your workstation. So, rather than design the |
84 |
information protocol to end all information protocols, we will always need to |
85 |
integrate new search engines, new clients, and new delivery paradigms into our |
86 |
grand information service. |
87 |
|
88 |
The second is that distributed systems are a better solution to large-scale |
89 |
information systems than centralized systems. If one million people are |
90 |
publishing electronic papers to the net, should they all have to log on to |
91 |
a single machine to modify the central archives? What kind of bandwidth |
92 |
would be required to that central machine to serve a billion papers a day? |
93 |
If we replicate the central archives, what sort of maintenance problems would |
94 |
be encountered? These questions and a host of others make it seem more |
95 |
profitable at the moment to investigate distributed systems. |
96 |
|
97 |
The third is that users don't want to be bothered with the details of the |
98 |
underlying protocols used to provide a given service. Just as most people |
99 |
don't care whether their e-mail message gets split up into 20 packets |
100 |
and routed through Tokyo to get to its destination, information service |
101 |
users don't care whether the GOPHER server used telnet to get to |
102 |
a WAIS database back-ended by an SQL database. They just want the |
103 |
information. In short, they care very much about how they interact with the |
104 |
client; they just don't want to know what goes on behind. |
105 |
|
106 |
These axioms force us to look at solutions which are distributed, support |
107 |
multiple access paradigms, and allow information about resources to be |
108 |
handed off from one system (say Gopher) to another (say WWW). |
109 |
|
110 |
3: An architecture to provide interoperability and integration. |
111 |
|
112 |
The basic architecture outlined in this paper splits up into 4 levels [Fig. |
113 |
1]. |
114 |
|
115 |
At the lowest level, we have the resources themselves. These are such things |
116 |
as files, telnet sessions, online library catalogs, etc. Each resource can |
117 |
have a resource transponder attached [Weider 93a], and should have a |
118 |
Uniform Resource Name (URN) [Weider 93b] associated with it to uniquely |
119 |
identify its contents. If a resource transponder is attached, it will help |
120 |
maintain the information required by the next level up. |
121 |
|
122 |
At the next level, we have a 'directory service' that takes a URN and |
123 |
returns Uniform Resource Locators (URLs) [Berners-Lee 92] for that resource. |
124 |
The URL is a string which contains location information, and can be used by |
125 |
a client to access the resource pointed to by the URL. It is expected that |
126 |
a given resource may be replicated many times across the net, and thus the |
127 |
client would get a number of URLs for a given resource, and could choose |
128 |
between them based on some other criteria. |
129 |
|
130 |
|
131 |
IIIR Working Group Vision of Information Service Weider, Deutsch |
132 |
|
133 |
|
134 |
______________________________________________________________ |
135 |
| | | | | |
136 |
| | | | | |
137 |
| Gopher | WAIS | WWW | Archie | Others ... |
138 |
| | | | | |
139 |
|___________|______________|_______|_______________|___________ |
140 |
| | |
141 |
| _________|____________ |
142 |
| | | |
143 |
| | Resource Discovery | |
144 |
| | System (perhaps | |
145 |
| | based on whois++) | |
146 |
| |______________________| |
147 |
| | |
148 |
| | |
149 |
_____|________________________________|____ |
150 |
| | |
151 |
| Uniform resource name to uniform resource | |
152 |
| locator mapping system (perhaps based on | |
153 |
| whois++ or X.500) | |
154 |
|___________________________________________| |
155 |
| |
156 |
| |
157 |
________________|______________________________________ |
158 |
| | | | |
159 |
______|______ _______|_____ ______|______ ______|______ |
160 |
| | | | | | | | |
161 |
| Transponder | | Transponder | | Transponder | | Transponder | |
162 |
|_____________| |_____________| |_____________| |_____________| |
163 |
| | | | | | | | |
164 |
| | | | | | | | |
165 |
| | | | | | | | |
166 |
| Resource | | Resource | | Resource | | Resource | |
167 |
| | | | | | | | |
168 |
| | | | | | | | |
169 |
|_____________| |_____________| |_____________| |_____________| |
170 |
|
171 |
|
172 |
Figure 1: Proposed architecture of an integrated information |
173 |
service |
174 |
|
175 |
_________________________________________________________________________ |
176 |
|
177 |
The third level of the architecture is a resource discovery system. This |
178 |
would be a large, distributed system which would accept search criteria |
179 |
and return URNs and associated information for every resource which matched |
180 |
the criteria. This would provide a set of URLs which the information service |
181 |
providers (GOPHER servers, etc.) could then select among for incorporation. |
182 |
|
183 |
The fourth level of the architecture is comprised of the various information |
184 |
delivery tools. These tools are responsible for collating pointers to |
185 |
resources, informing the user about the resources to which they contain |
186 |
pointers, and retrieving the resources when the user wishes. |
187 |
|
188 |
Let's take a look in greater detail at each of these levels. |
189 |
|
190 |
|
191 |
IIIR Working Group Vision of Information Service Weider, Deutsch |
192 |
|
193 |
|
194 |
3.1 Resource layer |
195 |
|
196 |
The resources at this layer can be any collection of data a publisher |
197 |
wishes to catalog. It might be an individual text file, a WAIS database, |
198 |
the starting point for a hypertext web, or anything else. Each resource |
199 |
is assigned a URN by the publisher, and the URL is derived from the |
200 |
current location of the resource. The transponder is responsible for |
201 |
updating levels 2 and 3 with the appropriate information as the resource |
202 |
is published and moves around. |
203 |
|
204 |
3.2 URN -> URL mapping |
205 |
|
206 |
This level takes a URN and returns a number of URLs for the various |
207 |
instantiations of that resource on the net. It will also maintain the URN |
208 |
space. Thus the only functionality required of this level is the ability |
209 |
to maintain a global namespace and to provide mappings from that namespace |
210 |
to the URLs. Consequently, any of the distributed 'directory service' |
211 |
protocols would allow us to provide that service. However, there may be some |
212 |
benefit to collapsing levels 2 and 3 onto the same software, in which case |
213 |
we may need to select the underlying protocol more carefully. For example, |
214 |
X.500 provides exactly the functionality required by level 2, but does not |
215 |
(yet) have the functionality required to provide the level 3 service. |
216 |
In addition, the service at level 2 does not necessarily have to be |
217 |
provided by a monolithic system. It can be provided by any collection of |
218 |
protocols which can jointly satisfy the requirements and also interoperate, |
219 |
so that level 2 does appear to level 3 to be universal in scope. |
220 |
|
221 |
3.3 Resource discovery |
222 |
|
223 |
This is the level which requires the most work, and where the greatest |
224 |
traps lurk to entangle the unwary. This level needs to serve as a giant |
225 |
repository of all information about every publication, except for that which |
226 |
is required for the URI -> URL mapping. Since this part is the least filled |
227 |
in at the moment, we will propose a mechanism which may or may not be the |
228 |
one which is eventually used. |
229 |
|
230 |
When a new resource is created on the network, it is assigned a URN determined |
231 |
by the publisher of the resource. Section 4.1 discusses in more detail the |
232 |
role of the publisher on the net, but at the moment we can consider only 2 |
233 |
of the publisher's functions. The publisher is responsible for assigning a |
234 |
URN out of the publishers namespace, and is responsible for notifying a |
235 |
publishing agent [Deutsch 92] that a new resource has been created; that |
236 |
agent will either be a part of the resource location service or will then |
237 |
take the responsibility for notifying an external resource location service |
238 |
that the resource has been created. Alternatively, the agent can use the |
239 |
resource location service to find parts of the RLS which should be notified |
240 |
that this resource has been created. |
241 |
|
242 |
To give a concrete example, let's say that Peter and Chris publish a multi- |
243 |
media document titled 'Chris and Peter's Bogus Journey' which talks about |
244 |
our recent trip to the Antarctic, complete with video clips. P & C would then |
245 |
ask their publishing agent to generate a URN for this document. They then ask |
246 |
their publishing agent to attach a transponder to the document, and to |
247 |
look around and see if anyone a) has asked that our agent notify them whenever |
248 |
anything we write comes out; or b) is running any kind of server of 'trips |
249 |
to Antarctica'. Janet has posted a request that she be notified, so the agent |
250 |
tells her that a new resource has been created. The agent also finds 3 |
251 |
|
252 |
|
253 |
IIIR Working Group Vision of Information Service Weider, Deutsch |
254 |
|
255 |
|
256 |
servers which archive video clips of Antarctica, so the agent notifies |
257 |
all three that a new resource on Antarctica has come out, and gives out the |
258 |
URN and a URL for the local copy. |
259 |
|
260 |
3.4 Information delivery tools |
261 |
|
262 |
One of the primary functions of an information delivery tool is to collect |
263 |
and collate pointers to resources. A given tool may provide mechanisms to |
264 |
group those pointers based on other information about the resource, e.g. |
265 |
a full-text index allows one to group pointers to resources based on their |
266 |
contents; archie can group pointers based on filenames, etc. The URLs which |
267 |
are being standardized in the IETF are directly based on the way World Wide |
268 |
Web built pointers to resources, by creating a uniform way to specify |
269 |
access information and location for a resource on the net. With just the |
270 |
URLs, however, it is impossible without much more extensive checking |
271 |
to tell whether two resources with different URLs have the same intellectual |
272 |
content or not. Consequently, the URN is designed to solve this problem. |
273 |
|
274 |
In this architecture, the pointers that a given information delivery tool |
275 |
would keep to a resource will be a URN and one or more cached URLs. When |
276 |
a pointer to a resource is first required (i.e. when a new resource is |
277 |
linked in a Gopher server), level 2 will provide a set of URLs for that |
278 |
URN, and the creator of the tool can then select which of those will |
279 |
be used. As it is expected that the URLs will eventually become stale |
280 |
(the resource moves, the machine goes down, etc.) the URN can be used |
281 |
to get a set of current URLs for the resource and an appropriate one can |
282 |
then be selected. Since the cost of using the level 2 service is probably |
283 |
greater than the cost of simply resolving a URL, both the URN and the URL |
284 |
are cached to provide speedy access unless something has moved. |
285 |
|
286 |
3.5 Using the architecture to provide interoperability between services |
287 |
|
288 |
In the simplest sense, each interaction that we have with an information |
289 |
delivery service does one of two things: it either causes a pointer to |
290 |
be resolved (a file to be retrieved, a telnet session to be initiated, etc.) |
291 |
or causes some set of the pointers available in the information service to |
292 |
be selected. At this level, the architecture outlined above provides the |
293 |
core implementation of interoperability. Once we have a means of mapping |
294 |
between names and pointers, and we have a standard method of specifying |
295 |
names and pointers, the interoperation problem becomes one of simply handing |
296 |
names and pointers around between systems. Obviously with such a simplistic |
297 |
interoperability scheme much of the flavor and functionality of the |
298 |
various systems are lost in transition. But, given the pointers, a system |
299 |
can either a) present them to the user with no additional functionality or |
300 |
b) resolve the pointers, examine the resources, and then run algorithms |
301 |
designed to tie these resources together into a structure appropriate for |
302 |
the current service. Let's look at one example (which just happens to be the |
303 |
easiest to resolve); interoperation between World Wide Web and Gopher. |
304 |
|
305 |
Displaying a Gopher screen as a WWW document is trivial with these pointers. |
306 |
Every Gopher screen is simply a list of menu items with pointers behind them |
307 |
(we'll ignore the other functionality Gopher provides for a moment), so is |
308 |
an extremely simple form of a hypertext document. Consequently with this |
309 |
architecture it is easy to show and resolve a Gopher screen in WWW. |
310 |
For a WWW to Gopher map, the simplest method would be that when one accesses |
311 |
a WWW document, all the pointers associated with links off to other |
312 |
documents are brought in with the document. Gopher could then resolve |
313 |
the links and read the first line of each document to provide a Gopher |
314 |
style screen which contains everything in the WWW document. When a link |
315 |
|
316 |
|
317 |
IIIR Working Group Vision of Information Service Weider, Deutsch |
318 |
|
319 |
|
320 |
is selected, all of the WWW links for the new document are brought in and |
321 |
the process repeats. Obviously we're losing a lot with the WWW -> Gopher |
322 |
mapping; some might argue that we are losing everything. However, this |
323 |
does provide a trivial interoperability capacity, and one can argue that |
324 |
the 'information content' has been preserved across the gateway. |
325 |
|
326 |
In addition, the whole purpose of gatewaying is to provide access to resources |
327 |
that lie outside the reach of your current client. Since all resources are |
328 |
identifiable and accessible through layers 2 and 3, it will be easy to |
329 |
copy resources from one protocol to another since all we need to do is to |
330 |
move the pointers and reexpress the relationships between the pointers in the |
331 |
new paradigm. |
332 |
|
333 |
3.6 Other techniques for interoperability |
334 |
|
335 |
One technique for interoperability which has recently received some serious |
336 |
attention is the technique of creating one client which speaks the protocols |
337 |
of all the information delivery tools. This approach has been taken in |
338 |
particular by the UNITE (User's Network Interface To Everything) group in |
339 |
Europe. This client would sit on the top level of the architecture in Figure |
340 |
1. This technique has a lot of appeal; however, there are several practical |
341 |
difficulties with this approach which may hinder its successful |
342 |
implementation. |
343 |
|
344 |
The first difficulty is one that is common to clients in general; the clients |
345 |
must be constantly updated to reflect changes in the underlying protocols and |
346 |
to accommodate new protocols. If the increase in the number of information |
347 |
services is very gradual, or if the underlying protocols do not change very |
348 |
rapidly, this may not be an insuperable difficulty. In addition, old clients |
349 |
must have some way of notifying their user that they are no longer current; |
350 |
otherwise they will no longer be able to access parts of the information mesh. |
351 |
|
352 |
The second problem is one which may prove more difficult. Each of the |
353 |
currently deployed information services provides information in a |
354 |
fundamentally different way. In addition, new information services are likely |
355 |
to use completely new paradigms for the organization and display of the |
356 |
information they provide. The various clients of these information services |
357 |
provide vastly different functionality from each other because the underlying |
358 |
protocols allow different techniques. It may very well prove impossible to |
359 |
create a single client which allows access to the full functionality of each |
360 |
of the underlying protocols while presenting a consistent user interface to |
361 |
the user. |
362 |
|
363 |
We will continue to follow this work and may include it in future revisions of |
364 |
this architecture if it bears fruit. |
365 |
|
366 |
4: Human interactions with the architecture |
367 |
|
368 |
In this section we will look at how humans might interact with an information |
369 |
system based on the above architecture. |
370 |
|
371 |
4.1 Publishing in this architecture |
372 |
|
373 |
When we speak of publishing in this section, we are referring only to the |
374 |
limited process of creating a resource on the net, assigning it a URN, and |
375 |
spreading the information around that we have created a new resource. |
376 |
|
377 |
We start with the creation of a resource. Simple enough; a creative thought, |
378 |
a couple of hours typing, and a few cups of coffee and we have a new resource. |
379 |
We then wish to assign it a URN. We can talk to whichever publishing agent |
380 |
we would like; whether it is our own personal publishing agent or some big |
381 |
|
382 |
|
383 |
IIIR Working Group Vision of Information Service Weider, Deutsch |
384 |
|
385 |
organization that provides URN and announcement services to a large number of |
386 |
authors. Once we have a URN, we can provide the publishing agent with |
387 |
a URL for our local copy of the resource and then let it do its thing. |
388 |
Alternatively, we can attach a transponder to the resource, let it determine |
389 |
a local URL for the resource, and let it contact the publishing agent and |
390 |
set up the announcement process. One would expect a publishing agent to |
391 |
prompt us for some information as to where it should announce our new |
392 |
resource. |
393 |
|
394 |
For example, we may just wish a local announcement, so that only people |
395 |
in our company can get a pointer to the resource. Or we may wish some sort of |
396 |
global announcement (but it will probably cost us a bit of money!) |
397 |
|
398 |
Once the announcement has been made, the publishing agent will be contacted |
399 |
by a number of pieces of the resource location system. For example, someone |
400 |
running a WAIS server may decide to add the resource to their index. So |
401 |
they can retrieve the resource, index it, and add the indexes to their tables |
402 |
along with a URI - URL combination. Then when someone uses that WAIS server, |
403 |
it can go off and retrieve the resource if necessary. Or, the WAIS server |
404 |
could create a local copy of the resource; if it wished other people to |
405 |
find their local copy of the resource, it could provide the URI -> URL |
406 |
mapper with a URL for the local copy. In any case, publication becomes a |
407 |
simple matter. |
408 |
|
409 |
So, where does this leave the traditional publisher? Well, there are a number |
410 |
of other functions which the traditional publisher provides in addition to |
411 |
distribution. There are editorial services, layout and design, copyright |
412 |
negotiations, marketing, etc. The only part of the traditional role that this |
413 |
system changes is that of distributing the resource; this architecture may |
414 |
make it much cheaper for publishers to distribute their wares to a much wider |
415 |
audience. |
416 |
|
417 |
Although copying of resources would be possible just as it is in paper format, |
418 |
it might be easier to detect republication of the resource in this system, and |
419 |
if most people use the original URN for the resource, there may be a reduced |
420 |
monetary risk to the publisher. |
421 |
|
422 |
4.2 A librarian role in this architecture |
423 |
|
424 |
We've been in a number of discussions with librarians over the past year, |
425 |
and one question that we're frequently asked is "Does Peter talk this rapidly |
426 |
all the time?". The answer to that question is "Yes". But another question |
427 |
we are frequently asked is "If all these electronic resources are going to |
428 |
be created, supplanting books and journals, what's left for the librarians?". |
429 |
The answer to that is slightly more complex, but just as straightforward. |
430 |
Librarians have excelled at obtaining resources, classifying them so that |
431 |
users can find them, weeding out resources that don't serve their communities, |
432 |
and helping users navigate the library itself. None of these roles are |
433 |
supplanted by this architecture. The only differences are that instead of |
434 |
scanning publisher's announcements for new resources their users might be |
435 |
interested in, they will have to scan the announcements on the net. |
436 |
Once they see something interesting, they can retrieve it (perhaps buying |
437 |
a copy just as they do now), classify it, set up a navigation system for |
438 |
their classification scheme, show users how to use it, and provide pointers |
439 |
(or actual copies) of the resource to their users. The classification and |
440 |
selection processes in particular are services which will be badly needed |
441 |
on a net with a million new 'publications' a day, and many will be willing |
442 |
to pay for this highly value added service. |
443 |
|
444 |
|
445 |
IIIR Working Group Vision of Information Service Weider, Deutsch |
446 |
|
447 |
4.3 Serving the users |
448 |
|
449 |
This architecture allows users to see the vast collection of networked |
450 |
resources in ways both familiar and unfamiliar. Bookstores, record shops, |
451 |
and libraries can all be constructed on top of this architecture, with a |
452 |
number of different access methods. Specialty shops and research libraries |
453 |
can be easily built, and then tailored to a thousand different users. |
454 |
One never need worry that a book has been checked out, that a CD is out of |
455 |
stock, that a copy of Xenophon in the original Greek isn't available locally. |
456 |
In fact, a user could even engage a proxy server to translate resources into |
457 |
forms that her machine can use, for example to get a text version of a |
458 |
Postscript document although her local machine has no Postscript viewer, |
459 |
or to obtain a translation of a sociology paper written in Chinese. |
460 |
|
461 |
In any case, however the system looks in three, five, or fifty years, we |
462 |
believe that the vision we've laid out above has the flexibility and |
463 |
functionality to start tying everything together without forcing everyone |
464 |
to use the same access methods or to look at the net the same way. |
465 |
It allows new views to evolve, new resources to be created out of old, |
466 |
and for people to move from today to tomorrow with all the comforts of |
467 |
home but all the excitement of exploring a new world. |
468 |
|
469 |
5: References |
470 |
|
471 |
|
472 |
[Berners-Lee 93] Berners-Lee, Tim, Universal Resource Locators, Internet |
473 |
Draft, July 1993. Available for anonymous FTP as |
474 |
<ftp://nic.merit.edu/documents/internet-drafts/draft-ietf-uri-url-01.txt>. |
475 |
|
476 |
Deutsch, Peter, Master's Thesis, June 1992. Available for anonymous FTP as |
477 |
<ftp://archives.cc.mcgill.ca/pub/peterd/peterd.thesis>. |
478 |
|
479 |
[Weider 93a] Weider, Chris, Resource Transponders, Internet Draft, October, |
480 |
1993. Available for anonymous FTP as |
481 |
<ftp://nic.merit.edu/documents/internet-drafts/draft-ietf-iiir-transponders- |
482 |
01.txt>. |
483 |
|
484 |
[Weider 93b] Weider, Chris and Peter Deutsch, Uniform Resource Names, Internet |
485 |
Draft, October, 1993. Available for anonymous FTP as |
486 |
<ftp://nic.merit.edu/documents/internet-drafts/draft-ietf-uri-resource-names- |
487 |
01.txt>. |
488 |
|