/[suikacvs]/webroot/www/2004/id/draft-ietf-uri-urc-req-01.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-uri-urc-req-01.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:37:16 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1
2
3
4 Internet Engineering Task Force Ron Daniel Jr.
5 INTERNET-DRAFT Los Alamos National Laboratory
6 draft-ietf-uri-urc-req-01.txt Michael Mealling
7 Georgia Institute of Technology
8 March 24, 1995
9
10
11 URC Scenarios and Requirements
12
13
14
15
16
17 Status of this draft
18
19
20 This document is an Internet-Draft. Internet-Drafts are
21 working documents of the Internet Engineering Task Force
22 (IETF), its areas, and its working groups. Note that
23 other groups may also distribute working documents as
24 Internet-Drafts.
25
26 Internet-Drafts are draft documents valid for a maximum of
27 six months. Internet-Drafts may be updated, replaced, or
28 obsoleted by other documents at any time. It is not
29 appropriate to use Internet-Drafts as reference material or
30 to cite them other than as a ``working draft'' or ``work in
31 progress.''
32
33 To learn the current status of any Internet-Draft, please
34 check the 1id-abstracts.txt listing contained in the Internet-
35 Drafts Shadow Directories on ds.internic.net, nic.nordu.net,
36 ftp.isi.edu, or munnari.oz.au.
37
38 This Internet Draft expires September 22, 1995.
39
40
41 Abstract
42
43
44 This draft describes the place of the Uniform Resource Characteristic
45 (URC) service within the overall context of Uniform Resource
46 Identification on the Internet. It presents several scenarios
47 illustrating how the URC service might be used. From these usage
48 scenarios, we derive a set of requirements that any proposed URC
49 services must meet.
50
51
52 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
53
54 Contents
55
56
57 1 Introduction 3
58
59
60 2 User Scenarios 3
61
62 2.1 URN to URL resolution . . . . . . . . . . . . . . . . . . . . 4
63
64 2.2 Meta-data for its own sake . . . . . . . . . . . . . . . . . 5
65
66 2.3 Ensuring the veracity of the resource . . . . . . . . . . . . 6
67
68 2.4 Ensuring the veracity of the URC . . . . . . . . . . . . . . 7
69
70 3 Provider Scenarios 8
71
72 3.1 Publishing a new resource . . . . . . . . . . . . . . . . . . 8
73
74 3.2 Publishing a new version of a resource . . . . . . . . . . . 9
75
76 3.3 Removing a location for a resource . . . . . . . . . . . . . 10
77
78 3.4 Establishing a new publishing authority . . . . . . . . . . . 10
79
80 3.5 Dealing with the demise of a publisher . . . . . . . . . . . 11
81
82
83 4 Third-Party Scenarios 11
84
85 4.1 Bibliographic Search . . . . . . . . . . . . . . . . . . . . 12
86
87 4.2 Annotations, Seals of Approval, etc. . . . . . . . . . . . . 13
88
89 4.3 Providing an additional location for a resource . . . . . . . 15
90
91 4.3.1Mirroring of free information . . . . . . . . . . . . . . 15
92
93 4.3.2Mirroring of information that is for sale . . . . . . . . 16
94
95 4.3.3Mirroring on a regular basis . . . . . . . . . . . . . . . 16
96
97 4.4 Here be Librarians . . . . . . . . . . . . . . . . . . . . . 16
98
99 4.5 Republication . . . . . . . . . . . . . . . . . . . . . . . . 17
100
101 5 Requirements 18
102
103 5.1 Requirements on the URC . . . . . . . . . . . . . . . . . . . 18
104
105
106
107 Daniel and Mealling [Page 2]
108
109
110 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
111
112 5.2 Requirements on the URC Service . . . . . . . . . . . . . . . 20
113
114
115 6 Characteristics 21
116
117
118 1 Introduction
119
120
121 As Joe Jackson says, ``You can't get what you want 'till you know
122 what you want''[1]. This applies to software development just as
123 much as it applies to affairs of the heart. In order for the URI
124 working group to design an architecture that does what we want, we
125 need to know what we want it to do. This paper presents a wide
126 range of scenarios for how we would like the URC service to operate.
127 From those scenarios, we derive requirements for the functionality and
128 encoding of Uniform Resource Characteristics (URCs) within the overall
129 architecture of Uniform Resource Identification.
130
131 The URI architecture is concerned with resources and how they will
132 be used in applications. Resources are the objects, services, and
133 information that applications will make use of. In order to be
134 used, applications must have means for discovering, identifying, and
135 retrieving resources. Resources are named by a URN (Uniform Resource
136 Name), and are retrieved by means of a URL (Uniform Resource Locator).
137 The role of the URC (Uniform Resource Characteristic) is to make the
138 binding between the URN of a resource and its URLs. In addition,
139 the URC can contain metadata about the resource for purposes of
140 discovery, conveying usage restrictions, etc. The URI architecture
141 is described in other working drafts of the URI working group of the
142 Internet Engineering Task Force, particularly in [2]. With the URI
143 architecture in mind, we can say that:
144
145
146
147 The purpose or function of a URC is to provide a vehicle or
148 structure for the representation of URIs and their associated
149 meta-information.
150
151
152 The next few sections present concrete examples of how we foresee
153 URCs being used. We also describe the operation of the service in
154 which URCs reside - the URC service. From those concrete scenarios,
155 we derive a set of requirements on the functionality and encoding of
156 URCs. Any proposals for the URC service will be expected to show how
157 they meet the requirements set forth in this paper, or to point out
158 the error of our ways.
159
160
161
162
163
164
165 Daniel and Mealling [Page 3]
166
167
168 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
169
170 2 User Scenarios
171
172
173 This section of the paper presents several scenarios of how users
174 might interact with the URC service. In these scenarios we have
175 attempted to show how the system will be used, without specifying how
176 the system will accomplish its tasks.
177
178
179
180 2.1 URN to URL resolution
181
182
183 The fundamental purpose of the URC service is to map URNs to URLs so
184 that a resource can be retrieved if its name is known. We believe
185 that the most frequent operation will be to take a URN and return a
186 (possibly empty) list of URLs where the resource named by the URN can
187 be found. This is the primary use of the service and its speed and
188 fault-tolerance are paramount.
189
190
191 o User provides a URN to the browser by clicking on an anchor or by
192 entering text into a dialog box.
193
194 o Browser connects to the URC service, possibly through a caching
195 intermediary, and gives it the URN.
196
197 o Service returns a (possibly empty) list of locations to the
198 browser. The means by which the URC service determines this list
199 is outside the scope of this usage scenario. Each location must
200 contain a URL. It may also contain information on Content-Type,
201 Price, Signatures, Version, etc. The list of locations is
202 unordered. Note that if a location contains information in
203 addition to the the URL, ordering may be used to associate the
204 additional information with a particular URL, but no importance
205 should be placed on one URL appearing before another in the list
206 of locations sent back to the browser.
207
208 o The browser uses user-configurable preferences to order the list.
209 For example, a user might prefer HTML to PostScript to text. One
210 user might prefer locations that carried signature information,
211 another might not care. Most would prefer the cheapest version
212 of a resource, and the most recent version. Estimated network
213 distance is another means for ordering the selections. If
214 multiple locations tie, the browser randomizes them in the list to
215 prevent overload of any one server.
216
217 o Once the list of locations has been sorted, the browser attempts
218 to retrieve the resource from the first location. If that fails,
219 the next location is tried. This continues until one of the
220 following is true:
221
222
223 Daniel and Mealling [Page 4]
224
225
226 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
227
228 -- The browser successfully retrieves the resource
229
230 -- The list is exhausted
231
232 -- The user tells the browser to cancel the retrieval
233
234
235
236 o The browser displays the resource to the user, perhaps with the
237 aid of an external viewer.
238
239
240 Note that the list of termination conditions given above is not
241 complete. The URC service will undoubtably make extensive use of
242 caching for speed. If the list was obtained from a cache, the
243 possibility exists that the resolution failed because of the cache
244 being stale. A reasonable fix is to allow the user to configure
245 the browser to retry the query, this time getting authoritative
246 information to fill the list of locations.
247
248 This scenario provides us with several requirements for the URC and
249 the URC service. First, we must be able to locate a URC from a
250 URN. We must be able to transport a URC without errors using normal
251 Internet protocols. The URC must be parseable by a computer. It
252 may have a hierarchical structure, and we should be able to rearrange
253 elements of the URC within the same hierarchical level. The URC must
254 be able to contain a wide variety of information. Furthermore, we
255 must be able to distinguish queries answered over cached information
256 from those answered over authoritative information.
257
258 The browser assumed in the scenario above was a complex program with
259 considerable capability to make decisions on price, format, etc. for
260 the user. What should people with less capable browsers see? A
261 plausible scenario would be for the URC service to return an HTML
262 document to the browser so that the user can read the meta-information
263 and pick a location, then click on a URL to get the resource.
264
265 Dealing with less capable browsers will require that the URC server be
266 able to emit the URC in a variety of formats.
267
268
269 2.2 Meta-data for its own sake
270
271
272 The scenario above assumed that the user really and truly wanted to
273 retrieve the resource on the other end of the anchor. However,
274 sometimes the user will not be sure if they want to get the resource.
275 This is already the case in WWW browsing where users will sometimes
276 decide, by inferring size and speed from the URL, not to access
277 particular resources. As the WWW starts to encompass resources that
278 will require payment for their access, users will want to know just
279 what they about to get themselves into.
280
281 Daniel and Mealling [Page 5]
282
283
284 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
285
286 o User is browsing and comes across a moderately interesting link
287 expressed as a URN.
288
289 o User does a right-mouse-button over the link, presenting a pop-up
290 menu.
291
292 o User selects ``More info...'' from the pop-up and releases the
293 mouse button.
294
295 o Browser fetches the URC for the resource and displays it in a
296 nicely formatted dialog box.
297
298 o The user decides that they don't mind paying $1 for the resource,
299 and selects ``OK'' in the dialog.
300
301 o The browser fetches the resource and displays it to the user.
302
303
304
305 To support this scenario, the service must be able to provide an
306 entire URC, not just the list of locations that it returned in the
307 previous scenario. Second, URCs must have a printable representation
308 that can be understood and transcribed by humans. This does not
309 mean that all elements must be easily understood, or even that we
310 have to transmit URCs in a readable form. It merely means that, in
311 addition to any other representation, fields must have have a printed
312 representation that does not intentionally obfuscate the URC, barring
313 the presence of encryption.
314
315
316 2.3 Ensuring the veracity of the resource
317
318
319 An important concern voiced over the URI mailing list and in
320 discussions with different communities of users has been how to ensure
321 the veracity of a resource. This concern has been raised on both
322 the user and provider side. Users want to make sure that they
323 are getting the real resource, especially if they are paying for
324 it. Providers want to make sure that they are not haunted by bogus
325 versions of a resource. To ensure the veracity of a resource, the
326 location information provided by the URC service could carry a digital
327 signature of the resource.
328
329
330 o The user starts to retrieve a resource according to the first
331 scenario.
332
333 o As the browser is going through its list of locations, it notes if
334 the current location has signature information. The rest of this
335 scenario assumes that we successfully retrieve a resource which
336 has signature info.
337
338
339 Daniel and Mealling [Page 6]
340
341
342 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
343
344 o When the browser retrieves the resource, it displays it to the
345 user.
346
347 o In the background, the browser verifies the signature on the
348 information. To do this it retrieves the public key of the
349 publisher or server through a secure, ubiquitous public key
350 service. The public key is used to decrypt the signature from
351 the location object. It is compared with a secure hash of the
352 resource.
353
354 o If the signature does not check out, the browser alerts the user.
355
356 o If the user goes on to another resource before the signature
357 computation is complete, it is discarded.
358
359
360
361 This assumes that signatures are computed over the contents of a
362 complete file. Some resources, such as search services, can not be
363 treated in such a fashion. One possibility would be for the URC to
364 contain the signature of a constant header the service provides with
365 its results. The header would contain a public key used to verify a
366 signature of the search results appended to the search results.
367
368 This scenario imposes the requirement that it be possible to establish
369 an unbroken chain of authentication from a URN through the URC to the
370 resource. Multiple signatures schemes should be supported to allow
371 different cost/security tradeoffs to be made.
372
373
374 2.4 Ensuring the veracity of the URC
375
376
377 Resources are not the only information that can be tampered with. The
378 URC service itself will provide a tempting target for attack. It
379 needs to be secured against determined attacks and the information it
380 provides needs to be verifiable. However, security does not come for
381 free, and we should not impose that cost on all accesses. Therefore
382 it is not appropriate to make the URC server compute a digital
383 signature for every query response it generates.
384
385 One approach would be for the server to keep two pre-computed
386 signatures for each of its URCs. The first is a signature over the
387 entire URC, the second is only computed over the location information
388 it would return in response to a standard URN resolution query.
389
390
391 o User configures the browser to verify URC information.
392
393 o The user clicks on a link
394
395 o The browser sends a URN resolution request to the URC service.
396
397 Daniel and Mealling [Page 7]
398
399
400 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
401
402 The request has a flag set so that the URC server will provide
403 digital signature information.
404
405 o The browser receives the list of locations as in the first
406 scenario. In addition it receives a digital signature of that
407 information which has been encrypted with the private key of the
408 URC server.
409
410 o The browser retrieves the public key of the server, and uses it to
411 verify the URC information.
412
413 o If there is no problem, the browser continues as before to
414 retrieve the resource. If there is a problem the browser alerts
415 the user, who should alert the administrator of the URC server.
416
417
418
419 If a general query is issued, the URCs for all matching resources are
420 returned in their entirety. The browser then has to verify each of
421 the URCs in turn. Validating general queries will be an expensive
422 process, but it is the user's machine paying most of the cost.
423
424 This scenario requires that the URC have a consistent external
425 representation that is suitable for the computation of digital
426 signatures. That representation must be network friendly to the
427 extent that it will be transmitted without any changes over standard
428 Internet protocols. Furthermore, it must be possible to separate the
429 portion of a URC being signed from the portion carrying the signature.
430
431
432 3 Provider Scenarios
433
434
435 3.1 Publishing a new resource
436
437
438 This is one of the fundamental operations for resource providers.
439 Consequently it needs to be as simple and as bulletproof as possible.
440
441 Consider the processes of preparing and testing a new resource when
442 the links are expressed using URNs instead of URLs. The resource
443 might be a hyperlinked collection of smaller resources. It will
444 undergo considerable additions, deletions, and modification while it
445 is being developed. There are several means for managing such a
446 project. One way is to do all the development using URLs, then
447 change to URNs at the last minute. Another is to generate trivial
448 URNs all through the development process, and not worry about holes in
449 the published name space. It is also possible to merge the methods
450 and use trivial URNs, then provide more meaningful names late in the
451 development process.
452
453
454
455 Daniel and Mealling [Page 8]
456
457
458 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
459
460 Different approaches are appropriate depending on the complexity of
461 the resource and the formality of the publisher. However, under
462 essentially all of these approaches, the URC for a resource will start
463 very small and simple. URN, URL, and restrictive access controls
464 will be the norm. As the resource gets closer to publication on the
465 net, more information will go into the URC, and access controls will
466 be loosened to allow for larger and larger test audiences. Again,
467 the details will vary from resource to resource and publisher to
468 publisher. Some publishers, burned by previous experience, will want
469 to track the versions of the URC as well as the versions of the
470 resource as it is developed.
471
472 This first publishing scenario imposes several requirements.
473 Publishers must be able to incrementally modify the URC as they
474 shepherd the corresponding resource through the development process.
475 Minimal URCs should be easy to generate by hand. It must be possible
476 to maintain versioning information for the URC, not just the resource
477 it describes. However, that is not foreseen to be part of a minimal,
478 hand-generated, URC. While in development, harvester queries should
479 not get these URCs. Finally, note that we leave open the possibility
480 for multiple URC servers to provide default information for the works
481 of a particular publisher. Any proposed specifications will need to
482 show how they meet requirements for consistency among such cooperating
483 servers. At a minimum, a URC server for a publisher should know all
484 the other URC servers for that publisher and be able to update their
485 records.
486
487
488
489 3.2 Publishing a new version of a resource
490
491
492 Revision of a resource is a normal occurrence and the URC service must
493 not inhibit it. When it is time to revise a resource the revisors
494 might request a new location in the URC of the original resource.
495 That location would carry the new version ID, and have restricted
496 access so that ordinary users only see the older version while the
497 authors and URC server administrator can see the new version. Once
498 the new version of the resource is ready, the modifications in the
499 URC are made publicly accessible. Locations for the new version are
500 established, and the locations for the old version should gradually go
501 away.
502
503 This scenario requires that the developers be able to augment their
504 resolution queries with version information so that they will be able
505 to access the new version while the rest of the world continues
506 to receive the old version. It also requires that access control
507 mechanisms have a fine enough granularity in the URC to allow such a
508 discrimination to be made.
509
510
511
512
513 Daniel and Mealling [Page 9]
514
515
516 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
517
518 3.3 Removing a location for a resource
519
520
521 Another strong motivation for the URC service is to allow
522 administrators of collections of information to rearrange their
523 collections without breaking pages across the globe. Moving resources
524 can be accomplished in two steps - establishing the new location then
525 deleting the original location. Deletion is also necessary when we
526 wish to remove a resource for any reason.
527
528
529
530 o Administrator of a resource location sends a delete_location
531 message to the URC server. This will typically require
532 authentication that is provided through means outside the scope of
533 this document.
534
535 o The URC service authenticates the request. If the issuer of the
536 request has permission, the URC is searched for the specified
537 location. If found, it is removed. If the URC has a
538 digital signature for integrity checking, a new signature would be
539 computed after the deletion.
540
541 o If there are other URC servers providing default information for
542 the particular publisher, they are notified as well so that they
543 may also modify their databases.
544
545
546 This scenario requires that we be able to select portions of a URC
547 and delete them. Access control mechanisms should operate on a fine
548 enough grain that the administrator of one location could delete their
549 location from the URC, but could not delete other locations.
550
551
552 3.4 Establishing a new publishing authority
553
554
555 Publishers are arranged in a hierarchy where new publishers can be
556 added as children of existing ones.
557
558
559 o Billy Bob Riker, Harley biker, decides to publish his doggerel to
560 the world. He contacts his friendly neighborhood web publisher
561 who, in exchange for a modest amount of cash, establishes ``HogDog
562 Press'' as a publisher by issuing the following request to the URC
563 service, signed with the private key of the publisher:
564
565 o Register_new_publisher(parent_publisher, name_of_new_publisher, signa-
566 ture_of_request)
567
568
569
570
571 Daniel and Mealling [Page 10]
572
573
574 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
575
576 o Since Billy Bob's nomadic life style is a little hard on disk
577 drives, he contracts with a third party to provide storage, HTTP
578 service, and URC service. These are private business dealings
579 between the two parties and do not especially concern us.
580
581 o Billy Bob's prose is published to the world using the operations
582 described earlier.
583
584
585
586 3.5 Dealing with the demise of a publisher
587
588
589 Poor Billy Bob. The market for Harley doggerel was not enough to
590 cover his yearly storage fees and his service provider is about to
591 evict his bits. While the parent publisher will never again register
592 an entity as ``HogDog Press'' no one is paying the service provider
593 for the machine resources, so it is time to remove the URLs from the
594 URCs for Billy Bob's resources.
595
596 Accomplishing this in the presence of digital signatures could be
597 a tricky question. Billy will have signed the URC elements with
598 HogDog's private key, and he is not about to go along willingly with
599 the eviction proceedings. Of course, he is not the administrator of
600 the HTTP and URC servers. The administrator of the servers simply
601 clobbers the old URC and replaces it with one that contains a ``no
602 longer available'' element. It can't be signed with Billy Bob's key,
603 but so what? Well, it leads to a form of denial of service attack.
604
605 Once the new URC is in place, the resources are deleted from the HTTP
606 server.
607
608 This scenario requires us to pay considerable attention to who will
609 sign URCs and URC components, how URCs might be nested in other URCs,
610 etc.
611
612
613 4 Third-Party Scenarios
614
615
616 The previous two sections have presented scenarios where users and
617 publishers have a very simple relationship. Of course, there are
618 other parties that can become involved in the relations between
619 providers and consumers. Search services, reviewers, editors, and
620 colleagues can all play a role in our access and provision of
621 information. Some of the roles that can be played by third parties
622 are described in this section.
623
624
625
626
627
628
629 Daniel and Mealling [Page 11]
630
631
632 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
633
634 4.1 Bibliographic Search
635
636
637 In addition to locations, the URC provides a convenient place to
638 store bibliographic information such as author, title, subject, date
639 of publication, etc. Also, since publishers are assumed to be
640 arranged in a hierarchy, it should be possible to find every publisher
641 affiliated with the URC service. Combining these two properties opens
642 up the possibility of bibliographic searches across the whole of the
643 web.
644
645 Exactly how this should work is not so obvious. It is easy to imagine
646 constructing a robot that would traverse the hierarchy, asking each
647 server if they have material pertaining to subject "X". However, that
648 is unrealistic. If every bibliographic search of every user consults
649 every URC server, the service as a whole will soon grind to a halt.
650 The obvious alternative is for some sites to come forward and carry
651 the burden of these searches, similar to the current situation with
652 Archie. Some sites will do this out of the goodness of their heart,
653 while others will charge a fee for their services. In this scenario:
654
655
656
657 o User connects to a URC search site
658
659 o Browser puts up the form from that site
660
661 o User fills it in and hits ``submit''
662
663 o The URC search site handles the query over its database and
664 returns the result to the browser.
665
666 o The browser displays the results to the user.
667
668
669 The database for handling the search is updated regularly by
670 harvesting the network.
671
672
673 o Search server starts a depth-first search of the tree of
674 publishers.
675
676 o Search server queries the current URC server for all URCs that are
677 new or have been modified since the last time the search server
678 visited. Those URCs are put into the database of the search
679 service.
680
681 o Search server asks the URC server for all changes in publication
682 hierarchy since last visit.
683
684 o Search server continues depth-first search using the new topology
685
686
687 Daniel and Mealling [Page 12]
688
689
690 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
691
692 The URC service will grow beyond the capabilities of all but the
693 most dedicated sites (OCLC, Library of Congress, etc.) to keep a
694 comprehensive index. The natural course is for search servers to
695 only keep a portion of the URCs that exist. Exactly how they choose
696 the subset to retain will be a decision that varies from one search
697 service to another.
698
699 Several requirements for the URC service spring from considering
700 searching. First, we must be able to connect to a variety of
701 URC servers instead of having only one URC server be our gateway
702 to the world. Second, if URN resolution is handled through a
703 query forwarding mechanism, servers will want to distinguish between
704 a simple resolution request (which should be forwarded) and general
705 bibliographic queries which should not be forwarded to most sites. If
706 URN resolution does not require query forwarding then this is not a
707 problem. Third, there will need to be a means for determining how
708 to contact the server administrator so that the administrator will
709 add the central search services to the list of entities that can
710 launch certain queries. Fourth, a publisher's server must keep a
711 complete record of all the sub-publishers authorized by that publisher
712 and be able to provide that list in response to a query. Fifth,
713 we must be able to determine the parent publisher of a publisher,
714 either from the URN or by a special request to the URC server. Sixth,
715 the administrator of a URC server should be able to make incremental
716 modifications to the URCs on that server. Seventh, URCs should
717 carry information on their creation and modification dates so that
718 incremental harvesting is possible.
719
720
721
722 4.2 Annotations, Seals of Approval, etc.
723
724
725 Many projects already use the web as a means for describing,
726 recording, and coordinating the work of the project members. One
727 technology that has disappeared from the web was global annotations.
728 The difficulties of scaling the system are well known. Recently,
729 the Digital Library effort at Stanford has suggested a new annotation
730 architecture that may have global scalability [3]. This architecture
731 would enable a variety of interesting scenarios. At this time we will
732 only consider two - using a form of annotation known as a "Seal of
733 Approval" (SOAP) to discover high-quality information on the web and
734 to filter the information our browser will display. SOAPs are an
735 interesting concept that came out of the Interpedia effort [4].
736
737 A SOAP is a quick review of a resource by a third party. They can
738 point to a full review, and a variety of security tradeoffs could be
739 made in their implementation. Critics, professional organizations,
740 etc. could use SOAPs to carry their reviews. For example, the APS
741 (American Physical Society) might receive a request to ``publish'' a
742 resource in one of their electronic journals. The editorial board of
743 the journal lines up the requisite number of reviewers and sends them
744
745 Daniel and Mealling [Page 13]
746
747
748 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
749
750 the URL of the resource. Each of the reviewers sends their review
751 back to the editors, who either turn the author down flat, recommend
752 changes, or accept the resource as it is. If the editorial board
753 accepts it, they form a digital signature of the resource, the quick
754 rating, an optional URN of a full review, etc. all encrypted with the
755 private key of the particular journal.
756
757 Users could use SOAPs to augment bibliographic searches. For example,
758 a new physics grad student might ask to see all the abstracts of all
759 the resources dealing with (string theory AND quantum chromodynamics)
760 which had been reviewed by the APS and received a rating of 9 or
761 above.
762
763 Such queries do not necessarily need to proceed in the same fashion
764 as the general bibliographic search described in the earlier section.
765 Instead, SOAPs will probably be the valuable intellectual property of
766 professional organizations. It may be that if you wish to do searches
767 on things with the SOAP of the APS, you have to connect to their
768 server to do it, presumably paying them for the privilege. Given
769 this money-making potential, it is doubtful that many professional
770 organizations will allow authors to include a SOAP in the default URC
771 for their resource unless the author pays for the privilege.
772
773
774
775 o User connects to the server of a reviewing organization, or to a
776 server that has licensed the right to use the SOAPs of particular
777 reviewing organizations.
778
779 o Browser displays the search form of that organization. This will
780 be a typical bibliographic search form augmented with special
781 features for the SOAPs issued by the organization.
782
783 o User fills out the form and submits it.
784
785 o The server does the search and returns the results to the browser.
786
787 o The browser displays the results of the search to the user.
788
789
790 Another subscription model is that the APS would mail you notification
791 of new resources that they have reviewed, along with their rating and
792 instructions on how to access the resource if you were so inclined.
793
794 The scenario above imposes two requirements on the URC. First, it
795 must be possible to extend the URC by adding arbitrary elements.
796 In the case above, the SOAP is the new element. Second, it must
797 be possible to ignore elements that you do not understand. For
798 this to be possible it must be possible to determine where any
799 particular element ends, even if you know nothing about the structure
800 of information inside the element. Note that there is an interaction
801 between ignorability and having a consistent representation for the
802
803 Daniel and Mealling [Page 14]
804
805
806 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
807
808 purposes of digital signatures. Digital signatures are computed over
809 the external representation, which can include experimental elements.
810 Ignorability is a feature of the conversion from the external to the
811 internal representation, where if we do not understand an element we
812 are free to discard it while we are parsing the URC.
813
814
815
816 4.3 Providing an additional location for a resource
817
818
819 One of the main benefits we are looking for from the URC service it
820 the ability to have multiple locations for a resource. How are these
821 additional locations to be established? There will be several ways
822 this might happen, the appropriate model will depend on financial
823 considerations more than technical ones. We will consider three
824 cases out of many possible ones. The first is simple mirroring of
825 free information. The second is a mirror of a small publisher's
826 information that is sold. The third is a contractual arrangement
827 between sites.
828
829
830 4.3.1 Mirroring of free information
831
832
833 A researcher in Australia comes across a collection of interesting
834 technical reports on a server in Sweden. He wishes to mirror those
835 reports as a service to the research community in Australia. He
836 contacts the administrator of the archive in Sweden, who also happens
837 to be the author of the reports of interest. She gives her permission
838 for a mirror to be established. He pulls over the reports and sets
839 them up on his HTTP server. Now that they have URLs, he sends
840 a register_new_url message to the URC service. Since the Swedish
841 researcher has provided a digital signature for the URCs of all the
842 reports, a new location can not just be blindly entered. The URC
843 service forwards the request to the Swedish researcher. She checks
844 out the new URLs to make sure that they are faithful versions of her
845 reports, then signs the register_new_url message with her private key
846 before sending it back to the URC service. The service verifies
847 the authentication information, sees that it is good, adds the new
848 location to the URC of each report and recalculates the signature
849 information. Now when users attempt to resolve the URN, they can
850 fetch it from either Australia or Sweden. As a matter of courtesy,
851 the Australian researcher periodically informs the Swedish researcher
852 about how many times her reports have been accessed.
853
854 This scenario does not impose any requirements on the URC service
855 beyond those already described for modifying the URC information.
856
857
858
859
860
861 Daniel and Mealling [Page 15]
862
863
864 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
865
866 4.3.2 Mirroring of information that is for sale
867
868
869 An experimental film maker in Germany has been selling avant-garde
870 videos over the WWW. A film distributor in Canada contacts her to see
871 if they can serve these up to the North American market in exchange
872 for a cut of the action. The film maker says ``sure''. The Canadian
873 distributor puts copies of the videos onto their video server and
874 attempts to register the new location with the URC service. The
875 service forwards the request to the film maker, who authorizes it by
876 signing the request with her private key. Periodically the Canadians
877 send the film maker a check to cover the royalties the film maker
878 collects from every download from the Canadian server. As part of the
879 contract between the two sites, the film maker can access the logs on
880 the distributor's server to make sure that she is being paid for all
881 the copies it provides.
882
883 This scenario does not impose many requirements on the URC service.
884 Note that the access logs are an obvious point of attack for an
885 unscrupulous mirror site administrator. It would be nice if there was
886 a means for ensuring their veracity, however, that is an HTTP (or
887 equivalent) server issue, not a URC server issue.
888
889
890 4.3.3 Mirroring on a regular basis
891
892
893 Some large sites may set up cooperative mirroring agreements. For
894 example, Los Alamos National Laboratory might make arrangements with
895 CERN to provide mirrors of each others work. When either of these
896 sites publishes a new resource, it sends a message to the other. The
897 second site fetches the resource and puts it on their server. It then
898 issues a register_new_url message to the URC service. It is forwarded
899 to the publisher of the resource, where it is automatically approved
900 without human intervention.
901
902 This scenario requires that the URC server's access controls be
903 capable of registering multiple users - not a big problem. It also
904 adds the concept of a list of sites to notify when new material is
905 published. However, that requirement could be eliminated in favor of
906 a polling model as already discussed in the information harvesting
907 scenario.
908
909
910
911 4.4 Here be Librarians
912
913
914 Much to our collective chagrin, the WWW is filled with trivial, poorly
915 maintained material. However, there are the occasional sites that
916 give us hope for the future by providing high quality information.
917 As the library community takes their skills in acquisition into
918
919 Daniel and Mealling [Page 16]
920
921
922 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
923
924 the Internet, they will want to provide their traditional levels
925 of quality in cataloging and archiving. ``Bob's Cool Links'' just
926 won't cut it for reliable long-term retrieval of information. The
927 URC service already offers a means for libraries to assist in the
928 archiving problem. The previous scenarios about mirroring information
929 are highly applicable to this. But what about cataloging? A
930 professional cataloger will take an hour or two to describe a work to
931 the normal standards of the library science community. It would be a
932 shame if that effort could not be folded back into the original URC
933 for the resource.
934
935
936
937 o Fiona D., a budding artist, has a selection of her works placed
938 on-line by a fan with access to a web server. The fan provides
939 simplistic URCs because he thinks that is the right way to do
940 things.
941
942 o 10 years pass, during which time Fiona's fame spreads through the
943 underground and the fan's collection grows. The site of the
944 collection has changed several times as the fan moves around from
945 one job to another. The fan has also revised the URC entries to
946 become more sophisticated as his knowledge of the area grows.
947
948 o An art criticism movement discovers this collection and wishes to
949 preserve it as an important marker of the transition stage in
950 electronic art. They obtain permission to maintain an archive of
951 the images and provide professional cataloging through the 2010
952 version of UNI-MARC. As a matter of courtesy to the fan, they sell
953 him a copy of catalog records.
954
955 o The fan converts the UNI-MARC records into the internal
956 representation used on the system in his home office, and ups the
957 version history on his URC records one more time.
958
959 o Harvesters notice the meta-meta information in the fan's URCs
960 indicating that the catalog follows certain standards and
961 authority lists and they process the information specially so that
962 their users can more reliably locate Fiona's illustrations.
963
964
965 This scenario requires that we be able to provide more than one URC
966 for a resource. We should also be able to identify the standard to
967 which a URC has been prepared. This could be as simple as identifying
968 the attribute set used.
969
970
971 4.5 Republication
972
973
974 For better or worse, the digital world makes it easier to republish
975 items. For example, a research group might be developing a corpora.
976
977 Daniel and Mealling [Page 17]
978
979
980 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
981
982 In that work they would include a range of other works. The corpora
983 would have its own URN. It might assign new URNs to the component
984 works, and describe them using its own URC scheme. That URC could
985 then point to the original URLs, or to URLs managed by the corpora
986 developers according to the mirroring scenarios described earlier.
987
988 This brief scenario shows that multiple people will be able to relabel
989 and redescribe the work of others. This is not a bad thing. The
990 collection and selection of the elements, along with their description
991 in some new scheme is a legitimate intellectual exercise. The
992 potential problems arise when this is performed for the purposes of
993 defrauding the publisher of the original information. The publisher's
994 recourse against such fraudulent uses will have to happen through
995 legal channels. Technical schemes for preventing such abuses are
996 limited. The "superdistribution" work is one means, although it
997 will be subject to the same copy-protection cracking we all know
998 of. Publishers might use resource discovery procedures to locate
999 unauthorized mirrors of their resources. These too will have limited
1000 effectiveness. There will be bad guys out there on the electronic
1001 frontier folks, and we can't hope to prevent all abuses.
1002
1003
1004
1005 5 Requirements
1006
1007
1008 In each of the scenarios above we listed any new requirements that
1009 would be placed on the URC and the URC service. This section collects
1010 those requirements into 2 categories: requirements on the URC, and
1011 requirements on the URC service. Any proposed specifications for the
1012 URC and URC service will need to demonstrate how they meet all the
1013 requirements, or demonstrate how the requirements are unnecessary or
1014 in error.
1015
1016
1017 5.1 Requirements on the URC
1018
1019
1020 Machine Consumption A URC must be parsable by a computer.
1021
1022 Consistent External Representation In order for digital signatures of
1023 the URC information to work, and to simplify the requirement
1024 for parsability, the URC must have a consistent external
1025 representation.
1026
1027 Transport Friendliness Related to the consistency of the external
1028 representation, it must be possible to transport a URC unmodified
1029 in the common Internet protocols, such as TCP, SMTP, FTP, Telnet,
1030 etc.
1031
1032
1033
1034
1035 Daniel and Mealling [Page 18]
1036
1037
1038 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
1039
1040 Human Readability A URC must have a printed representation that is
1041 suitable for printing on paper, as well as suitable for entry
1042 by means of being typed by a user on a keyboard. Digital
1043 signature information need not apply to such items given the high
1044 probability of trivial differences.
1045
1046 Some meta-information items are meant for humans only while others
1047 are only meant to be machine consumable. One requirement should
1048 not preclude the other from being encoded.
1049
1050 Simplicity It must be simple for humans to generate correct minimal
1051 URCs that do not carry any signature information.
1052
1053 Rearrangeability It must be possible to reorder elements in a URC
1054 without changing their semantics. Note that elements in this case
1055 may mean compound entities. For example, a Location element might
1056 contain elements for a URL, a Content-Type, a size, etc. The
1057 compound entities should be rearrangeable, while the information
1058 inside the entity may need to have its order preserved.
1059
1060 Generality In the most basic sense, a URC must be able to contain ANY
1061 conceivable type of meta-information or URI. Therefore it must be
1062 possible to add new types of elements to the URC without breaking
1063 previous applications. Any restrictions on the representational
1064 capability of a URC will be the target of intense scrutiny.
1065
1066 Structure To accommodate the encapsulation of objects that are
1067 currently unforeseen, the URC must have a self-describing
1068 structure. By this we mean that elements in a URC must be tagged
1069 with a descriptive label and it must be possible to determine
1070 their extent, even in the presence of nesting. Further, it must
1071 be possible to determine to which attribute set a URC has been
1072 prepared.
1073
1074 Ignorability Related to the previous 2 requirements, an application
1075 encountering an unknown tag must be able to ignore it without
1076 error. This implies that it must be easy to tell where an
1077 element stops, even if you don't know anything about its internal
1078 structure. Also note that nesting unknown elements must still be
1079 handled correctly.
1080
1081 Searchable It must be possible to select a URC based on a search of
1082 its components. It must be possible to select which components
1083 will be searched and which will not be searched.
1084
1085 Subsettable It must be possible to form a new URC from some of the
1086 components of another URC.
1087
1088 Seperable It must be possible to separate a signature in a URC from
1089 the information it signs.
1090
1091
1092
1093 Daniel and Mealling [Page 19]
1094
1095
1096 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
1097
1098 Incrementally Modifiable It must be possible to add (or delete)
1099 elements to (or from) a URC in an incremental fashion. It must be
1100 possible to provide elements in a URC for tracking the changes in
1101 a URC.
1102
1103 Versioning It must be possible for a URC to track version changes in
1104 the resource(s) it describes. This is related to, but distinct
1105 from, the requirement that a URC be able to track version changes
1106 in a URC.
1107
1108 Caching Caching should be possible for any URC regardless of whether
1109 or not any of its specific elements are not cacheable. Further,
1110 it must be possible to determine if a query has been answered from
1111 cached or authoritative URC information.
1112
1113 Grandfathering Current meta-information schemes should be allowed to
1114 work within the URC structure, where this will not conflict with
1115 the other requirements.
1116
1117
1118
1119 5.2 Requirements on the URC Service
1120
1121
1122 The previous section discussed requirements on the encoding and
1123 functionality of single URCs. This section presents the requirements
1124 we have derived for collections of URCs sitting on a URC server, and
1125 that server's communication with applications.
1126
1127
1128 Resolution It must be possible for a URN to be resolved into a URC.
1129
1130 A URC is meant to be the format that URNs and URLs are transported
1131 in, therefore a given URN or URL may be resolved into a URC.
1132 Nothing within a URC should cause it to not be the solution to a
1133 URN or URL resolution.
1134
1135 Multiple Syntaxes The URC server must be capable of providing the URC
1136 in a variety of syntaxes to accommodate browsers of differing
1137 capabilities.
1138
1139 Query Language It must be possible for simple resolution queries
1140 to be augmented with information on the version of a resource
1141 desired, and an indication of whether signature information should
1142 be supplied. It must be possible to select sets of URCs based on
1143 criteria such as data of last modification, etc.
1144
1145 Security Since the URC service will be providing information
1146 of significant value, it will be a tempting target for
1147 attack. It must be possible to secure the service without
1148 imposing performance penalties on unauthenticated access to free,
1149 unencrypted, information.
1150
1151 Daniel and Mealling [Page 20]
1152
1153
1154 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
1155
1156 Authentication Chain One aspect of the general requirement for
1157 security is that it must be possible to establish a chain of
1158 authentication from the URN, through the URC, to the resource
1159 retrieved through a URL.
1160
1161 Access Control Another aspect of security is that it must be possible
1162 to control (at least) read and write access to portions of a URC.
1163
1164 Maintenance The URI architecture is intended to be long-lived. The
1165 service must not prevent the maintenance of the resources and
1166 their meta-information.
1167
1168 Synchronization If multiple servers are equally authoritative for a
1169 publisher, they must work with each other to keep their URCs in
1170 sync within a reasonable time delay. This delay is certainly less
1171 than a week, but can be more than an hour.
1172
1173 Development URC servers must support the development of new resources
1174 by issuing and handling "development" URNs and URCs.
1175
1176 Choice A user must be able to connect to a range of URC servers,
1177 and not have to do all interaction through one server that is a
1178 gateway to the world.
1179
1180 Scalability In order for the system to scale to large numbers of
1181 users, queries on one URC server should not automatically be
1182 forwarded in such a fashion that they will hit a large number of
1183 other servers around the globe.
1184
1185 Administrative Contact There must be a standard means for contacting
1186 the administrator(s) of a server.
1187
1188 Hierarchical Operations A publisher will have the rights to register
1189 sub-publishers.
1190
1191 The publisher must keep a list of the sub-publishers it has
1192 created.
1193
1194 That list should be available as the result of an appropriate
1195 query to a server that speaks for the publisher.
1196
1197 It must be possible to determine the parent publisher of a
1198 sub-publisher.
1199
1200
1201
1202 6 Characteristics
1203
1204
1205 Several important characteristics of URCs come about as a result of
1206 fulfilling the above requirements. Some of these characteristics are
1207
1208
1209 Daniel and Mealling [Page 21]
1210
1211
1212 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
1213
1214 a result of requirements on URNs and URLs that make up some of the
1215 elements of a URC:
1216
1217
1218
1219 Time To Live Since a URC may contain transient information such as
1220 timestamps, access privileges, etc. it can not be guaranteed to
1221 have a Time To Live greater than 0. Any subset of a URC is free
1222 to specify its own TTL but this still does not affect the whole
1223 URC. While this does not preclude the user from attempting to
1224 trust a URC for a longer amount of time it should not be something
1225 to depend on.
1226
1227 Character Sets Since the encapsulation and scalability requirements
1228 force the inclusion of alternate character sets, some common
1229 scheme must be found that accommodates all character sets in a way
1230 that fulfills the transport friendly encoding requirement. This
1231 precludes any restrictions on allowable character sets.
1232
1233 Data Naming Fulfilling the grandfathering requirement will make it
1234 nearly impossible to specify the numerous ways extremely similar
1235 pieces of information can be represented. Thus one consideration
1236 should be a central authority that makes suggestions as to the
1237 consolidation of the names used to identify specific pieces of
1238 meta-information.
1239
1240 Member Element Control By allowing any piece of meta-information to
1241 be included within a URC the number of globally understood
1242 elements will be small if not non-existent. Therefore some entity
1243 must have some control over some set of very concretely specified
1244 member elements. The specification of that entity should be done
1245 in an encoding specification and is outside the scope of this list
1246 of functional requirements.
1247
1248 Multiple Signatures To accommodate the requirements of insertion and
1249 deletion in the presence of digital signatures, we anticipate that
1250 URCs will be signed using the private key of a server. The
1251 server's public key would be signed by the private key of the
1252 publisher. Entities with the authority to modify URC elements
1253 will have to have their keys signed by the server.
1254
1255
1256 References
1257
1258
1259 [1] Jackson, J, You can't get what you want (till you know what you
1260 want), "Body & Soul", A&M 7502 3286, 1984.
1261
1262 [2] Sollins, K. and Masinter, L., Functional Re-
1263 quirements for Uniform Resource Names, RFC 1737
1264 <URL:http://ds.internic.net/rfc/rfc1737.txt>
1265
1266
1267 Daniel and Mealling [Page 22]
1268
1269
1270 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
1271
1272 [3] Roscheisen, M., Morgensen, C. and Winograd, T. "Beyond
1273 Browsing: Shared Comments, SOAPs, Trails, and On-line
1274 Communities", Technical Report CSDTR/DLTR, Computer Science
1275 Dept., Stanford University, Stanford CA 94305. <URL:http://www-
1276 diglib.stanford.edu/diglib/pub/reports/BRIOTR/>
1277
1278 [4] Rhine, J., Interpedia Homepage,
1279 <URL:http://www.hmc.edu/interpedia/index.html>
1280
1281
1282
1283 Glossary
1284
1285
1286 Default URC The URC that is provided by the publisher of a resource.
1287
1288 Default URC server The URC server(s) that can provide the default
1289 URCs for a publisher.
1290
1291 Local URC server The URC server that a user's browser is configured
1292 to connect to as a first resort.
1293
1294 Development URN A URN used while developing a resource. It
1295 starts with very tight access controls so that only the
1296 resource developers and the server administrator can see the URC
1297 information and resolve the URN to a URL. The access controls can
1298 be eased later.
1299
1300 value-added URC server A server that provides more than just the
1301 default information on a resource. Servers run by professional
1302 organizations that provide SOAPs are one example, servers that
1303 keep full-text indices or n-grams of text in order to offer
1304 greater search capabilities are another.
1305
1306 SOAP Seal Of APproval - A capsule review of a resource which uses
1307 cryptographic techniques to provide guarantees on the source of
1308 the review and its authenticity.
1309
1310
1311
1312 Contact Information
1313
1314
1315 Ron Daniel Jr.
1316 MS B287
1317 Los Alamos National Laboratory
1318 Los Alamos, NM, USA 87545
1319 voice: (505) 665-0139
1320 fax: (505) 665-4939
1321 rdaniel@lanl.gov
1322 http://www.acl.lanl.gov/ rdaniel/
1323
1324
1325 Daniel and Mealling [Page 23]
1326
1327
1328 INTERNET-DRAFT URC Scenarios and Requirements March 24, 1995
1329
1330 Michael Mealling
1331 Office of Information Technology, Network Services
1332 Georgia Institute of Technology
1333 Atlanta, GA, USA 30332-0730
1334 voice: (404) 894-1712
1335 fax: (404) 894 9548
1336 michael.mealling@oit.gatech.edu
1337 http://www.gatech.edu/michael.html
1338
1339
1340
1341 This Internet Draft expires September 22, 1995.
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383 Daniel and Mealling [Page 24]

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24