1 |
|
2 |
MHTML Working Group Alexander Hopmann |
3 |
INTERNET-DRAFT ResNova Software, Inc. |
4 |
<draft-hopmann-html-email-packaging-00.txt> |
5 |
Expires SIX MONTHS FROM---> February 20th, 1995 |
6 |
|
7 |
|
8 |
|
9 |
Packaging Aggregate HTML Objects Inside MIME |
10 |
|
11 |
Status of this Memo |
12 |
|
13 |
This document is an Internet-Draft. Internet-Drafts are working |
14 |
documents of the Internet Engineering Task Force (IETF), its areas, |
15 |
and its working groups. Note that other groups may also distribute |
16 |
working documents as Internet-Drafts. |
17 |
|
18 |
Internet-Drafts are draft documents valid for a maximum of six |
19 |
months and may be updated, replaced, or obsoleted by other |
20 |
documents at any time. It is inappropriate to use Internet- |
21 |
Drafts as reference material or to cite them other than as |
22 |
"work in progress." |
23 |
|
24 |
To learn the current status of any Internet-Draft, please check |
25 |
the "1id-abstracts.txt" listing contained in the Internet- |
26 |
Drafts Shadow Directories on ds.internic.net (US East Coast), |
27 |
nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or |
28 |
munnari.oz.au (Pacific Rim). |
29 |
|
30 |
Distribution of this document is unlimited. Please send comments |
31 |
to the MHTML working group at <mhtml@segate.sunet.se>. To subscribe |
32 |
to this list, send a message to <listserv@segate.sunet.se> which |
33 |
contains the text "sub mhtml <your full name, not email address>". |
34 |
Discussions of the working group are archived at |
35 |
<URL:ftp://segate.sunet.se>. |
36 |
|
37 |
Abstract |
38 |
|
39 |
Although HTML was designed within the context of MIME, more than the |
40 |
specification of HTML as defined in RFC 1866 is needed for two |
41 |
electronic mail user agents to be able to interoperate using HTML as |
42 |
a document format. These issues include the naming of objects that |
43 |
are normally referred to by URIs, and the means of aggregating |
44 |
objects that go together. This draft describes a set of guidelines |
45 |
that will allow conforming mail user agents to be able to send, |
46 |
deliver and display these HTML objects. In addition it is hoped that |
47 |
these techniques will also apply to the wider category of URI-enabled |
48 |
objects. |
49 |
|
50 |
Table of Contents |
51 |
|
52 |
1. Introduction |
53 |
1.1 Purpose |
54 |
1.2 Overall Operation |
55 |
1.3 URI References |
56 |
1.3.1 Use of Multipart/Related |
57 |
1.3.2 Content-ID URL References |
58 |
1.3.3 New MIME Content Headers |
59 |
1.4 Other MIME Issues |
60 |
2. Examples |
61 |
3. Security Considerations |
62 |
4. Acknowledgments |
63 |
5. References |
64 |
6. Author's Address |
65 |
|
66 |
1. Introduction |
67 |
|
68 |
1.1 Purpose |
69 |
|
70 |
Although HTML [1] is a valid MIME [2] type, RFC 1866 does not |
71 |
provide enough specification in order for two electronic mail |
72 |
user agents to be able to interoperate using HTML as a document |
73 |
format. This draft describes a set of guidelines that will |
74 |
allow conforming mail user agents to be able to send, deliver |
75 |
and display HTML objects. In addition it is hoped that these |
76 |
techniques will also apply to the wider category of URI-enabled |
77 |
[3] objects. |
78 |
|
79 |
An HTML aggregate object is a MIME-encoded message that |
80 |
contains an HTML document as well as other data that is |
81 |
required in order to represent that object (inline pictures, |
82 |
style sheets, etc.). HTML aggregate objects can also include |
83 |
additional HTML documents that are linked to the first object, |
84 |
as well as other arbitrary MIME content. |
85 |
|
86 |
In designing HTML capabilities for electronic mail user agents |
87 |
(UAs), it is important to keep in mind the differing needs of |
88 |
several audiences. Mail sending agents will send aggregate HTML |
89 |
objects as an encoding of normal day-to-day electronic mail. |
90 |
Mail sending agents will also send aggregate HTML objects when |
91 |
a user wishes to mail a particular document from the World Wide |
92 |
Web to someone else. Finally mail sending agents will send |
93 |
aggregate HTML documents as automatic responders, providing |
94 |
access to WWW resources for non-IP connected clients. |
95 |
|
96 |
Mail receiving agents also have several differing needs. Some |
97 |
mail receiving agents will be able to receive an aggregate HTML |
98 |
document and display it just as any other text content type |
99 |
would be displayed. Others will have to pass this aggregate |
100 |
HTML document to an HTML browsing program, and provisions need |
101 |
to be made to make this possible. |
102 |
|
103 |
Finally several other constraints on the problem arise. It is |
104 |
important that it be possible for an HTML document to be signed |
105 |
and for it to be able to be transmitted to a client and |
106 |
displayed with a minimum chance of breaking the message |
107 |
integrity check that is part of the signature. |
108 |
|
109 |
1.2 Overall Operation |
110 |
|
111 |
A mail user agent that wishes to send a content-type of HTML |
112 |
can just do so, so long as the normal data encoding issues are |
113 |
taken care of as specified in [2]. However at a basic level |
114 |
there are some differences between HTML being transferred by |
115 |
HTTP and HTML being transferred through Internet email. When |
116 |
transferred through HTTP, HTML by default uses the document |
117 |
character set ISO-8859-1. Within electronic mail, the default |
118 |
character set is US-ASCII. If a document uses any characters |
119 |
that are not in US-ASCII, the document must explicitly label |
120 |
the character set and perform appropriate MIME content |
121 |
encodings. Instead of applying normal MIME content encodings it |
122 |
is possible to translate non US-ASCII characters to HTML |
123 |
defined entity references. However it is inappropriate to use |
124 |
entity references to non-US-ASCII characters without labeling |
125 |
the document character set appropriately. |
126 |
|
127 |
1.3 URI References |
128 |
|
129 |
The use of URI references creates some additional issues for |
130 |
aggregate HTML objects. Normal URI references can of course be used, |
131 |
however it is likely that many user agents may not be able to |
132 |
retrieve those objects referred to. This document provides a means |
133 |
for these additional objects to be transmitted with the HTML and for |
134 |
the links between these objects to be properly resolved. |
135 |
|
136 |
1.3.1 Use of Multipart/Related |
137 |
|
138 |
Multiple objects should be aggregated using the |
139 |
multipart/related content type as defined in RFC 1872 [4]. RFC |
140 |
1872 says that multipart/related should have parameters |
141 |
"start", and "type". The "start" parameter refers to the |
142 |
Content-ID of the sub-part which contains the main document, in |
143 |
this case, usually the primary HTML document. The main document |
144 |
should be the one first displayed by the receiving UA. The |
145 |
"type" parameter serves as a label for the type of the |
146 |
aggregate object, in this case "text/html". |
147 |
|
148 |
1.3.2 Content-ID URL References |
149 |
|
150 |
An HTML body part can use Content-ID URLs as described by draft- |
151 |
levinson-cid-01.txt [5] to refer to other body parts of the |
152 |
same MIME message. Content-ID URLs can also refer to body parts |
153 |
in other MIME messages, but it is unlikely that many clients |
154 |
will be able to resolve the reference. |
155 |
|
156 |
1.3.3 New MIME Content Headers |
157 |
|
158 |
In order to resolve URI references to other body parts, two new MIME |
159 |
content headers are required. Both of these place URIs in MIME header |
160 |
fields. Since MIME header fields have a limited length and URIs can |
161 |
get quite long, these lines may have to be folded. When the lines are |
162 |
folded, no additional non-white space characters may be introduced, |
163 |
and since white space is not allowed in URIs it is simply ignored. |
164 |
|
165 |
1.3.3.1 Content-Base |
166 |
|
167 |
The Content-Base header field may be included in any MIME content |
168 |
header. It specifies the Content-Base for the body part and should be |
169 |
a full URL. Any relative URL references within the body part are made |
170 |
relative to the body parts "base URL". An HTML body part can also |
171 |
include a <BASE> tag. If it does contain a <BASE> tag, that <BASE> |
172 |
tag takes precedence over the Content-Base header field. |
173 |
|
174 |
1.3.3.2 Content-Location |
175 |
|
176 |
The Content-Location content header may be included in any MIME |
177 |
subpart header. It specifies the URI that corresponds to the object |
178 |
present in that subpart. If a URL is specified using the |
179 |
Content-Location header, it should be a fully qualified URL. |
180 |
|
181 |
1.4 Other MIME issues |
182 |
|
183 |
Several other tricky issues may exist regarding the deployment |
184 |
of HTML email, however they are out of the scope of this |
185 |
document. These issues include the use of multipart/alternative |
186 |
for content negotiation, the use of mail/WWW gateways, and the |
187 |
use of URLs referring to objects outside of the encapsulation |
188 |
(both to WWW-based objects, objects in other MIME documents, |
189 |
and objects in other parts of the same MIME message but not in |
190 |
the same multipart/related). |
191 |
|
192 |
The use of multipart/alternative is in no way an HTML-specific |
193 |
issue and no clear solution exists at this time for the problem |
194 |
of content negotiation though electronic mail. The use of |
195 |
mail/WWW gateways should be facilitated by the provisions of |
196 |
this document. However this document makes no attempt to |
197 |
specify the format of a request to such a gateway. The use of |
198 |
references to outside of the "encapsulating MIME object" is not |
199 |
something that can be prohibited, but it is simply something |
200 |
that the sending UA needs to realize creates a danger of the |
201 |
receiving UA not being able to resolve the reference. |
202 |
|
203 |
Finally, a sending user agent should not make any assumptions |
204 |
about the method that the receiving user agent will use to |
205 |
display the HTML files. For example it should not use Content- |
206 |
Disposition and/or "file:" URLs under the assumption that the |
207 |
receiving UA is going to save the pieces of the HTML aggregate |
208 |
object as files on a disk to be displayed by a separate |
209 |
browser. Content-Disposition should only be used as described |
210 |
in RFC 1806 in order to distinguish between file attachments |
211 |
and inline message components. |
212 |
|
213 |
2. Examples |
214 |
|
215 |
The first example is the simplest form of an HTML email message. |
216 |
This is not an aggregate HTML object, but simply one by itself. This |
217 |
message contains a hot-link but does not provide the ability to |
218 |
resolve the hot-link. To resolve the hot-link the receiving client |
219 |
would need either IP access to the Internet, or an electronic mail |
220 |
web gateway. |
221 |
|
222 |
|
223 |
From: some.user@resnova.com |
224 |
To: someone.else@entropy.net |
225 |
Subject: Hello there |
226 |
Mime-Version: 1.0 |
227 |
Content-Type: text/html |
228 |
|
229 |
<html> |
230 |
<head></head> |
231 |
<body> |
232 |
<h1>Hi there!</h1> |
233 |
This is a rather pointless example of an HTML message.<p> |
234 |
Try clicking <a href=3D"http://www.resnova.com/">here.</a><p> |
235 |
</body></html> |
236 |
|
237 |
|
238 |
The second example shows a simple HTML document with a picture in it: |
239 |
|
240 |
|
241 |
From: some.user@resnova.com |
242 |
To: someone.else@entropy.net |
243 |
Subject: Hello there |
244 |
Mime-Version: 1.0 |
245 |
Content-Type: multipart/related; boundary=3Dabcdefghij |
246 |
start=3D"<12345@entropy.net>"; type=3D"text/html" |
247 |
|
248 |
--abcdefghij |
249 |
Content-Type: text/html |
250 |
Content-ID: <12345@entropy.net> |
251 |
<html> |
252 |
<head></head> |
253 |
<body> |
254 |
<h1>Hi there!</h1> |
255 |
This is a rather pointless example of an HTML message.<p> |
256 |
<img src=3D"cid:<45678@entropy.net>"><p> |
257 |
Try clicking <a href=3D"http://www.resnova.com/">here.</a><p> |
258 |
</body></html> |
259 |
|
260 |
--abcdefghij |
261 |
Content-Type: image/jpeg |
262 |
Content-ID: <45678@entropy.net> |
263 |
Content-Transfer-Encoding: Base64 |
264 |
|
265 |
jdisfdsufhsdjvbfdhvbfhbvfhbvfdvifdjrivjifdjvfivfd |
266 |
etc=85 |
267 |
--abcdefghij=97 |
268 |
|
269 |
|
270 |
The third example shows the use of Content-Location. This |
271 |
example could be a web page that was mailed to someone. Note |
272 |
that the starting body part still needs a Content-ID. |
273 |
|
274 |
|
275 |
From: some.user@resnova.com |
276 |
To: someone.else@entropy.net |
277 |
Subject: Hello there |
278 |
Mime-Version: 1.0 |
279 |
Content-Type: multipart/related; boundary=3Dabcdefghij |
280 |
start=3D"<12345@entropy.net>"; type=3D"text/html" |
281 |
|
282 |
--abcdefghij |
283 |
Content-Type: text/html |
284 |
Content-ID: <12345@entropy.net> |
285 |
Content-Location: http://www.entropy.net/mydoc.html |
286 |
<html> |
287 |
<head></head> |
288 |
<body> |
289 |
<h1>Hi there!</h1> |
290 |
This is a rather pointless example of an HTML message.<p> |
291 |
<img src=3D"http://www.entropy.net/myimg.jpg"><p> |
292 |
Try clicking <a href=3D"http://www.resnova.com/">here.</a><p> |
293 |
</body></html> |
294 |
|
295 |
--abcdefghij |
296 |
Content-Type: image/jpeg |
297 |
Content-Location: http://www.entropy.net/myimg.jpg |
298 |
Content-Transfer-Encoding: Base64 |
299 |
|
300 |
jdisfdsufhsdjvbfdhvbfhbvfhbvfdvifdjrivjifdjvfivfd |
301 |
etc=85 |
302 |
--abcdefghij=97 |
303 |
|
304 |
3. Security |
305 |
|
306 |
Some Security Considerations include the potential to mail someone |
307 |
an object, and claim that it is represented by a particular URI (by |
308 |
giving it a Content-Location: header field). There can be no |
309 |
assurance that a WWW request for that same URI would normally result |
310 |
in that same object. Because of this problem, receiving User Agents |
311 |
should not cache this data in the same way that data that was |
312 |
retrieved through an HTTP or FTP request might be cached. |
313 |
|
314 |
In addition, by allowing people to mail aggregate HTML objects, we |
315 |
are opening the door to other potential security problems that until |
316 |
now were only problems for WWW users. For example, some HTML |
317 |
documents now either themselves contain executable content |
318 |
(JavaScript) or contain links to executable content (The "INSERT" |
319 |
specification, Java). It would be exceedingly dangerous for a |
320 |
receiving User Agent to execute content received through a mail |
321 |
message without careful attention to restrictions on the capabilities |
322 |
of that executable content. |
323 |
|
324 |
4. Acknowledgments |
325 |
|
326 |
Thanks to Dave Crocker, Roy Fielding, Ed Levinson, and Paul Hoffman |
327 |
who, with me, worked out most of the details of this draft in an |
328 |
informal discussion at the Dallas December 1995 IETF. Additional |
329 |
thanks to Greg Herlihy and Ed Levinson for reviewing this draft. |
330 |
Thanks also to Jacob Palme who encouraged the existence of the Mail |
331 |
HTML effort and addressed many of the issues in his drafts. |
332 |
|
333 |
5 References |
334 |
|
335 |
[1] T. Berners-Lee and D. Connolly. |
336 |
"HyperText Markup Language Specification 2.0" |
337 |
RFC 1866, Proposed Standard |
338 |
<URL: http://ds.internic.net/rfc/rfc1866.txt>, November 1995. |
339 |
|
340 |
[2] N. Borenstein and N. Freed. |
341 |
"MIME (Multipurpose Internet Mail Extensions) Part One" |
342 |
RFC 1521, Proposed Standard |
343 |
<URL: http://ds.internic.net/rfc/rfc1521.txt>, September 1993. |
344 |
|
345 |
[3] T. Berners-Lee, L. Masinter, and M. McCahill |
346 |
"Uniform Resource Locators (URL)" |
347 |
RFC 1738, Proposed Standard |
348 |
<URL: http://ds.internic.net/rfc/rfc1738.txt>, December 1994. |
349 |
|
350 |
[4] E. Levinson |
351 |
"The MIME Multipart/Related Content-type" |
352 |
RFC 1872, Experimental |
353 |
<URL: http://ds.internic.net/rfc/rfc1872.txt>, December 1995. |
354 |
|
355 |
6 Author's Address |
356 |
|
357 |
Alex Hopmann |
358 |
alex.hopmann@resnova.com |
359 |
President |
360 |
ResNova Software, Inc. |
361 |
5011 Argosy Dr. #13 |
362 |
Huntington Beach, CA 92649 |
363 |
|
364 |
|
365 |
|