1 |
wakaba |
1.1 |
|
2 |
|
|
MHTML Working Group Alexander Hopmann |
3 |
|
|
INTERNET-DRAFT ResNova Software, Inc. |
4 |
|
|
<draft-hopmann-html-email-packaging-00.txt> |
5 |
|
|
Expires SIX MONTHS FROM---> February 20th, 1995 |
6 |
|
|
|
7 |
|
|
|
8 |
|
|
|
9 |
|
|
Packaging Aggregate HTML Objects Inside MIME |
10 |
|
|
|
11 |
|
|
Status of this Memo |
12 |
|
|
|
13 |
|
|
This document is an Internet-Draft. Internet-Drafts are working |
14 |
|
|
documents of the Internet Engineering Task Force (IETF), its areas, |
15 |
|
|
and its working groups. Note that other groups may also distribute |
16 |
|
|
working documents as Internet-Drafts. |
17 |
|
|
|
18 |
|
|
Internet-Drafts are draft documents valid for a maximum of six |
19 |
|
|
months and may be updated, replaced, or obsoleted by other |
20 |
|
|
documents at any time. It is inappropriate to use Internet- |
21 |
|
|
Drafts as reference material or to cite them other than as |
22 |
|
|
"work in progress." |
23 |
|
|
|
24 |
|
|
To learn the current status of any Internet-Draft, please check |
25 |
|
|
the "1id-abstracts.txt" listing contained in the Internet- |
26 |
|
|
Drafts Shadow Directories on ds.internic.net (US East Coast), |
27 |
|
|
nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or |
28 |
|
|
munnari.oz.au (Pacific Rim). |
29 |
|
|
|
30 |
|
|
Distribution of this document is unlimited. Please send comments |
31 |
|
|
to the MHTML working group at <mhtml@segate.sunet.se>. To subscribe |
32 |
|
|
to this list, send a message to <listserv@segate.sunet.se> which |
33 |
|
|
contains the text "sub mhtml <your full name, not email address>". |
34 |
|
|
Discussions of the working group are archived at |
35 |
|
|
<URL:ftp://segate.sunet.se>. |
36 |
|
|
|
37 |
|
|
Abstract |
38 |
|
|
|
39 |
|
|
Although HTML was designed within the context of MIME, more than the |
40 |
|
|
specification of HTML as defined in RFC 1866 is needed for two |
41 |
|
|
electronic mail user agents to be able to interoperate using HTML as |
42 |
|
|
a document format. These issues include the naming of objects that |
43 |
|
|
are normally referred to by URIs, and the means of aggregating |
44 |
|
|
objects that go together. This draft describes a set of guidelines |
45 |
|
|
that will allow conforming mail user agents to be able to send, |
46 |
|
|
deliver and display these HTML objects. In addition it is hoped that |
47 |
|
|
these techniques will also apply to the wider category of URI-enabled |
48 |
|
|
objects. |
49 |
|
|
|
50 |
|
|
Table of Contents |
51 |
|
|
|
52 |
|
|
1. Introduction |
53 |
|
|
1.1 Purpose |
54 |
|
|
1.2 Overall Operation |
55 |
|
|
1.3 URI References |
56 |
|
|
1.3.1 Use of Multipart/Related |
57 |
|
|
1.3.2 Content-ID URL References |
58 |
|
|
1.3.3 New MIME Content Headers |
59 |
|
|
1.4 Other MIME Issues |
60 |
|
|
2. Examples |
61 |
|
|
3. Security Considerations |
62 |
|
|
4. Acknowledgments |
63 |
|
|
5. References |
64 |
|
|
6. Author's Address |
65 |
|
|
|
66 |
|
|
1. Introduction |
67 |
|
|
|
68 |
|
|
1.1 Purpose |
69 |
|
|
|
70 |
|
|
Although HTML [1] is a valid MIME [2] type, RFC 1866 does not |
71 |
|
|
provide enough specification in order for two electronic mail |
72 |
|
|
user agents to be able to interoperate using HTML as a document |
73 |
|
|
format. This draft describes a set of guidelines that will |
74 |
|
|
allow conforming mail user agents to be able to send, deliver |
75 |
|
|
and display HTML objects. In addition it is hoped that these |
76 |
|
|
techniques will also apply to the wider category of URI-enabled |
77 |
|
|
[3] objects. |
78 |
|
|
|
79 |
|
|
An HTML aggregate object is a MIME-encoded message that |
80 |
|
|
contains an HTML document as well as other data that is |
81 |
|
|
required in order to represent that object (inline pictures, |
82 |
|
|
style sheets, etc.). HTML aggregate objects can also include |
83 |
|
|
additional HTML documents that are linked to the first object, |
84 |
|
|
as well as other arbitrary MIME content. |
85 |
|
|
|
86 |
|
|
In designing HTML capabilities for electronic mail user agents |
87 |
|
|
(UAs), it is important to keep in mind the differing needs of |
88 |
|
|
several audiences. Mail sending agents will send aggregate HTML |
89 |
|
|
objects as an encoding of normal day-to-day electronic mail. |
90 |
|
|
Mail sending agents will also send aggregate HTML objects when |
91 |
|
|
a user wishes to mail a particular document from the World Wide |
92 |
|
|
Web to someone else. Finally mail sending agents will send |
93 |
|
|
aggregate HTML documents as automatic responders, providing |
94 |
|
|
access to WWW resources for non-IP connected clients. |
95 |
|
|
|
96 |
|
|
Mail receiving agents also have several differing needs. Some |
97 |
|
|
mail receiving agents will be able to receive an aggregate HTML |
98 |
|
|
document and display it just as any other text content type |
99 |
|
|
would be displayed. Others will have to pass this aggregate |
100 |
|
|
HTML document to an HTML browsing program, and provisions need |
101 |
|
|
to be made to make this possible. |
102 |
|
|
|
103 |
|
|
Finally several other constraints on the problem arise. It is |
104 |
|
|
important that it be possible for an HTML document to be signed |
105 |
|
|
and for it to be able to be transmitted to a client and |
106 |
|
|
displayed with a minimum chance of breaking the message |
107 |
|
|
integrity check that is part of the signature. |
108 |
|
|
|
109 |
|
|
1.2 Overall Operation |
110 |
|
|
|
111 |
|
|
A mail user agent that wishes to send a content-type of HTML |
112 |
|
|
can just do so, so long as the normal data encoding issues are |
113 |
|
|
taken care of as specified in [2]. However at a basic level |
114 |
|
|
there are some differences between HTML being transferred by |
115 |
|
|
HTTP and HTML being transferred through Internet email. When |
116 |
|
|
transferred through HTTP, HTML by default uses the document |
117 |
|
|
character set ISO-8859-1. Within electronic mail, the default |
118 |
|
|
character set is US-ASCII. If a document uses any characters |
119 |
|
|
that are not in US-ASCII, the document must explicitly label |
120 |
|
|
the character set and perform appropriate MIME content |
121 |
|
|
encodings. Instead of applying normal MIME content encodings it |
122 |
|
|
is possible to translate non US-ASCII characters to HTML |
123 |
|
|
defined entity references. However it is inappropriate to use |
124 |
|
|
entity references to non-US-ASCII characters without labeling |
125 |
|
|
the document character set appropriately. |
126 |
|
|
|
127 |
|
|
1.3 URI References |
128 |
|
|
|
129 |
|
|
The use of URI references creates some additional issues for |
130 |
|
|
aggregate HTML objects. Normal URI references can of course be used, |
131 |
|
|
however it is likely that many user agents may not be able to |
132 |
|
|
retrieve those objects referred to. This document provides a means |
133 |
|
|
for these additional objects to be transmitted with the HTML and for |
134 |
|
|
the links between these objects to be properly resolved. |
135 |
|
|
|
136 |
|
|
1.3.1 Use of Multipart/Related |
137 |
|
|
|
138 |
|
|
Multiple objects should be aggregated using the |
139 |
|
|
multipart/related content type as defined in RFC 1872 [4]. RFC |
140 |
|
|
1872 says that multipart/related should have parameters |
141 |
|
|
"start", and "type". The "start" parameter refers to the |
142 |
|
|
Content-ID of the sub-part which contains the main document, in |
143 |
|
|
this case, usually the primary HTML document. The main document |
144 |
|
|
should be the one first displayed by the receiving UA. The |
145 |
|
|
"type" parameter serves as a label for the type of the |
146 |
|
|
aggregate object, in this case "text/html". |
147 |
|
|
|
148 |
|
|
1.3.2 Content-ID URL References |
149 |
|
|
|
150 |
|
|
An HTML body part can use Content-ID URLs as described by draft- |
151 |
|
|
levinson-cid-01.txt [5] to refer to other body parts of the |
152 |
|
|
same MIME message. Content-ID URLs can also refer to body parts |
153 |
|
|
in other MIME messages, but it is unlikely that many clients |
154 |
|
|
will be able to resolve the reference. |
155 |
|
|
|
156 |
|
|
1.3.3 New MIME Content Headers |
157 |
|
|
|
158 |
|
|
In order to resolve URI references to other body parts, two new MIME |
159 |
|
|
content headers are required. Both of these place URIs in MIME header |
160 |
|
|
fields. Since MIME header fields have a limited length and URIs can |
161 |
|
|
get quite long, these lines may have to be folded. When the lines are |
162 |
|
|
folded, no additional non-white space characters may be introduced, |
163 |
|
|
and since white space is not allowed in URIs it is simply ignored. |
164 |
|
|
|
165 |
|
|
1.3.3.1 Content-Base |
166 |
|
|
|
167 |
|
|
The Content-Base header field may be included in any MIME content |
168 |
|
|
header. It specifies the Content-Base for the body part and should be |
169 |
|
|
a full URL. Any relative URL references within the body part are made |
170 |
|
|
relative to the body parts "base URL". An HTML body part can also |
171 |
|
|
include a <BASE> tag. If it does contain a <BASE> tag, that <BASE> |
172 |
|
|
tag takes precedence over the Content-Base header field. |
173 |
|
|
|
174 |
|
|
1.3.3.2 Content-Location |
175 |
|
|
|
176 |
|
|
The Content-Location content header may be included in any MIME |
177 |
|
|
subpart header. It specifies the URI that corresponds to the object |
178 |
|
|
present in that subpart. If a URL is specified using the |
179 |
|
|
Content-Location header, it should be a fully qualified URL. |
180 |
|
|
|
181 |
|
|
1.4 Other MIME issues |
182 |
|
|
|
183 |
|
|
Several other tricky issues may exist regarding the deployment |
184 |
|
|
of HTML email, however they are out of the scope of this |
185 |
|
|
document. These issues include the use of multipart/alternative |
186 |
|
|
for content negotiation, the use of mail/WWW gateways, and the |
187 |
|
|
use of URLs referring to objects outside of the encapsulation |
188 |
|
|
(both to WWW-based objects, objects in other MIME documents, |
189 |
|
|
and objects in other parts of the same MIME message but not in |
190 |
|
|
the same multipart/related). |
191 |
|
|
|
192 |
|
|
The use of multipart/alternative is in no way an HTML-specific |
193 |
|
|
issue and no clear solution exists at this time for the problem |
194 |
|
|
of content negotiation though electronic mail. The use of |
195 |
|
|
mail/WWW gateways should be facilitated by the provisions of |
196 |
|
|
this document. However this document makes no attempt to |
197 |
|
|
specify the format of a request to such a gateway. The use of |
198 |
|
|
references to outside of the "encapsulating MIME object" is not |
199 |
|
|
something that can be prohibited, but it is simply something |
200 |
|
|
that the sending UA needs to realize creates a danger of the |
201 |
|
|
receiving UA not being able to resolve the reference. |
202 |
|
|
|
203 |
|
|
Finally, a sending user agent should not make any assumptions |
204 |
|
|
about the method that the receiving user agent will use to |
205 |
|
|
display the HTML files. For example it should not use Content- |
206 |
|
|
Disposition and/or "file:" URLs under the assumption that the |
207 |
|
|
receiving UA is going to save the pieces of the HTML aggregate |
208 |
|
|
object as files on a disk to be displayed by a separate |
209 |
|
|
browser. Content-Disposition should only be used as described |
210 |
|
|
in RFC 1806 in order to distinguish between file attachments |
211 |
|
|
and inline message components. |
212 |
|
|
|
213 |
|
|
2. Examples |
214 |
|
|
|
215 |
|
|
The first example is the simplest form of an HTML email message. |
216 |
|
|
This is not an aggregate HTML object, but simply one by itself. This |
217 |
|
|
message contains a hot-link but does not provide the ability to |
218 |
|
|
resolve the hot-link. To resolve the hot-link the receiving client |
219 |
|
|
would need either IP access to the Internet, or an electronic mail |
220 |
|
|
web gateway. |
221 |
|
|
|
222 |
|
|
|
223 |
|
|
From: some.user@resnova.com |
224 |
|
|
To: someone.else@entropy.net |
225 |
|
|
Subject: Hello there |
226 |
|
|
Mime-Version: 1.0 |
227 |
|
|
Content-Type: text/html |
228 |
|
|
|
229 |
|
|
<html> |
230 |
|
|
<head></head> |
231 |
|
|
<body> |
232 |
|
|
<h1>Hi there!</h1> |
233 |
|
|
This is a rather pointless example of an HTML message.<p> |
234 |
|
|
Try clicking <a href=3D"http://www.resnova.com/">here.</a><p> |
235 |
|
|
</body></html> |
236 |
|
|
|
237 |
|
|
|
238 |
|
|
The second example shows a simple HTML document with a picture in it: |
239 |
|
|
|
240 |
|
|
|
241 |
|
|
From: some.user@resnova.com |
242 |
|
|
To: someone.else@entropy.net |
243 |
|
|
Subject: Hello there |
244 |
|
|
Mime-Version: 1.0 |
245 |
|
|
Content-Type: multipart/related; boundary=3Dabcdefghij |
246 |
|
|
start=3D"<12345@entropy.net>"; type=3D"text/html" |
247 |
|
|
|
248 |
|
|
--abcdefghij |
249 |
|
|
Content-Type: text/html |
250 |
|
|
Content-ID: <12345@entropy.net> |
251 |
|
|
<html> |
252 |
|
|
<head></head> |
253 |
|
|
<body> |
254 |
|
|
<h1>Hi there!</h1> |
255 |
|
|
This is a rather pointless example of an HTML message.<p> |
256 |
|
|
<img src=3D"cid:<45678@entropy.net>"><p> |
257 |
|
|
Try clicking <a href=3D"http://www.resnova.com/">here.</a><p> |
258 |
|
|
</body></html> |
259 |
|
|
|
260 |
|
|
--abcdefghij |
261 |
|
|
Content-Type: image/jpeg |
262 |
|
|
Content-ID: <45678@entropy.net> |
263 |
|
|
Content-Transfer-Encoding: Base64 |
264 |
|
|
|
265 |
|
|
jdisfdsufhsdjvbfdhvbfhbvfhbvfdvifdjrivjifdjvfivfd |
266 |
|
|
etc=85 |
267 |
|
|
--abcdefghij=97 |
268 |
|
|
|
269 |
|
|
|
270 |
|
|
The third example shows the use of Content-Location. This |
271 |
|
|
example could be a web page that was mailed to someone. Note |
272 |
|
|
that the starting body part still needs a Content-ID. |
273 |
|
|
|
274 |
|
|
|
275 |
|
|
From: some.user@resnova.com |
276 |
|
|
To: someone.else@entropy.net |
277 |
|
|
Subject: Hello there |
278 |
|
|
Mime-Version: 1.0 |
279 |
|
|
Content-Type: multipart/related; boundary=3Dabcdefghij |
280 |
|
|
start=3D"<12345@entropy.net>"; type=3D"text/html" |
281 |
|
|
|
282 |
|
|
--abcdefghij |
283 |
|
|
Content-Type: text/html |
284 |
|
|
Content-ID: <12345@entropy.net> |
285 |
|
|
Content-Location: http://www.entropy.net/mydoc.html |
286 |
|
|
<html> |
287 |
|
|
<head></head> |
288 |
|
|
<body> |
289 |
|
|
<h1>Hi there!</h1> |
290 |
|
|
This is a rather pointless example of an HTML message.<p> |
291 |
|
|
<img src=3D"http://www.entropy.net/myimg.jpg"><p> |
292 |
|
|
Try clicking <a href=3D"http://www.resnova.com/">here.</a><p> |
293 |
|
|
</body></html> |
294 |
|
|
|
295 |
|
|
--abcdefghij |
296 |
|
|
Content-Type: image/jpeg |
297 |
|
|
Content-Location: http://www.entropy.net/myimg.jpg |
298 |
|
|
Content-Transfer-Encoding: Base64 |
299 |
|
|
|
300 |
|
|
jdisfdsufhsdjvbfdhvbfhbvfhbvfdvifdjrivjifdjvfivfd |
301 |
|
|
etc=85 |
302 |
|
|
--abcdefghij=97 |
303 |
|
|
|
304 |
|
|
3. Security |
305 |
|
|
|
306 |
|
|
Some Security Considerations include the potential to mail someone |
307 |
|
|
an object, and claim that it is represented by a particular URI (by |
308 |
|
|
giving it a Content-Location: header field). There can be no |
309 |
|
|
assurance that a WWW request for that same URI would normally result |
310 |
|
|
in that same object. Because of this problem, receiving User Agents |
311 |
|
|
should not cache this data in the same way that data that was |
312 |
|
|
retrieved through an HTTP or FTP request might be cached. |
313 |
|
|
|
314 |
|
|
In addition, by allowing people to mail aggregate HTML objects, we |
315 |
|
|
are opening the door to other potential security problems that until |
316 |
|
|
now were only problems for WWW users. For example, some HTML |
317 |
|
|
documents now either themselves contain executable content |
318 |
|
|
(JavaScript) or contain links to executable content (The "INSERT" |
319 |
|
|
specification, Java). It would be exceedingly dangerous for a |
320 |
|
|
receiving User Agent to execute content received through a mail |
321 |
|
|
message without careful attention to restrictions on the capabilities |
322 |
|
|
of that executable content. |
323 |
|
|
|
324 |
|
|
4. Acknowledgments |
325 |
|
|
|
326 |
|
|
Thanks to Dave Crocker, Roy Fielding, Ed Levinson, and Paul Hoffman |
327 |
|
|
who, with me, worked out most of the details of this draft in an |
328 |
|
|
informal discussion at the Dallas December 1995 IETF. Additional |
329 |
|
|
thanks to Greg Herlihy and Ed Levinson for reviewing this draft. |
330 |
|
|
Thanks also to Jacob Palme who encouraged the existence of the Mail |
331 |
|
|
HTML effort and addressed many of the issues in his drafts. |
332 |
|
|
|
333 |
|
|
5 References |
334 |
|
|
|
335 |
|
|
[1] T. Berners-Lee and D. Connolly. |
336 |
|
|
"HyperText Markup Language Specification 2.0" |
337 |
|
|
RFC 1866, Proposed Standard |
338 |
|
|
<URL: http://ds.internic.net/rfc/rfc1866.txt>, November 1995. |
339 |
|
|
|
340 |
|
|
[2] N. Borenstein and N. Freed. |
341 |
|
|
"MIME (Multipurpose Internet Mail Extensions) Part One" |
342 |
|
|
RFC 1521, Proposed Standard |
343 |
|
|
<URL: http://ds.internic.net/rfc/rfc1521.txt>, September 1993. |
344 |
|
|
|
345 |
|
|
[3] T. Berners-Lee, L. Masinter, and M. McCahill |
346 |
|
|
"Uniform Resource Locators (URL)" |
347 |
|
|
RFC 1738, Proposed Standard |
348 |
|
|
<URL: http://ds.internic.net/rfc/rfc1738.txt>, December 1994. |
349 |
|
|
|
350 |
|
|
[4] E. Levinson |
351 |
|
|
"The MIME Multipart/Related Content-type" |
352 |
|
|
RFC 1872, Experimental |
353 |
|
|
<URL: http://ds.internic.net/rfc/rfc1872.txt>, December 1995. |
354 |
|
|
|
355 |
|
|
6 Author's Address |
356 |
|
|
|
357 |
|
|
Alex Hopmann |
358 |
|
|
alex.hopmann@resnova.com |
359 |
|
|
President |
360 |
|
|
ResNova Software, Inc. |
361 |
|
|
5011 Argosy Dr. #13 |
362 |
|
|
Huntington Beach, CA 92649 |
363 |
|
|
|
364 |
|
|
|
365 |
|
|
|