/[suikacvs]/webroot/www/2004/id/draft-ietf-html-fileupload-00.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-html-fileupload-00.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations) (download)
Tue Jun 15 08:04:04 2004 UTC (20 years, 10 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 wakaba 1.1
2     INTERNET-DRAFT E. Nebel
3     File Transmission from WWW Browsers to Servers L. Masinter
4     draft-ietf-html-fileupload-00.txt Xerox Corporation
5     Expires May 18, 1995 November 18, 1994
6    
7    
8     File Transfer from World-Wide Web Browsers to Servers
9    
10     i. Status of this Memo
11    
12     This document is an Internet-Draft. Internet-Drafts are working
13     documents of the Internet Engineering Task Force (IETF), its areas,
14     and its working groups. Note that other groups may also distribute
15     working documents as Internet-Drafts.
16    
17     Internet-Drafts are draft documents valid for a maximum of six months
18     and may be updated, replaced, or obsoleted by other documents at any
19     time. It is inappropriate to use Internet-Drafts as reference
20     material or to cite them other than as ``work in progress.''
21    
22     To learn the current status of any Internet-Draft, please check the
23     ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
24     Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
25     munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
26     ftp.isi.edu (US West Coast).
27    
28    
29     1. Abstract
30    
31     Currently, a World-Wide Web server can get information from users
32     with HTML forms. These forms have proven useful in a wide variety
33     of applications in which input from the user is necessary. But this
34     capability is still greatly limited because HTML forms don't provide
35     a way for the user to submit files to the server. Service providers
36     who need to get files from the user have had to implement custom
37     browsers. (Examples of these custom browsers have appeared on the
38     www-talk mailing list.) To avoid the necessity for custom browsers
39     and to make WWW servers complete in their ability to get information
40     from the user, the WWW needs to provide a way for users to send files
41     to servers. Since user information is sent back to the server using
42     HTML forms, it is most logical to extend HTML forms to support file
43     submission.
44    
45     This document proposes an extention to HTML to allow forms to request
46     users supply files as data to be returned when the form has been
47     completely filled out and submitted. It also includes a description of
48     a backward compatibility strategy that allows new servers to interact
49     with old WWW browsers.
50    
51     2. HTML forms with file submission
52    
53     The current draft HTML specification <URL:http://www.hal.com/
54     %7Fconnolly/html-spec/spyglass-19941014/html-19941014.txt.Z> defines
55     eight possible values for the attribute TYPE of an INPUT element:
56     CHECKBOX, HIDDEN, IMAGE, PASSWORD, RADIO, RESET, SUBMIT, TEXT.
57    
58     In addition, it defines the default ENCTYPE attribute of the FORM
59     element using the POST METHOD to have the default type
60     "application/x-www-form-urlencoded".
61    
62     This proposal makes two changes:
63     1) add a FILE option for the TYPE attribute of INPUT
64     2) allow the ENCTYPE of a FORM to be "multipart/www-form-data".
65    
66     These changes might be considered independently, but are both
67     necessary for reasonable file upload.
68    
69     The author of an HTML form who wants to request one or more files
70     from a user would write (for example):
71    
72     <FORM ENCTYPE="multipart/www-form-data" ACTION="_URL_" METHOD=POST>
73    
74     File to process: <INPUT NAME="userfile1" TYPE="file">
75    
76     <INPUT TYPE="submit" VALUE="Send File">
77    
78     </FORM>
79    
80     The change to the HTML DTD is trivial--just one item added to the
81     entity "InputType", as follows:
82    
83    
84     ... (other elements) ...
85    
86     <!ENTITY % InputType "(TEXT | PASSWORD | CHECKBOX |
87     RADIO | SUBMIT | RESET |
88     IMAGE | HIDDEN | FILE )">
89     <!ELEMENT INPUT - 0 EMPTY>
90     <!ATTLIST INPUT
91     TYPE %InputType TEXT
92     NAME CDATA #IMPLIED -- required for all but submit and reset
93     VALUE CDATA #IMPLIED
94     SRC %URI #IMPLIED -- for image inputs --
95     CHECKED (CHECKED) #IMPLIED
96     SIZE CDATA #IMPLIED --like NUMBERS,
97     but delimited with comma, not space
98     MAXLENGTH NUMBER #IMPLIED
99     ALIGN (top|middle|bottom) #IMPLIED
100     >
101    
102     ... (other elements) ...
103    
104     This is the minimal change requested. Other, larger changes to the
105     InputType entity might also be contemplated but are not part of this
106     proposal. For example, an INPUT element might usefully have an
107     attribute which identifies a set of acceptable media-types, e.g.,
108     <INPUT TYPE=file ACCEPT="image/gif, image/tiff" NAME="image1">.
109    
110    
111     3. Proposed implementation
112    
113     The proposed implementation in WWW browsers is, when a INPUT tag of
114     type FILE is encountered, to show a a display of (previously selected)
115     file names, and a "Browse" button or selection method. Selecting the
116     "Browse" button would cause the browser to enter into a file selection
117     mode appropriate for the platform. Window-based browsers might pops up
118     a file selection window, for example. In such a file selection dialog,
119     the user would have the option of replacing a current selection,
120     adding a new file selection, etc. Browser implementors might choose
121     let the list of file names be manually edited.
122    
123     When the user completes the form, and selects the SUBMIT element, the
124     browser should send the form data and the content of the selected
125     files. The encoding type "application/x-www-form-urlencoded" is
126     inefficient for efficiently sending large quantities of binary data.
127     Thus, a (new) media type, "multipart/www-form-data" is proposed as a
128     way of efficiently sending the values associated with a filled-out
129     form from client to server.
130    
131     The media-type (MIME-type) "multipart/www-form-data" follows the
132     rules of all multipart MIME data streams as outlined in RFC 1521--a
133     boundary is selected that does not occur (with more than
134     infinitessimal probability) in any of the data. Each field of the
135     form is sent, in
136    
137     the order in which it occurs in the form, as a part of the multipart
138     stream. Each part identifies the INPUT name within the original
139     HTML form using a "Name: " attribute. Each part has an optional
140     Content-Type (which defaults to text/plain). File inputs should be
141     identified as either application/binary or the appropriate media
142     type, if known. If multiple files were selected, they should be
143     transferred together using the multipart/mixed format. The
144     "content-transfer-encoding" for each part should be "binary".
145    
146     File inputs may optionally identify the file name using the
147     "Content-Description" header. Browers may optionally include a
148     Content-Length header both in the overall reply and in individual
149     components; the content-length is not intended as a replacement for
150     the multipart boundary as a way of detecting the end of an
151     individual component; rather, it is just as a way forwarning the
152     server of the amount of data coming.
153    
154     On the server end, the ACTION might point to a HTTP URL that
155     implements the forms action via CGI. In such a case, the CGI program
156     would note that the content-type is multipart/www-form-data, parse the
157     various fields (checking for validity, writing the file data to local
158     files for subsequent processing, etc.).
159    
160     4. Backward compatibility issues
161    
162     While not necessary for successful adoption of an enhancement to the
163     current WWW form mechanism, it is useful to also plan for a migration
164     strategy:
165     users with older browsers can still participate in file
166     upload dialogs, using a helper application. Most current web browers,
167     when given <INPUT TYPE=FILE>, will treat it
168     as <INPUT TYPE=TEXT> and give the user a text box. The user can type
169     in a file name into this text box. In addition, current browsers seem
170     to ignore the ENCTYPE parameter in the <FORM> element, and always
171     transmit the data as application/x-www-form-urlencoded.
172    
173     Thus, the server CGI might be written in a way that would note that
174     the form data returned had content-type
175     application/x-www-form-urlencoded instead of multipart/www-form-data,
176     and know that the user was using a browser that didn't implement file
177     upload.
178    
179     In this case, rather than replying with a "text/html" response, the
180     CGI on the server could instead send back a data stream that a helper
181     application might process instead; this would be a data stream that
182     contained:
183     * The (fully qualified) URL to which the actual form data should
184     be posted (terminated with CRLF)
185     * The list of field names that were supposed to be file contents
186     (space separated, terminated with CRLF)
187     * The entire original application/x-www-form-urlencoded form data.
188    
189     This data stream would be marked as application/x-please-send-files.
190    
191     In this case, the browser needs to be configured to process
192     application/x-please-send-files to launch a helper application.
193    
194     The helper would read the form data, note which fields contained
195     'local file names' that needed to be replaced with their data
196     content, might itself prompt the user for changing or adding to the
197     list of files available, and then repackage the data & file contents
198     in multipart/www-form-data for retransmission back to the server.
199    
200     The helper would generate the kind of data that a 'new' browser should
201     actually have sent in the first place, with the intention that the URL
202     to which it is sent corresponds to the original ACTION URL. The point
203     of this is that the server can use the *same* CGI to implement the
204     mechanism for dealing with both old and new browsers.
205    
206     The helper need not display the form data, but *should* ensure that
207     the user actually be prompted about the suitability of sending the
208     files requested (this is to avoid a security problem with malicious
209     servers that ask for files that weren't actually promised by the
210     user.) It would be useful if the status of the transfer of the files
211     involved could be displayed.
212    
213     5. Other considerations
214    
215     Compression:
216    
217     This scheme doesn't address the possible compression of files.
218     After some consideration, it seemed that the optimization issues of
219     file compression were too complex to try to automatically have
220     browsers decide that files should be compressed. Many link-layer
221     transport mechanisms (e.g., high-speed modems) perform data
222     compression over the link, and optimizing for compression at this
223     layer might not be appropriate. It might be possible for browsers to
224     optionally produce a content-transfer-encoding of x-compress for
225     file data, and for servers to decompress the data before processing,
226     if desired; this was left out of the proposal, however.
227    
228     Deferred file transmission:
229    
230     In some situations, it might be advisable to have the server validate
231     various elements of the clients data (user name, account, etc.)
232     before actually preparing to receive the data. However, after some
233     consideration, it seemed best to require that servers that wish to do
234     this should implement this as a series of forms, where some of the
235     data elements that were previously validated might be sent back to
236     the client as 'hidden' fields. This puts the onus of maintaining the
237     state of a transaction only on those servers that wish to build a
238     complex application, while allowing those cases that have simple
239     input needs to be built simply.
240    
241     Other choices for return transmission of binary data:
242    
243     Various people have suggested using new mime top-level type
244     "aggregate", e.g., aggregate/mixed or a content-transfer-encoding of
245     "packet" to express indeterminate-length binary data, rather than
246     relying on the multipart-style boundaries. While we are not opposed
247     to doing so, this would require additional design and standardization
248     work to get acceptance of "aggregate". On the other hand, the
249     'multipart' mechanisms are well established, trivial to implement on
250     both the sending client and receiving server, and as efficient as
251     other methods of dealing with multiple combinations of binary data.
252    
253     Not overloading <INPUT>:
254    
255     Various people have wondered about the advisability of overloading
256     'INPUT' for this function, rather than merely providing a different
257     type of FORM element. Among other considerations, the migration
258     strategy which is allowed when using <INPUT> is important. In
259     addition, the <INPUT> field *is* already overloaded to contain most
260     kinds of data input; rather than creating multiple kinds of <INPUT>
261     tags, it seems most reasonable to enhance <INPUT>. The 'type' of
262     INPUT is not the content-type of what is returned, but rather the
263     'widget-type'; i.e., it identifies the interaction style with the
264     user. The description here is carefully written to allow <INPUT
265     TYPE=FILE> to work for text browsers or audio-markup.
266    
267     Default content-type of field data:
268    
269     Many input fields in HTML are to be typed in. There has been some
270     ambiguity as to how form data should be transmitted back to servers.
271     Making the content-type of <INPUT> fields be text/plain clearly
272     disambiguates that the client should properly encode the data before
273     sending it back to the server with CRLFs.
274    
275     6. Examples
276    
277     Suppose the server supplies the following HTML:
278    
279     <FORM ACTION="http://server.dom/cgi/handle">
280     What is your name? <INPUT TYPE=TEXT NAME=submitter>
281     What files are you sending? <INPUT TYPE=FILE NAME=pics>
282     </FORM>
283    
284     and the user types "Joe Blow" in the name field, and selects
285     a text file "file1.txt" and also an image file "file2.gif" for
286     the answer to 'What files are you sending?'.
287    
288     The client would send back the following data:
289    
290     Content-type: multipart/www-form-data, boundary=AaB03x
291     --AaB03x
292     Name: field1
293    
294     Joe Blow
295     --AaB03x
296     Name: pics
297     Content-type: multipart/mixed, boundary=BbC04y
298    
299     --BbC04y
300     Content-Description: file1.txt
301     Content-Type: text/plain
302    
303     ... contents of file1.txt ...
304     --BbC04y
305     Content-Description: file2.gif
306     Content-type: image/gif
307    
308     ...contents of file2.gif...
309     --BbC04y--
310     --AaB03x--
311    
312     7. Conclusion
313    
314     The suggested implementation gives the client a lot of flexibility in
315     the number and types of files it can send to the server, it gives the
316     server control of the decision to accept the files, and it gives
317     servers a chance to interact with browsers which do not support INPUT
318     TYPE "file".
319    
320     The change to the HTML DTD is very simple, but very powerful. It
321     enables a much greater variety of services to be implemented via the
322     World-Wide Web than is currently possible due to the lack of a file
323     submission facility. This would be an extremely valuable addition to
324     the capabilities of the World-Wide Web.
325    
326    
327     A. Authors' Addresses
328    
329     Larry Masinter masinter@parc.xerox.com
330     Xerox Palo Alto Research Center Voice: (415) 812-4365
331     3333 Coyote Hill Road Fax: (415) 812-4333
332     Palo Alto, CA 94304
333    
334     Ernesto Nebel nebel@xsoft.sd.xerox.com
335     XSoft, Xerox Corporation Voice: (619) 676-7817
336     10875 Rancho Bernardo Road, Suite 200 Fax: (619) 676-7865
337     San Diego, CA 92127-2116

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24