/[suikacvs]/webroot/www/2004/id/draft-ietf-iiir-html-01.txt
Suika

Contents of /webroot/www/2004/id/draft-ietf-iiir-html-01.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Tue Jun 15 08:04:06 2004 UTC (19 years, 11 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 Hypertext Markup Language (HTML) Tim Berners-Lee, CERN
2 Internet Draft Daniel Connolly, Atrium
3 IIIR Working Group June 1993
4
5
6 Hypertext Markup Language (HTML)
7
8 A Representation of Textual Information and MetaInformation
9 for Retrieval and Interchange
10
11
12 Status of this Document
13
14 This document is an Internet Draft. Internet Drafts are working
15 documents of the Internet Engineering Task Force (IETF), its Areas,
16 and its Working Groups. Note that other groups may also distribute
17 working documents as Internet Drafts.
18
19 Internet Drafts are working documents valid for a maximum of six
20 months. Internet Drafts may be updated, replaced, or obsoleted by
21 other documents at any time. It is not appropriate to use Internet
22 Drafts as reference material or to cite them other than as a
23 "working draft" or "work in progress".
24
25 Distribution of this document is unlimited. The document is a
26 draft form of a standard for interchange of information on the
27 network which is proposed to be registered as a MIME (RFC1341)
28 content type. Please send comments to timbl@info.cern.ch or the
29 discussion list www-talk@info.cern.ch.
30
31 This is version 1.2 of this draft. This document is available in
32 hypertext on the World-Wide Web as
33 http://info.cern.ch/hypertext/WWW/MarkUp/HTML.html
34
35 Abstract
36
37 HyperText Markup Language (HTML) can be used to represent
38
39 Hypertext news, mail, online documentation, and collaborative
40 hypermedia;
41
42 Menus of options;
43
44 Database query results;
45
46 Simple structured documents with inlined graphics.
47
48 Hypertext views of existing bodies of information
49
50 The World Wide Web (W3) initiative links related information
51 throughout the globe. HTML provides one simple format for
52 providing linked information, and all W3 compatible programs are
53 required to be capable of handling HTML. W3 uses an Internet
54
55
56
57 Berners-Lee and Connolly 1
58
59 protocol (Hypertext Transfer Protocol, HTTP), which allows transfer
60 representations to be negotiated between client and server, the
61 result being returned in an extended MIME message. HTML is
62 therefore just one, but an important one, of the representations
63 used with W3.
64
65 HTML is proposed as a MIME content type.
66
67 HTML refers to the URL specification of RFCxxxx.
68
69 Implementations of HTML parsers and generators can be found in the
70 various W3 servers and browsers, in the public domain W3 code, and
71 may also be built using various public domain SGML parsers such as
72 [SGMLS] . HTML is an SGML document type with fairly generic
73 semantics appropriate for representing information from a wide
74 range of applications. It is more generic than many specific SGML
75 applications, but is still completely device-independent.
76
77 IN THIS DOCUMENT
78
79 This document contains the following parts:
80
81 Vocabulary used in this document, degrees of imperative.
82
83 HTML and MIME with discussion of character sets.
84
85 HTML and SGML and the relationship between them, and
86 Structured text : an introduction for
87 beginners to SGML.
88
89 HTML Elements A list with description, example, and
90 typical rendering.
91
92 HTML Entities Entities used to describe characters.
93
94 The HTML DTD The text of the SGML DTD for HTML
95
96 Link relationship values .
97 A provisional list. Not part of the
98 standard.
99
100 Registration Authority
101 The authority for extending lists of valid
102 vales.
103
104 References to related documents
105
106 Authors addresses Contact information.
107
108 table of contents
109
110 Vocabulary
111
112
113
114
115 Berners-Lee and Connolly 2
116
117 This specification uses the words below with the precise meaning
118 given.
119
120 Representation The encoding of information for interchange.
121 For example, HTML is a representation of
122 hypertext.
123
124 Rendering The form of presentation to information to
125 the human reader.
126
127 IMPERATIVES
128
129 may The implementation is not obliged to follow
130 this in any way.
131
132 must If this is not followed, the implementation
133 does not conform to this specification.
134
135 shall as "must"
136
137 should If this is not followed, though the
138 implementation officially conforms to the
139 standard, undesirable results may occur in
140 practice.
141
142 typical Typical rendering is described for many
143 elements. This is not a mandatory part of the
144 standard but is given as guidance for
145 designers and to help explain the uses for
146 which the elements were intended.
147
148 NOTES
149
150 Sections marked "Note:" are not mandatory parts of the
151 specification but for guidance only.
152
153 STATUS OF FEATURES
154
155 Mainstream All parsers must recognize these features.
156 Features are mainstream unless otherwise
157 mentioned.
158
159 Extra Standard HTML features which may safely be
160 ignored by parsers. It is legal to ignore
161 these, treat the contents as though the tags
162 were not there. (e.g. EM, and any undefined
163 elements)
164
165 Obsolete Not standard HTML. Parsers should implement
166 these features as far as possible in order to
167 preserve back-compatibility with previous
168 versions of this specification.
169
170
171
172
173 Berners-Lee and Connolly 3
174
175 HTML AND MIME
176
177 The definition of the HTML content subtype is
178
179 MIME Type name text
180
181 MIME subtype name: html
182
183 Required parameters: none
184
185 Optional parameters: charset
186
187 Character sets
188
189 The base character set (the SGML BASESET) for HTML is ISO Latin-1.
190 This is the set referred to by any numeric character references .
191 The actual character set used in the representation of an HTML
192 document may be ISO Latin 1, or its 7-bit subset which is ASCII.
193 There is no obligation for an HTML document to contain any
194 characters above decimal 127. It is possible that a transport
195 medium such as electronic mail imposes constraints on the number of
196 bits in a representation of a document, though the HTTP access
197 protocol used by W3 always allows 8 bit transfer.
198
199 When an HTML document is encoded using 7-bit characters, then the
200 mechanisms of character references and entity references may be
201 used to encode characters in the upper half of the ISO Latin-1 set.
202 In this way, documents may be prepared which are suitable for
203 mailing through 7-bit limited systems.
204
205 INTRODUCTION
206
207 The HyperText Markup Language is defined in terms of the ISO
208 Standard Generalized Markup Language []. SGML is a system for
209 defining structured document types and markup languages to
210 represent instances of those document types.
211
212 Every SGML document has three parts:
213
214 An SGML declaration, which binds SGML processing quantities and
215 syntax token names to specific values. For example, the SGML
216 declaration in the HTML DTD specifies that the string that opens
217 a tag is </ and the maximum length of a name is 40 characters.
218
219 A prologue including one or more document type declarations,
220 which specifiy the element types, element relationships and
221 attributes, and references that can be represented by markup.
222 The HTML DTD specifies, for example, that the HEAD element
223 contains at most one TITLE element.
224
225 An instance, which contains the data and markup of the document.
226
227 We use the term HTML to mean both the document type and the markup
228
229
230
231 Berners-Lee and Connolly 4
232
233 language for representing instances of that document type.
234
235 All HTML documents share the same SGML declaration an prologue.
236 Hence implementations of the WorldWide Web generally only transmit
237 and store the instance part of an HTML document. To construct an
238 SGML document entity for processing by an SGML parser, it is
239 necessary to prefix the text from ``HTML DTD'' on page 10 to the
240 HTML instance.
241
242 Conversely, to implement an HTML parser, one need only implement
243 those parts of an SGML parser that are needed to parse an instance
244 after parsing the HTML DTD.
245
246
247
248 Structured Text
249
250 An HTML instance is like a text file, except that some of the
251 characters are interpreted as markup. The markup gives structure to
252 the document.
253
254 The instance represents a hierarchy of elements. Each element has a
255 name , some attributes , and some content. Most elements are
256 represented in the document as a start tag, which gives the name
257 and attributes, followed by the content, followed by the end tag.
258 For example:
259
260 <HTML>
261 <TITLE>
262 A sample HTML instance
263 </TITLE>
264 <H1>
265 An Example of Structure
266 </H1>
267 Here's a typical paragraph.
268 <P>
269 <UL>
270 <LI>
271 Item one has an
272 <A NAME="anchor">
273 anchor
274 </A>
275 <LI>
276 Here's item two.
277 </UL>
278 </HTML>
279
280 Some elements (e.g. P, LI) are empty. They have no content. They
281 show up as just a start tag.
282
283 For the rest of the elements, the content is a sequence of data
284 characters and nested elements. Note that the HTML DTD in fact
285 severely limits the amount of nesting which is allowed: most things
286
287
288
289 Berners-Lee and Connolly 5
290
291 cannot be nested, in fact. No elements may be recursively nested.
292 Anchors and character highlighting may be put inside other
293 constructs.
294
295 TAGS
296
297 Every element starts with a tag, and every non-empty element ends
298 with a tag. Start tags are delimited by < and >, and end tags are
299 delimited by </ and >.
300
301 Names
302
303 The element name immediately follows the tag open delimiter. Names
304 consist of a letter followed by up to 33 letters, digits, periods,
305 or hyphens. Names are not case sensitive.
306
307 Attributes
308
309 In a start tag, whitespace and attributes are allowed between the
310 element name and the closing delimiter. An attribute consists of a
311 name, an equal sign, and a value. Whitespace is allowed around the
312 equal sign.
313
314 The value is specified in a string surrounded by single quotes or a
315 string surrounded by double quotes. (See: other tolerated forms @@)
316
317 The string is parsed like RCDATA (see below ) to determine the
318 attribute value. This allows, for example, quote characters in
319 attribute values to be represented by character references.
320
321 The length of an attribute value (after parsing) is limited to 1024
322 characters.
323
324 ELEMENT TYPES
325
326 The name of a tag refers to an element type declaration in the HTML
327 DTD. An element type declaration associates an element name with
328
329 A list of attributes and their types and statuses
330
331 A content type (one of EMPTY, CDATA, RCDATA, ELEMENT, or MIXED)
332 which determines the syntax of the element's content
333
334 A content model, which specifies the pattern of nested elements
335 and data
336
337 Empty Elements
338
339 Empty elements have the keyword EMPTY in their declaration. For
340 example:
341
342 <!ELEMENT NEXTID - O EMPTY>
343 <!ATTLIST NEXTID N NUMBER #REQUIRED>
344
345
346
347 Berners-Lee and Connolly 6
348
349 This means that the following:
350
351 <nextid n=''27''>
352
353 is legal, but these others are not:
354
355 <nextid>
356 <nextid n=''abc''>
357
358 Character Data
359
360 The keyword CDATA indicates that the content of an element is
361 character data. Character data is all the text up to the next end
362 tag open delimiter-in-context. For example:
363
364 <!ELEMENT XMP - - CDATA>
365
366 specifies that the following text is a legal XMP element:
367
368 <xmp>Here's an example. It looks like it has
369 <tags> and <!--comments-->
370 in it, but it does not. Even this
371 </ is data.</xmp>
372
373 The string </ is only recognized as the opening delimiter of an end
374 tag when it is ``in context,'' that is, when it is followed by a
375 letter. However, as soon as the end tag open delimiter is
376 recognized, it terminates the CDATA content. The following is an
377 error:
378
379 <xmp>There is no way to represent </end> tags
380 in CDATA </xmp>
381
382 Replaceable Character Data
383
384 Elements with RCDATA content behave much like those with CDATA,
385 except for character references and entity references. Elements
386 declared like:
387
388 <!ELEMENT TITLE - - RCDATA>
389
390 can have any sequence of characters in their content.
391
392 Character References
393
394 To represent a character that would otherwise be recognized as
395 markup, use a character reference. The string &# signals a
396 character reference when it is followed by a letter or a digit. The
397 delimiter is followed by the decimal character number and a
398 semicolon. For example:
399
400 <title>You can even represent &#60;/end> tags in RCDATA </title>
401
402
403
404
405 Berners-Lee and Connolly 7
406
407 Entity References
408
409 The HTML DTD declares entities for the less than, greater than, and
410 ampersand characters and each of the ISO Latin 1 characters so that
411 you can reference them by name rather than by number.
412
413 The string & signals an entity reference when it is followed by a
414 letter or a digit. The delimiter is followed by the entity name and
415 a semicolon. For example:
416
417 Kurt G&ouml;del was a famous logician and mathematician.
418
419 Note: To be sure that a string of characters has
420 no markup, HTML writers should represent all
421 occurrences of <, >, and & by character or
422 entity references.
423
424 Element Content
425
426 Some elements have, in stead of a keyword that states the type of
427 content, a content model, which tells what patterns of data and
428 nested elements are allowed. If the content model of an element
429 does not include the symbol #PCDATA , the content is element
430 content.
431
432 Whitespace in element content is considered markup and ignored. Any
433 characters that are not markup, that is, data characters, are
434 illegal.
435
436 For example:
437
438 <!ELEMENT HEAD - - (TITLE? & ISINDEX? & NEXTID? & LINK*)>
439
440 declares an element that may be used as follows:
441
442 <head>
443 <isindex>
444 <title>Head Example</title>
445 </head>
446
447 But the following are illegal:
448
449 <head> no data allowed! </head>
450 <head><isindex><title>Two isindex tags</title><isindex></head>
451
452 Mixed Content
453
454 If the content model includes the symbol #PCDATA, the content of
455 the element is parsed as mixed content. For example:
456
457 <!ELEMENT PRE - - (#PCDATA | A | B | I | U | P)+>
458 <!ATTLIST PRE
459 WIDTH NUMBER #implied
460
461
462
463 Berners-Lee and Connolly 8
464
465 >
466
467 This says that the PRE element contains one or more A, B, I, U, or
468 P elements or data characters. Here's an example of a PRE element:
469
470 <pre>
471 <b>NAME</b>
472 cat -- concatenate<a href=''terms.html#file''>files</a>
473 <b>EXAMPLE</b>
474 cat <xyz
475 </pre>
476
477 The content of the above PRE element is:
478
479 A B element
480
481 The string `` cat -- concatenate''
482
483 An A element
484
485 The string ``\n''
486
487 Another B element
488
489 The string ``\n cat <xyz''
490
491 COMMENTS AND OTHER MARKUP
492
493 To include comments in an HTML document that will be ignored by the
494 parser, surround them with <!-- and -->. After the comment
495 delimiter, all text up to the next occurrence of -- is ignored.
496 Hence comments cannot be nested. Whitespace is allowed between the
497 closing -- and >. (But not between the opening <! and --.)
498
499 For example:
500
501 <HEAD>
502 <TITLE>HTML Guide: Recommended Usage</TITLE>
503 <!-- $Id: HTML.txt,v 1.2 1994/04/12 23:13:42 connolly Exp $ -->
504 </HEAD>
505
506 There are a few other SGML markup constructs that are deprecated or
507 illegal.
508
509 Delimiter Signals...
510
511 <? Processing instruction. Terminated by >.
512
513 <![ Marked section. Marked sections are
514 deprecated. See the SGML standard for
515 complete information.
516
517 <! Markup declaration. HTML defines no short
518
519
520
521 Berners-Lee and Connolly 9
522
523 reference maps, so these are errors.
524 Terminated by >.
525
526 LINE BREAKS
527
528 A line break character is considered markup (and ignored) if it is
529 the first or last piece of content in an element. This allows you
530 to write either
531
532 <PRE>some example text</pre>
533
534 or
535
536 <pre>
537 some example text
538 </pre>
539
540 and these will be processed identically.
541
542 Also, a line that's not empty but contains no content will be
543 ignored altogether. For example, the element
544
545 <pre>
546 <!-- this line is ignored, including the linebreak character -->
547 first line
548
549 third line<!-- the following linebreak is content: -->
550 fourth line<!-- this one's ignored because it's the last piece of cont
551 ent: -->
552 </pre>
553
554 contains only the strings
555
556
557 first line
558
559 third line
560 fourth line.
561
562 SPACES AND TABS
563
564 Space characters must be rendered as horizontal white space. In
565 HTML, multiple spaces should be rendered as proportionally larger
566 spaces.
567
568 The rendering of a horizontal tab (HT) character is not defined,
569 and HT should therefore not be used, except within a PRE (or
570 obsolete XMP, LISTING or PLAINTEXT) element.
571
572 Neither spaces nor tabs should be used to make SGML source layout
573 more attractive or easier to read.
574
575 SUMMARY OF MARKUP SIGNALS
576
577
578
579 Berners-Lee and Connolly 10
580
581 The following delimiters may signal markup, depending on context.
582
583 Delimiter Signals
584
585 <!-- Comment
586
587 &# Character reference
588
589 & Entity reference
590
591 </ End tag
592
593 <! Markup declaration
594
595 ]]> Marked section close (an error)
596
597 < Start tag
598
599 HTML ELEMENTS
600
601 This is a list of elements used in the HTML language. Documents
602 should (but need not absolutely) contain an initial HEAD element
603 followed by a BODY element.
604
605 Old style documents may contain a just the contents of the normal
606 HEAD and BODY elements, in any order. This is deprecated but must
607 be supported by parsers.
608
609 See also: Status of elements
610
611 Properties of the whole document
612
613 Properties of the whole document are defined by the following
614 elements. They should appear within the HEAD element. Their order
615 is not significant.
616
617 TITLE The title of the document
618
619 ISINDEX Sent by a server in a searchable document
620
621 NEXTID A parameter used by editors to generate
622 unique identifiers
623
624 LINK Relationship between this document and
625 another. See also the Anchor element ,
626 Relationships . A document may have many
627 LINK elements.
628
629 BASE A record of the URL of the document when
630 saved
631
632 Text formatting
633
634
635
636
637 Berners-Lee and Connolly 11
638
639 These are elements which occur within the BODY element of a
640 document. Their order is the logical order in which the elements
641 should be rendered on the output device.
642
643 Headings Several levels of heading are supported.
644
645 Anchors Sections of text which form the beginning
646 and/or end of hypertext links are called
647 "anchors" and defined by the A tag.
648
649 Paragraph marks The P element marks the break between two
650 paragraphs.
651
652 Address style An ADDRESS element is displayed in a
653 particular style.
654
655 Blockquote style A block of text quoted from another source.
656
657 Lists Bulleted lists, glossaries, etc.
658
659 Preformatted text Sections in fixed-width font for
660 preformatted text.
661
662 Character highlighting
663 Formatting elements which do not cause
664 paragraph breaks.
665
666 Graphics
667
668 IMG The IMG tag allows inline graphics.
669
670 Obsolete elements
671
672 The other elements are obsolete but should be recognised by parsers
673 for back-compatibility.
674
675 HEAD
676
677 The HEAD element contains all information about the document in
678 general. It does not contain any text which is part of the
679 document: this is in the BODY. Within the head element, only
680 certain elements are allowed.
681
682 BODY
683
684 The BODY element contains all the information which is part of the
685 document, as opposed information about the document which is in the
686 HEAD .
687
688 The elements within the BODY element are in the order in which they
689 should be presented to the reader.
690
691 See the list of things which are allowed within a BODY element .
692
693
694
695 Berners-Lee and Connolly 12
696
697 Anchors
698
699 An anchor is a piece of text which marks the beginning and/or the
700 end of a hypertext link.
701
702 The text between the opening tag and the closing tag is either the
703 start or destination (or both) of a link. Attributes of the anchor
704 tag are as follows.
705
706 HREF OPTIONAL. If the HREF attribute is present,
707 the anchor is sensitive text: the start of a
708 link. If the reader selects this text, (s)he
709 should be presented with another document
710 whose network address is defined by the value
711 of the HREF attribute . The format of the
712 network address is specified elsewhere . This
713 allows for the form HREF="#identifier" to
714 refer to another anchor in the same document.
715 If the anchor is in another document, the
716 attribute is a relative name , relative to
717 the documents address (or specified base
718 address if any).
719
720 NAME OPTIONAL. If present, the attribute NAME
721 allows the anchor to be the destination of a
722 link. The value of the attribute is an
723 identifier for the anchor. Identifiers are
724 arbitrary strings but must be unique within
725 the HTML document. Another document can
726 then make a reference explicitly to this
727 anchor by putting the identifier after the
728 address, separated by a hash sign .
729
730 REL OPTIONAL. An attribute REL may give the
731 relationship (s) described by the hypertext
732 link. The value is a comma-separated list of
733 relationship values. Values and their
734 semantics will be registered by the HTML
735 registration authority . The default
736 relationship if none other is given is void.
737 REL should not be present unless HREF is
738 present. See Relationship values , REV .
739
740 REV OPTIONAL. The same as REL , but the
741 semantics of the link type are in the reverse
742 direction. A link from A to B with REL="X"
743 expresses the same relationship as a link
744 from B to A with REV="X". An anchor may
745 have both REL and REV attributes.
746
747 URN OPTIONAL. If present, this specifies a
748 uniform resource number for the document. See
749 note .
750
751
752
753 Berners-Lee and Connolly 13
754
755 TITLE OPTIONAL. This is informational only. If
756 present the value of this field should equal
757 the value of the TITLE of the document whose
758 address is given by the HREF attribute. See
759 note .
760
761 METHODS OPTIONAL. The value of this field is a
762 string which if present must be a comma
763 separated list of HTTP METHODS supported by
764 the object for public use. See note .
765
766 All attributes are optional, although one of NAME and HREF is
767 necessary for the anchor to be useful. See also: LINK .
768
769 EXAMPLE OF USE:
770
771 See <A HREF="http://info.cern.ch/">CERN</A>'s information for
772 more details.
773
774 A <A NAME=serious>serious</A> crime is one which is associated
775 with imprisonment.
776 ...
777 The Organization may refuse employment to anyone convicted
778 of a <a href="#serious">serious</A> crime.
779
780
781 NOTE : UNIVERSAL RESOURCE NUMBERS
782
783 URNs are provided to allow a document to be recognized if duplicate
784 copies are found. This should save a client implementation from
785 picking up a copy of something it already has.
786
787 The format of URNs is under discussion (1993) by various working
788 groups of the Internet Engineering Task Force.
789
790 NOTE: TITLE ATTRIBUTE OF LINKS
791
792 The link may carry a TITLE attribute which should if present give
793 the title of the document whose address is given by the HREF
794 attribute.
795
796 This is useful for at least two reasons
797
798 The browser software may chose to display the title of the
799 document as a preliminary to retrieving it, for example as a
800 margin note or on a small box while the mouse is over the
801 anchor, or during document fetch.
802
803 Some documents -- mainly those which are not marked up text,
804 such as graphics, plain text and also Gopher menus, do not come
805 with a title themselves, and so putting a title in the link is
806 the only way to give them a title. This is how Gopher works.
807 Obviously it leads to duplication of data, and so it is
808
809
810
811 Berners-Lee and Connolly 14
812
813 dangerous to assume that the title attribute of the link is a
814 valid and unique title for the destination document.
815
816 NOTE: METHODS ATTRIBUTE OF LINKS
817
818 The METHODS attributes of anchors and links are used to provide
819 information about the functions which the user may perform on an
820 object. These are more accurately given by the HTTP protocol when
821 it is used, but it may, for similar reasons as for the TITLE
822 attribute, be useful to include the information in advance in the
823 link.
824
825 For example, The browser may chose a different rendering as a
826 function of the methods allowed (for example something which is
827 searchable may get a different icon)
828
829 Address
830
831 This element is for address information, signatures, authorship,
832 etc, often at the top or bottom of a document.
833
834 TYPICAL RENDERING
835
836 Typically, an address element is italic and/or right justified or
837 indented. The address element implies a paragraph break. Paragraph
838 marks within the address element do not cause extra white space to
839 be inserted.
840
841 EXAMPLES OF USE:
842
843 <ADDRESS><A HREF="Author.html">A.N.Other</A></ADDRESS>
844
845
846 <ADDRESS>
847 Newsletter editor<p>
848 J.R. Brown<p>
849 JimquickPost News, Jumquick, CT 01234<p>
850 Tel (123) 456 7890
851 </ADDRESS>
852
853
854 BASE
855
856 This element allows the URL of the document itself to be recorded
857 in situations in which the document may be read out of context.
858 URLs within the document may be in a "partial" form relative to
859 this base address.
860
861 Where the base address is not specified, the reader will use the
862 URL it used to access the document to resolve any relative URLs.
863
864 The one attribute is:
865
866
867
868
869 Berners-Lee and Connolly 15
870
871 HREF the URL
872
873 BLOCKQUOTE
874
875 The BLOCKQUOTE element allows text quoted from another source to be
876 rendered specially.
877
878 TYPICAL RENDERING
879
880 A typical rendering might be a slight extra left and right indent,
881 and/or italic font. BLOCKQUOTE causes a paragraph break, and
882 typically a line or so of white space will be allowed between it
883 and any text before or after it.
884
885 Single-font rendition may for example put a vertical line of ">"
886 characters down the left margin to indicate quotation in the
887 Internet mail style.
888
889 EXAMPLE
890
891 I think it ends
892 <BLOCKQUOTE>Soft you now, the fair Ophelia. Nymph, in thy orisons,
893 be all my sins remembered.
894 </BLOCKQUOTE>
895 but I am not sure.
896
897 Headings
898
899 Six levels of heading are supported. (Note that a hypertext node
900 within a hypertext work tends to need less levels of heading than
901 a work whose only structure is given by the nesting of headings.)
902
903 A heading element implies all the font changes, paragraph breaks
904 before and after, and white space (for example) necessary to render
905 the heading. Further character emphasis or paragraph marks are not
906 required in HTML.
907
908 H1 is the highest level of heading, and is recommended for the
909 start of a hypertext node. It is suggested that the the text of
910 the first heading be suitable for a reader who is already browsing
911 in related information, in contrast to the title tag which should
912 identify the node in a wider context.
913
914 The heading elements are
915
916 <H1>, <H2>, <H3>, <H4>, <H5>, <H6>
917
918 It is not normal practice to jump from one header to a header level
919 more than one below, for example for follow an H1 with an H3.
920 Although this is legal, it is discouraged, as it may produce
921 strange results for example when generating other representations
922 from the HTML.
923
924
925
926
927 Berners-Lee and Connolly 16
928
929 EXAMPLE:
930
931 <H1>This is a heading</H1>
932 Here is some text
933 <H2>Second level heading</H2>
934 Here is some more text.
935
936 PARSER NOTE:
937
938 Parsers should not require any specific order to heading elements,
939 even if the heading level increases by more than one between
940 successive headings.
941
942 TYPICAL RENDERING
943
944 H1 Bold very large font, centered. One or two
945 lines clear space between this and anything
946 following. If printed on paper, start new
947 page.
948
949 H2 Bold, large font,, flush left against left
950 margin, no indent. One or two clear lines
951 above and below.
952
953 H3 Italic, large font, slightly indented from
954 the left margin. One or two clear lines above
955 and below.
956
957 H4 Bold, normal font, indented more than H3.
958 One clear line above and below.
959
960 H5 Italic, normal font, indented as H4. One
961 clear line above.
962
963 H6 Bold, indented same as normal text, more
964 than H5. One clear line above.
965
966 These typical values are just an indication, and it is up to the
967 designer of the presentation software to define the styles. The
968 reader may have options to customize these. When writing
969 documents, you should assume that whatever is done it is designed
970 to have the same sort of effect as the styles above.
971
972 The rendering software is responsible for generating suitable
973 vertical white space between elements, so it is NOT normal or
974 required to follow a heading element with a paragraph mark.
975
976 IMG: Embedded Images
977
978 Status: Extra
979
980 The IMG element allows another document to be inserted inline. The
981 document is normally an icon or small graphic, etc. This element is
982
983
984
985 Berners-Lee and Connolly 17
986
987 NOT intended for embedding other HTML text.
988
989 Browsers which are not able to display inline images ignore IMG
990 elements. Authors should note that some browsers will be able to
991 display (or print) linked graphics but not inline graphics. If the
992 graphic is essential, it may be wiser to make a link to it rather
993 than to put it inline. If the graphic is essentially decorative,
994 then IMG is appropriate.
995
996 The IMG element is empty: it has no closing tag. It has two
997 attributes:
998
999 SRC The value of this attribute is the URL of
1000 the document to be embedded. Its syntax is
1001 the same as that of the HREF attribute of the
1002 A tag. SRC is mandatory.
1003
1004 ALIGN Take values TOP or MIDDLE or BOTTOM,
1005 defining whether the tops or middles of
1006 bottoms of the graphics and text should be
1007 aligned vertically.
1008
1009 ALT Optional alternative text as an alternative
1010 to the graphics for display in text-only
1011 environments.
1012
1013 Note that IMG elements are allowed within anchors.
1014
1015 EXAMPLE
1016
1017 Warning: < IMG SRC ="triangle.gif" ALT="Warning:"> This must b
1018 e done by a
1019 qualified technician.
1020
1021 < A HREF="Go">< IMG SRC ="Button"> Press to start</A>
1022
1023
1024
1025
1026
1027 ISINDEX
1028
1029 This element informs the reader that the document is an index
1030 document. As well as reading it, the reader may use a keyword
1031 search.
1032
1033 The node may be queried with a keyword search by suffixing the node
1034 address with a question mark, followed by a list of keywords
1035 separated by plus signs. See the network address format .
1036
1037 Note that this tag is normally generated automatically by a server.
1038 If it is added by hand to an HTML document, then the client will
1039 assume that the server can handle a search on the document.
1040
1041
1042
1043 Berners-Lee and Connolly 18
1044
1045 Obviously the server must have this capability for it to work:
1046 simply adding <ISINDEX> in the document is not enough to make
1047 searches happen if the server does not have a search engine!
1048
1049 Status: standard.
1050
1051 EXAMPLE OF USE:
1052
1053 <ISINDEX>
1054
1055 LINK
1056
1057 The LINK element occurs within the HEAD element of an HTML
1058 document. It is used to indicate a relationship between the
1059 document and some other object. A document may have any number of
1060 LINK elements.
1061
1062 The LINK element is empty, but takes the same attributes as the
1063 anchor element .
1064
1065 Typical uses are to indicate authorship, related indexes and
1066 glossaries, older or more recent versions, etc. Links can indicate
1067 a static tree structure in which the document was authored by
1068 pointing to a "parent" and "next" and "previous" document, for
1069 example.
1070
1071 Servers may also allow links to be added by those who do not have
1072 the right to alter the body of a document.
1073
1074 Forms of list in HTML
1075
1076 GLOSSARIES
1077
1078 A glossary (or definition list) is a list of paragraphs each of
1079 which has a short title alongside it. Apart from glossaries, this
1080 element is useful for presenting a set of named elements to the
1081 reader. The elements within a glossary follow are
1082
1083 DT The "term", typically placed in a wide left
1084 indent
1085
1086 DD The "definition", which may wrap onto many
1087 lines
1088
1089 These elements must appear in pairs. Single occurrences of DT
1090 without a following DD are illegal. The one attribute which DL can
1091 take is
1092
1093 COMPACT suggests that a compact rendering be used,
1094 because the enclosed elements are
1095 individually small, or the whole glossary is
1096 rather large, or both.
1097
1098
1099
1100
1101 Berners-Lee and Connolly 19
1102
1103 Typical rendering
1104
1105 The definition list DT, DD pairs are arranged vertically. For
1106 each pair, the DT element is on the left, in a column of about a
1107 third of the display area, and the DD element is in the right hand
1108 two thirds of the display area. The DT term is normally small
1109 enough to fit on one line within the left-hand column. If it is
1110 longer, it will either extend across the page, in which case the DD
1111 section is moved down to separate them, or it is wrapped onto
1112 successive lines of the left hand column.
1113
1114 White space is typically left between successive DT,DD pairs unless
1115 the COMPACT attribute is given. The COMPACT attribute is
1116 appropriate for lists which are long and/or have DT,DD pairs which
1117 each take only a line or two. It is of course possible for the
1118 rendering software to discover these cases itself and make its own
1119 decisions, and this is to be encouraged.
1120
1121 The COMPACT attribute may also reduce the width of the left-hand
1122 (DT) column.
1123
1124 Examples of use
1125
1126 <DL>
1127 <DT>Term the first<DD>definition paragraph is reasonably
1128 long but is still displayed clearly
1129 <DT>Term2 follows<DD>Definition of term2
1130 </DL>
1131
1132 <DL COMPACT>
1133 <DT>Term<DD>definition paragraph
1134 <DT>Term2<DD>Definition of term2
1135 </DL>
1136
1137
1138
1139
1140 LISTS
1141
1142 A list is a sequence of paragraphs, each of which may be preceded
1143 by a special mark or sequence number. The syntax is:
1144
1145
1146 <UL>
1147 <LI> list element
1148 <LI> another list element ...
1149 </UL>
1150
1151 The opening list tag may be any of UL, OL, MENU or DIR. It must
1152 be immediately followed by the first list element.
1153
1154 Typical rendering
1155
1156
1157
1158
1159 Berners-Lee and Connolly 20
1160
1161 The representation of the list is not defined here, but a bulleted
1162 list for unordered lists, and a sequence of numbered paragraphs
1163 for an ordered list would be quite appropriate. Other possibilities
1164 for interactive display include embedded scrollable browse panels.
1165
1166 List elements with typical rendering are:
1167
1168 UL A list of multi-line paragraphs, typically
1169 separated by some white space and/or marked
1170 by bullets, etc.
1171
1172 OL As UL, but the paragraphs are typically
1173 numbered in some way to indicate the order as
1174 significant.
1175
1176 MENU A list of smaller paragraphs. Typically one
1177 line per item, with a style more compact than
1178 UL.
1179
1180 DIR A list of short elements, typically less
1181 than 20 characters. These may be arranged in
1182 columns across the page, typically 24
1183 character in width. If the rendering software
1184 is able to optimize the column width as
1185 function of the widths of individual
1186 elements, so much the better.
1187
1188 Example of use
1189
1190 <OL>
1191 <LI> When you get to the station, leave
1192 by the southern exit, on platform one.
1193 <LI>Turn left to face toward the mountain
1194 <LI>Walk for a mile or so until you reach the
1195 "Asquith Arms" then
1196 <LI>Wait and see...
1197 </OL>
1198
1199 < MENU >
1200 <LI>The oranges should be pressed fresh
1201 <LI>The nuts may come from a packet
1202 <LI>The gin must be good quality
1203 </MENU>
1204
1205 < DIR >
1206 <LI>A-H<LI>I-M
1207 <LI>M-R<LI>S-Z
1208 </DIR>
1209
1210
1211
1212 Next ID
1213
1214
1215
1216
1217 Berners-Lee and Connolly 21
1218
1219 This tag takes a single attribute which is the number of the next
1220 document-wide numeric identifier to be allocated of the form z123.
1221
1222 When modifying a document, old anchor ids should not be reused, as
1223 there may be references stored elsewhere which point to them. This
1224 is read and generated by hypertext editors. Human writers of HTML
1225 usually use mnemonic alphabetical identifiers. Browser software may
1226 ignore this tag.
1227
1228 EXAMPLE OF USE:
1229
1230 <NEXTID N=27>
1231
1232
1233 P: Paragraph mark
1234
1235 The empty P element indicates a paragraph break. The exact
1236 rendering of this (indentation, leading, etc) is not defined here,
1237 and may be a function of other tags, style sheets etc.
1238
1239 <P> is used between two pieces of text which otherwise would be
1240 flowed together.
1241
1242 You do NOT need to use <P> to put white space around heading,
1243 list, address or blockquote elements which imply a paragraph break.
1244 It is the responsibility of the rendering software to generate that
1245 white space. A paragraph mark which is preceded or followed by
1246 such elements which imply a paragraph break is has undefined effect
1247 and should be avoided.
1248
1249 TYPICAL RENDERING
1250
1251 Typically, <P> will generate a small vertical space (of a line or
1252 half a line) between the paragraphs. This is not the case
1253 (typically) within ADDRESS or (ever) within PRE elements. With
1254 some implementations, in normal text, <P> may generate a small
1255 extra left indent on the first line.
1256
1257 EXAMPLES OF USE
1258
1259 <h1>What to do</h1>
1260 This is a one paragraph.< p >This is a second.
1261 < P >
1262 This is a third.
1263
1264 BAD EXAMPLE
1265
1266 <h1><P>What not to do</h1>
1267 <p>I found that on my XYZ browser it looked prettier to
1268 me if I put some paragraph marks
1269 <p>
1270 <ul><p><li>Around lists, and
1271 <li>After headings.
1272
1273
1274
1275 Berners-Lee and Connolly 22
1276
1277 </ul>
1278 <p>
1279 None of the paragraph marks in this example should
1280 be there.
1281
1282
1283
1284
1285 PRE: Preformatted text
1286
1287 Preformatted elements in HTML are displayed with text in a fixed
1288 width font, and so are suitable for text which has been formatted
1289 for a teletype by some existing formatting system.
1290
1291
1292
1293 The optional attribute is:
1294
1295 WIDTH This attribute gives the maximum number of
1296 characters which will occur on a line. It
1297 allows the presentation system to select a
1298 suitable font and indentation. Where the
1299 WIDTH attribute is not recognized, it is
1300 recommended that a width of 80 be assumed.
1301 Where WIDTH is supported, it is recommended
1302 that at least widths of 40, 80 and 132
1303 characters be presented optimally, with other
1304 widths being rounded up.
1305
1306 Within a PRE element,
1307
1308 Line boundaries within the text are rendered as a move to the
1309 beginning of the next line, except for one immediately following
1310 or immediately preceding a tag.
1311
1312 The <p> tag should not be used. If found, it should be rendered
1313 as a move to the beginning of the next line.
1314
1315 Anchor elements and character highlighting elements may be used.
1316
1317 Elements which define paragraph formatting (Headings, Address,
1318 etc) must not be used.
1319
1320 The ASCII Horizontal Tab (HT) character must be interpreted as
1321 the smallest positive nonzero number of spaces which will leave
1322 the number of characters so far on the line as a multiple of 8.
1323 Its use is not recommended however.
1324
1325 Example of use
1326
1327 <PRE WIDTH="80">
1328 This is an example line
1329 </PRE>
1330
1331
1332
1333 Berners-Lee and Connolly 23
1334
1335 Note: Highlighting
1336
1337 Within a preformatted element, the constraint that the rendering
1338 must be on a fixed horizontal character pitch may limit or prevent
1339 the ability of the renderer to render highlighting elements
1340 specially.
1341
1342 Note: Margins
1343
1344 The above references to the "beginning of a new line" must not be
1345 taken as implying that the renderer is forbidden from using a
1346 (constant) left indent for rendering preformatted text. The left
1347 indent may of course be constrained by the width required.
1348
1349 TITLE
1350
1351 The title of a document is specified by the TITLE element. The
1352 TITLE element should occur in the HEAD of the document.
1353
1354 There may only be one title in any document. It should identify the
1355 content of the document in a fairly wide context.
1356
1357 The title is not part of the text of the document, but is a
1358 property of the whole document. It may not contain anchors,
1359 paragraph marks, or highlighting. The title may be used to identify
1360 the node in a history list, to label the window displaying the
1361 node, etc. It is not normally displayed in the text of a document
1362 itself. Contrast titles with headings . The title should ideally
1363 be less than 64 characters in length. That is, many applications
1364 will display document titles in window titles, menus, etc where
1365 there is only limited room. Whilst there is no limit on the length
1366 of a title (as it may be automatically generated from other data),
1367 information providers are warned that it may be truncated if long.
1368
1369 Examples of use
1370
1371 Appropriate titles might be
1372
1373 <TITLE>Rivest and Neuman. 1989(b)</TITLE>
1374
1375 or
1376
1377 <TITLE>A Recipe for Maple Syrup Flap-Jack</TITLE>
1378
1379 or
1380
1381 <TITLE>Introduction -- AFS user's Guide</TITLE>
1382
1383 Examples of inappropriate titles are those which are only
1384 meaningful within context,
1385
1386 <TITLE>Introduction</TITLE>
1387
1388
1389
1390
1391 Berners-Lee and Connolly 24
1392
1393 or too long,
1394
1395 <TITLE>Remarks on the Quantum-Gravity effects of "Bean
1396 Pole" diversification in Mononucleosis patients in Developing
1397 Countries under Economic Conditions Prevalent during
1398 the Second half of the Twentieth Century, and Related Papers:
1399 a Summary</TITLE>
1400
1401
1402
1403
1404 Character highlighting
1405
1406 Status: Extra
1407
1408 These elements allow sections of text to be formatted in a
1409 particular way, to provide emphasis, etc. The tags do NOT cause a
1410 paragraph break, and may be used on sections of text within
1411 paragraphs.
1412
1413 Where not supported by implementations, like all tags, these tags
1414 should be ignored but the content rendered.
1415
1416 All these tags have related closing tags, as in
1417
1418 This is <EM>emphasized</EM> text.
1419
1420 Some of these styles are more explicit than others about how they
1421 should be physically represented. The logical styles should be
1422 used wherever possible, unless for example it is necessary to refer
1423 to the formatting in the text. (Eg, "The italic parts are
1424 mandatory".)
1425
1426 Note:
1427
1428 Browsers unable to display a specified style may render it in some
1429 alternative, or the default, style, with some loss of quality for
1430 the reader. Some implementations may ignore these tags altogether,
1431 so information providers should attempt not to rely on them as
1432 essential to the information content.
1433
1434 These element names are derived from TeXInfo macro names.
1435
1436 PHYSICAL STYLES
1437
1438 TT Fixed-width typewriter font.
1439
1440 B Boldface, where available, otherwise
1441 alternative mapping allowed.
1442
1443 I Italic font (or slanted if italic
1444 unavailable).
1445
1446
1447
1448
1449 Berners-Lee and Connolly 25
1450
1451 U Underline.
1452
1453 LOGICAL STYLES
1454
1455 EM Emphasis, typically italic.
1456
1457 STRONG Stronger emphasis, typically bold.
1458
1459 CODE Example of code. typically monospaced font.
1460 (Do not confuse with PRE )
1461
1462 SAMP A sequence of literal characters.
1463
1464 KBD in an instruction manual, Text typed by a
1465 user.
1466
1467 VAR A variable name.
1468
1469 DFN The defining instance of a term. Typically
1470 bold or bold italic.
1471
1472 CITE A citation. Typically italic.
1473
1474 EXAMPLES OF USE
1475
1476 This text contains an <em>emphasized</em> word.
1477 <strong>Don't assume</strong> that it will be italic!
1478 It was made using the <CODE>EM</CODE> element. A citation is
1479 typically italic and has no formal necessary structure:
1480 <cite>Moby Dick</cite> is a book title.
1481
1482
1483
1484
1485 Obsolete elements
1486
1487 The following elements of HTML are obsolete. It is recommended
1488 that client implementors implement the obsolete forms for
1489 compatibility with old servers.
1490
1491 Plaintext
1492
1493 Status: Obsolete .
1494
1495 The empty PLAINTEXT tag terminates the HTML entity. What follows is
1496 not SGML. In stead, there's an old HTTP convention that what
1497 follows is an ASCII (MIME "text/plain") body.
1498
1499 An example if its use is:
1500
1501 <PLAINTEXT>
1502 0001 This is line one of a ling listing
1503 0002 file from <any@host.inc.com> which is sen
1504
1505
1506
1507 Berners-Lee and Connolly 26
1508
1509 t
1510
1511 This tag allows the rest of a file to be read efficiently without
1512 parsing. Its presence is an optimization. There is no closing tag.
1513 The rest of the data is not in SGML.
1514
1515 XMP and LISTING: Example sections
1516
1517 Status: Obsolete . This are in use and should be recognized by
1518 browsers. New servers should use <PRE> instead.
1519
1520 These styles allow text of fixed-width characters to be embedded
1521 absolutely as is into the document. The syntax is:
1522
1523 <LISTING>
1524 ...
1525 </LISTING>
1526
1527 or
1528
1529 <XMP>
1530 ...
1531 </XMP>
1532
1533 The text between these tags is to be portrayed in a fixed width
1534 font, so that any formatting done by character spacing on
1535 successive lines will be maintained. Between the opening and
1536 closing tags:
1537
1538 The text may contain any ISO Latin printable characters, but not
1539 the end tag opener. (See Historical note )
1540
1541 Line boundaries are significant, except any occurring
1542 immediately after the opening tag or before the closing tag. and
1543 are to be rendered as a move to the start of a new line.
1544
1545 The ASCII Horizontal Tab (HT) character must be interpreted as
1546 the smallest positive nonzero number of spaces which will leave
1547 the number of characters so far on the line as a multiple of 8.
1548 Its use is not recommended however.
1549
1550 The LISTING element is portrayed so that at least 132 characters
1551 will fit on a line. The XMP elementis portrayed in a font so that
1552 at least 80 characters will fit on a line but is otherwise
1553 identical to LISTING.
1554
1555 Highlighted Phrase HP1 etc
1556
1557 Status: Obsolete . These tags like all others should be ignored if
1558 not implemented. Replaced will more meaningful elements -- see
1559 character highlighting .
1560
1561 Examples of use:
1562
1563
1564
1565 Berners-Lee and Connolly 27
1566
1567 <HP1>...</HP1> <HP2>... </HP2> etc.
1568
1569 Comment element
1570
1571 Status: Obsolete
1572
1573 A comment element used for bracketing off unneed text and comment
1574 has been introduced in some browsers but will be replaced by the
1575 SGML command feature in new implementations.
1576
1577 HISTORICAL NOTE: XMP AND LISTING
1578
1579 The XMP and LISTING elements used historically to have non SGML
1580 conforming specifications, in that the text could contain any ISO
1581 Latin printable characters, including the tag opener, so long as it
1582 does not contain the closing tag in full.
1583
1584 This form is not supported by SGML and so is not the specified HTML
1585 interpretation. Providers should be warned that implementations
1586 may vary on how they interpret end tags apparently within these
1587 elements
1588
1589 ENTITIES
1590
1591 The following entity names are used in HTML , always prefixed by
1592 ampersand (&) and followed by a semicolon as shown. They represent
1593 particular graphic characters which have special meanings in places
1594 in the markup, or may not be part of the character set available to
1595 the writer.
1596
1597 &lt; The less than sign <
1598
1599 &gt; The "greater than" sign >
1600
1601 &amp; The ampersand sign & itself.
1602
1603 &quot; The double quote sign "
1604
1605 Also allowed are references to any of the ISO Latin-1 alphabet,
1606 using the entity names in the following table.
1607
1608 ISO Latin 1 character entities
1609
1610 This list is derived from "ISO 8879:1986//ENTITIES Added Latin
1611 1//EN".
1612
1613 &AElig; capital AE diphthong (ligature)
1614
1615 &Aacute; capital A, acute accent
1616
1617 &Acirc; capital A, circumflex accent
1618
1619 &Agrave; capital A, grave accent
1620
1621
1622
1623 Berners-Lee and Connolly 28
1624
1625 &Aring; capital A, ring
1626
1627 &Atilde; capital A, tilde
1628
1629 &Auml; capital A, dieresis or umlaut mark
1630
1631 &Ccedil; capital C, cedilla
1632
1633 &ETH; capital Eth, Icelandic
1634
1635 &Eacute; capital E, acute accent
1636
1637 &Ecirc; capital E, circumflex accent
1638
1639 &Egrave; capital E, grave accent
1640
1641 &Euml; capital E, dieresis or umlaut mark
1642
1643 &Iacute; capital I, acute accent
1644
1645 &Icirc; capital I, circumflex accent
1646
1647 &Igrave; capital I, grave accent
1648
1649 &Iuml; capital I, dieresis or umlaut mark
1650
1651 &Ntilde; capital N, tilde
1652
1653 &Oacute; capital O, acute accent
1654
1655 &Ocirc; capital O, circumflex accent
1656
1657 &Ograve; capital O, grave accent
1658
1659 &Oslash; capital O, slash
1660
1661 &Otilde; capital O, tilde
1662
1663 &Ouml; capital O, dieresis or umlaut mark
1664
1665 &THORN; capital THORN, Icelandic
1666
1667 &Uacute; capital U, acute accent
1668
1669 &Ucirc; capital U, circumflex accent
1670
1671 &Ugrave; capital U, grave accent
1672
1673 &Uuml; capital U, dieresis or umlaut mark
1674
1675 &Yacute; capital Y, acute accent
1676
1677 &aacute; small a, acute accent
1678
1679
1680
1681 Berners-Lee and Connolly 29
1682
1683 &acirc; small a, circumflex accent
1684
1685 &aelig; small ae diphthong (ligature)
1686
1687 &agrave; small a, grave accent
1688
1689 &aring; small a, ring
1690
1691 &atilde; small a, tilde
1692
1693 &auml; small a, dieresis or umlaut mark
1694
1695 &ccedil; small c, cedilla
1696
1697 &eacute; small e, acute accent
1698
1699 &ecirc; small e, circumflex accent
1700
1701 &egrave; small e, grave accent
1702
1703 &eth; small eth, Icelandic
1704
1705 &euml; small e, dieresis or umlaut mark
1706
1707 &iacute; small i, acute accent
1708
1709 &icirc; small i, circumflex accent
1710
1711 &igrave; small i, grave accent
1712
1713 &iuml; small i, dieresis or umlaut mark
1714
1715 &ntilde; small n, tilde
1716
1717 &oacute; small o, acute accent
1718
1719 &ocirc; small o, circumflex accent
1720
1721 &ograve; small o, grave accent
1722
1723 &oslash; small o, slash
1724
1725 &otilde; small o, tilde
1726
1727 &ouml; small o, dieresis or umlaut mark
1728
1729 &szlig; small sharp s, German (sz ligature)
1730
1731 &thorn; small thorn, Icelandic
1732
1733 &uacute; small u, acute accent
1734
1735 &ucirc; small u, circumflex accent
1736
1737
1738
1739 Berners-Lee and Connolly 30
1740
1741 &ugrave; small u, grave accent
1742
1743 &uuml; small u, dieresis or umlaut mark
1744
1745 &yacute; small y, acute accent
1746
1747 &yuml; small y, dieresis or umlaut mark
1748
1749 THE HTML DTD
1750
1751 The HTML DTD follows . Its relationship to the content of an SGML
1752 document is explained in the section "HTML and SGML" .
1753
1754
1755 <!SGML "ISO 8879:1986"
1756 --
1757 Document Type Definition for the HyperText Markup Language
1758 as used by the World Wide Web application (HTML DTD).
1759
1760 NOTE: This is a definition of HTML with respect to
1761 SGML, and assumes an understanding of SGML terms.
1762 --
1763
1764 CHARSET
1765 BASESET "ISO 646:1983//CHARSET
1766 International Reference Version (IRV)//ESC 2/5 4/0"
1767 DESCSET 0 9 UNUSED
1768 9 2 9
1769 11 2 UNUSED
1770 13 1 13
1771 14 18 UNUSED
1772 32 95 32
1773 127 1 UNUSED
1774 BASESET "ISO Registration Number 100//CHARSET
1775 ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4
1776 /1"
1777 DESCSET 128 32 UNUSED
1778 160 95 32
1779 255 1 UNUSED
1780
1781
1782 CAPACITY SGMLREF
1783 TOTALCAP 150000
1784 GRPCAP 150000
1785
1786 SCOPE DOCUMENT
1787 SYNTAX
1788 SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1789 18
1790 19 20 21 22 23 24 25 26 27 28 29 30 31 127
1791 255
1792 BASESET "ISO 646:1983//CHARSET
1793 International Reference Version (IRV)//ESC 2/5 4/0"
1794
1795
1796
1797 Berners-Lee and Connolly 31
1798
1799 DESCSET 0 128 0
1800 FUNCTION RE 13
1801 RS 10
1802 SPACE 32
1803 TAB SEPCHAR 9
1804 NAMING LCNMSTRT ""
1805 UCNMSTRT ""
1806 LCNMCHAR ".-"
1807 UCNMCHAR ".-"
1808 NAMECASE GENERAL YES
1809 ENTITY NO
1810 DELIM GENERAL SGMLREF
1811 SHORTREF SGMLREF
1812 NAMES SGMLREF
1813 QUANTITY SGMLREF
1814 NAMELEN 34
1815 TAGLVL 100
1816 LITLEN 1024
1817 GRPGTCNT 150
1818 GRPCNT 64
1819
1820 FEATURES
1821 MINIMIZE
1822 DATATAG NO
1823 OMITTAG NO
1824 RANK NO
1825 SHORTTAG NO
1826 LINK
1827 SIMPLE NO
1828 IMPLICIT NO
1829 EXPLICIT NO
1830 OTHER
1831 CONCUR NO
1832 SUBDOC NO
1833 FORMAL YES
1834 APPINFO NONE
1835 >
1836
1837 <!DOCTYPE HTML [
1838 <!-- Jul 1 93 -->
1839 <!-- Regarding clause 6.1, SGML Document:
1840
1841 [1] SGML document = SGML document entity,
1842 (SGML subdocument entity |
1843 SGML text entity | non-SGML data entity)*
1844
1845 The role of SGML document entity is filled by this DTD,
1846 followed by the conventional HTML data stream.
1847 -->
1848
1849 <!-- DTD definitions -->
1850
1851 <!ENTITY % heading "H1|H2|H3|H4|H5|H6" >
1852
1853
1854
1855 Berners-Lee and Connolly 32
1856
1857 <!ENTITY % list "UL|OL|DIR|MENU">
1858 <!ENTITY % literal "XMP|LISTING">
1859
1860 <!ENTITY % headelement
1861 "TITLE|NEXTID|ISINDEX" >
1862
1863 <!ENTITY % bodyelement
1864 "P | %heading |
1865 %list | DL | HEADERS | ADDRESS | PRE | BLOCKQUOTE
1866 | %literal">
1867
1868 <!ENTITY % oldstyle "%headelement | %bodyelement | #PCDATA">
1869
1870 <!ENTITY % URL "CDATA"
1871 -- The term URL means a CDATA attribute
1872 whose value is a Uniform Resource Locator,
1873 as defined. (A URN may also be usable here when defined.)
1874 -->
1875
1876 <!ENTITY % linkattributes
1877 "NAME NMTOKEN #IMPLIED
1878 HREF %URL; #IMPLIED
1879 REL CDATA #IMPLIED -- forward relationship type --
1880 REV CDATA #IMPLIED -- reversed relationship type
1881 to referent data:
1882
1883 PARENT CHILD, SIBLING, NEXT, TOP,
1884 DEFINITION, UPDATE, ORIGINAL etc. --
1885
1886 URN CDATA #IMPLIED -- universal resource number --
1887
1888 TITLE CDATA #IMPLIED -- advisory only --
1889
1890 METHODS NAMES #IMPLIED -- supported public methods of the obje
1891 ct:
1892 TEXTSEARCH, GET, HEAD, ... --
1893
1894 ">
1895
1896
1897 <!-- Document Element -->
1898
1899 <!ELEMENT HTML O O (( HEAD | BODY | %oldstyle)*, PLAINTEXT?)>
1900
1901 <!ELEMENT HEAD - - (TITLE? & ISINDEX? & NEXTID? & LINK*
1902 & BASE?)>
1903
1904 <!ELEMENT TITLE - - RCDATA
1905 -- The TITLE element is not considered part of the flow of t
1906 ext.
1907 It should be displayed, for example as the page header or
1908 window title.
1909 -->
1910
1911
1912
1913 Berners-Lee and Connolly 33
1914
1915 <!ELEMENT ISINDEX - O EMPTY
1916 -- WWW clients should offer the option to perform a search o
1917 n
1918 documents containing ISINDEX.
1919 -->
1920
1921 <!ELEMENT NEXTID - O EMPTY>
1922 <!ATTLIST NEXTID N NAME #REQUIRED
1923 -- The number should be a name suitable for use
1924 for the ID of a new element. When used, the value
1925 has its numeric part incremented. EG Z67 becomes Z68
1926 -->
1927 <!ELEMENT LINK - O EMPTY>
1928 <!ATTLIST LINK
1929 %linkattributes>
1930
1931 <!ELEMENT BASE - O EMPTY -- Reference context for URLS -->
1932 <!ATTLIST BASE
1933
1934 HREF %URL; #IMPLIED
1935
1936 >
1937 <!ENTITY % inline "EM | TT | STRONG | B | I | U |
1938 CODE | SAMP | KBD | KEY | VAR | DFN | CITE "
1939 >
1940
1941 <!ELEMENT (%inline;) - - (#PCDATA)>
1942
1943 <!ENTITY % text "#PCDATA | IMG | %inline;">
1944
1945 <!ENTITY % htext "A | %text">
1946
1947 <!ELEMENT BODY - - (%bodyelement|%htext;)*>
1948
1949
1950 <!ELEMENT A - - (%text)>
1951 <!ATTLIST A
1952 %linkattributes;
1953 >
1954
1955 <!ELEMENT IMG - O EMPTY -- Embedded image -->
1956 <!ATTLIST IMG
1957 SRC %URL; #IMPLIED -- URL of document to embed --
1958 >
1959
1960
1961 <!ELEMENT P - O EMPTY -- separates paragraphs -->
1962
1963 <!ELEMENT ( %heading ) - - (%htext;)+>
1964
1965 <!ELEMENT DL - - (DT | DD | P | %htext;)*>
1966 <!-- Content should match ((DT,(%htext;)+)+,(DD,(%htext;)+))
1967 But mixed content is messy.
1968
1969
1970
1971 Berners-Lee and Connolly 34
1972
1973 -->
1974
1975 <!ELEMENT DT - O EMPTY>
1976 <!ELEMENT DD - O EMPTY>
1977
1978 <!ELEMENT (UL|OL) - - (%htext;|LI|P)+>
1979 <!ELEMENT (DIR|MENU) - - (%htext;|LI)+>
1980 <!-- Content should match ((LI,(%htext;)+)+)
1981 But mixed content is messy.
1982 -->
1983 <!ATTLIST (%list)
1984 COMPACT NAME #IMPLIED -- COMPACT, etc.--
1985 >
1986
1987 <!ELEMENT LI - O EMPTY>
1988
1989 <!ELEMENT BLOCKQUOTE - - (%htext;|P)+
1990 -- for quoting some other source -->
1991
1992 <!ELEMENT ADDRESS - - (%htext;|P)+>
1993
1994 <!ELEMENT PRE - - (#PCDATA|%inline|A|P)+>
1995 <!ATTLIST PRE
1996 WIDTH NUMBER #implied
1997 >
1998
1999 <!-- Mnemonic character entities. -->
2000 <!ENTITY AElig "&#198;" -- capital AE diphthong (ligature) -->
2001 <!ENTITY Aacute "&#193;" -- capital A, acute accent -->
2002 <!ENTITY Acirc "&#194;" -- capital A, circumflex accent -->
2003 <!ENTITY Agrave "&#192;" -- capital A, grave accent -->
2004 <!ENTITY Aring "&#197;" -- capital A, ring -->
2005 <!ENTITY Atilde "&#195;" -- capital A, tilde -->
2006 <!ENTITY Auml "&#196;" -- capital A, dieresis or umlaut mark -->
2007 <!ENTITY Ccedil "&#199;" -- capital C, cedilla -->
2008 <!ENTITY ETH "&#208;" -- capital Eth, Icelandic -->
2009 <!ENTITY Eacute "&#201;" -- capital E, acute accent -->
2010 <!ENTITY Ecirc "&#202;" -- capital E, circumflex accent -->
2011 <!ENTITY Egrave "&#200;" -- capital E, grave accent -->
2012 <!ENTITY Euml "&#203;" -- capital E, dieresis or umlaut mark -->
2013 <!ENTITY Iacute "&#205;" -- capital I, acute accent -->
2014 <!ENTITY Icirc "&#206;" -- capital I, circumflex accent -->
2015 <!ENTITY Igrave "&#204;" -- capital I, grave accent -->
2016 <!ENTITY Iuml "&#207;" -- capital I, dieresis or umlaut mark -->
2017 <!ENTITY Ntilde "&#209;" -- capital N, tilde -->
2018 <!ENTITY Oacute "&#211;" -- capital O, acute accent -->
2019 <!ENTITY Ocirc "&#212;" -- capital O, circumflex accent -->
2020 <!ENTITY Ograve "&#210;" -- capital O, grave accent -->
2021 <!ENTITY Oslash "&#216;" -- capital O, slash -->
2022 <!ENTITY Otilde "&#213;" -- capital O, tilde -->
2023 <!ENTITY Ouml "&#214;" -- capital O, dieresis or umlaut mark -->
2024 <!ENTITY THORN "&#222;" -- capital THORN, Icelandic -->
2025 <!ENTITY Uacute "&#218;" -- capital U, acute accent -->
2026
2027
2028
2029 Berners-Lee and Connolly 35
2030
2031 <!ENTITY Ucirc "&#219;" -- capital U, circumflex accent -->
2032 <!ENTITY Ugrave "&#217;" -- capital U, grave accent -->
2033 <!ENTITY Uuml "&#220;" -- capital U, dieresis or umlaut mark -->
2034 <!ENTITY Yacute "&#221;" -- capital Y, acute accent -->
2035 <!ENTITY aacute "&#225;" -- small a, acute accent -->
2036 <!ENTITY acirc "&#226;" -- small a, circumflex accent -->
2037 <!ENTITY aelig "&#230;" -- small ae diphthong (ligature) -->
2038 <!ENTITY agrave "&#224;" -- small a, grave accent -->
2039 <!ENTITY amp "&amp;" -- ampersand -->
2040 <!ENTITY aring "&#229;" -- small a, ring -->
2041 <!ENTITY atilde "&#227;" -- small a, tilde -->
2042 <!ENTITY auml "&#228;" -- small a, dieresis or umlaut mark -->
2043 <!ENTITY ccedil "&#231;" -- small c, cedilla -->
2044 <!ENTITY eacute "&#233;" -- small e, acute accent -->
2045 <!ENTITY ecirc "&#234;" -- small e, circumflex accent -->
2046 <!ENTITY egrave "&#232;" -- small e, grave accent -->
2047 <!ENTITY eth "&#240;" -- small eth, Icelandic -->
2048 <!ENTITY euml "&#235;" -- small e, dieresis or umlaut mark -->
2049 <!ENTITY gt "&#62;" -- greater than -->
2050 <!ENTITY iacute "&#237;" -- small i, acute accent -->
2051 <!ENTITY icirc "&#238;" -- small i, circumflex accent -->
2052 <!ENTITY igrave "&#236;" -- small i, grave accent -->
2053 <!ENTITY iuml "&#239;" -- small i, dieresis or umlaut mark -->
2054 <!ENTITY lt "&lt;" -- less than -->
2055 <!ENTITY ntilde "&#241;" -- small n, tilde -->
2056 <!ENTITY oacute "&#243;" -- small o, acute accent -->
2057 <!ENTITY ocirc "&#244;" -- small o, circumflex accent -->
2058 <!ENTITY ograve "&#242;" -- small o, grave accent -->
2059 <!ENTITY oslash "&#248;" -- small o, slash -->
2060 <!ENTITY otilde "&#245;" -- small o, tilde -->
2061 <!ENTITY ouml "&#246;" -- small o, dieresis or umlaut mark -->
2062 <!ENTITY szlig "&#223;" -- small sharp s, German (sz ligature) -->
2063 <!ENTITY thorn "&#254;" -- small thorn, Icelandic -->
2064 <!ENTITY uacute "&#250;" -- small u, acute accent -->
2065 <!ENTITY ucirc "&#251;" -- small u, circumflex accent -->
2066 <!ENTITY ugrave "&#249;" -- small u, grave accent -->
2067 <!ENTITY uuml "&#252;" -- small u, dieresis or umlaut mark -->
2068 <!ENTITY yacute "&#253;" -- small y, acute accent -->
2069 <!ENTITY yuml "&#255;" -- small y, dieresis or umlaut mark -->
2070
2071 <!-- deprecated elements -->
2072
2073 <!ELEMENT (%literal) - - CDATA>
2074
2075 <!ELEMENT PLAINTEXT - O EMPTY>
2076
2077 <!-- Local Variables: -->
2078 <!-- mode: sgml -->
2079 <!-- compile-command: "sgmls -s -p " -->
2080 <!-- end: -->
2081 ]>
2082
2083
2084
2085
2086
2087 Berners-Lee and Connolly 36
2088
2089 LINK RELATIONSHIP VALUES
2090
2091 Status: This list is not part of the standard. It is intended to
2092 illustrate the use of link relationships and to provide a framework
2093 for further development.
2094
2095 Additions to this list will be controlled by the HTML registration
2096 authority . Experimental values may be used on the condition that
2097 they begin with "X-".
2098
2099 These values of the REL attribute of hypertext links have a
2100 significance defined here, and may be treated in special ways by
2101 HTML applications.
2102
2103 These relationships relate whole documents (objects), rather than
2104 particular anchors within them. If the relationship value is used
2105 with a link between anchors rather than whole documents, the
2106 semantics are considered to apply to the documents.
2107
2108 In the explanations which follows, A is the source document of the
2109 link and B is the destination document specified by the HREF
2110 attribute.
2111
2112 A relationship marked "Acyclic" has the property that no sequence
2113 of links with that relationship may be followed from any document
2114 back to itself. These types of links may therefore be used to
2115 define trees.
2116
2117 Relationships between documents
2118
2119 These relationships are between the documents themselves rather
2120 than the subjects of the documents.
2121
2122 USEINDEX
2123
2124 B is a related index for a search by a user reading this document
2125 who asks for an index search function.
2126
2127 A document may have any number of index links, causing several
2128 indexes top be searched in a client-defined manner.
2129
2130 B must support SEARCH operations under its access protocol.
2131
2132 USEGLOSSARY
2133
2134 B is an index which should be used to resolve glossary queries in
2135 the document. (Typically, a double-click on a word which is not
2136 within an anchor).
2137
2138 A document may have any number of glossary links.
2139
2140 ANNOTATION
2141
2142
2143
2144
2145 Berners-Lee and Connolly 37
2146
2147 The information in B is additional to and subsidiary to that in A.
2148
2149 Annotation is used by one person to write the equivalent of "margin
2150 notes" or other criticism on another's document, for example.
2151
2152 Example: The relationship between a newsgroup and its articles.
2153
2154 Acyclic.
2155
2156 REPLY
2157
2158 Similar to Annotation, but there is no suggestion that B is
2159 subsidiary to A: A and B are on equal footings.
2160
2161 Example: The relationship between a mail message and its reply, a
2162 news article and its reply.
2163
2164 Acyclic.
2165
2166 EMBED
2167
2168 If this link is followed, the node at the end of it is embedded
2169 into the display of the source document.
2170
2171 Acyclic.
2172
2173 PRECEDES
2174
2175 In an ordered structure defined by the author, A precedes B, B is
2176 followed by A.
2177
2178 Acyclic.
2179
2180 Any document may only have one link of this relationship, and/or
2181 one link of the reverse relationship.
2182
2183 Note: May be used to control navigational aids, generate printed
2184 material, etc. In conjunction with " subdocument ", may be used to
2185 define a tree such as a printed book made of hypertext document.
2186 The document can only have one such tree.
2187
2188 SUBDOCUMENT
2189
2190 B is a lower part in the author's hierarchy to A. Acyclic. See
2191 also Precedes .
2192
2193 PRESENT
2194
2195 Whenever A is presented, B must also be presented. This implies
2196 that whenever A is retrieved, B must also be retrieved.
2197
2198 SEARCH
2199
2200
2201
2202
2203 Berners-Lee and Connolly 38
2204
2205 When the link is followed, the node B should be searched rather
2206 than presented. That is, where the client software allows it, the
2207 user should immediately be presented with a search panel and
2208 prompted for text. The search is then performed without an
2209 intermediate retrieval or presentation of the node B
2210
2211 SUPERSEDES
2212
2213 B is a previous version of A.
2214
2215 Acyclic.
2216
2217 HISTORY
2218
2219 B is a list of versions of A
2220
2221 A link reverse link must exist from B to A and to all other known
2222 versions of A.
2223
2224 Relationships about subjects of documents
2225
2226 These relationships convey semantics about objects described by
2227 documents, rather than the documents themselves.
2228
2229 INCLUDES
2230
2231 A includes B, B is part of A. For example, a person described by
2232 document A is a part of the group described by document B.
2233
2234 Acyclic.
2235
2236 MADE
2237
2238 Person (etc) described by node A is author of, or is responsible
2239 for B
2240
2241 This information can be used for protection, and informing authors
2242 of interest, for sending mail to authors, etc.
2243
2244 INTERESTED
2245
2246 Person (etc) described by A is interested in node B.
2247
2248 This information can be used for notification of changes.
2249
2250 Typically, this is a request that, when object B changes in some
2251 way, a new link is made to object A.
2252
2253 The phrase "object B changes" may be interpreted narrowly (as "B
2254 itself changes") or widely (as "B or anythink linked to it or
2255 related to it closely changes"). The amount of change considered
2256 worth notifying people about is also subject to interpretation,
2257 varying from bit changes in the source to a "new edition" statement
2258
2259
2260
2261 Berners-Lee and Connolly 39
2262
2263 by the publisher.
2264
2265 REGISTRATION AUTHORITY
2266
2267 The HTTP Registration Authority is responsible for maintaining
2268 lists of:
2269
2270 Relationship names for link and anchor elements
2271
2272 It is proposed that the Internet Assigned Numbers Authority or
2273 their successors take this role.
2274
2275 Unregistered values may be used for experimental purposes if they
2276 are start with "X-".
2277
2278 REFERENCES
2279
2280 SGML ISO 8879:1986, Information Processing Text
2281 and Office Systems Standard Generalized
2282 Markup Language (SGML).
2283
2284 sgmls an SGML parser by James Clark
2285 <jjc@jclark.com> derived from the ARCSGML
2286 parser materials which were written by
2287 Charles F. Goldfarb. The source is available
2288 on the ifi.uio.no FTP server in the directory
2289 /pub/SGML/SGMLS .
2290
2291 WWW The World-Wide Web , a global information
2292 initiative. For bootstrap information, telnet
2293 info.cern.ch or find documents by
2294 ftp://info.cern.ch/pub/www/doc
2295
2296 URL Universal Resource Locators. RFCxxx.
2297 Currently available by anonymous FTP from
2298 info.cern.ch in /pub/ietf.
2299
2300 AUTHOR'S ADDRESSES
2301
2302 This document was prepared with the help and advice of many people
2303 across the net. Dan Connolly prepared the DTD and the section on
2304 HTML and SGML whilst with Convex Computer Corporation of 3000
2305 Waterview Parkway Richardson, TX 75083. He is now with Atrium
2306 Technology Inc., and is not a current editor of the document.
2307
2308 Tim Berners-Lee
2309 Address CERN
2310 1211 Geneva 23
2311 Switzerland
2312 Telephone: +41(22)767 3755
2313 Fax: +41(22)767 7155
2314 email: timbl@info.cern.ch
2315
2316
2317
2318
2319 Berners-Lee and Connolly 40
2320
2321 Daniel Connolly
2322 Address: Atrium Technologies, Inc.
2323 5000 Plaza on the Lake, Suite 275
2324 Austin, TX 78746
2325 USA
2326 email: connolly@atrium.com
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378 Berners-Lee and Connolly 41
2379

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24