11 |
<body> |
<body> |
12 |
<h1>Description of Errors</h1> |
<h1>Description of Errors</h1> |
13 |
|
|
14 |
|
<section id="html5-character-encoding"> |
15 |
|
<h2>HTML5 Character Encoding Errors</h2> |
16 |
|
|
17 |
|
<d:item name="character encoding" class="format-charset must" level="m"> |
18 |
|
<d:message xml:lang="en">Character encoding <code><var>$0</var></code> |
19 |
|
is not allowed for <abbr>HTML</abbr> document.</d:message> |
20 |
|
<d:desc xml:lang="en"> |
21 |
|
<p>The character encoding used for the document is not allowed |
22 |
|
for <abbr>HTML</abbr> document. The document is non‐conforming.</p> |
23 |
|
</d:desc> |
24 |
|
</d:item> |
25 |
|
|
26 |
|
<d:item name="character encoding" class="format-charset should" |
27 |
|
level="s"> |
28 |
|
<d:message xml:lang="en">Character encoding <code><var>$0</var></code> |
29 |
|
should not be used for <abbr>HTML</abbr> document.</d:message> |
30 |
|
<d:desc xml:lang="en"> |
31 |
|
<p>The character encoding used for the document is not recommended |
32 |
|
for <abbr>HTML</abbr> document. The document is non‐conforming |
33 |
|
unless there is any good reason to use that encoding.</p> |
34 |
|
</d:desc> |
35 |
|
</d:item> |
36 |
|
|
37 |
|
<d:item name="character encoding" class="format-charset warning" |
38 |
|
level="w"> |
39 |
|
<d:message xml:lang="en">Use of UTF-8 is encouraged.</d:message> |
40 |
|
<d:desc xml:lang="en"> |
41 |
|
<p>Use of UTF-8 as the character encoding of the document is encouraged, |
42 |
|
though the use of another character encoding is conforming.</p> |
43 |
|
</d:desc> |
44 |
|
</d:item> |
45 |
|
|
46 |
|
<d:item name="no character encoding declaration" class="format-charset error" |
47 |
|
level="m"> |
48 |
|
<d:message xml:lang="en">There is no character encoding |
49 |
|
declaration.</d:message> |
50 |
|
<d:desc xml:lang="en"> |
51 |
|
<p>The document does not contain a character encoding |
52 |
|
declaration. Unless the character encoding is explicitly |
53 |
|
specified in upper‐level protocol, e.g. in <abbr>HTTP</abbr>, |
54 |
|
or is implied by <abbr>BOM</abbr>, there must be a character |
55 |
|
encoding declaration. The document is non‐conforming.</p> |
56 |
|
|
57 |
|
<p>The long character encoding declaration syntax |
58 |
|
<code class="html bad example"><meta http-equiv="Content-Type" content="text/html; charset=<var>charset-name</var>"></code> |
59 |
|
is obsolete. The new syntax is:</p> |
60 |
|
<pre class="html example"><code><meta charset="<var>charset-name</var>"></code></pre> |
61 |
|
|
62 |
|
<p>Note that the <code>encoding</code> declaration in <abbr>XML</abbr> |
63 |
|
declaration has no effect for <abbr>HTML</abbr> document.</p> |
64 |
|
</d:desc> |
65 |
|
</d:item> |
66 |
|
|
67 |
|
<d:item name="non ascii superset" class="format-charset error" |
68 |
|
level="m"> |
69 |
|
<d:message xml:lang="en">No character encoding metadata is found |
70 |
|
in upper‐level protocol nor is there <abbr>BOM</abbr>, while |
71 |
|
character encoding <code><var>$0</var></code> |
72 |
|
is not a superset of <abbr>ASCII</abbr>.</d:message> |
73 |
|
<d:desc xml:lang="en"> |
74 |
|
<p>The document is not labeled with character encoding name |
75 |
|
in upper‐level protocol, e.g. in <abbr>HTTP</abbr>, and |
76 |
|
the document is not begin with <abbr>BOM</abbr>. In addition, |
77 |
|
the character encoding of the document is not a superset of |
78 |
|
<abbr>ASCII</abbr>. The document is non‐conforming.</p> |
79 |
|
|
80 |
|
<p>Unless there is a <abbr>BOM</abbr>, the character encoding |
81 |
|
for the document must be specified in e.g. <abbr>HTTP</abbr>‐level, |
82 |
|
as:</p> |
83 |
|
<pre class="http example"><code>Content-Type: text/html; charset=<var>charset-name</var></code></pre> |
84 |
|
|
85 |
|
<p>Existence of <abbr>HTML</abbr> character encoding declaration, i.e. |
86 |
|
<code class="html example"><meta charset="<var>charset-name</var>"></code>, |
87 |
|
does not allow to omit <code>charset</code> parameter |
88 |
|
for <abbr>HTML</abbr> document encoded in non‐<abbr>ASCII</abbr> |
89 |
|
compatible encoding.</p> |
90 |
|
|
91 |
|
<p>Character encodings <code>Shift_JIS</code>, <code>Windows-31J</code>, |
92 |
|
and <code>ISO-2022-JP</code> are <em>not</em> a superset of |
93 |
|
<abbr>ASCII</abbr> for the purpose of <abbr>HTML</abbr> conformance.</p> |
94 |
|
</d:desc> |
95 |
|
</d:item> |
96 |
|
|
97 |
|
<d:item name="charset label detected" class="format-charset warning" |
98 |
|
level="w"> |
99 |
|
<d:message xml:lang="en">While parsing the document as |
100 |
|
<code><var>$0</var></code>, a character encoding declaration specifying |
101 |
|
character encoding as <code><var>$1</var></code> is found. The document |
102 |
|
is reparsed.</d:message> |
103 |
|
<d:desc xml:lang="en"> |
104 |
|
<p>While parsing a document in a character encoding, |
105 |
|
a character encoding declaration which declares the character |
106 |
|
encoding of the document as another character encoding is found. |
107 |
|
The occurence of this warning itself does not make the document |
108 |
|
non‐conforming. However, the failure of the first attempt to |
109 |
|
to detect the character encoding might be the result of non‐conformance |
110 |
|
of the document.</p> |
111 |
|
|
112 |
|
<p>The document will be reparsed from the beginning. Some error |
113 |
|
or warning might be reported again.</p> |
114 |
|
|
115 |
|
<p>These are suggestions to avoid this warning:</p> |
116 |
|
<ul> |
117 |
|
<li>Specify <code>charset</code> parameter in the <code>Content-Type</code> |
118 |
|
field in the <abbr>HTTP</abbr> header, as: |
119 |
|
<pre class="HTTP example"><code>Content-Type: text/html; charset="<var>charset-name</var>"</code></pre></li> |
120 |
|
<li>Put the character encoding declaration |
121 |
|
(<code class="html example"><meta charset="<var>charset-name</var>"></code>) |
122 |
|
just after <code class="html example"><head></code> start tag.</li> |
123 |
|
<li>Use <code>UTF-8</code>.</li> |
124 |
|
</ul> |
125 |
|
</d:desc> |
126 |
|
</d:item> |
127 |
|
</section> |
128 |
|
|
129 |
<section id="html5-tokenize-error"> |
<section id="html5-tokenize-error"> |
130 |
<h2>HTML5 Parse Errors in Tokenization Stage</h2> |
<h2>HTML5 Parse Errors in Tokenization Stage</h2> |
131 |
|
|
1037 |
|
|
1038 |
<d:item name="enumerated:invalid:http-equiv:content-type" |
<d:item name="enumerated:invalid:http-equiv:content-type" |
1039 |
class="attribute-value-error"> |
class="attribute-value-error"> |
1040 |
<d:message xml:lang="en">Charset declaration syntax |
<d:message xml:lang="en">Character encoding declaration syntax |
1041 |
<code class="html bad example"><meta http-equiv="Content-Type" content="text/html; charset=<var>charset-name</var>"></code> |
<code class="html bad example"><meta http-equiv="Content-Type" content="text/html; charset=<var>charset-name</var>"></code> |
1042 |
is obsolete.</d:message> |
is obsolete.</d:message> |
1043 |
<d:desc xml:lang="en"> |
<d:desc xml:lang="en"> |
1044 |
<p>Old long charset declaration syntax |
<p>Old long character encoding declaration syntax |
1045 |
<code class="html bad example"><meta http-equiv="Content-Type" content="text/html; charset=<var>charset-name</var>"></code> |
<code class="html bad example"><meta http-equiv="Content-Type" content="text/html; charset=<var>charset-name</var>"></code> |
1046 |
is in use. The document is non‐conforming.</p> |
is in use. The document is non‐conforming.</p> |
1047 |
|
|
1048 |
<p>The new charset declaration syntax is: |
<p>The new character encoding declaration syntax is: |
1049 |
<pre class="html example"><code><meta charset="<var>charset-name</var>"></code></pre> |
<pre class="html example"><code><meta charset="<var>charset-name</var>"></code></pre> |
1050 |
</p> |
</p> |
1051 |
</d:desc> |
</d:desc> |