--- test/html-webhacc/error-description-source.xml 2007/11/07 11:29:46 1.14 +++ test/html-webhacc/error-description-source.xml 2007/11/23 06:36:19 1.17 @@ -11,6 +11,134 @@

Description of Errors

+
+

HTML5 Character Encoding Errors

+ + + Character encoding $0 + is not allowed for HTML document. + +

The character encoding used for the document is not allowed + for HTML document. The document is non‐conforming.

+
+
+ + + Character encoding $0 + should not be used for HTML document. + +

The character encoding used for the document is not recommended + for HTML document. The document is non‐conforming + unless there is any good reason to use that encoding.

+
+
+ + + Use of UTF-8 is encouraged. + +

Use of UTF-8 as the character encoding of the document is encouraged, + though the use of another character encoding is conforming.

+
+
+ + + Conformance for character encoding requirements + cannot be checked. + +

The conformance checker cannot detect whether the input document + met the requirements on character encoding, since the document + is not inputed as a serialized byte sequence. The document is + not conforming if it is not encoded in an appropriate character + encoding with appropriate labeling.

+
+
+ + + There is no character encoding + declaration. + +

The document does not contain a character encoding + declaration. Unless the character encoding is explicitly + specified in lower‐level protocol, e.g. in HTTP, + or is implied by BOM, there must be a character + encoding declaration. The document is non‐conforming.

+ +

The long character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> + is obsolete. The new syntax is:

+
<meta charset="charset-name">
+ +

Note that the encoding declaration in XML + declaration has no effect for HTML document.

+
+
+ + + No character encoding metadata is found + in lower‐level protocol nor is there BOM, while + character encoding $0 + is not a superset of ASCII. + +

The document is not labeled with character encoding name + in lower‐level protocol, e.g. in HTTP, and + the document is not begin with BOM. In addition, + the character encoding of the document is not a superset of + ASCII. The document is non‐conforming.

+ +

Unless there is a BOM, the character encoding + for the document must be specified in e.g. HTTP‐level, + as:

+
Content-Type: text/html; charset=charset-name
+ +

Existence of HTML character encoding declaration, i.e. + <meta charset="charset-name">, + does not allow to omit charset parameter + for HTML document encoded in non‐ASCII + compatible encoding.

+ +

Character encodings Shift_JIS, Windows-31J, + and ISO-2022-JP are not a superset of + ASCII for the purpose of HTML conformance.

+
+
+ + + While parsing the document as + $0, a character encoding declaration specifying + character encoding as $1 is found. The document + is reparsed. + +

While parsing a document in a character encoding, + a character encoding declaration which declares the character + encoding of the document as another character encoding is found. + The occurence of this warning itself does not make the document + non‐conforming. However, the failure of the first attempt to + to detect the character encoding might be the result of non‐conformance + of the document.

+ +

The document will be reparsed from the beginning. Some error + or warning might be reported again.

+ +

These are suggestions to avoid this warning:

+
    +
  • Specify charset parameter in the Content-Type + field in the HTTP header, as: +
    Content-Type: text/html; charset="charset-name"
  • +
  • Put the character encoding declaration + (<meta charset="charset-name">) + just after <head> start tag.
  • +
  • Use UTF-8.
  • +
+
+
+
+

HTML5 Parse Errors in Tokenization Stage

@@ -48,17 +176,16 @@ The & character must introduce a reference. -

An & (U+0026 - AMPERSAND) character which +

An & character which is not part of any reference appears in the input stream. - The document is non-conforming.

+ The document is non‐conforming.

-

Any & character in URI (or IRI) - must be escaped as &amp;.

+

Any & character in URI (or IRI) + must be escaped as &amp;.

The & character must be the first character of a reference: -

+
Named entity reference
&entity-name;
where entity-name is the name of the @@ -134,7 +261,7 @@

The string &# must be the first two characters of a reference: -

+
Numeric character reference
&#d;
where d is the decimal representation of @@ -189,20 +316,22 @@
Comments
-
In HTML documents, comments must be introduced by - <!-- (<! immediately followed +
In HTML document, comments must be introduced by + <!-- (<! + immediately followed by two -s) and must be terminated by - -->. Strings <! not followed + -->. + Strings <! not followed by -- and <!- not followed by - are not valid open delimiters for comments.
Marked sections, including CDATA sections
-
Marked sections are not allowed in HTML documents.
+
Marked sections are not allowed in HTML document.
Markup declarations
-
Markup declarations, except DOCTYPE - and comment declarations, are not allowed in HTML documents.
+
Markup declarations, except for DOCTYPE + and comment declarations, are not allowed in HTML document.
String <!
String <! must be escaped as - &lt;!.
+ &lt;!.
@@ -277,7 +406,8 @@
<script/>

The polytheistic slash cannot be used for script element. Even for an empty script element, - there must be an explicit end tag </script>.

+ there must be an explicit end tag + </script>.

NOTE: Though some user agents interpret polytheistic slash for script element as the @@ -294,7 +424,10 @@ to allow polytheistic slash for these elements.

<a/>, <p/>
These elements are not always empty and therefore - polytheistic slash is not allowed.
+ polytheistic slash is not allowed. Use explicit end tag + to represent empty element as: +
<p></p>
+

Note that, unlike in XML, the polytheistic slash has @@ -319,7 +452,13 @@

An XBL binding cannot be associated by PI in HTML document. Use binding property in CSS - style sheet.
+ style sheet as: +
<style>
+p {
+  binding: url(binding.xbl);
+}
+</style>
+
<?xml?> (XML declaration)
XML declaration is unnecessary for HTML documents.
<?xml-stylesheet?> (XML style sheet @@ -327,7 +466,9 @@
Use HTML link element with rel attribute set to stylesheet (or, alternate stylesheet for an alternate style - sheet).
+ sheet). +
<link rel=stylesheet href="path/to/stylesheet.css">
+
<?php?> or <? ... PHP code ... ?> (PHP code)
@@ -506,7 +647,7 @@
Though the element is void in earlier versions of Safari, the canvas element is no longer defined as empty. There must be an end tag - </canvas>.
+ </canvas>.

Note that misnesting tags, such as @@ -540,7 +681,8 @@

The document contains a DOCTYPE declaration that is different from HTML5 DOCTYPE (i.e. - <!DOCTYPE HTML>). The document is non-conforming.

+ <!DOCTYPE HTML>). + The document is non‐conforming.

The document might or might not be conformant to some version of HTML. However, conformance to any HTML @@ -672,11 +814,11 @@ block-level content, any inline-level content must be put in e.g. paragraph element such as p.

For example, an HTML document fragment - <div><p>Hello!</p> World!</div> + <div><p>Hello!</p> World!</div> is non-conforming, since a word World! does not belong to any paragraph. (If not part of any paragraph, what is it!?) A conforming example would be: -

<div><p>Hello!</p> <p>World!</p></div>
+
<div><p>Hello!</p> <p>World!</p></div>

If the parent element does not allow block-level elements as content
@@ -720,18 +862,19 @@
html element in an XHTML document
-

In an XHTML document, the root html - element must have an xmlns attribute - whose value is set to - http://www.w3.org/1999/xhtml.

+

In XHTML document, the root html + element must have an xmlns attribute as: +

<html xmlns="http://www.w3.org/1999/xhtml">

rss element

The document is written in some version of RSS.

The conformance checker does not support any version of RSS. Use Atom 1.0 for feed documents.

feed element

The Atom feed element must be - in the http://www.w3.org/2004/Atom - namespace.

+ in the http://www.w3.org/2005/Atom + namespace as: +
<feed xmlns="http://www.w3.org/2005/Atom">
+

The conformance checker does not support Atom 0.3. Use Atom 1.0 for feed documents.

@@ -874,6 +1017,87 @@

Attribute Value Errors

+ + Character encoding name $0 + is not registered. + +

The specified character encoding name is not registered to + IANA. Use of registered character encoding name + is a good practice to facilitate interoperability.

+ +
+
EUC-TW
+
EUC-TW is not registered. Unfortunately, there + is no registered name for that character encoding. Use + Big5 encoding with character encoding name Big5 + if it is enough to represent the document.
+
ISO-2022-JP-1
+
ISO-2022-JP-1 is not registered, nevertheless + this character encoding name is documented in + RFC 2237. Use + ISO-2022-JP-2 instead, since that character encoding + is a superset of ISO-2022-JP-1.
+
ISO-2022-JP-3, ISO-2022-JP-3-plane1
+
These names are not registered and obsoleted in favor of + ISO-2022-JP-2004 and + ISO-2022-JP-2004-plane1.
+
ISO-2022-JP-2003, + ISO-2022-JP-2003-plane1
+
These names are not registered and corrected to + ISO-2022-JP-2004 and + ISO-2022-JP-2004-plane1.
+
ISO-2022-JP-2004, + ISO-2022-JP-2004-plane1
+
These names are not registered. Unfortunately, there is + no registered name for these character encodings.
+
UTF-8N
+
UTF-8N is not registered. Character encoding + name UTF-8 represents UTF-8 encoding with or + without BOM.
+
+ +

WARNING: This error might be raised for + a registered character encoding name, since the character encoding + name database of the conformance checker is not complete yet.

+
+
+ + + $0 is a private + character encoding name. + +

The specified character encoding name is a private name and + not registered to IANA. Use of registered character + encoding name is a good practice to facilitate interoperability.

+ +
+
x-euc-jp
+
Use EUC-JP for the Japanese EUC + character encoding.
+
x-sjis
+
Use Shift_JIS for standard Shift encoding scheme of + JIS coded character set, or Windows-31J + for Microsoft standard character set as implemented by + Microsoft Windows.
+
+
+
+ + + The specified value is syntactically not a + character encoding name. + +

The attribute value must be a character encoding name. However, + the specified value is not a character encoding name syntactically. + The document is non‐conforming.

+

Character encoding name is a string of ASCII + printable characters, up to 40 characters.

+
+
+ This attribute only allow a limited set of @@ -886,8 +1110,8 @@
HTML meta element, http-equiv attribute
-

Only Default-Style and Refresh - is allowed.

+

Only values Default-Style and Refresh + are allowed.

Value Content-Type is obsolete; for charset declaration, the charset attribute can be used as:

<meta charset="charset-name">
@@ -907,16 +1131,16 @@ - Charset declaration syntax - <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> + Character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> is obsolete. -

Old long charset declaration syntax - <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> +

Old long character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> is in use. The document is non‐conforming.

-

The new charset declaration syntax is: -

<meta charset="charset-name">
+

The new character encoding declaration syntax is: +

<meta charset="charset-name">

@@ -970,6 +1194,18 @@ + + Character encoding name $1 + is different from document character encoding + $0. + +

The specified character encoding name is different from + the character encoding of the document. The document + is non‐conforming.

+
+
+ Browsing context name @@ -1043,7 +1279,7 @@ The document is non-conforming.

For example, the table below is non-conforming: -

<table>
+      
<table>
 <tbody>
 <tr><td rowspan=2></td></tr>
 </tbody>
@@ -1210,7 +1446,8 @@
     text/cache-manifest must contain a cache manifest.

A cache manifest must start with a line whose content is - CACHE MANIFEST (exactly one space character between + CACHE MANIFEST + (exactly one space character between CACHE and MANIFEST).

@@ -1359,6 +1596,6 @@ and/or modify it under the same terms as Perl itself.

- + \ No newline at end of file