--- test/html-webhacc/error-description-source.xml 2007/11/07 12:20:44 1.15 +++ test/html-webhacc/error-description-source.xml 2007/11/18 11:05:12 1.16 @@ -11,6 +11,121 @@

Description of Errors

+
+

HTML5 Character Encoding Errors

+ + + Character encoding $0 + is not allowed for HTML document. + +

The character encoding used for the document is not allowed + for HTML document. The document is non‐conforming.

+
+
+ + + Character encoding $0 + should not be used for HTML document. + +

The character encoding used for the document is not recommended + for HTML document. The document is non‐conforming + unless there is any good reason to use that encoding.

+
+
+ + + Use of UTF-8 is encouraged. + +

Use of UTF-8 as the character encoding of the document is encouraged, + though the use of another character encoding is conforming.

+
+
+ + + There is no character encoding + declaration. + +

The document does not contain a character encoding + declaration. Unless the character encoding is explicitly + specified in upper‐level protocol, e.g. in HTTP, + or is implied by BOM, there must be a character + encoding declaration. The document is non‐conforming.

+ +

The long character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> + is obsolete. The new syntax is:

+
<meta charset="charset-name">
+ +

Note that the encoding declaration in XML + declaration has no effect for HTML document.

+
+
+ + + No character encoding metadata is found + in upper‐level protocol nor is there BOM, while + character encoding $0 + is not a superset of ASCII. + +

The document is not labeled with character encoding name + in upper‐level protocol, e.g. in HTTP, and + the document is not begin with BOM. In addition, + the character encoding of the document is not a superset of + ASCII. The document is non‐conforming.

+ +

Unless there is a BOM, the character encoding + for the document must be specified in e.g. HTTP‐level, + as:

+
Content-Type: text/html; charset=charset-name
+ +

Existence of HTML character encoding declaration, i.e. + <meta charset="charset-name">, + does not allow to omit charset parameter + for HTML document encoded in non‐ASCII + compatible encoding.

+ +

Character encodings Shift_JIS, Windows-31J, + and ISO-2022-JP are not a superset of + ASCII for the purpose of HTML conformance.

+
+
+ + + While parsing the document as + $0, a character encoding declaration specifying + character encoding as $1 is found. The document + is reparsed. + +

While parsing a document in a character encoding, + a character encoding declaration which declares the character + encoding of the document as another character encoding is found. + The occurence of this warning itself does not make the document + non‐conforming. However, the failure of the first attempt to + to detect the character encoding might be the result of non‐conformance + of the document.

+ +

The document will be reparsed from the beginning. Some error + or warning might be reported again.

+ +

These are suggestions to avoid this warning:

+
    +
  • Specify charset parameter in the Content-Type + field in the HTTP header, as: +
    Content-Type: text/html; charset="charset-name"
  • +
  • Put the character encoding declaration + (<meta charset="charset-name">) + just after <head> start tag.
  • +
  • Use UTF-8.
  • +
+
+
+
+

HTML5 Parse Errors in Tokenization Stage

@@ -922,15 +1037,15 @@ - Charset declaration syntax + Character encoding declaration syntax <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> is obsolete. -

Old long charset declaration syntax +

Old long character encoding declaration syntax <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> is in use. The document is non‐conforming.

-

The new charset declaration syntax is: +

The new character encoding declaration syntax is:

<meta charset="charset-name">

@@ -1375,6 +1490,6 @@ and/or modify it under the same terms as Perl itself.

- + \ No newline at end of file