HTML5 Character Encoding Errors

+ + + Character encoding $0 + is not allowed for HTML document. + +

The character encoding used for the document is not allowed + for HTML document. The document is non‐conforming.

+ + + + + Character encoding $0 + should not be used for HTML document. + +

The character encoding used for the document is not recommended + for HTML document. The document is non‐conforming + unless there is any good reason to use that encoding.

+ + + + + Use of UTF-8 is encouraged. + +

Use of UTF-8 as the character encoding of the document is encouraged, + though the use of another character encoding is conforming.

+ + + + + Conformance for character encoding requirements + cannot be checked. + +

The conformance checker cannot detect whether the input document + met the requirements on character encoding, since the document + is not inputed as a serialized byte sequence. The document is + not conforming if it is not encoded in an appropriate character + encoding with appropriate labeling.

+ + + + + There is no character encoding + declaration. + +

The document does not contain a character encoding + declaration. Unless the character encoding is explicitly + specified in lower‐level protocol, e.g. in HTTP, + or is implied by BOM, there must be a character + encoding declaration. The document is non‐conforming.

+ +

The long character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> + is obsolete. The new syntax is:

<meta charset="charset-name">

+ +

Note that the encoding declaration in XML + declaration has no effect for HTML document.

+ + + + + No character encoding metadata is found + in lower‐level protocol nor is there BOM, while + character encoding $0 + is not a superset of ASCII. + +

The document is not labeled with character encoding name + in lower‐level protocol, e.g. in HTTP, and + the document is not begin with BOM. In addition, + the character encoding of the document is not a superset of + ASCII. The document is non‐conforming.

+ +

Unless there is a BOM, the character encoding + for the document must be specified in e.g. HTTP‐level, + as:

Content-Type: text/html; charset=charset-name

+ +

Existence of HTML character encoding declaration, i.e. + <meta charset="charset-name">, + does not allow to omit charset parameter + for HTML document encoded in non‐ASCII + compatible encoding.

+ +

Character encodings Shift_JIS, Windows-31J, + and ISO-2022-JP are not a superset of + ASCII for the purpose of HTML conformance.

+ + + + + While parsing the document as + $0, a character encoding declaration specifying + character encoding as $1 is found. The document + is reparsed. + +

While parsing a document in a character encoding, + a character encoding declaration which declares the character + encoding of the document as another character encoding is found. + The occurence of this warning itself does not make the document + non‐conforming. However, the failure of the first attempt to + to detect the character encoding might be the result of non‐conformance + of the document.

+ +

The document will be reparsed from the beginning. Some error + or warning might be reported again.

+ +

These are suggestions to avoid this warning:

Specify charset parameter in the Content-Type + field in the HTTP header, as: +
```
Content-Type: text/html; charset="charset-name"
```
Put the character encoding declaration + (<meta charset="charset-name">) + just after <head> start tag.
Use UTF-8.

+ + +

HTML5 Parse Errors in Tokenization Stage

@@ -48,17 +176,16 @@ The & character must introduce a reference. -

An & (U+0026 - AMPERSAND) character which +

An & character which is not part of any reference appears in the input stream. - The document is non-conforming.

+ The document is non‐conforming.

Any & character in URI (or IRI) - must be escaped as &.

Any & character in URI (or IRI) + must be escaped as &.

The & character must be the first character of a reference: -

An XBL binding cannot be associated by PI in HTML document. Use binding property in CSS - style sheet.
<?xml?> (XML declaration): XML declaration is unnecessary for HTML documents.
<?xml-stylesheet?> (XML style sheet @@ -327,7 +466,9 @@: Use HTML link element with rel attribute set to stylesheet (or, alternate stylesheet for an alternate style - sheet).
<?php?> or <? ... PHP code ... ?> (PHP code): Though the element is void in earlier versions of Safari, the canvas element is no longer defined as empty. There must be an end tag - </canvas>.

Note that misnesting tags, such as @@ -540,7 +681,8 @@

The document contains a DOCTYPE declaration that is different from HTML5 DOCTYPE (i.e. - <!DOCTYPE HTML>). The document is non-conforming.

+ <!DOCTYPE HTML>). + The document is non‐conforming.

The document might or might not be conformant to some version of HTML. However, conformance to any HTML @@ -672,11 +814,11 @@ block-level content, any inline-level content must be put in e.g. paragraph element such as p.

For example, an HTML document fragment - <div><p>Hello!</p> World!</div> + <div><p>Hello!</p> World!</div> is non-conforming, since a word World! does not belong to any paragraph. (If not part of any paragraph, what is it!?) A conforming example would be: -

<div><p>Hello!</p> <p>World!</p></div>

<div><p>Hello!</p> <p>World!</p></div>

If the parent element does not allow block-level elements as content

@@ -720,18 +862,19 @@

html element in an XHTML document

In an XHTML document, the root html - element must have an xmlns attribute - whose value is set to - http://www.w3.org/1999/xhtml.

In XHTML document, the root html + element must have an xmlns attribute as: +

<html xmlns="http://www.w3.org/1999/xhtml">

rss element

The document is written in some version of RSS.

The conformance checker does not support any version of RSS. Use Atom 1.0 for feed documents.

feed element

The Atom feed element must be - in the http://www.w3.org/2004/Atom - namespace.

+ in the http://www.w3.org/2005/Atom + namespace as: +

<feed xmlns="http://www.w3.org/2005/Atom">

The conformance checker does not support Atom 0.3. Use Atom 1.0 for feed documents.

@@ -874,6 +1017,87 @@

Attribute Value Errors

+ + Character encoding name $0 + is not registered. + +

The specified character encoding name is not registered to + IANA. Use of registered character encoding name + is a good practice to facilitate interoperability.

+ +

EUC-TW: EUC-TW is not registered. Unfortunately, there + is no registered name for that character encoding. Use + Big5 encoding with character encoding name Big5 + if it is enough to represent the document.
ISO-2022-JP-1: ISO-2022-JP-1 is not registered, nevertheless + this character encoding name is documented in + RFC 2237. Use + ISO-2022-JP-2 instead, since that character encoding + is a superset of ISO-2022-JP-1.
ISO-2022-JP-3, ISO-2022-JP-3-plane1: These names are not registered and obsoleted in favor of + ISO-2022-JP-2004 and + ISO-2022-JP-2004-plane1.
ISO-2022-JP-2003, + ISO-2022-JP-2003-plane1: These names are not registered and corrected to + ISO-2022-JP-2004 and + ISO-2022-JP-2004-plane1.
ISO-2022-JP-2004, + ISO-2022-JP-2004-plane1: These names are not registered. Unfortunately, there is + no registered name for these character encodings.
UTF-8N: UTF-8N is not registered. Character encoding + name UTF-8 represents UTF-8 encoding with or + without BOM.

+ +

WARNING: This error might be raised for + a registered character encoding name, since the character encoding + name database of the conformance checker is not complete yet.

+ + + + + $0 is a private + character encoding name. + +

The specified character encoding name is a private name and + not registered to IANA. Use of registered character + encoding name is a good practice to facilitate interoperability.

+ +

x-euc-jp: Use EUC-JP for the Japanese EUC + character encoding.
x-sjis: Use Shift_JIS for standard Shift encoding scheme of + JIS coded character set, or Windows-31J + for Microsoft standard character set as implemented by + Microsoft Windows.

+ + + + + The specified value is syntactically not a + character encoding name. + +

The attribute value must be a character encoding name. However, + the specified value is not a character encoding name syntactically. + The document is non‐conforming.

Character encoding name is a string of ASCII + printable characters, up to 40 characters.

+ + + This attribute only allow a limited set of @@ -886,8 +1110,8 @@

HTML meta element, http-equiv attribute

Only Default-Style and Refresh - is allowed.

Only values Default-Style and Refresh + are allowed.

Value Content-Type is obsolete; for charset declaration, the charset attribute can be used as:

<meta charset="charset-name">

@@ -907,16 +1131,16 @@ - Charset declaration syntax - <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> + Character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> is obsolete. -

Old long charset declaration syntax - <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> +

Old long character encoding declaration syntax + <meta http-equiv="Content-Type" content="text/html; charset=charset-name"> is in use. The document is non‐conforming.

The new charset declaration syntax is: -

<meta charset="charset-name">

The new character encoding declaration syntax is: +

<meta charset="charset-name">

@@ -970,6 +1194,18 @@ + + Character encoding name $1 + is different from document character encoding + $0. + +

The specified character encoding name is different from + the character encoding of the document. The document + is non‐conforming.

+ + + Browsing context name @@ -1043,7 +1279,7 @@ The document is non-conforming.

For example, the table below is non-conforming: -

<table>
+      <table>
 <tbody>
 <tr><td rowspan=2></td></tr>
 </tbody>
@@ -1210,7 +1446,8 @@
     text/cache-manifest must contain a cache manifest.
 
     A cache manifest must start with a line whose content is 
-    CACHE MANIFEST (exactly one space character between
+    CACHE MANIFEST
+    (exactly one space character between
     CACHE and MANIFEST).
   
 
@@ -1338,7 +1575,7 @@
 It does not affect to the conformance of the document, and
 may sometimes be inappropriate.
 
-
+
 Not supported
 Unknown.
 Some feature that is not supported by the conformance checker
@@ -1359,6 +1596,6 @@
 and/or modify it under the same terms as Perl itself.


 
-
+
 
 
\ No newline at end of file

Description of Errors

HTML5 Character Encoding Errors

HTML5 Parse Errors in Tokenization Stage

Attribute Value Errors