This is a simplified implementation of HTML5 Parsing Algorithm. It only implements script-related part of the algorithm. Especially, this parser:
DOCTYPE and comment tokens.
",
', and & in script
src attribute value.
title and
textarea).
pre,
listing, and textarea elements.
<!--..--> parsing rule
in CDATA/RCDATA elements.
script type
text/javascript. type and language
attributes are ignored.
document.write ("string", ["string", ...]);.
var s = document.createElement ("script");
s.src = "string";
document.documentElement.appendChild (s);
w (document.documentElement.innerHTML); (This statement
can be used to dump the document, even when the document has no
document element. The output format is the tree dump format used
in html5lib test data, not HTML.)
's instead of
"s.
javascript:
URI scheme in the
src attribute of the script element. In addition,
the URI must be conform to
the regular expression ^javascript:\s*(?:"[^"]*"|'[^']*')\s*$.
\uHHHH escapes in JavaScript
string literals.
document.open () call. In other word, delayed
(deferred or asynchronous) script executions and event firings might be
treated in a wrong way if a document.open () invocation
is implicitly done by document.write () in a delayed script.
For some reason, this parser does not work in browsers that do not support JavaScript 1.5.