2009-09-06 Wakaba * Serializer.pm (get_inner_html): Added |keygen| to the list of void elements (HTML5 revision 2960). 2009-09-05 Wakaba * Tokenizer.pm.src: Changed to keep non-normal character references as is (HTML5 revision 3374). 2009-09-05 Wakaba * Tokenizer.pm.src: Discard unclosed tags (HTML5 revision 2990). 2009-09-05 Wakaba * Tokenizer.pm.src (_get_next_token): Implemented the "comment end space state" (HTML5 revision 3195). 2009-09-05 Wakaba * Tokenizer.pm.src (_get_next_token): Implemented the "comment end bang state" (HTML5 revision 3191). 2009-08-16 Wakaba * Tokenizer.pm.src: Any "<" character in attribute names become parse error (HTML5 revision 3354). 2009-08-16 Wakaba * Tokenizer.pm.src: Lowercase-fold doctype names (HTML5 revision 2501, cf. HTML5 revision 3571). 2009-07-05 Wakaba * Tokenizer.pm.src: Reduced the number of parse errors on broken DOCTYPE (HTML5 revision 3121). 2009-07-03 Wakaba * Tokenizer.pm.src: Reduced a parse error (HTML5 revision 3194). 2009-07-03 Wakaba * Tokenizer.pm.src: "<" in unquoted attribute values is now treated as parse error (HTML5 revision 3206). 2008-11-07 Wakaba * Dumper.pm (dumptree): Support for namespace abbreviation for SWML namespaces. 2008-10-19 Wakaba * Tokenizer.pm.src: Normalize white space characters in attribute value literals in XML documents. Don't apply character reference mapping table for non-NULL non-surrogate code points. 2008-10-19 Wakaba * Tokenizer.pm.src: Set the "stop_processing" flag true when a parameter entity occurs in a standalone="no" document. 2008-10-19 Wakaba * Tokenizer.pm.src: Column number counting fixed. 2008-10-19 Wakaba * Tokenizer.pm.src: Raise a parse error for '&' that does not introduce a reference in XML. Support for non-ASCII entity reference names. 2008-10-19 Wakaba * Tokenizer.pm.src: Make uppercase "&#X" in XML a parse error. Remove the limitation of entity name length. Enable replacement of text-only general entities. Raise a parse error for an unparsed entity reference. Raise a parse error for a general entity reference to an undefined entity. 2008-10-19 Wakaba * Tokenizer.pm.src: Support for . (AFTER_NOTATION_NAME_STATE): Renamed as |AFTER_MD_DEF_STATE| (i.e. after markup declaration definition state). 2008-10-19 Wakaba * Tokenizer.pm.src: Support for EntityValue. 2008-10-19 Wakaba * Dumper.pm: Dump text content of Entity nodes. * Tokenizer.pm.src: Support for . 2008-10-19 Wakaba * Tokenizer.pm.src (_get_next_token): Make keywords 'ENTITY', 'ELEMENT', 'ATTLIST', and 'NOTATION' ASCII case-insensitive. 2008-10-18 Wakaba * Tokenizer.pm.src: Modifies PUBLIC/SYSTEM identifier tokenizer states such that and can be tokenized by those states as well. (BOGUS_MD_STATE): A new state; used for bogus markup declarations, in favor of BOGUS_COMMENT_STATE. 2008-10-18 Wakaba * Tokenizer.pm.src: in the internal subset of an XML document, is now fully implemented. * Dumper.pm (dumptree): Output allowed tokens and default value always. 2008-10-17 Wakaba * Tokenizer.pm.src: New token types AtTLIST_TOKEN, ELEMENT_TOKEN, GENERAL_ENTITY_TOKEN, PARAMETER_ENTITY_TOKEN, and NOTATION_TOKEN are added. New intertion modes for markup declarations are added. 2008-10-16 Wakaba * Tokenizer.pm.src: New token type END_OF_DOCTYPE_TOKEN added. New states DOCTYPE_TAG_STATE and BOGUS_DOCTYPE_INTERNAL_SUBSET_AFTER_STATE are added. (Bogus string after the internal subset, which was handled by the state BOGUS_DOCTYPE_STATE, are now handled by the new state.) Support for comments, bogus comments, and processing instructions in the internal subset. If there is the internal subset, then emit the doctype token before the internal subset (with its $token->{has_internal_subset} flag set) and an END_OF_DOCTYPE_TOKEN after the internal subset. 2008-10-15 Wakaba * Tokenizer.pm.src: $self->{s_kwd} for non-DATA_STATE states are renamed as $self->{kwd} to avoid confliction. Don't raise case-sensitivity error for the keyword "DOCTYPE" in HTML mode. Support for internal subsets (internal subset itself only; no declaration in them is supported yet). Raise a parse error for non-uppercase keywords "PUBLIC" and "SYSTEM" in XML mode. Raise a parse error if no system identifier is specified for a DOCTYPE declaration with a public identifier. Don't close the DOCTYPE declaration by a ">" character in the system declaration in XML mode. 2008-10-15 Wakaba * Tokenizer.pm.src: Set index attribute to each attribute token, for ignoring namespaced duplicate attribute at the XML namespace parser layer. Raise a parse error if the attribute value is omitted, in XML mode. Raise a parse error if the attribute value is not quoted, in XML mode. Raise a parse error if "<" character is found in a quoted attribute value, in XML mode. 2008-10-15 Wakaba * Tokenizer.pm.src: XML tag name start character support for end tags. Support for the short end tag syntax of XML5. Raise a parse erorr for a lowercase in XML. 2008-10-15 Wakaba * Tokenizer.pm.src: XML tag name start character support for start tags. 2008-10-15 Wakaba * Tokenizer.pm.src: Support for XML processing instructions. 2008-10-15 Wakaba * Tokenizer.pm.src: Mark CHARACTER_TOKEN with character reference as such, for the support of XML parse error. 2008-10-14 Wakaba * Tokenizer.pm.src: Parse error if CDATA section is not closed or is placed outside of the root element. 2008-10-14 Wakaba * Tokenizer.pm.src: Raise a parse error for XML "]]>" other than CDATA section end. 2008-10-14 Wakaba * Tokenizer.pm.src: Support for case-insensitive XML attribute names. 2008-10-14 Wakaba * Dumper.pm: Typo fixed. 2008-10-14 Wakaba * Dumper.pm: New module. 2008-10-14 Wakaba * Tokenizer.pm.src: Introduced "in_xml" flag for CDATA section support in XML. 2008-10-14 Wakaba * Tokenizer.pm.src: Make *_TOKEN (token type constants) exportable. New token types, PI_TOKEN for XML and ABORT_TOKEN for document.write() or incremental parsing, are added for future extensions. 2008-10-14 Wakaba * Tokenizer.pm.src: New file. 2008-05-24 Wakaba * Serializer.pm (get_inner_html): Don't escape |"| in content (HTML5 revision 1592). 2008-05-24 Wakaba * Serializer.pm (get_inner_html): Append "\n" after the start tag of a |listing| element (HTML5 revision 1675). 2008-03-02 Wakaba * Serializer.pm (get_inner_html): Typo fixed. 2008-03-01 Wakaba * Serializer.pm (get_inner_html): Escape NBSP (HTML5 revision 1277). 2007-11-11 Wakaba * Serializer.pod: New file. * Makefile: New file. 2007-11-11 Wakaba * Serializer.pm: New module (split from ../HTML.pm.src). 2007-11-11 Wakaba * ChangeLog: New file.