2008-12-12 Wakaba * ContentChecker.pm: Introduced |check_attrs2| method for the ease of defining codes for checking required attribute and as such. 2008-12-12 Wakaba * IMTChecker.pm: Added more definitions for subtypes. 2008-12-11 Wakaba * URIChecker.pm: Some of |pos_end| values were wrong. 2008-12-06 Wakaba * ContentChecker.pm (check_element): Added support for "no referenced datalist" error. 2008-12-06 Wakaba * URIChecker.pm: Bug fix: It did not work unless Message::DOM::DOMImplementation has been |require|d. 2008-12-06 Wakaba * NanoDOM.pm (document_uri): New attribute. * ContentChecker.pm: Don't use methods not implemented by NanoDOM. 2008-11-07 Wakaba * NanoDOM.pm (text_content): Don't create a Text node if the new value is empty. 2008-11-06 Wakaba * SWML/: New directory. 2008-10-20 Wakaba * NanoDOM.pm (specified, all_declarations_processed, manakai_attribute_type): New attributes. 2008-10-19 Wakaba * NanoDOM.pm (Entity->new): Initialize ->child_nodes as an empty array. 2008-10-19 Wakaba * NanoDOM.pm (notation_name): New attribute. 2008-10-18 Wakaba * NanoDOM.pm (public_id, system_id): New attributes.a 2008-10-18 Wakaba * NanoDOM.pm (text_content): Moved to Node from Element. Setter implemented. (allowed_tokens, default_type, declared_type): Implemented. 2008-10-17 Wakaba * NanoDOM.pm (node_name): New attribute. (ELEMENT_TYPE_DEFINITION_NODE, ATTRIBUTE_DEFINITION_NODE): New constants. (create_element_type_definition_node, create_attribute_definition, create_notation, create_general_entity, get_element_type_definition_node, set_element_type_definition_node, get_general_entity_node, set_general_entity_node, get_notation_node, set_notation_node, get_attribute_definition_node, set_attribute_definition_node): New methods. (element_types, entities, notations, attribute_definitions): New attributes. (DocumentType): Support for child nodes, entities, notations, and element types. (Entity, Notation, ElementTypeDefinition, AttributeDefinition): New classes. * Dumper.pm: Support for general entities, notations, element type definitions, and attribute definitions. 2008-10-15 Wakaba * NanoDOM.pm (create_processing_instruction): New method. (xml_version, xml_encoding, xml_standalone): New attributes. (ProcessingInstruction): New class. 2008-10-14 Wakaba * HTML.pm.src: Handling of end tags in the foreign content insertion mode was partially wrong, because of wrong bit operations. 2008-10-14 Wakaba * NanoDOM.pm (dom_config): New attribute (do nothing), for Whatpm::XML::Parser support. 2008-10-14 Wakaba * Makefile: New rule to make HTML/Tokenizer.pm is added. * HTML.pm.src: Tokenizer part moved to another file. 2008-10-13 Wakaba * HTML.pm.src: Merge |DT_EL| and |DD_EL| as |DTDD_EL|. 2008-10-13 Wakaba * HTML.pm.src: Element category constants redefined. 2008-10-13 Wakaba * HTML.pm.src: Steps for CDATA/RCDATA elements in tree construction stage synced with the spec (HTML5 revisions 2139 and 2302). 2008-10-07 Wakaba * ContentChecker.pm: New error level "html5_fact" added, which should be tentatively used until all of requirements are properly specced as RFC 2119 "MUST" in HTML5. 2008-10-05 Wakaba * ContentChecker.pod: Note on internal flags is added. 2008-10-05 Wakaba * HTML.pm.src: An AAA bug fixed. 2008-10-04 Wakaba * HTML.pm.src: If another node is inserted by the parser, don't reuse existing Text node to append a character (HTML5 revision 2124). 2008-10-04 Wakaba * HTML.pm.src: Support for in body (HTML5 revisions 1731 and 2128). 2008-10-04 Wakaba * HTML.pm.src: Make scoping (HTML5 revision 1837). Support for end tags of camelCase SVG elements were broken. A wrong error type text fixed. 2008-10-04 Wakaba * HTML.pm.src: Drop redundant code (HTML5 revision 1731). 2008-10-04 Wakaba * HTML.pm.src: Support for new definition of |param| and |source| start tag parsing (HTML5 revision 1731). 2008-10-04 Wakaba * HTML.pm.src:

steps reimplemented (HTML5 revision 1731). 2008-10-04 Wakaba * HTML.pm.src:

  • ,
    , and
    steps reimplemented (HTML5 revisions 1731 and 1831). 2008-10-04 Wakaba * HTML.pm.src: Support for new flow (but not phrasing) elements (HTML5 revisions 1731 and 1778). Support for the end tag (HTML5 revision 1731). 2008-10-04 Wakaba * HTML.pm.src: Support for |command| and |eventsource| elements (HTML5 revision 1731). End tags of |option| and |optgroup| elements are now optional (HTML5 revision 1731). 2008-10-04 Wakaba * HTML.pm.src: New "special" elements added to the list (HTML5 revision 1778). "strile" -> "strike". 2008-10-02 Wakaba * ContentType.pm (get_sniffed_type): Support for the "better" content sniffing (HTML5 revision 1927). In a case the official type was not returned when the method is invoked in the list context. 2008-09-22 Wakaba * HTML.pm.src: Character references for non-space C0 characters, including U+000B VT, DEL character, noncharacter code points, are now converted to the U+FFFD character (cf. HTML5 revision 2138). 2008-09-21 Wakaba * ContentChecker.pm: |form=""| check support added. 2008-09-21 Wakaba * ContentChecker.pm: |contextmenu| validness is now checked using |id| and |id_type| properties, and |menu| property is removed. 2008-09-21 Wakaba * ContentChecker.pm: Prepare for |form| |name| attribute's duplication checking. 2008-09-21 Wakaba * HTML.pm.src (parse_byte_stream): Support (or non-support) for unsupported charset="" parameter value (HTML5 revision 2131). 2008-09-20 Wakaba * HTML.pm.src: Reminding places where U+000B is allowed as a space character is fixed (cf. HTML5 revision 1738). * ContentChecker.pm, HTMLTable.pm: U+000B is no longer part of space characters (HTML5 revision 1738). 2008-09-20 Wakaba * HTML.pm.src: The "anything else" case for the "after after body" insertion mode was not updated to swtich to the "in body" insertion mode. U+000B is no longer a space character for the purpose of tree construction phase (HTML5 revision 1738). 2008-09-20 Wakaba * HTML.pm.src: U+000B is no longer a space character (HTML5 revision 1738). 2008-09-20 Wakaba * ContentType.pm: 0x0B is no longer a space character (HTML5 revision 1738). * HTML.pm.src: U+000B is no longer a space character for the algorithm for extracting an encoding from a Content-Type (HTML5 revision 1738). 2008-09-20 Wakaba * ContentChecker.pm ($IsInHTMLInteractiveContent): New. 2008-09-18 Wakaba * LangTag.pm: Add checks for remaining requirements from RFC 4646. * mklangreg.pl: Sort 'Prefix' values by their length, to ease matching. 2008-09-18 Wakaba * LangTag.pm: Warn for private use language subtags. Error level typos fixed. Support for Suppress-Script field. * mklangreg.pl: Support for dumping of nested structure. 2008-09-18 Wakaba * LangTag.pm (check_rfc4646_langtag): Check if a tag is in the recommended case as per RFC 4646. 2008-09-18 Wakaba * LangTag.pm (check_rfc4646_langtag): New method. 2008-09-18 Wakaba * mklangreg.pl: New script. * Makefile: Updated for creation of the module for language subtag registry. 2008-09-16 Wakaba * Makefile: WebIDL.html added. * WebIDL.pod: New documentation. 2008-09-16 Wakaba * WebIDL.pm: Checker's error types are redefined. 2008-09-16 Wakaba * WebIDL.pm: Parser's error types are redefined. Some forward compatible parsing bugs are fixed. Some unreachable codes are commented out. 2008-09-16 Wakaba * WebIDL.pm: Support for the reminding extended attributes are added. It does not satisfy the definition that a forward interface declaration has an extended attribute. It seems that unless explicitly allowed multiple extended attributes with the same name is not allowed, though it is not explicitly mentioned in the spec. 2008-09-16 Wakaba * WebIDL.pm: Unescapes extended attribute names and extended attribute identifiers. Preserve whether an extended attribute has an argument list of not. Support for extended attributes: Constructor, ExceptionConsts, IndexGetter, IndexSetter, NameGetter, NameSetter, and Null. (has_argument_list): New attribute. (idl_text): Stringifies argument lists, if any, even if it is empty. 2008-09-15 Wakaba * HTML.pm.src: New state |PCDATA_STATE|. Use an empty string for |{s_kwd}| in DATA_STATE as default. 2008-09-15 Wakaba * HTML.pm.src, mkhtmlparser.pl: Replace |{prev_char}| by |{s_kwd}| in DATA_STATE. 2008-09-15 Wakaba * HTML.pm.src: Shorten keys. 2008-09-15 Wakaba * HTML.pm.src: Remove checking for control character, surrogate pair, or noncharacter code points and non-Unicode code points (they should be handled by Whatpm::Charset::UnicodeChecker). (parse_char_stream): Support for the |$get_wrapper| argument and character stream error handlers. 2008-09-15 Wakaba * ContentChecker.pm: Don't call |loda_ns_module| for null-namespace elements/attributes. * HTML.pm.src: Fact out $disallowed_control_chars as a hash. 2008-09-14 Wakaba * HTML.pm.src: Regexp typo fixed. |{prev_char}| and |{next_char}| initializations are moved to initialization method. |{read_until}| now supports buffering. Sync |set_inner_html| with |parse_char_stream|. 2008-09-14 Wakaba * HTML.pm.src (parse_char_stream): Make |set_next_char| invoke |manakai_read_until|, not only |read|, where possible, to decrease the number of |read| method calls. * mkhtmlparser.pl: Related changes to the aforementioned modification. 2008-09-14 Wakaba * HTML.pm.src: Use |read| instead of |getc|. |set_inner_html| would report character error from now. 2008-09-14 Wakaba * HTML.pm.src: White-space-leaded non-white-space character tokens in "before head insertion mode" was not correctly handled. (set_inner_html): Reimplemented using CharString decodehandle class. Support for $get_wrapper argument. Support for |{read_until}| feature. 2008-09-14 Wakaba * HTML.pm.src: Make a "bare ero" error for unknown entities point the "&" character. 2008-09-14 Wakaba * HTML.pm.src: It turns out that U+FFFD don't have to be added to the list of excluded characters. 2008-09-14 Wakaba * HTML.pm.src ($char_onerror): Have character decoder's |line| and |column| a higher priority than the one set by the tokenizer's input handler. ($self->{read_until}): Exclude U+FFFD (but this might not be necessary, since now we do line/column fixup in the character decode handle). 2008-09-14 Wakaba * HTML.pm.src: Use |{read_until}| where possible. 2008-09-14 Wakaba * HTML.pm.src: Change |{getc_until}| to |{read_until}| and |manakai_getc_until| to |manakai_read_until| to reduce the number of string copies. 2008-09-14 Wakaba * HTML.pm.src (parse_char_string): Use newly created |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's standard feature to |open| a string as a filehandle, since Perl's string filehandle seems not supporting |ungetc| method correctly. (parse_char_stream): Define |{getc_until}| method. (DATA_STATE): Experimental support for |getc_until| feature. 2008-09-13 Wakaba * HTML.pm.src: Check points added to newly added branches. 2008-09-13 Wakaba * HTML.pm.src: Remove |{char}|, which is no longer used. Remove |{entity_in_attr}| and |{last_attribute_value_state}| and replaced by |{prev_state}|. * mkhtmlparser.pl: Remove |{char}| feature. Remove |!!!back-next-input-character;| macro. 2008-09-13 Wakaba * HTML.pm.src: Finally we get rid of all the inner loops. Remove entity related tokenizer states in favor of new states implementing the consume character reference algorithm. 2008-09-13 Wakaba * HTML.pm.src: "Consume a character reference" algorithm is now implemented as a tokenizer's state, rather than a method, with minimum changes (more changes will be made, in due course). "Bogus comment state"'s inner loop gets removed. 2008-09-13 Wakaba * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing into their own tokenizer states. 2008-09-13 Wakaba * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE| is split into three states. 2008-09-13 Wakaba * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that no longer does the tokenizer have to push back next input characters in those states. 2008-09-13 Wakaba * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken into four states so that no longer does the tokenizer have to push back next input characters in that state. 2008-09-11 Wakaba * HTML.pm.src: Methods now accept additional parameter, $get_wrapper, which can be used to insert some wrapper between the character stream handle and the tokenizer. (It is currently not supported for |set_inner_html| for |Element|s). 2008-09-10 Wakaba * HTML.pm.src: Ignore punctuations in charset names. 2008-09-10 Wakaba * ContentChecker.pm: Support for charset-layer error levels. * HTML.pm.src: Don't specify |text| argument for the |chardecode:fallback| error, since it is not the encoding being used alternatively. 2008-09-06 Wakaba * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141). 2008-08-31 Wakaba * CacheManifest.pm: Support for extensibility (HTML5 revision 2051). 2008-08-31 Wakaba * HTML.pm.src: Bug fix and sync with the spec with regard to after after frameset insertion mode processing (HTML5 revision 1909). Note that the implementation was wrong per the old spec before the r1909 changes. 2008-08-30 Wakaba * HTMLTable.pm: scope=auto algorithm fix synced with the spec (HTML5 revision 2093). ($process_row): Algorithm step numbers synced with the spec (HTML5 revision 2092). 2008-08-30 Wakaba * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5 revision 2094). 2008-08-30 Wakaba * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096). 2008-08-30 Wakaba * HTML.pm.src: '"' and "'" at the end of attribute name (after another attribute) now raise parse error (HTML5 revision 2123). Empty unquoted attribute values are no longer allowed (HTML5 revision 2122). 2008-08-30 Wakaba * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5 revision 2130). 2008-08-30 Wakaba * ContentChecker.pm: |xml:lang| attribute value must be same as |lang| attribute value for HTML elements (HTML5 revision 2062 and so on). 2008-08-30 Wakaba * ContentChecker.pm: Error level definition for |xml_id_error| was missing. * URIChecker.pm: The end of the URL should be marked as the error location for an empty path error. The position between the userinfo and the port components should be marked as the error location for an empty host error. 2008-08-30 Wakaba * URIChecker.pm: Set parameters representing where in the value the error occurs for errors. Report unknown address format error in warning level, since address formats are rarely added. Path segments starting with "/.." were misinterpreted as a dot-segment. 2008-08-30 Wakaba * URIChecker.pm (check_iri_reference): Requires |Message::DOM::DOMImplementation|. 2008-08-29 Wakaba * IMTChecker.pm: Updated for the new error reporting architecture. * ContentChecker.pm: Error levels for IMTs are added. 2008-08-17 Wakaba * H2H.pm (_shift_token): Support for unquoted HTML attribute values. 2008-08-16 Wakaba * CacheManifest.pm: Support for new style of error reports. * HTML.pm.src: Set line=1, column=1 to the document node. 2008-08-16 Wakaba * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag and URL checkers. Support for more error levels for bogus langauge tag and URL "standards". * LangTag.pm, URIChecker.pm: Support for new style error level reporting. 2008-08-15 Wakaba * ContentChecker.pm: Support for RDF/XML error levels. * HTMLTable.pm, RDFXML.pm: Support for new style of error level specifying. Error types are revised. 2008-08-15 Wakaba * ContentChecker.pm: All error reporting method calls are renewed. 2008-08-15 Wakaba * HTML.pm.src: All error type names and "text" parameters are revised. Use new style for "level" specification. * mkhtmlparser.pl: Use new style for "level" specification. 2008-08-03 Wakaba * WebIDL.pm (parse_char_string): Simplified error reporting process for broken ignored valuetype definition. (Valuetype idl_text): Support for special "DOMString" name. 2008-08-03 Wakaba * WebIDL.pm ($get_scoped_name): Append "::::" if the last terminal of the ScopedName is "DOMString", such that whether the last part of the scoped name is "DOMString" or "_DOMString" later. It is necessary to determine whether a |typedef| definition should be ignored or not. (parse_char_string): Unescape the identifier of exception members. ($resolve): Return undef for builtin types and sequence types (we might not have to do this, however...). (check): Support checking for Exceptions, Valuetypes, and Typedefs. ($serialize_type): Support for "DOMString::::" syntax. (Typedef idl_text): Output Type as "DOMString" if it is really "DOMString" (i.e. its internal representation is "::DOMString::"). 2008-08-03 Wakaba * WebIDL.pm ($resolve): New code, based on resolve code for constant types in the |check| method. (check): Support for checking of attributes, operations, and arguments. (Attribute/Operation idl_text): Exception names in getraises, setraises, and raises clauses is serizlied by |$serialize_type| code. 2008-08-02 Wakaba * WebIDL.pm ($integer): Order of selections are changed to match hexadecimal numbers (the original pattern, taken from the spec, was not work for hexadecimal numbers, because the "0" prefix matches to the [0-7]* part (as an empty string) and therefore it does not match with remaining "x..." part of a "0x..." integer literal. ($get_type): It now returns a string, not an array reference, for regular types and |sequence| types (i.e. it in any case returns a string). ($get_next_token): The second item in the array that represents a integer or float token is now a Perl number value, not the original string representation of the number. (check): Support for const value consistency checking. No extended attribute is defined for constants. (Node subclasses): Use simple strings rather than array references for default data type values. ($serialize_type): Type values are now simple strings. (value): If the new attribute value is a false value, then a FALSE value is set to the attribute. 2008-08-02 Wakaba * WebIDL.pm ($get_scoped_name): Now scoped names are stored in its stringified format ("scoped name" as defined in the spec). Note that future version of this module should not use array references for type values and the |type_text| attribute should be made obsolete. (parse_char_string): Unescape attribute names. (check): Support for checking of whether inherited interfaces are actually defined or not. Support for checking of whether interface member identifiers are duplicated or not. ($serialize_type): Scoped names are returned as is. A future version of this code should escape identifiers other than "DOMString", otherwise the idl_text would be non-conforming. 2008-08-02 Wakaba * WebIDL.pm (parse_char_string): Set line/column numbers to generated nodes. Unescape identifiers. Extended attributes for Definition's were ignored. (append_child): Set |parent_node| attribute. (parent_node): New attribute. (check): Support interface/exception members. Support extended attributes. Support definition identifier uniqueness constraint. (qualified_name): New attribute. (Interface/Exception idl_text): Extended attributes were not prepended to the returned text. 2008-08-02 Wakaba * WebIDL.pm (parse_char_string): Set line/column numbers to interface object experimentally. s/shift/pop/g, shift would make things wrong. Support for interface forward declarations was missing. Broken interface declarations with no block were not ignored entirely. (Whatpm::WebIDL::Node): New abstract class. This class makes things easier. (child_nodes): New attribute. Unlike DOM's attribute with same name, this attribute returns a dead list of nodes for simplicity. (get_user_data, set_user_data): New methods. (Module idl_text): A SPACE character should be inserted before the |{| character. (Interface idl_text): Support for interface forward declarations. (is_forward_declaration): New attribute. 2008-07-19 Wakaba * WebIDL.pm (type_text): Better serializer. 2008-07-19 Wakaba * WebIDL.pm: Revise forward-compatible parsing so that it now can handle broken extended attributes and as such. 2008-07-19 Wakaba * WebIDL.pm: Real support for extended attributes. Support for extended attributes with arguments. 2008-07-19 Wakaba * WebIDL.pm: Support for |exception| syntax. (Interface->idl_text): Tentative support for inheritances. 2008-07-19 Wakaba * WebIDL.pm: Hierarchical scoped name support was broken. Support for raises, setraises, and getraises syntaxes. 2008-07-18 Wakaba * WebIDL.pm: Support for |idl_text| attribute, version 1 (no proper support for types, extended attributes, and exceptions yet). WebIDL parser, version 1 (no support for exceptions yet, no proper support for extended attributes yet). 2008-07-09 Wakaba * WebIDL.pm (parse_char_string): Support for basic attribute syntax. 2008-06-29 Wakaba * WebIDL.pm: Support for valuetype and const. 2008-06-29 Wakaba * WebIDL.pm: New module. 2008-06-15 Wakaba * Makefile (Entities.html): URI changed. 2008-06-08 Wakaba * HTML.pm.src: Support for ruby parsing (HTML5 revision 1704). 2008-06-01 Wakaba * HTML.pm.src (_get_next_token): A parse error was missing. 2008-06-01 Wakaba * mklinktypelist.pl: rel=contact is no longer part of the HTML5 spec (commented out). (HTML5 revision 1711). 2008-05-25 Wakaba * ContentType.pm: Drop support for UTF-32 (HTML5 revision 1701). * HTML.pm.src: UTF-16BE and UTF-16LE should be considered as UTF-16 (HTML5 revision 1701). 2008-05-25 Wakaba * HTML.pm.src: Support for in <head> (HTML5 revision 1692). 2008-05-25 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The secondary insertion mode used when switching to foreign content is the "in body" insertion mode (HTML5 revision 1696). 2008-05-25 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Don't raise parse error for <isindex/> (HTML5 revision 1697). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Support for end-of-file token in foreign content insertion mode (HTML5 revision 1693). Update SVG camelCase attribute list (HTML5 revision 1700). <textarea> closes </select> (HTML5 revision 1699). More start tags close in foreign content insertion mode (HTML5 revision 1698). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: ";" is not part of charset name (HTML5 revision 1665). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: More robust charset parameter detection (HTML5 revision 1674). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: Support for image/vnd.microsoft.icon (HTML5 revision 1676). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Ignore language part of public identifiers for quriks mode detection (HTML5 revision 1679). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Reduce the number of errors in truncated doctypes (HTML5 revision 1685). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Support for EOF in new states for tags (HTML5 revision 1684). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_reset_insertion_mode): Make <td>.innerHTML work (HTML5 revision 1690). 2008-05-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_tree_construction_main): Change handling of end tags in head insertion modes (HTML5 revision 1686). (parse_char_string): Bug fix for non-utf8 character string handlings. (parse_char_stream): |ungetc| does not work well for this context. 2008-05-18 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): Redefined to invoke |parse_byte_stream|. (parse_byte_stream): New method. 2008-05-18 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): Fix the column number reported by encoding layer error reporter. 2008-05-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): Use streaming decoder rather than converting the whole byte string and then parsing. Propagate errors in character encoding layer. (get_next_token): Precise error reporting for |bare stago| error. 2008-05-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_char_stream): New method. (parse_char_string): This method is now defined as an invocation of the |parse_char_stream| method. 2008-05-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): Report various status of the sniffing as info-level errors. Support for new decoding framework in parser resestting. (new): Various default error levels were not set. 2008-05-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): HTML5 encoding siniffing algorithm, except for the actual sniffing, is implemented with new framework with Message::Charset::Info. 2008-05-16 Wakaba <wakaba@suika.fam.cx> * CacheManifest.pm (_parse): Drop fragment identifiers from URIs in fallback section (HTML5 revision 1596). 2008-05-10 Wakaba <wakaba@suika.fam.cx> * Makefile (Entities.html): URI has changed. 2008-05-10 Wakaba <wakaba@suika.fam.cx> * CacheManifest.pm: Don't replace U+0000 NULL (HTML5 revision 1553). 2008-05-06 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Noted that those returned in |table| are no longer table elements, but table objects returned by Whatpm::HTMLTable. * HTMLTable.pm (form_table): Return table element node as |$table->{element}|. (assign_header): Support for the |headers=""| attribute. 2008-05-06 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm (assign_header): New function; first version with no support for headers="". (form_table): Include table width and height to the returned table object for covenience. Indexing in column assignement was wrong. Set whether a data cell is empty or not for convenience. 2008-05-05 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm: Robuster caption support (HTML5 revision 1393). 2008-05-05 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm: How table model errors are detected is changed (HTML5 revision 1387). 2008-05-05 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm: The algorithm now moves |tfoot| elements to the end of the table (HTML5 revision 1380). 2008-05-05 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm: The algorithm is now 0-based indexing, instead of 1-based (HTML5 revision 1376). 2008-05-05 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: "Content-Type: text/plain; charset=UTF-8" and "Content-Encoding" no longer prevent sniffing (HTML5 revision 1288). 2008-05-05 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: Skip BOMs n feed or HTML algorithm (HTML5 revision 1282). 2008-05-03 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Support for global attributes. Status of XML specs are added. 2008-05-03 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): Support for |template=""| and |ref=""| attribute (referent element type checking). 2008-04-29 Wakaba <wakaba@suika.fam.cx> * CacheManifest.pm (_parse): New same origin definition (HTML5 revision 1500) is implemented (except for IDNA part and URI-scheme-specific knowledge). Line number counting was wrong for LF-only documents. 2008-04-13 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Raise an parse error for any disallowed character (HTML5 revision 1263). 2008-04-13 Wakaba <wakaba@suika.fam.cx> * mkentitylist.pl: Support for new HTML5 entity table format (the definition for |AElig;| was missing). 2008-04-13 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src, mkhtmlparser.pl: Support for element/attribute name/namespace fixup (HTML5 revisions 1413, 1415, 1416, and 1417). 2008-04-13 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: List of element names that close foreign content insertion mode is added (HTML5 revisions 1412 and 1418). 2008-04-13 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Support for |mglyph| and |malignmark| elements (HTML5 revision 1410). 2008-04-13 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Support for new long MathML entities (HTML5 revision 1406). 2008-04-13 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: CDATA section support for MathML and SVG elements (HTML5 revision 1404 and 1420). 2008-04-12 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src, mkhtmlparser.pl: Support for MathML and SVG elements (HTML5 revision 1404). Unused !!!macro definitions are removed. 2008-04-12 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src, mkhtmlparser.pl: The way permitted slash errors are raised is changed (HTML5 revision 1404). 2008-04-06 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Category sets that are no longer used are removed. 2008-04-06 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The ->[1] property of stack entries are now replaced by constants representing element category. 2008-04-06 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Don't use local name stored in stack (i.e. ->[1]) for error reporting. (This is a preparation for using constant value for ->[1].) 2008-03-22 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: Typo fixed. 2008-03-22 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: |fact_level| is now treated as same as |must_level|, i.e. level = |m|. (check_element): Make list of URIs in the DOM. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: Language accessor implemented. Local (null-namespace) attribute support. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: Factored out ID checking code. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: TODO items noted. Validation of ID and URI attributes is implemented. Warn if unknown value is used in rdf:parseType="" attribute. * URIChecker.pm (check_rdf_uri_reference): New function. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: bnodeid implemented. Relative references are now resolved. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: RDF reification implemented. * RDFXML.pm: undef vs false bug fixed. Reification implemented. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: s/id/ID/ for attribute name. The |node| arguments are added for |ontriple| calls. Too many "attribute not allowed" errors were raised. * ContentChecker.pm: Initial experimental support for rdf:RDF element. 2008-03-21 Wakaba <wakaba@suika.fam.cx> * RDFXML.pm: New module. 2008-03-20 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (set_inner_html): Line/column number code was old one yet. 2008-03-20 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Better line/column reporting for "duplicate attribute" errors. Line/column markings for DOCTYPE, comment, and character tokens are reintroduced; otherwise, error location for "not HTML5" error and errors for implied elements are not attached. 2008-03-20 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Set line/column numbers to attributes. * NanoDOM.pm (create_attribute_ns, set_attribute_node_ns): Added. (value): Setter implemented. * mkhtmlparser.pl: Set line/column numbers to Attr nodes. 2008-03-20 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Unused line/column markings are removed. 2008-03-20 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_get_next_token): Remove |first_start_tag| flag, which is no longer used. 2008-03-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Set line/column information to element nodes. * mkhtmlparser.pl (!!!create-element, !!!insert-element, and !!!insert-element-t): Set line/column information to element nodes. 2008-03-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_get_next_token): The first "<" character in "<?", "<>", or "</>" should be the error point. 2008-03-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Some more fixes on error position reporting. 2008-03-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Token-level precious error reporting. 2008-03-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Preparation for more precious error point reporting. 2008-03-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Error type revised. 2008-03-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Similar codes are merged together, again. 2008-03-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Similar codes are merged together. 2008-03-10 Wakaba <wakaba@suika.fam.cx> * mkhtmlparser.pl: Set "level" parameter to parse errors. * HTML.pm.src: Code refined. 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |</body>| treatement has been changed (HTML5 revision 1348). Note that I really don't know this makes any difference in the black-box behavior of the parser. 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: New end-of-file token implementation (HTML5 revision 1348). 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |applet| support (HTML5 revision 1347). 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Foster parenting in AAA (HTML5 revision 1343). 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Support for |<input>| in the "in select" insertion mode and support for the "in select in table" insertion mode (HTML5 revision 1342). 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: No foster parenting for <script> and <script> in non-tainted <table>s (HTML5 revision 1336). 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Ignore white space characters between <html> and <head> (HTML5 revision 1332). 2008-03-09 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Treat <input type=hidden> as if it were a white space (HTML5 revision 1331). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Ignore U+000A at the beginning of a |listing| element (HTML5 revision 1330). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: <title> is always appended to the current element (HTML5 revision 1328). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: White space in tainted tables are moved into foster parents (HTML5 revision 1326). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Reduce errors from foster parenting cases (HTML5 revision 1321). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |</h/n/>| case code rearranged to align with the spec (HTML5 revision 1320). Note that we finally complete all of HTML5 revision 1320 changes. 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |</form>| now works similar to |</div>| for unclosed tags (HTML5 revision 1320). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |</p>| case rearranged with no actual change in fact. 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: A "generate implied end tags" code (t409.1) could not be reached so that it is now removed (HTML5 revision 1320). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Code for the case of |</div>| and so on are revised to align with new spec text (HTML5 revision 1320). 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Remove strange |if| condition; however, it should have had no harm in theory. 2008-03-08 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_tree_construction_main): '</p>' in body case is split from other end tags for the preparation of implementing HTML revision 1320. 2008-03-07 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Simplified "generate implied end tag" (HTML5 revision 1320). 2008-03-07 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_tree_construction_main): Merge rules for "h1" and "div" (HTML5 revision 1318). Add comments to where |form| pointer association codes should be inserted (HTML5 revision 1319). 2008-03-06 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: <html> treatement refined (HTML5 revision 1314). 2008-03-05 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Since the case t268 should never be reached (no other token type, there are), it is replaced by a |die| statement. 2008-03-05 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Typo fixed. 2008-03-04 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_tree_construction_initial): Some limited quirks doctypes were not uppercased for comparison. 2008-03-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (tree construction and set_inner_html): Checkpoints are added. 2008-03-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_tokenize_attempt_to_consume_an_entity): Checkpoints are set. Cases that are unlikely reached are noted as so. 2008-03-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Checkpoints for debugging are added. * mkhtmlparser.pl: Support for |!!!cp| syntax. 2008-03-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src, mkhtmlparser.pl: s/_input_character/_char/g for simplicity. 2008-03-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Flag name changed: s/correct/force-quirks/g (HTML5 revision 1307). 2008-03-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_get_next_token): Where the /incorrect/ flag is set are changed (HTML5 revision 1305). 2008-03-02 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Raise a parse error for |<span ===>| (HTML5 revision 1292). Entities are not parsed in comment-like part in RCDATA elements (HTML5 revision 1294). Allow bare & at the end of attribute value literals (HTML5 revision 1296). More quirks mode doctypes (HTML5 revision 1302). Requires spaces between attributes and ban attribute names or unquoted attribute values containing single or double quotes (HTML5 revision 1303). 2008-03-02 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Typo fixed. Don't raise "character encoding" and related errors unless it is an HTML document (though the spec is unclear on whether it is applied to XHTML document). * HTML.pm (%HTMLAttrStatus): WF2 repetition model attributes are added. 2008-03-02 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: s/local_name/manakai_local_name/g. 2008-03-01 Wakaba <wakaba@suika.fam.cx> * _NamedEntityList.pm: Updated (HTML5 revision 1286). * HTML.pm.src: |charset| in |content| attribute is case-insensitive (HTML5 revision 1270). 2008-02-26 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: New status constants are added. ($ElementDefault): |status| added. (check_element): Err for non-standard or deprecated elements. (_attr_status_info): For non-standard or deprecated attributes. 2008-02-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (_attr_status_info): New internal method. 2008-02-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): Element standardized status information is now dispatched. 2008-02-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): Fix |del|-and-significant problem by adding some more arguments. 2008-02-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): Use context of container-for-the-purpose-of-content-model element (not transparent element) for |check_child_element| calling and significant text flag marking. This reintroduces |<del>|-and-significant problem again. 2008-02-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): Make semi-transparent elements ignored for the purpose of phase changes in content model checking. 2008-02-23 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): In-element state was not properly managed for transparent cases. 2008-02-23 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): Support for |video| and |audio| as semi-transparent elements. 2008-02-23 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLSemiTransparentElements): New. (check_element): s/minuses/minus_elements/, s/pluses/plus_elements/. Support for |html:object| as a semi-transparent element. 2008-02-23 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): The way to traverse the tree is entirely revised to make it easier to track the state of ancestors/descendants. As a result of this revision (which rewrites almost all of Whatpm::ContentChecker::HTML), support for content model checking for HTML elements |figure|, |object|, |video|, and |audio| and checking for XML elements (and some XMLNS checkings) are dropped for now. They will be reimplemented in due cource. 2008-02-17 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |>| in public or system literal closes the DOCTYPE token (HTML5 revision 1225). 2008-02-17 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ({unsupported_level}): New value. * HTML.pm.src: Save whether |meta| |content| attribute contains character references or not. 2008-02-17 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (_get_children): (Incompleted) attempt to imlement significant content checking for contents with |del| elements. 2008-02-17 Wakaba <wakaba@suika.fam.cx> * ContenteChecker.pm ($HTMLTransparentElements): More elements are added. (_get_children): HTML |object| elements are now semi-transparent. * NanoDOM.pm (manakai_html, manakai_head): New methods. 2008-02-16 Wakaba <wakaba@suika.fam.cx> * CacheManifest.pm: HTML5 revision 1211 implemented. * CacheManifest.pod: Updated. 2008-02-10 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_document, check_element): Support for second argument ($onsubdoc). (_get_css_parser): Removed (now it is part of WDCC). 2008-02-09 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (_get_css_parser): New. 2007-11-25 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($AnyChecker): Old way to add child elements for checking had been used. 2007-11-25 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_element): New todo item type |descendant|. 2007-11-23 Wakaba <wakaba@suika.fam.cx> * IMTChecker.pm: Revised to raise errors and warnings as (poorly) specced in RFC 2046 and RFC 4288. (application/atom+xml): Definition added. 2007-11-23 Wakaba <wakaba@suika.fam.cx> * URIChecker.pm: Make RFC 3986 should-level errors warnings (rather than SHOULD-level errors). 2007-11-23 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (get_user_data, set_user_data): New methods. * HTML.pm.src: A flag for character references in attribute values are added. Set |manakai_has_reference| user data to |charset| attribute. 2007-11-23 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (input_encoding, manakai_charset, manakai_has_bom): New attributes. * ContentChecker.pm (check_document): Warn if charset requirements cannot be tested. 2007-11-19 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): Detect charset by universalchardet if charset parameter is not specified. * Makefile (Charset-all, Charset-clean): New rules. 2007-11-18 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_document): Check the existence of character encoding declaration and selection of encoding for HTML document. 2007-11-18 Wakaba <wakaba@suika.fam.cx> * ContentType.pm (get_sniffed_type): Return also the official type in list context. 2007-11-18 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: Sniffing with leading white space ignoring (HTML5 revisions 1013 and 1016). 2007-11-18 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: HTML5 revision 1013 changes, except for leading white spaces, are implemented. 2007-11-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_char_string): Set |inner_encoding| attribute if possible. 2007-11-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_byte_string): New method. (parse_char_string): New alias for |parse_string|. (main phase): Invoking "change the encoding" algorithm if desired. * HTML.pod: Updated. 2007-11-11 Wakaba <wakaba@suika.fam.cx> * HTML.pod (get_inner_html): Removed. * Makefile (HTML-all, HTML-clean): New. 2007-11-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (get_inner_html): Removed (moved to HTML/Serializer.pm). 2007-11-08 Wakaba <wakaba@suika.fam.cx> * mklinktypelist.pl: s/noreferer/noreferrer/ (HTML5 revision 1132). 2007-11-04 Wakaba <wakaba@suika.fam.cx> * Makefile: |CacheManifest.html| is added. * CacheManifest.pod: New file. 2007-11-04 Wakaba <wakaba@suika.fam.cx> * CacheManifest.pm: New module. 2007-11-04 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Support for application cache selection algorithm callback. 2007-11-04 Wakaba <wakaba@suika.fam.cx> * mklinktypelist.pl: Support for rel=noreferer (HTML5 revision 1118). 2007-10-17 Wakaba <wakaba@suika.fam.cx> * Makefile (clean): New rule. * NanoDOM.pm (public_id, system_id): New attributes. 2007-10-17 Wakaba <wakaba@suika.fam.cx> * Makefile (CSS-all, CSS-clean, clean): New rules. 2007-10-14 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_document): Support for new |is_xml_root| flag. (check_element): Support for new |pluses| state. (_add_pluses): New method. (_remove_minuses): Support for new |minus| item. 2007-09-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Raise specific error for invalid root element. 2007-09-24 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Set level values for later uses. 2007-09-09 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Support for language tag validation. 2007-09-09 Wakaba <wakaba@suika.fam.cx> * LangTag.pm (check_rfc3066_language_tag): New method. 2007-09-09 Wakaba <wakaba@suika.fam.cx> * LangTag.pm: New module. 2007-09-04 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Some error types were wrong. 2007-08-17 Wakaba <wakaba@suika.fam.cx> * CSS/: New directory. 2007-08-17 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (_check_get_children): Support for |noscript| in |head|. 2007-08-12 Wakaba <wakaba@suika.fam.cx> * URI/: New directory. 2007-08-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Tokenizer's states are now represented in number. 2007-08-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |or|s for insertion modes are replaced by |&|s. 2007-08-11 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Token types are now represented in number. 2007-08-11 Wakaba <wakaba@suika.fam.cx> * ContentType.pm (SEE ALSO): Updated. * HTML.pm.src: Insertion modes are now represented in number. 2007-08-11 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: Sniffing for bitmap images (HTML5 revision 999) is implemented. 2007-08-08 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: Sniffing for |<script| (HTML5 revision 983) is implemented. 2007-08-06 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pod: New documentation. * Makefile: A rule for |ContentChecker.html| is added. * ContentChecker.pm: A pod "LICENSE" section is added. * NanoDOM.pm ($VERSION): New variable. 2007-08-05 Wakaba <wakaba@suika.fam.cx> * H2H.pm: |b|, |i|, and |sub| are added to the list of allowed HTML elements. 2007-08-05 Wakaba <wakaba@suika.fam.cx> * H2H.pm: |samp| is added to the list of allowed HTML elements. * URIChecker.pm (check_iri): New. (check_iri_reference): Error type for IRI reference syntax error is changed. 2007-08-04 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Reference to the |Whatpm::ContentChecker::Atom| is added. (check_document): Load appropriate module before validation. 2007-08-04 Wakaba <wakaba@suika.fam.cx> * ContentChecker/: New directory. 2007-08-04 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: HTML |time| element is implemented. * HTMLTable.pm: Comments are updated as HTML5 is revised. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (check_document): Return value even if no document element is found. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |$in_body| is no longer a function. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The |$in_body| code has been moved down. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The "trailing end" insertion mode is split into "after html body" and "after html frameset" insertion modes. Their codes are merged with "after body" and "after frameset" codes. |$previous_insertion_mode| has been removed. "after frameset" code is merged with "in frameset" code. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The "before head" insertion mode is merged with the "in head" insertion mode. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Cases in "in head" insertion mode are reorganized. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Some cases in "in table" insertion mode are merged. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The "in row" insertion mode is merged with "in table" insertion mode. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The "in table" and "in table body" insertion modes are merged. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: There is no "in table head" or "in table foot" insertion mode! 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |<noframes>| "in frameset" and "in noframes" now directly invoke the handler. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Codes for "in cell" insertion mode is merged to the "in body" insertion mode code. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Codes for "in body" and "in caption" insertion modes are merged. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Two |!!!next-token|s were missing. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Use numeric constant for |{content_mode}| instead of string constant for |{content_model_flag}|. 2007-07-21 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Add the name of the attribute to the "duplicate attribute" error. 2007-07-17 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Return the |class| node list. 2007-07-17 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Return the |id| node list. * HTML.pm.src: A typo is fixed. 2007-07-16 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Drop wrong |level => 'error'| specification from "in HTML:xml:lang" error. Character position is now the last part of the error type in the URI error description. Report "unsupported" status for language tags, media queries, script codes, and style sheets. 2007-07-16 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Report error if |xml:lang| in HTML, |lang| in XML, |xmlns| in XML, and |meta| |charset| in XML. * NanoDOM.pm (Attr.owner_document): New attribute. 2007-07-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The character immediately following a bare |hcro| was discarded. Fix handling of entity references in attribute values. 2007-07-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (main and trailin end phases): Token types |DOCTYPE|, |comment|, |end-of-file|, and |<html>| are factored out. Error types |in html:#DOCTYPE| and |after html:#DOCTYPE| are merged into |DOCTYPE in the middle|. |</frameset>| in fragment parsing mode changed the insertion mode. 2007-07-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |$phase| has been removed; The |trailing end| phase is now an insertion mode. Treatments for white space character tokens were incorrect for some insertion modes. An old |meta| case was not removed. 2007-07-16 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |meta| charset declaration extraction implemented (but changing the encoding is not yet:-). 2007-07-15 Wakaba <wakaba@suika.fam.cx> * Charset/: New directory. 2007-07-15 Wakaba <wakaba@suika.fam.cx> * H2H.pm: New Perl module (created from manakai's H2H.dis). 2007-07-15 Wakaba <wakaba@suika.fam.cx> * XMLSerializer.pm: New Perl module (created from manakai's SimpleLS.dis). 2007-07-07 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |<!---x-->| was not processed correctly. 2007-07-01 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Report correct error message for |<body></div></body>|. 2007-07-01 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm: An error description was incorrect. 2007-06-30 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Return |{term}| list. 2007-06-30 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revisions 961-966 (</p>, </br>, nested <nobr>, implied </tbody>, </tfoot>, and </thead>, and <title> outside of head). 2007-06-30 Wakaba <wakaba@suika.fam.cx> * IMTChecker.pm: Report warning for unregistered and private types/subtypes. * ContentChecker.pm, HTML.pm.src, IMTChecker.pm, URIChecker.pm, HTMLTable.pm: Error messages are now consistent; they are all listed in <http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types>. 2007-06-25 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: |<img ismap>| not in |<a></a>| is now erred. |<datalist>| is implemented. Attribute checker for |<command>| and |<menu>| are added. Support for |contextmenu| global attribute is added. 2007-06-25 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (_reset_insertion_mode): Interpretation of Step 3 has been changed. 2007-06-25 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Late |<html>| parse error is implemented. 2007-06-24 Wakaba <wakaba@suika.fam.cx> * URIChecker.pm (check_iri_reference): A |decode| method name was incorrect. * ContentChecker.pm: Support for the |footer| element. Check URI syntax for space-separated URI attributes. Support for the |tabindex| attribute. Support for |datetime| attribute. 2007-06-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 1.144 (&#x0D;) and 1.145 (invalid character references). HTML5 revision 1.146 (white space characters before root start tag). HTML5 revision 1.148 (named character references in attribute values). HTML5 revision 1.152 (<plaintext>.innerHTML get). 2007-06-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revisions 1.142 and 1.143 (<noscript> in <head>). 2007-06-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 935 (<base>, <link>, <meta> in body). * ContentChecker.pm: HTML5 revision 938 (scoped=""). 2007-06-24 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 923 (matching end tag in CDATA or RCDATA in fragment parsing mode). HTML5 revision 924 (<!--> and <!--->). HTML5 revision 926 (hn in hn). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (get_inner_html): HTML5 revision 922 (inner_html for <pre> and <textarea>). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 920 (<isindex>). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 918 (</head>, </body>, </html>). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 916 (</body>). HTML5 revision 917 (conforming bare &). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (manakai_is_html): Setting to false did not work. * HTML.pm.src: HTML5 revision 914 (</ in CDATA, RCDATA). HTML5 revision 915 (<nobr>). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revisions 908, 909, 912, and 913 (quirks mode). * NanoDOM.pm (manakai_is_html, manakai_compat_mode, compat_mode): New attributes. 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revisions 900, 901, 902, and 911 (< in tags). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * .cvsignore: |Entities.html| is added. * HTML.pm.src: |$entity_char| is removed and requires |Whatpm::_NamedEntityList| instead. HTML5 revision 898 (refc), except that lack of refc is parse error. * mkentitylist.pl: New script. * Makefile (all): |_NamedEntityList.pm| is added. (_NamedEntityList.pm, Entities.html): New rules. 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Parse errors immediately after U+000D were ignored and U+000D immediately following another U+000D was not converted to U+000A. 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (set_inner_html): HTML5 revision 892 (adopt nodes before appended). Parser was not ready for NULL parse error and escape flag. * NanoDOM.pm (adopt_node): New. 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 886 (insane comment in CDATA and RCDATA). Note that current implementation is simply repeating what the spec says and it is maybe not a best way to do it. 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 884 (</form> don't close the form element if a descendant element without implied end tag has still been open). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: HTML5 revision 881 (Make |id| attribute with space characters non-conforming). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: An error message was incorrect. HTML5 revision 869 (C1 character references). 2007-06-23 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: HTML5 revision 867 (a LF at the beginning of a |textarea| is removed). 2007-06-05 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (get_attribute_node_ns): New method. * ContentChecker.pm: |script| |async| and |defer| no longer require |src|. |async| MUST NOT be specified if |defer|. (HTML5 revision 858). 2007-05-30 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: |<form><form>| went to inifinite loop. 2007-05-27 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (html): Set |is_root| (allowed as a document element) flag on. (new): Removed. (check_document): New method. 2007-05-27 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (thead, tfoot): Checker specifications were incorrect. 2007-05-27 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLURIAttrChecker): Include error position in the |type| option of the error. * HTMLTable.pm (form_table): The |$onerror| parameter is now optional. Some bugs are fixed. 2007-05-27 Wakaba <wakaba@suika.fam.cx> * HTMLTable.pm: New module. * ContentChecker.pm (table): Invoke table model error checker. * NanoDOM.pm (first_child, get_attribute_ns): New. 2007-05-26 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLLinkTypesAttrChecker): New checker. (link/@rel, a/@rel, area/@rel): Use new checker. * Makefile (_LinkTypeList.pm, RelExtensions.html): New rules. * _LinkTypeList.pm: New file. * mklinktypelist.pl: New file. * .cvsignore: |RelExtensions.html| added. * NanoDOM.pm (child_nodes): Returns an empty array for non-child-containing node types. (text_content): New attribute. 2007-05-26 Wakaba <wakaba@suika.fam.cx> * IMTChecker.pm: New module. * ContentChecker.pm ($HTMLIMTAttrChecker): Call IMTChecker to test parameter value validity. * HTML.pm.src ($style_start_tag): Attributes were discarded. 2007-05-25 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLURIAttrChecker): Implemented. 2007-05-25 Wakaba <wakaba@suika.fam.cx> * URIChecker.pm: All recommendations from RFC 3986 and RFC 3987 are listed (not all testable items are checked yet). 2007-05-25 Wakaba <wakaba@suika.fam.cx> * URIChecker.pm: New module. 2007-05-20 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Now most attributes are implemented or associated to some placeholder. ($ElementDefault): Warn unknown attributes for unknown elements as "attribute not supported". ($HTMLLanguageTagAttrChecker, $HTMLMQAttrChecker): New placeholders. ($HTMLUsemapAttrChecker, $HTMLTargetAttrChecker): New checkers. (|a| attribute checker): Reimplemented. 2007-05-20 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLEventHandlerAttrChecker): New placeholder. ($HTMLAttrChecker): Event handler content attributes are added. (link, embed): Required attribute is now checked. (embed): Unknown local attributes are no longer warned. 2007-05-20 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLSpaceURIsAttrChecker): New placeholder. ($HTMLIMTAttrChecker): New checker. (link@rel, link@href, link@type, style@type, a@href, a@ping, a@ping, a@type, embed@src, embed@type, object@data, object@type, source@src, source@type, area@alt, area@shape, area@coords, area@href, area@ping, area@rel, area@type, script@src, script@defer, script@async, script@type): Checkers added. 2007-05-20 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Descendant checking was incorrect. 2007-05-19 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Support |xml:*| and |xmlns:*| attributes. Report an error if |Element.prefix| is |xmlns|. * NanoDOM.pm (prefix): New attribute. 2007-05-19 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: In |main| phase, |in body| insertion mode, action for |<iframe>| was missing. 2007-05-19 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Support for many of HTML5 elements. ($GetHTMLNonNegativeIntegerAttrChecker): New. 2007-05-19 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Support for most elements up to |progress|. ($HTMLURIAttrChecker): Placeholder. ($HTMLIntegerAttrChecker, $GetHTMLFloatingPointNumberAttrChecker): New. 2007-05-19 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Attribute checkers for global attributes, |html|, |base|, |style|, and |meta|. * NanoDOM.pm (insert_before): Weaken reference to the parent node. (Attr::new): Set |owner_element| attribute. (namespace_uri, manakai_local_name): New attribute implementations. (owner_element): New attribute. 2007-05-19 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($AttrChecker, $HTMLAttrChecker, $AnyChecker->{attr_checker}, $HTMLAttrsChecker, $Element->{$HTML_NS}->{''}): New. (check_element): Invoke attrs_checker for each element. 2007-05-13 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Don't use |manakai_element_type_match|. 2007-05-13 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Use hashs rather than lists for element type testings. 2007-05-13 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: Don't generate duplicate error when an element type is put in the "minus" list and the element type is not allowed explicitly in the particular element content model. (html:a checker): New checker. (html:details, html:datagrid): New checkers. (html:legend): New checker. 2007-05-13 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm (html:li checker): Implemented. 2007-05-13 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($HTMLInlineOrStriclyInlineChecker): New checker. (html:dd checker): New checker. (html:q, html:em, html:strong, html:small, html:m, html:dfn, html:code, html:samp, html:span): New checkers. 2007-05-13 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm ($AnyChecker): Renamed from |$ElementDefault->{checker}|. ($ElementDefault->{checker}): Throw an error that the element type is not supported by the checker. ($HTMLMetadataElement): |html:base| was missing. ($HTMLEmptyChecker): Don't throw an error for inter-element whitespace nodes. (html:html checker): Errors were not thrown even if |html:head| and/or |html:body| children were missing. (html:head checker): An error was not thrown if <meta charset> appered after other elements. 2007-05-05 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: |footer|, |video|, |audio|, |script|, and |noscript| elements are implemented. (new): New method. 2007-05-04 Wakaba <wakaba@suika.fam.cx> * ContentChecker.pm: New module. 2007-05-04 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (manakai_parent_element, document_element, manakai_local_name, manakai_element_type_match): New method. 2007-05-03 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Replace decimal and hexadecimal numeric entities in C1 range using Windows-1252 mapping. Bare LF did not count as new line for error reporting. 2007-05-02 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (DOMImplementation): New class. (append_child): Weaken the |parent_node| reference. (create_element_ns, Element new): Set the |owner_document| reference. (implementation): New attribute. (owner_document, local_name, namespace_uri): New attributes. * HTML.pm.src (parse_string): Line and column numbers are now provided to error handler. (!!!parse-error): Short descriptions are added. (_construct_tree): Split into three methods; support for innerHTML mode. (set_inner_html): New method. 2007-05-01 Wakaba <wakaba@suika.fam.cx> * NanoDOM.html: Documentation is added. * HTML.pod, ContentType.html: Documentation is revised. * .cvsignore: Pod2html temporary files are added. * Makefile: Make |NanoDOM.html|. 2007-05-01 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src (parse_string): New method. (get_inner_html): Renamed from |inner_html|. * Makefile: A rule for |HTML.html| is added. * HTML.pod: New documentation. 2007-05-01 Wakaba <wakaba@suika.fam.cx> * NanoDOM.pm (last_child, previous_sibling): New attributes. (clone_node): Attribute nodes were not completely copied. * HTML.pm.src: Many bugs are fixed. 2007-04-30 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Some typos are fixed. 2007-04-30 Wakaba <wakaba@suika.fam.cx> * mkhtmlparser.pl, Makefile: References to the |HTML-consume-entity.src| are removed. * HTML.pm.src: Tokenizer's handling on named entities are rewritten. * HTML-consume-entity.src: Removed. 2007-04-30 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Tokenizer's handling on hexadecimal numeric entities are rewritten. 2007-04-30 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: Some tokenizer bugs are fixed. 2007-04-30 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src: The tree construction stage is implemented. * mkhtmlparser.pl: New macros are added. 2007-04-28 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: A note on bug in the specification is removed since it's been now fixed. * .cvsignore: New file. 2007-04-28 Wakaba <wakaba@suika.fam.cx> * HTML.pm.src, HTML-consume-entity.src: New files. * Makefile (HTML.pm): New rule. * mkhtmlparser.pl: New script. 2007-04-25 Wakaba <wakaba@suika.fam.cx> * Makefile: New file. 2007-04-24 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: An error in pod is fixed. 2007-04-24 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: Documentation is added. 2007-04-24 Wakaba <wakaba@suika.fam.cx> * ContentType.pm: New Perl module. * ChangeLog: New file.