2008-09-14 Wakaba
* HTML.pm.src: It turns out that U+FFFD don't have to
be added to the list of excluded characters.
2008-09-14 Wakaba
* HTML.pm.src ($char_onerror): Have character decoder's |line|
and |column| a higher priority than the one set by the
tokenizer's input handler.
($self->{read_until}): Exclude U+FFFD (but this might
not be necessary, since now we do line/column fixup in
the character decode handle).
2008-09-14 Wakaba
* HTML.pm.src: Use |{read_until}| where possible.
2008-09-14 Wakaba
* HTML.pm.src: Change |{getc_until}| to |{read_until}|
and |manakai_getc_until| to |manakai_read_until| to
reduce the number of string copies.
2008-09-14 Wakaba
* HTML.pm.src (parse_char_string): Use newly created
|Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
standard feature to |open| a string as a filehandle,
since Perl's string filehandle seems not supporting |ungetc|
method correctly.
(parse_char_stream): Define |{getc_until}| method.
(DATA_STATE): Experimental support for |getc_until| feature.
2008-09-13 Wakaba
* HTML.pm.src: Check points added to newly added branches.
2008-09-13 Wakaba
* HTML.pm.src: Remove |{char}|, which is no longer used.
Remove |{entity_in_attr}| and |{last_attribute_value_state}|
and replaced by |{prev_state}|.
* mkhtmlparser.pl: Remove |{char}| feature.
Remove |!!!back-next-input-character;| macro.
2008-09-13 Wakaba
* HTML.pm.src: Finally we get rid of all the inner loops. Remove
entity related tokenizer states in favor of new states
implementing the consume character reference algorithm.
2008-09-13 Wakaba
* HTML.pm.src: "Consume a character reference" algorithm is
now implemented as a tokenizer's state, rather than
a method, with minimum changes (more changes will
be made, in due course). "Bogus comment state"'s inner
loop gets removed.
2008-09-13 Wakaba
* HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
into their own tokenizer states.
2008-09-13 Wakaba
* HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
is split into three states.
2008-09-13 Wakaba
* HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
no longer does the tokenizer have to push back next input
characters in those states.
2008-09-13 Wakaba
* HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
into four states so that no longer does the tokenizer have to push
back next input characters in that state.
2008-09-11 Wakaba
* HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
which can be used to insert some wrapper between the character
stream handle and the tokenizer. (It is currently not supported
for |set_inner_html| for |Element|s).
2008-09-10 Wakaba
* HTML.pm.src: Ignore punctuations in charset names.
2008-09-10 Wakaba
* ContentChecker.pm: Support for charset-layer error levels.
* HTML.pm.src: Don't specify |text| argument for the
|chardecode:fallback| error, since it is not the encoding
being used alternatively.
2008-09-06 Wakaba
* HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
2008-08-31 Wakaba
* CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
2008-08-31 Wakaba
* HTML.pm.src: Bug fix and sync with the spec with regard
to after after frameset insertion mode processing (HTML5
revision 1909). Note that the implementation was wrong
per the old spec before the r1909 changes.
2008-08-30 Wakaba
* HTMLTable.pm: scope=auto algorithm fix synced with the
spec (HTML5 revision 2093).
($process_row): Algorithm step numbers synced with the
spec (HTML5 revision 2092).
2008-08-30 Wakaba
* HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
revision 2094).
2008-08-30 Wakaba
* ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
2008-08-30 Wakaba
* HTML.pm.src: '"' and "'" at the end of attribute
name (after another attribute) now raise parse error (HTML5
revision 2123). Empty unquoted attribute values are no
longer allowed (HTML5 revision 2122).
2008-08-30 Wakaba
* mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
revision 2130).
2008-08-30 Wakaba
* ContentChecker.pm: |xml:lang| attribute value must be same
as |lang| attribute value for HTML elements (HTML5 revision 2062
and so on).
2008-08-30 Wakaba
* ContentChecker.pm: Error level definition for |xml_id_error|
was missing.
* URIChecker.pm: The end of the URL should be marked as the
error location for an empty path error. The position
between the userinfo and the port components should be
marked as the error location for an empty host error.
2008-08-30 Wakaba
* URIChecker.pm: Set parameters representing where in the
value the error occurs for errors. Report unknown
address format error in warning level, since address
formats are rarely added. Path segments starting with "/.."
were misinterpreted as a dot-segment.
2008-08-30 Wakaba
* URIChecker.pm (check_iri_reference): Requires
|Message::DOM::DOMImplementation|.
2008-08-29 Wakaba
* IMTChecker.pm: Updated for the new error reporting architecture.
* ContentChecker.pm: Error levels for IMTs are added.
2008-08-17 Wakaba
* H2H.pm (_shift_token): Support for unquoted HTML attribute
values.
2008-08-16 Wakaba
* CacheManifest.pm: Support for new style of error
reports.
* HTML.pm.src: Set line=1, column=1 to the document node.
2008-08-16 Wakaba
* ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
and URL checkers. Support for more error levels for bogus
langauge tag and URL "standards".
* LangTag.pm, URIChecker.pm: Support for new style error
level reporting.
2008-08-15 Wakaba
* ContentChecker.pm: Support for RDF/XML error levels.
* HTMLTable.pm, RDFXML.pm: Support for new style of error level
specifying. Error types are revised.
2008-08-15 Wakaba
* ContentChecker.pm: All error reporting method calls are
renewed.
2008-08-15 Wakaba
* HTML.pm.src: All error type names and "text" parameters
are revised. Use new style for "level" specification.
* mkhtmlparser.pl: Use new style for "level" specification.
2008-08-03 Wakaba
* WebIDL.pm (parse_char_string): Simplified error
reporting process for broken ignored valuetype definition.
(Valuetype idl_text): Support for special "DOMString" name.
2008-08-03 Wakaba
* WebIDL.pm ($get_scoped_name): Append "::::" if the last
terminal of the ScopedName is "DOMString", such that whether
the last part of the scoped name is "DOMString" or "_DOMString"
later. It is necessary to determine whether a |typedef|
definition should be ignored or not.
(parse_char_string): Unescape the identifier of
exception members.
($resolve): Return undef for builtin types and sequence
types (we might not have to do this, however...).
(check): Support checking for Exceptions, Valuetypes,
and Typedefs.
($serialize_type): Support for "DOMString::::" syntax.
(Typedef idl_text): Output Type as "DOMString" if it
is really "DOMString" (i.e. its internal representation
is "::DOMString::").
2008-08-03 Wakaba
* WebIDL.pm ($resolve): New code, based on resolve code
for constant types in the |check| method.
(check): Support for checking of attributes, operations, and
arguments.
(Attribute/Operation idl_text): Exception names in getraises,
setraises, and raises clauses is serizlied by |$serialize_type|
code.
2008-08-02 Wakaba
* WebIDL.pm ($integer): Order of selections are changed to match
hexadecimal numbers (the original pattern, taken from the spec,
was not work for hexadecimal numbers, because the "0" prefix
matches to the [0-7]* part (as an empty string) and therefore
it does not match with remaining "x..." part of a "0x..." integer
literal.
($get_type): It now returns a string, not an array reference,
for regular types and |sequence| types (i.e. it in any case
returns a string).
($get_next_token): The second item in the array that represents
a integer or float token is now a Perl number value, not the
original string representation of the number.
(check): Support for const value consistency checking.
No extended attribute is defined for constants.
(Node subclasses): Use simple strings rather than array references
for default data type values.
($serialize_type): Type values are now simple strings.
(value): If the new attribute value is a false value, then
a FALSE value is set to the attribute.
2008-08-02 Wakaba
* WebIDL.pm ($get_scoped_name): Now scoped names are stored
in its stringified format ("scoped name" as defined in the
spec). Note that future version of this module should not use
array references for type values and the |type_text| attribute
should be made obsolete.
(parse_char_string): Unescape attribute names.
(check): Support for checking of whether inherited interfaces
are actually defined or not. Support for checking of whether
interface member identifiers are duplicated or not.
($serialize_type): Scoped names are returned as is. A future
version of this code should escape identifiers other than "DOMString",
otherwise the idl_text would be non-conforming.
2008-08-02 Wakaba
* WebIDL.pm (parse_char_string): Set line/column numbers
to generated nodes. Unescape identifiers. Extended attributes
for Definition's were ignored.
(append_child): Set |parent_node| attribute.
(parent_node): New attribute.
(check): Support interface/exception members. Support
extended attributes. Support definition identifier uniqueness
constraint.
(qualified_name): New attribute.
(Interface/Exception idl_text): Extended attributes were
not prepended to the returned text.
2008-08-02 Wakaba
* WebIDL.pm (parse_char_string): Set line/column numbers
to interface object experimentally. s/shift/pop/g, shift
would make things wrong. Support for interface forward
declarations was missing. Broken interface declarations
with no block were not ignored entirely.
(Whatpm::WebIDL::Node): New abstract class. This class
makes things easier.
(child_nodes): New attribute. Unlike DOM's attribute with
same name, this attribute returns a dead list of nodes for
simplicity.
(get_user_data, set_user_data): New methods.
(Module idl_text): A SPACE character should be inserted
before the |{| character.
(Interface idl_text): Support for interface forward declarations.
(is_forward_declaration): New attribute.
2008-07-19 Wakaba
* WebIDL.pm (type_text): Better serializer.
2008-07-19 Wakaba
* WebIDL.pm: Revise forward-compatible parsing so that
it now can handle broken extended attributes and as such.
2008-07-19 Wakaba
* WebIDL.pm: Real support for extended attributes.
Support for extended attributes with arguments.
2008-07-19 Wakaba
* WebIDL.pm: Support for |exception| syntax.
(Interface->idl_text): Tentative support for inheritances.
2008-07-19 Wakaba
* WebIDL.pm: Hierarchical scoped name support was broken.
Support for raises, setraises, and getraises syntaxes.
2008-07-18 Wakaba
* WebIDL.pm: Support for |idl_text| attribute, version 1 (no
proper support for types, extended attributes, and exceptions yet).
WebIDL parser, version 1 (no support for exceptions yet,
no proper support for extended attributes yet).
2008-07-09 Wakaba
* WebIDL.pm (parse_char_string): Support for basic attribute syntax.
2008-06-29 Wakaba
* WebIDL.pm: Support for valuetype and const.
2008-06-29 Wakaba
* WebIDL.pm: New module.
2008-06-15 Wakaba
* Makefile (Entities.html): URI changed.
2008-06-08 Wakaba
* HTML.pm.src: Support for ruby parsing (HTML5 revision 1704).
2008-06-01 Wakaba
* HTML.pm.src (_get_next_token): A parse error was missing.
2008-06-01 Wakaba
* mklinktypelist.pl: rel=contact is no longer part of the HTML5
spec (commented out). (HTML5 revision 1711).
2008-05-25 Wakaba
* ContentType.pm: Drop support for UTF-32 (HTML5 revision 1701).
* HTML.pm.src: UTF-16BE and UTF-16LE should be considered
as UTF-16 (HTML5 revision 1701).
2008-05-25 Wakaba
* HTML.pm.src: Support for in (HTML5 revision
1692).
2008-05-25 Wakaba
* HTML.pm.src: The secondary insertion mode used when switching
to foreign content is the "in body" insertion mode (HTML5 revision
1696).
2008-05-25 Wakaba
* HTML.pm.src: Don't raise parse error for (HTML5
revision 1697).
2008-05-24 Wakaba
* HTML.pm.src: Support for end-of-file token in foreign content
insertion mode (HTML5 revision 1693). Update SVG camelCase
attribute list (HTML5 revision 1700).