/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory | Revision Log | View Patch Patch

-revision 1.295 by wakaba,
Sat Sep 13 08:21:35 2008 UTC
+revision 1.312 by wakaba,
Mon Sep 15 08:09:39 2008 UTC
 Line 1
+-09-15  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Shorten keys.
+-09-15  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Remove checking for control character, surrogate
+         pair, or noncharacter code points and non-Unicode code
+         points (they should be handled by Whatpm::Charset::UnicodeChecker).
+         (parse_char_stream): Support for the |$get_wrapper| argument and
+         character stream error handlers.
+-09-15  Wakaba  <wakaba@suika.fam.cx>
+         * ContentChecker.pm: Don't call |loda_ns_module|
+         for null-namespace elements/attributes.
+         * HTML.pm.src: Fact out $disallowed_control_chars
+         as a hash.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
+         and |{next_char}| initializations are moved to initialization
+         method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
+         with |parse_char_stream|.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src (parse_char_stream): Make |set_next_char|
+         invoke |manakai_read_until|, not only |read|, where
+         possible, to decrease the number of |read| method calls.
+         * mkhtmlparser.pl: Related changes to the aforementioned
+         modification.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
+         would report character error from now.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: White-space-leaded non-white-space character
+         tokens in "before head insertion mode" was not
+         correctly handled.
+         (set_inner_html): Reimplemented using CharString decodehandle
+         class.  Support for $get_wrapper argument.  Support
+         for |{read_until}| feature.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Make a "bare ero" error for unknown
+         entities point the "&" character.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: It turns out that U+FFFD don't have to
+         be added to the list of excluded characters.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src ($char_onerror): Have character decoder's |line|
+         and |column| a higher priority than the one set by the
+         tokenizer's input handler.
+         ($self->{read_until}): Exclude U+FFFD (but this might
+         not be necessary, since now we do line/column fixup in
+         the character decode handle).
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Use |{read_until}| where possible.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Change |{getc_until}| to |{read_until}|
+         and |manakai_getc_until| to |manakai_read_until| to
+         reduce the number of string copies.
+-09-14  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src (parse_char_string): Use newly created
+         |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
+         standard feature to |open| a string as a filehandle,
+         since Perl's string filehandle seems not supporting |ungetc|
+         method correctly.
+         (parse_char_stream): Define |{getc_until}| method.
+         (DATA_STATE): Experimental support for |getc_until| feature.
+-09-13  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Check points added to newly added branches.
+-09-13  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Remove |{char}|, which is no longer used.
+         Remove |{entity_in_attr}| and |{last_attribute_value_state}|
+         and replaced by |{prev_state}|.
+         * mkhtmlparser.pl: Remove |{char}| feature.
+         Remove |!!!back-next-input-character;| macro.
+-09-13  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
+         entity related tokenizer states in favor of new states
+         implementing the consume character reference algorithm.
+-09-13  Wakaba  <wakaba@suika.fam.cx>
+         * HTML.pm.src: "Consume a character reference" algorithm is
+         now implemented as a tokenizer's state, rather than
+         a method, with minimum changes (more changes will
+         be made, in due course).  "Bogus comment state"'s inner
+         loop gets removed.
 -09-13  Wakaba  <wakaba@suika.fam.cx>
-         * HTML.pm: Make |PUBLIC| and |SYSTEM| keyword tokenizing
+         * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
          into their own tokenizer states.
 -09-13  Wakaba  <wakaba@suika.fam.cx>

 Legend:



Removed from v.1.295
 


changed lines


 
Added in v.1.312
 Legend:



Removed from v.1.295
 


changed lines


 
Added in v.1.312
-Removed from v.1.295
+Added in v.1.312

admin@suikawiki.org	ViewVC Help
Powered by ViewVC 1.1.24