/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.295 by wakaba, Sat Sep 13 08:21:35 2008 UTC revision 1.312 by wakaba, Mon Sep 15 08:09:39 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: Shorten keys.
4    
5    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
6    
7            * HTML.pm.src: Remove checking for control character, surrogate
8            pair, or noncharacter code points and non-Unicode code
9            points (they should be handled by Whatpm::Charset::UnicodeChecker).
10            (parse_char_stream): Support for the |$get_wrapper| argument and
11            character stream error handlers.
12    
13    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
14    
15            * ContentChecker.pm: Don't call |loda_ns_module|
16            for null-namespace elements/attributes.
17    
18            * HTML.pm.src: Fact out $disallowed_control_chars
19            as a hash.
20    
21    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
22    
23            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
24            and |{next_char}| initializations are moved to initialization
25            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
26            with |parse_char_stream|.
27    
28    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
29    
30            * HTML.pm.src (parse_char_stream): Make |set_next_char|
31            invoke |manakai_read_until|, not only |read|, where
32            possible, to decrease the number of |read| method calls.
33    
34            * mkhtmlparser.pl: Related changes to the aforementioned
35            modification.
36    
37    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
38    
39            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
40            would report character error from now.
41    
42    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
43    
44            * HTML.pm.src: White-space-leaded non-white-space character
45            tokens in "before head insertion mode" was not
46            correctly handled.
47            (set_inner_html): Reimplemented using CharString decodehandle
48            class.  Support for $get_wrapper argument.  Support
49            for |{read_until}| feature.
50    
51    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
52    
53            * HTML.pm.src: Make a "bare ero" error for unknown
54            entities point the "&" character.
55    
56    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
57    
58            * HTML.pm.src: It turns out that U+FFFD don't have to
59            be added to the list of excluded characters.
60    
61    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
62    
63            * HTML.pm.src ($char_onerror): Have character decoder's |line|
64            and |column| a higher priority than the one set by the
65            tokenizer's input handler.
66            ($self->{read_until}): Exclude U+FFFD (but this might
67            not be necessary, since now we do line/column fixup in
68            the character decode handle).
69    
70    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
71    
72            * HTML.pm.src: Use |{read_until}| where possible.
73    
74    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
75    
76            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
77            and |manakai_getc_until| to |manakai_read_until| to
78            reduce the number of string copies.
79    
80    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
81    
82            * HTML.pm.src (parse_char_string): Use newly created
83            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
84            standard feature to |open| a string as a filehandle,
85            since Perl's string filehandle seems not supporting |ungetc|
86            method correctly.
87            (parse_char_stream): Define |{getc_until}| method.
88            (DATA_STATE): Experimental support for |getc_until| feature.
89    
90    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
91    
92            * HTML.pm.src: Check points added to newly added branches.
93    
94    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
95    
96            * HTML.pm.src: Remove |{char}|, which is no longer used.
97            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
98            and replaced by |{prev_state}|.
99    
100            * mkhtmlparser.pl: Remove |{char}| feature.
101            Remove |!!!back-next-input-character;| macro.
102    
103    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
104    
105            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
106            entity related tokenizer states in favor of new states
107            implementing the consume character reference algorithm.
108    
109    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
110    
111            * HTML.pm.src: "Consume a character reference" algorithm is
112            now implemented as a tokenizer's state, rather than
113            a method, with minimum changes (more changes will
114            be made, in due course).  "Bogus comment state"'s inner
115            loop gets removed.
116    
117  2008-09-13  Wakaba  <wakaba@suika.fam.cx>  2008-09-13  Wakaba  <wakaba@suika.fam.cx>
118    
119          * HTML.pm: Make |PUBLIC| and |SYSTEM| keyword tokenizing          * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
120          into their own tokenizer states.          into their own tokenizer states.
121    
122  2008-09-13  Wakaba  <wakaba@suika.fam.cx>  2008-09-13  Wakaba  <wakaba@suika.fam.cx>

Legend:
Removed from v.1.295  
changed lines
  Added in v.1.312

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24