/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.290 by wakaba, Wed Sep 10 10:46:50 2008 UTC revision 1.308 by wakaba, Sun Sep 14 13:09:00 2008 UTC
# Line 1  Line 1 
1    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src (parse_char_stream): Make |set_next_char|
4            invoke |manakai_read_until|, not only |read|, where
5            possible, to decrease the number of |read| method calls.
6    
7            * mkhtmlparser.pl: Related changes to the aforementioned
8            modification.
9    
10    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
11    
12            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
13            would report character error from now.
14    
15    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
16    
17            * HTML.pm.src: White-space-leaded non-white-space character
18            tokens in "before head insertion mode" was not
19            correctly handled.
20            (set_inner_html): Reimplemented using CharString decodehandle
21            class.  Support for $get_wrapper argument.  Support
22            for |{read_until}| feature.
23    
24    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
25    
26            * HTML.pm.src: Make a "bare ero" error for unknown
27            entities point the "&" character.
28    
29    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
30    
31            * HTML.pm.src: It turns out that U+FFFD don't have to
32            be added to the list of excluded characters.
33    
34    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
35    
36            * HTML.pm.src ($char_onerror): Have character decoder's |line|
37            and |column| a higher priority than the one set by the
38            tokenizer's input handler.
39            ($self->{read_until}): Exclude U+FFFD (but this might
40            not be necessary, since now we do line/column fixup in
41            the character decode handle).
42    
43    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
44    
45            * HTML.pm.src: Use |{read_until}| where possible.
46    
47    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
48    
49            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
50            and |manakai_getc_until| to |manakai_read_until| to
51            reduce the number of string copies.
52    
53    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
54    
55            * HTML.pm.src (parse_char_string): Use newly created
56            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
57            standard feature to |open| a string as a filehandle,
58            since Perl's string filehandle seems not supporting |ungetc|
59            method correctly.
60            (parse_char_stream): Define |{getc_until}| method.
61            (DATA_STATE): Experimental support for |getc_until| feature.
62    
63    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
64    
65            * HTML.pm.src: Check points added to newly added branches.
66    
67    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
68    
69            * HTML.pm.src: Remove |{char}|, which is no longer used.
70            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
71            and replaced by |{prev_state}|.
72    
73            * mkhtmlparser.pl: Remove |{char}| feature.
74            Remove |!!!back-next-input-character;| macro.
75    
76    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
77    
78            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
79            entity related tokenizer states in favor of new states
80            implementing the consume character reference algorithm.
81    
82    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
83    
84            * HTML.pm.src: "Consume a character reference" algorithm is
85            now implemented as a tokenizer's state, rather than
86            a method, with minimum changes (more changes will
87            be made, in due course).  "Bogus comment state"'s inner
88            loop gets removed.
89    
90    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
91    
92            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
93            into their own tokenizer states.
94    
95    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
96    
97            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
98            is split into three states.
99    
100    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
101    
102            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
103            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
104            no longer does the tokenizer have to push back next input
105            characters in those states.
106    
107    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
108    
109            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
110            into four states so that no longer does the tokenizer have to push
111            back next input characters in that state.
112    
113    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
114    
115            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
116            which can be used to insert some wrapper between the character
117            stream handle and the tokenizer.  (It is currently not supported
118            for |set_inner_html| for |Element|s).
119    
120  2008-09-10  Wakaba  <wakaba@suika.fam.cx>  2008-09-10  Wakaba  <wakaba@suika.fam.cx>
121    
122          * HTML.pm.src: Ignore punctuations in charset names.          * HTML.pm.src: Ignore punctuations in charset names.

Legend:
Removed from v.1.290  
changed lines
  Added in v.1.308

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24