/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.276 by wakaba, Sun Aug 17 05:09:12 2008 UTC revision 1.310 by wakaba, Mon Sep 15 02:54:12 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * ContentChecker.pm: Don't call |loda_ns_module|
4            for null-namespace elements/attributes.
5    
6            * HTML.pm.src: Fact out $disallowed_control_chars
7            as a hash.
8    
9    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
10    
11            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
12            and |{next_char}| initializations are moved to initialization
13            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
14            with |parse_char_stream|.
15    
16    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
17    
18            * HTML.pm.src (parse_char_stream): Make |set_next_char|
19            invoke |manakai_read_until|, not only |read|, where
20            possible, to decrease the number of |read| method calls.
21    
22            * mkhtmlparser.pl: Related changes to the aforementioned
23            modification.
24    
25    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
26    
27            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
28            would report character error from now.
29    
30    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
31    
32            * HTML.pm.src: White-space-leaded non-white-space character
33            tokens in "before head insertion mode" was not
34            correctly handled.
35            (set_inner_html): Reimplemented using CharString decodehandle
36            class.  Support for $get_wrapper argument.  Support
37            for |{read_until}| feature.
38    
39    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
40    
41            * HTML.pm.src: Make a "bare ero" error for unknown
42            entities point the "&" character.
43    
44    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
45    
46            * HTML.pm.src: It turns out that U+FFFD don't have to
47            be added to the list of excluded characters.
48    
49    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
50    
51            * HTML.pm.src ($char_onerror): Have character decoder's |line|
52            and |column| a higher priority than the one set by the
53            tokenizer's input handler.
54            ($self->{read_until}): Exclude U+FFFD (but this might
55            not be necessary, since now we do line/column fixup in
56            the character decode handle).
57    
58    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
59    
60            * HTML.pm.src: Use |{read_until}| where possible.
61    
62    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
63    
64            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
65            and |manakai_getc_until| to |manakai_read_until| to
66            reduce the number of string copies.
67    
68    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
69    
70            * HTML.pm.src (parse_char_string): Use newly created
71            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
72            standard feature to |open| a string as a filehandle,
73            since Perl's string filehandle seems not supporting |ungetc|
74            method correctly.
75            (parse_char_stream): Define |{getc_until}| method.
76            (DATA_STATE): Experimental support for |getc_until| feature.
77    
78    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
79    
80            * HTML.pm.src: Check points added to newly added branches.
81    
82    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
83    
84            * HTML.pm.src: Remove |{char}|, which is no longer used.
85            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
86            and replaced by |{prev_state}|.
87    
88            * mkhtmlparser.pl: Remove |{char}| feature.
89            Remove |!!!back-next-input-character;| macro.
90    
91    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
92    
93            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
94            entity related tokenizer states in favor of new states
95            implementing the consume character reference algorithm.
96    
97    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
98    
99            * HTML.pm.src: "Consume a character reference" algorithm is
100            now implemented as a tokenizer's state, rather than
101            a method, with minimum changes (more changes will
102            be made, in due course).  "Bogus comment state"'s inner
103            loop gets removed.
104    
105    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
106    
107            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
108            into their own tokenizer states.
109    
110    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
111    
112            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
113            is split into three states.
114    
115    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
116    
117            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
118            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
119            no longer does the tokenizer have to push back next input
120            characters in those states.
121    
122    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
123    
124            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
125            into four states so that no longer does the tokenizer have to push
126            back next input characters in that state.
127    
128    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
129    
130            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
131            which can be used to insert some wrapper between the character
132            stream handle and the tokenizer.  (It is currently not supported
133            for |set_inner_html| for |Element|s).
134    
135    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
136    
137            * HTML.pm.src: Ignore punctuations in charset names.
138    
139    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
140    
141            * ContentChecker.pm: Support for charset-layer error levels.
142    
143            * HTML.pm.src: Don't specify |text| argument for the
144            |chardecode:fallback| error, since it is not the encoding
145            being used alternatively.
146    
147    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
148    
149            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
150    
151    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
152    
153            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
154    
155    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
156    
157            * HTML.pm.src: Bug fix and sync with the spec with regard
158            to after after frameset insertion mode processing (HTML5
159            revision 1909).  Note that the implementation was wrong
160            per the old spec before the r1909 changes.
161    
162    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
163    
164            * HTMLTable.pm: scope=auto algorithm fix synced with the
165            spec (HTML5 revision 2093).
166            ($process_row): Algorithm step numbers synced with the
167            spec (HTML5 revision 2092).
168    
169    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
170    
171            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
172            revision 2094).
173    
174    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
175    
176            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
177    
178    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
179    
180            * HTML.pm.src: '"' and "'" at the end of attribute
181            name (after another attribute) now raise parse error (HTML5
182            revision 2123).  Empty unquoted attribute values are no
183            longer allowed (HTML5 revision 2122).
184    
185    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
186    
187            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
188            revision 2130).
189    
190    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
191    
192            * ContentChecker.pm: |xml:lang| attribute value must be same
193            as |lang| attribute value for HTML elements (HTML5 revision 2062
194            and so on).
195    
196    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
197    
198            * ContentChecker.pm: Error level definition for |xml_id_error|
199            was missing.
200    
201            * URIChecker.pm: The end of the URL should be marked as the
202            error location for an empty path error.  The position
203            between the userinfo and the port components should be
204            marked as the error location for an empty host error.
205    
206    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
207    
208            * URIChecker.pm: Set parameters representing where in the
209            value the error occurs for errors.  Report unknown
210            address format error in warning level, since address
211            formats are rarely added.  Path segments starting with "/.."
212            were misinterpreted as a dot-segment.
213    
214    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
215    
216            * URIChecker.pm (check_iri_reference): Requires
217            |Message::DOM::DOMImplementation|.
218    
219    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
220    
221            * IMTChecker.pm: Updated for the new error reporting architecture.
222    
223            * ContentChecker.pm: Error levels for IMTs are added.
224    
225  2008-08-17  Wakaba  <wakaba@suika.fam.cx>  2008-08-17  Wakaba  <wakaba@suika.fam.cx>
226    
227          * H2H.pm (_shift_token): Support for unquoted HTML attribute          * H2H.pm (_shift_token): Support for unquoted HTML attribute

Legend:
Removed from v.1.276  
changed lines
  Added in v.1.310

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24