/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.276 by wakaba, Sun Aug 17 05:09:12 2008 UTC revision 1.311 by wakaba, Mon Sep 15 07:19:03 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: Remove checking for control character, surrogate
4            pair, or noncharacter code points and non-Unicode code
5            points (they should be handled by Whatpm::Charset::UnicodeChecker).
6            (parse_char_stream): Support for the |$get_wrapper| argument and
7            character stream error handlers.
8    
9    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
10    
11            * ContentChecker.pm: Don't call |loda_ns_module|
12            for null-namespace elements/attributes.
13    
14            * HTML.pm.src: Fact out $disallowed_control_chars
15            as a hash.
16    
17    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
18    
19            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
20            and |{next_char}| initializations are moved to initialization
21            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
22            with |parse_char_stream|.
23    
24    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
25    
26            * HTML.pm.src (parse_char_stream): Make |set_next_char|
27            invoke |manakai_read_until|, not only |read|, where
28            possible, to decrease the number of |read| method calls.
29    
30            * mkhtmlparser.pl: Related changes to the aforementioned
31            modification.
32    
33    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
34    
35            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
36            would report character error from now.
37    
38    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
39    
40            * HTML.pm.src: White-space-leaded non-white-space character
41            tokens in "before head insertion mode" was not
42            correctly handled.
43            (set_inner_html): Reimplemented using CharString decodehandle
44            class.  Support for $get_wrapper argument.  Support
45            for |{read_until}| feature.
46    
47    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
48    
49            * HTML.pm.src: Make a "bare ero" error for unknown
50            entities point the "&" character.
51    
52    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
53    
54            * HTML.pm.src: It turns out that U+FFFD don't have to
55            be added to the list of excluded characters.
56    
57    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
58    
59            * HTML.pm.src ($char_onerror): Have character decoder's |line|
60            and |column| a higher priority than the one set by the
61            tokenizer's input handler.
62            ($self->{read_until}): Exclude U+FFFD (but this might
63            not be necessary, since now we do line/column fixup in
64            the character decode handle).
65    
66    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
67    
68            * HTML.pm.src: Use |{read_until}| where possible.
69    
70    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
71    
72            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
73            and |manakai_getc_until| to |manakai_read_until| to
74            reduce the number of string copies.
75    
76    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
77    
78            * HTML.pm.src (parse_char_string): Use newly created
79            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
80            standard feature to |open| a string as a filehandle,
81            since Perl's string filehandle seems not supporting |ungetc|
82            method correctly.
83            (parse_char_stream): Define |{getc_until}| method.
84            (DATA_STATE): Experimental support for |getc_until| feature.
85    
86    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
87    
88            * HTML.pm.src: Check points added to newly added branches.
89    
90    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
91    
92            * HTML.pm.src: Remove |{char}|, which is no longer used.
93            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
94            and replaced by |{prev_state}|.
95    
96            * mkhtmlparser.pl: Remove |{char}| feature.
97            Remove |!!!back-next-input-character;| macro.
98    
99    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
100    
101            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
102            entity related tokenizer states in favor of new states
103            implementing the consume character reference algorithm.
104    
105    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
106    
107            * HTML.pm.src: "Consume a character reference" algorithm is
108            now implemented as a tokenizer's state, rather than
109            a method, with minimum changes (more changes will
110            be made, in due course).  "Bogus comment state"'s inner
111            loop gets removed.
112    
113    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
114    
115            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
116            into their own tokenizer states.
117    
118    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
119    
120            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
121            is split into three states.
122    
123    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
124    
125            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
126            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
127            no longer does the tokenizer have to push back next input
128            characters in those states.
129    
130    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
131    
132            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
133            into four states so that no longer does the tokenizer have to push
134            back next input characters in that state.
135    
136    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
137    
138            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
139            which can be used to insert some wrapper between the character
140            stream handle and the tokenizer.  (It is currently not supported
141            for |set_inner_html| for |Element|s).
142    
143    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
144    
145            * HTML.pm.src: Ignore punctuations in charset names.
146    
147    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
148    
149            * ContentChecker.pm: Support for charset-layer error levels.
150    
151            * HTML.pm.src: Don't specify |text| argument for the
152            |chardecode:fallback| error, since it is not the encoding
153            being used alternatively.
154    
155    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
156    
157            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
158    
159    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
160    
161            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
162    
163    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
164    
165            * HTML.pm.src: Bug fix and sync with the spec with regard
166            to after after frameset insertion mode processing (HTML5
167            revision 1909).  Note that the implementation was wrong
168            per the old spec before the r1909 changes.
169    
170    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
171    
172            * HTMLTable.pm: scope=auto algorithm fix synced with the
173            spec (HTML5 revision 2093).
174            ($process_row): Algorithm step numbers synced with the
175            spec (HTML5 revision 2092).
176    
177    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
178    
179            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
180            revision 2094).
181    
182    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
183    
184            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
185    
186    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
187    
188            * HTML.pm.src: '"' and "'" at the end of attribute
189            name (after another attribute) now raise parse error (HTML5
190            revision 2123).  Empty unquoted attribute values are no
191            longer allowed (HTML5 revision 2122).
192    
193    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
194    
195            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
196            revision 2130).
197    
198    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
199    
200            * ContentChecker.pm: |xml:lang| attribute value must be same
201            as |lang| attribute value for HTML elements (HTML5 revision 2062
202            and so on).
203    
204    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
205    
206            * ContentChecker.pm: Error level definition for |xml_id_error|
207            was missing.
208    
209            * URIChecker.pm: The end of the URL should be marked as the
210            error location for an empty path error.  The position
211            between the userinfo and the port components should be
212            marked as the error location for an empty host error.
213    
214    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
215    
216            * URIChecker.pm: Set parameters representing where in the
217            value the error occurs for errors.  Report unknown
218            address format error in warning level, since address
219            formats are rarely added.  Path segments starting with "/.."
220            were misinterpreted as a dot-segment.
221    
222    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
223    
224            * URIChecker.pm (check_iri_reference): Requires
225            |Message::DOM::DOMImplementation|.
226    
227    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
228    
229            * IMTChecker.pm: Updated for the new error reporting architecture.
230    
231            * ContentChecker.pm: Error levels for IMTs are added.
232    
233  2008-08-17  Wakaba  <wakaba@suika.fam.cx>  2008-08-17  Wakaba  <wakaba@suika.fam.cx>
234    
235          * H2H.pm (_shift_token): Support for unquoted HTML attribute          * H2H.pm (_shift_token): Support for unquoted HTML attribute

Legend:
Removed from v.1.276  
changed lines
  Added in v.1.311

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24