/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.262 by wakaba, Sat Jul 19 13:11:30 2008 UTC revision 1.313 by wakaba, Mon Sep 15 09:02:27 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src, mkhtmlparser.pl: Replace |{prev_char}|
4            by |{s_kwd}| in DATA_STATE.
5    
6    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
7    
8            * HTML.pm.src: Shorten keys.
9    
10    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
11    
12            * HTML.pm.src: Remove checking for control character, surrogate
13            pair, or noncharacter code points and non-Unicode code
14            points (they should be handled by Whatpm::Charset::UnicodeChecker).
15            (parse_char_stream): Support for the |$get_wrapper| argument and
16            character stream error handlers.
17    
18    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
19    
20            * ContentChecker.pm: Don't call |loda_ns_module|
21            for null-namespace elements/attributes.
22    
23            * HTML.pm.src: Fact out $disallowed_control_chars
24            as a hash.
25    
26    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
27    
28            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
29            and |{next_char}| initializations are moved to initialization
30            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
31            with |parse_char_stream|.
32    
33    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
34    
35            * HTML.pm.src (parse_char_stream): Make |set_next_char|
36            invoke |manakai_read_until|, not only |read|, where
37            possible, to decrease the number of |read| method calls.
38    
39            * mkhtmlparser.pl: Related changes to the aforementioned
40            modification.
41    
42    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
43    
44            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
45            would report character error from now.
46    
47    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
48    
49            * HTML.pm.src: White-space-leaded non-white-space character
50            tokens in "before head insertion mode" was not
51            correctly handled.
52            (set_inner_html): Reimplemented using CharString decodehandle
53            class.  Support for $get_wrapper argument.  Support
54            for |{read_until}| feature.
55    
56    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
57    
58            * HTML.pm.src: Make a "bare ero" error for unknown
59            entities point the "&" character.
60    
61    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
62    
63            * HTML.pm.src: It turns out that U+FFFD don't have to
64            be added to the list of excluded characters.
65    
66    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
67    
68            * HTML.pm.src ($char_onerror): Have character decoder's |line|
69            and |column| a higher priority than the one set by the
70            tokenizer's input handler.
71            ($self->{read_until}): Exclude U+FFFD (but this might
72            not be necessary, since now we do line/column fixup in
73            the character decode handle).
74    
75    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
76    
77            * HTML.pm.src: Use |{read_until}| where possible.
78    
79    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
80    
81            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
82            and |manakai_getc_until| to |manakai_read_until| to
83            reduce the number of string copies.
84    
85    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
86    
87            * HTML.pm.src (parse_char_string): Use newly created
88            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
89            standard feature to |open| a string as a filehandle,
90            since Perl's string filehandle seems not supporting |ungetc|
91            method correctly.
92            (parse_char_stream): Define |{getc_until}| method.
93            (DATA_STATE): Experimental support for |getc_until| feature.
94    
95    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
96    
97            * HTML.pm.src: Check points added to newly added branches.
98    
99    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
100    
101            * HTML.pm.src: Remove |{char}|, which is no longer used.
102            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
103            and replaced by |{prev_state}|.
104    
105            * mkhtmlparser.pl: Remove |{char}| feature.
106            Remove |!!!back-next-input-character;| macro.
107    
108    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
109    
110            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
111            entity related tokenizer states in favor of new states
112            implementing the consume character reference algorithm.
113    
114    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
115    
116            * HTML.pm.src: "Consume a character reference" algorithm is
117            now implemented as a tokenizer's state, rather than
118            a method, with minimum changes (more changes will
119            be made, in due course).  "Bogus comment state"'s inner
120            loop gets removed.
121    
122    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
123    
124            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
125            into their own tokenizer states.
126    
127    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
128    
129            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
130            is split into three states.
131    
132    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
133    
134            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
135            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
136            no longer does the tokenizer have to push back next input
137            characters in those states.
138    
139    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
140    
141            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
142            into four states so that no longer does the tokenizer have to push
143            back next input characters in that state.
144    
145    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
146    
147            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
148            which can be used to insert some wrapper between the character
149            stream handle and the tokenizer.  (It is currently not supported
150            for |set_inner_html| for |Element|s).
151    
152    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
153    
154            * HTML.pm.src: Ignore punctuations in charset names.
155    
156    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
157    
158            * ContentChecker.pm: Support for charset-layer error levels.
159    
160            * HTML.pm.src: Don't specify |text| argument for the
161            |chardecode:fallback| error, since it is not the encoding
162            being used alternatively.
163    
164    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
165    
166            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
167    
168    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
169    
170            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
171    
172    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
173    
174            * HTML.pm.src: Bug fix and sync with the spec with regard
175            to after after frameset insertion mode processing (HTML5
176            revision 1909).  Note that the implementation was wrong
177            per the old spec before the r1909 changes.
178    
179    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
180    
181            * HTMLTable.pm: scope=auto algorithm fix synced with the
182            spec (HTML5 revision 2093).
183            ($process_row): Algorithm step numbers synced with the
184            spec (HTML5 revision 2092).
185    
186    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
187    
188            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
189            revision 2094).
190    
191    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
192    
193            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
194    
195    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
196    
197            * HTML.pm.src: '"' and "'" at the end of attribute
198            name (after another attribute) now raise parse error (HTML5
199            revision 2123).  Empty unquoted attribute values are no
200            longer allowed (HTML5 revision 2122).
201    
202    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
203    
204            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
205            revision 2130).
206    
207    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
208    
209            * ContentChecker.pm: |xml:lang| attribute value must be same
210            as |lang| attribute value for HTML elements (HTML5 revision 2062
211            and so on).
212    
213    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
214    
215            * ContentChecker.pm: Error level definition for |xml_id_error|
216            was missing.
217    
218            * URIChecker.pm: The end of the URL should be marked as the
219            error location for an empty path error.  The position
220            between the userinfo and the port components should be
221            marked as the error location for an empty host error.
222    
223    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
224    
225            * URIChecker.pm: Set parameters representing where in the
226            value the error occurs for errors.  Report unknown
227            address format error in warning level, since address
228            formats are rarely added.  Path segments starting with "/.."
229            were misinterpreted as a dot-segment.
230    
231    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
232    
233            * URIChecker.pm (check_iri_reference): Requires
234            |Message::DOM::DOMImplementation|.
235    
236    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
237    
238            * IMTChecker.pm: Updated for the new error reporting architecture.
239    
240            * ContentChecker.pm: Error levels for IMTs are added.
241    
242    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
243    
244            * H2H.pm (_shift_token): Support for unquoted HTML attribute
245            values.
246    
247    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
248    
249            * CacheManifest.pm: Support for new style of error
250            reports.
251    
252            * HTML.pm.src: Set line=1, column=1 to the document node.
253    
254    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
255    
256            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
257            and URL checkers.  Support for more error levels for bogus
258            langauge tag and URL "standards".
259    
260            * LangTag.pm, URIChecker.pm: Support for new style error
261            level reporting.
262    
263    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
264    
265            * ContentChecker.pm: Support for RDF/XML error levels.
266    
267            * HTMLTable.pm, RDFXML.pm: Support for new style of error level
268            specifying.  Error types are revised.
269    
270    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
271    
272            * ContentChecker.pm: All error reporting method calls are
273            renewed.
274    
275    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
276    
277            * HTML.pm.src: All error type names and "text" parameters
278            are revised.  Use new style for "level" specification.
279    
280            * mkhtmlparser.pl: Use new style for "level" specification.
281    
282    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
283    
284            * WebIDL.pm (parse_char_string): Simplified error
285            reporting process for broken ignored valuetype definition.
286            (Valuetype idl_text): Support for special "DOMString" name.
287    
288    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
289    
290            * WebIDL.pm ($get_scoped_name): Append "::::" if the last
291            terminal of the ScopedName is "DOMString", such that whether
292            the last part of the scoped name is "DOMString" or "_DOMString"
293            later.  It is necessary to determine whether a |typedef|
294            definition should be ignored or not.
295            (parse_char_string): Unescape the identifier of
296            exception members.
297            ($resolve): Return undef for builtin types and sequence<T>
298            types (we might not have to do this, however...).
299            (check): Support checking for Exceptions, Valuetypes,
300            and Typedefs.
301            ($serialize_type): Support for "DOMString::::" syntax.
302            (Typedef idl_text): Output Type as "DOMString" if it
303            is really "DOMString" (i.e. its internal representation
304            is "::DOMString::").
305    
306    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
307    
308            * WebIDL.pm ($resolve): New code, based on resolve code
309            for constant types in the |check| method.
310            (check): Support for checking of attributes, operations, and
311            arguments.
312            (Attribute/Operation idl_text): Exception names in getraises,
313            setraises, and raises clauses is serizlied by |$serialize_type|
314            code.
315    
316    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
317    
318            * WebIDL.pm ($integer): Order of selections are changed to match
319            hexadecimal numbers (the original pattern, taken from the spec,
320            was not work for hexadecimal numbers, because the "0" prefix
321            matches to the [0-7]* part (as an empty string) and therefore
322            it does not match with remaining "x..." part of a "0x..." integer
323            literal.
324            ($get_type): It now returns a string, not an array reference,
325            for regular types and |sequence| types (i.e. it in any case
326            returns a string).
327            ($get_next_token): The second item in the array that represents
328            a integer or float token is now a Perl number value, not the
329            original string representation of the number.
330            (check): Support for const value consistency checking.
331            No extended attribute is defined for constants.
332            (Node subclasses): Use simple strings rather than array references
333            for default data type values.
334            ($serialize_type): Type values are now simple strings.
335            (value): If the new attribute value is a false value, then
336            a FALSE value is set to the attribute.
337    
338    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
339    
340            * WebIDL.pm ($get_scoped_name): Now scoped names are stored
341            in its stringified format ("scoped name" as defined in the
342            spec).  Note that future version of this module should not use
343            array references for type values and the |type_text| attribute
344            should be made obsolete.
345            (parse_char_string): Unescape attribute names.
346            (check): Support for checking of whether inherited interfaces
347            are actually defined or not.  Support for checking of whether
348            interface member identifiers are duplicated or not.
349            ($serialize_type): Scoped names are returned as is.  A future
350            version of this code should escape identifiers other than "DOMString",
351            otherwise the idl_text would be non-conforming.
352    
353    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
354    
355            * WebIDL.pm (parse_char_string): Set line/column numbers
356            to generated nodes.  Unescape identifiers.  Extended attributes
357            for Definition's were ignored.
358            (append_child): Set |parent_node| attribute.
359            (parent_node): New attribute.
360            (check): Support interface/exception members.  Support
361            extended attributes.  Support definition identifier uniqueness
362            constraint.
363            (qualified_name): New attribute.
364            (Interface/Exception idl_text): Extended attributes were
365            not prepended to the returned text.
366    
367    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
368    
369            * WebIDL.pm (parse_char_string): Set line/column numbers
370            to interface object experimentally.  s/shift/pop/g, shift
371            would make things wrong.  Support for interface forward
372            declarations was missing.  Broken interface declarations
373            with no block were not ignored entirely.
374            (Whatpm::WebIDL::Node): New abstract class.  This class
375            makes things easier.
376            (child_nodes): New attribute.  Unlike DOM's attribute with
377            same name, this attribute returns a dead list of nodes for
378            simplicity.
379            (get_user_data, set_user_data): New methods.
380            (Module idl_text): A SPACE character should be inserted
381            before the |{| character.
382            (Interface idl_text): Support for interface forward declarations.
383            (is_forward_declaration): New attribute.
384    
385    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
386    
387            * WebIDL.pm (type_text): Better serializer.
388    
389  2008-07-19  Wakaba  <wakaba@suika.fam.cx>  2008-07-19  Wakaba  <wakaba@suika.fam.cx>
390    
391          * WebIDL.pm: Revise forward-compatible parsing so that          * WebIDL.pm: Revise forward-compatible parsing so that

Legend:
Removed from v.1.262  
changed lines
  Added in v.1.313

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24