/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.237 by wakaba, Sun May 18 03:46:26 2008 UTC revision 1.314 by wakaba, Mon Sep 15 09:27:53 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: New state |PCDATA_STATE|.  Use an empty string for
4            |{s_kwd}| in DATA_STATE as default.
5    
6    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
7    
8            * HTML.pm.src, mkhtmlparser.pl: Replace |{prev_char}|
9            by |{s_kwd}| in DATA_STATE.
10    
11    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
12    
13            * HTML.pm.src: Shorten keys.
14    
15    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
16    
17            * HTML.pm.src: Remove checking for control character, surrogate
18            pair, or noncharacter code points and non-Unicode code
19            points (they should be handled by Whatpm::Charset::UnicodeChecker).
20            (parse_char_stream): Support for the |$get_wrapper| argument and
21            character stream error handlers.
22    
23    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
24    
25            * ContentChecker.pm: Don't call |loda_ns_module|
26            for null-namespace elements/attributes.
27    
28            * HTML.pm.src: Fact out $disallowed_control_chars
29            as a hash.
30    
31    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
32    
33            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
34            and |{next_char}| initializations are moved to initialization
35            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
36            with |parse_char_stream|.
37    
38    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
39    
40            * HTML.pm.src (parse_char_stream): Make |set_next_char|
41            invoke |manakai_read_until|, not only |read|, where
42            possible, to decrease the number of |read| method calls.
43    
44            * mkhtmlparser.pl: Related changes to the aforementioned
45            modification.
46    
47    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
48    
49            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
50            would report character error from now.
51    
52    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
53    
54            * HTML.pm.src: White-space-leaded non-white-space character
55            tokens in "before head insertion mode" was not
56            correctly handled.
57            (set_inner_html): Reimplemented using CharString decodehandle
58            class.  Support for $get_wrapper argument.  Support
59            for |{read_until}| feature.
60    
61    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
62    
63            * HTML.pm.src: Make a "bare ero" error for unknown
64            entities point the "&" character.
65    
66    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
67    
68            * HTML.pm.src: It turns out that U+FFFD don't have to
69            be added to the list of excluded characters.
70    
71    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
72    
73            * HTML.pm.src ($char_onerror): Have character decoder's |line|
74            and |column| a higher priority than the one set by the
75            tokenizer's input handler.
76            ($self->{read_until}): Exclude U+FFFD (but this might
77            not be necessary, since now we do line/column fixup in
78            the character decode handle).
79    
80    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
81    
82            * HTML.pm.src: Use |{read_until}| where possible.
83    
84    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
85    
86            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
87            and |manakai_getc_until| to |manakai_read_until| to
88            reduce the number of string copies.
89    
90    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
91    
92            * HTML.pm.src (parse_char_string): Use newly created
93            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
94            standard feature to |open| a string as a filehandle,
95            since Perl's string filehandle seems not supporting |ungetc|
96            method correctly.
97            (parse_char_stream): Define |{getc_until}| method.
98            (DATA_STATE): Experimental support for |getc_until| feature.
99    
100    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
101    
102            * HTML.pm.src: Check points added to newly added branches.
103    
104    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
105    
106            * HTML.pm.src: Remove |{char}|, which is no longer used.
107            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
108            and replaced by |{prev_state}|.
109    
110            * mkhtmlparser.pl: Remove |{char}| feature.
111            Remove |!!!back-next-input-character;| macro.
112    
113    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
114    
115            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
116            entity related tokenizer states in favor of new states
117            implementing the consume character reference algorithm.
118    
119    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
120    
121            * HTML.pm.src: "Consume a character reference" algorithm is
122            now implemented as a tokenizer's state, rather than
123            a method, with minimum changes (more changes will
124            be made, in due course).  "Bogus comment state"'s inner
125            loop gets removed.
126    
127    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
128    
129            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
130            into their own tokenizer states.
131    
132    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
133    
134            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
135            is split into three states.
136    
137    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
138    
139            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
140            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
141            no longer does the tokenizer have to push back next input
142            characters in those states.
143    
144    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
145    
146            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
147            into four states so that no longer does the tokenizer have to push
148            back next input characters in that state.
149    
150    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
151    
152            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
153            which can be used to insert some wrapper between the character
154            stream handle and the tokenizer.  (It is currently not supported
155            for |set_inner_html| for |Element|s).
156    
157    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
158    
159            * HTML.pm.src: Ignore punctuations in charset names.
160    
161    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
162    
163            * ContentChecker.pm: Support for charset-layer error levels.
164    
165            * HTML.pm.src: Don't specify |text| argument for the
166            |chardecode:fallback| error, since it is not the encoding
167            being used alternatively.
168    
169    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
170    
171            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
172    
173    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
174    
175            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
176    
177    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
178    
179            * HTML.pm.src: Bug fix and sync with the spec with regard
180            to after after frameset insertion mode processing (HTML5
181            revision 1909).  Note that the implementation was wrong
182            per the old spec before the r1909 changes.
183    
184    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
185    
186            * HTMLTable.pm: scope=auto algorithm fix synced with the
187            spec (HTML5 revision 2093).
188            ($process_row): Algorithm step numbers synced with the
189            spec (HTML5 revision 2092).
190    
191    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
192    
193            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
194            revision 2094).
195    
196    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
197    
198            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
199    
200    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
201    
202            * HTML.pm.src: '"' and "'" at the end of attribute
203            name (after another attribute) now raise parse error (HTML5
204            revision 2123).  Empty unquoted attribute values are no
205            longer allowed (HTML5 revision 2122).
206    
207    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
208    
209            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
210            revision 2130).
211    
212    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
213    
214            * ContentChecker.pm: |xml:lang| attribute value must be same
215            as |lang| attribute value for HTML elements (HTML5 revision 2062
216            and so on).
217    
218    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
219    
220            * ContentChecker.pm: Error level definition for |xml_id_error|
221            was missing.
222    
223            * URIChecker.pm: The end of the URL should be marked as the
224            error location for an empty path error.  The position
225            between the userinfo and the port components should be
226            marked as the error location for an empty host error.
227    
228    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
229    
230            * URIChecker.pm: Set parameters representing where in the
231            value the error occurs for errors.  Report unknown
232            address format error in warning level, since address
233            formats are rarely added.  Path segments starting with "/.."
234            were misinterpreted as a dot-segment.
235    
236    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
237    
238            * URIChecker.pm (check_iri_reference): Requires
239            |Message::DOM::DOMImplementation|.
240    
241    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
242    
243            * IMTChecker.pm: Updated for the new error reporting architecture.
244    
245            * ContentChecker.pm: Error levels for IMTs are added.
246    
247    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
248    
249            * H2H.pm (_shift_token): Support for unquoted HTML attribute
250            values.
251    
252    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
253    
254            * CacheManifest.pm: Support for new style of error
255            reports.
256    
257            * HTML.pm.src: Set line=1, column=1 to the document node.
258    
259    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
260    
261            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
262            and URL checkers.  Support for more error levels for bogus
263            langauge tag and URL "standards".
264    
265            * LangTag.pm, URIChecker.pm: Support for new style error
266            level reporting.
267    
268    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
269    
270            * ContentChecker.pm: Support for RDF/XML error levels.
271    
272            * HTMLTable.pm, RDFXML.pm: Support for new style of error level
273            specifying.  Error types are revised.
274    
275    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
276    
277            * ContentChecker.pm: All error reporting method calls are
278            renewed.
279    
280    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
281    
282            * HTML.pm.src: All error type names and "text" parameters
283            are revised.  Use new style for "level" specification.
284    
285            * mkhtmlparser.pl: Use new style for "level" specification.
286    
287    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
288    
289            * WebIDL.pm (parse_char_string): Simplified error
290            reporting process for broken ignored valuetype definition.
291            (Valuetype idl_text): Support for special "DOMString" name.
292    
293    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
294    
295            * WebIDL.pm ($get_scoped_name): Append "::::" if the last
296            terminal of the ScopedName is "DOMString", such that whether
297            the last part of the scoped name is "DOMString" or "_DOMString"
298            later.  It is necessary to determine whether a |typedef|
299            definition should be ignored or not.
300            (parse_char_string): Unescape the identifier of
301            exception members.
302            ($resolve): Return undef for builtin types and sequence<T>
303            types (we might not have to do this, however...).
304            (check): Support checking for Exceptions, Valuetypes,
305            and Typedefs.
306            ($serialize_type): Support for "DOMString::::" syntax.
307            (Typedef idl_text): Output Type as "DOMString" if it
308            is really "DOMString" (i.e. its internal representation
309            is "::DOMString::").
310    
311    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
312    
313            * WebIDL.pm ($resolve): New code, based on resolve code
314            for constant types in the |check| method.
315            (check): Support for checking of attributes, operations, and
316            arguments.
317            (Attribute/Operation idl_text): Exception names in getraises,
318            setraises, and raises clauses is serizlied by |$serialize_type|
319            code.
320    
321    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
322    
323            * WebIDL.pm ($integer): Order of selections are changed to match
324            hexadecimal numbers (the original pattern, taken from the spec,
325            was not work for hexadecimal numbers, because the "0" prefix
326            matches to the [0-7]* part (as an empty string) and therefore
327            it does not match with remaining "x..." part of a "0x..." integer
328            literal.
329            ($get_type): It now returns a string, not an array reference,
330            for regular types and |sequence| types (i.e. it in any case
331            returns a string).
332            ($get_next_token): The second item in the array that represents
333            a integer or float token is now a Perl number value, not the
334            original string representation of the number.
335            (check): Support for const value consistency checking.
336            No extended attribute is defined for constants.
337            (Node subclasses): Use simple strings rather than array references
338            for default data type values.
339            ($serialize_type): Type values are now simple strings.
340            (value): If the new attribute value is a false value, then
341            a FALSE value is set to the attribute.
342    
343    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
344    
345            * WebIDL.pm ($get_scoped_name): Now scoped names are stored
346            in its stringified format ("scoped name" as defined in the
347            spec).  Note that future version of this module should not use
348            array references for type values and the |type_text| attribute
349            should be made obsolete.
350            (parse_char_string): Unescape attribute names.
351            (check): Support for checking of whether inherited interfaces
352            are actually defined or not.  Support for checking of whether
353            interface member identifiers are duplicated or not.
354            ($serialize_type): Scoped names are returned as is.  A future
355            version of this code should escape identifiers other than "DOMString",
356            otherwise the idl_text would be non-conforming.
357    
358    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
359    
360            * WebIDL.pm (parse_char_string): Set line/column numbers
361            to generated nodes.  Unescape identifiers.  Extended attributes
362            for Definition's were ignored.
363            (append_child): Set |parent_node| attribute.
364            (parent_node): New attribute.
365            (check): Support interface/exception members.  Support
366            extended attributes.  Support definition identifier uniqueness
367            constraint.
368            (qualified_name): New attribute.
369            (Interface/Exception idl_text): Extended attributes were
370            not prepended to the returned text.
371    
372    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
373    
374            * WebIDL.pm (parse_char_string): Set line/column numbers
375            to interface object experimentally.  s/shift/pop/g, shift
376            would make things wrong.  Support for interface forward
377            declarations was missing.  Broken interface declarations
378            with no block were not ignored entirely.
379            (Whatpm::WebIDL::Node): New abstract class.  This class
380            makes things easier.
381            (child_nodes): New attribute.  Unlike DOM's attribute with
382            same name, this attribute returns a dead list of nodes for
383            simplicity.
384            (get_user_data, set_user_data): New methods.
385            (Module idl_text): A SPACE character should be inserted
386            before the |{| character.
387            (Interface idl_text): Support for interface forward declarations.
388            (is_forward_declaration): New attribute.
389    
390    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
391    
392            * WebIDL.pm (type_text): Better serializer.
393    
394    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
395    
396            * WebIDL.pm: Revise forward-compatible parsing so that
397            it now can handle broken extended attributes and as such.
398    
399    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
400    
401            * WebIDL.pm: Real support for extended attributes.
402            Support for extended attributes with arguments.
403    
404    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
405    
406            * WebIDL.pm: Support for |exception| syntax.
407            (Interface->idl_text): Tentative support for inheritances.
408    
409    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
410    
411            * WebIDL.pm: Hierarchical scoped name support was broken.
412            Support for raises, setraises, and getraises syntaxes.
413    
414    2008-07-18  Wakaba  <wakaba@suika.fam.cx>
415    
416            * WebIDL.pm: Support for |idl_text| attribute, version 1 (no
417            proper support for types, extended attributes, and exceptions yet).
418            WebIDL parser, version 1 (no support for exceptions yet,
419            no proper support for extended attributes yet).
420    
421    2008-07-09  Wakaba  <wakaba@suika.fam.cx>
422    
423            * WebIDL.pm (parse_char_string): Support for basic attribute syntax.
424    
425    2008-06-29  Wakaba  <wakaba@suika.fam.cx>
426    
427            * WebIDL.pm: Support for valuetype and const.
428    
429    2008-06-29  Wakaba  <wakaba@suika.fam.cx>
430            
431            * WebIDL.pm: New module.
432    
433    2008-06-15  Wakaba  <wakaba@suika.fam.cx>
434    
435            * Makefile (Entities.html): URI changed.
436    
437    2008-06-08  Wakaba  <wakaba@suika.fam.cx>
438    
439            * HTML.pm.src: Support for ruby parsing (HTML5 revision 1704).
440    
441    2008-06-01  Wakaba  <wakaba@suika.fam.cx>
442    
443            * HTML.pm.src (_get_next_token): A parse error was missing.
444    
445    2008-06-01  Wakaba  <wakaba@suika.fam.cx>
446    
447            * mklinktypelist.pl: rel=contact is no longer part of the HTML5
448            spec (commented out). (HTML5 revision 1711).
449    
450    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
451    
452            * ContentType.pm: Drop support for UTF-32 (HTML5 revision 1701).
453    
454            * HTML.pm.src: UTF-16BE and UTF-16LE should be considered
455            as UTF-16 (HTML5 revision 1701).
456    
457    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
458    
459            * HTML.pm.src: Support for <noframes> in <head> (HTML5 revision
460            1692).
461    
462    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
463    
464            * HTML.pm.src: The secondary insertion mode used when switching
465            to foreign content is the "in body" insertion mode (HTML5 revision
466            1696).
467    
468    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
469    
470            * HTML.pm.src: Don't raise parse error for <isindex/> (HTML5
471            revision 1697).
472    
473    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
474    
475            * HTML.pm.src: Support for end-of-file token in foreign content
476            insertion mode (HTML5 revision 1693).  Update SVG camelCase
477            attribute list (HTML5 revision 1700).  <textarea> closes
478            </select> (HTML5 revision 1699).  More start tags close in
479            foreign content insertion mode (HTML5 revision 1698).
480    
481    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
482    
483            * HTML.pm.src: ";" is not part of charset name (HTML5 revision 1665).
484    
485    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
486    
487            * HTML.pm.src: More robust charset parameter detection (HTML5
488            revision 1674).
489    
490    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
491    
492            * ContentType.pm: Support for image/vnd.microsoft.icon (HTML5
493            revision 1676).
494    
495    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
496    
497            * HTML.pm.src: Ignore language part of public identifiers for
498            quriks mode detection (HTML5 revision 1679).
499    
500    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
501    
502            * HTML.pm.src: Reduce the number of errors in truncated doctypes (HTML5
503            revision 1685).
504    
505    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
506    
507            * HTML.pm.src: Support for EOF in new states for tags (HTML5
508            revision 1684).
509    
510    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
511    
512            * HTML.pm.src (_reset_insertion_mode): Make <td>.innerHTML
513            work (HTML5 revision 1690).
514    
515    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
516    
517            * HTML.pm.src (_tree_construction_main): Change handling of
518            end tags in head insertion modes (HTML5 revision 1686).
519            (parse_char_string): Bug fix for non-utf8 character string handlings.
520            (parse_char_stream): |ungetc| does not work well for this context.
521    
522    2008-05-18  Wakaba  <wakaba@suika.fam.cx>
523    
524            * HTML.pm.src (parse_byte_string): Redefined to invoke
525            |parse_byte_stream|.
526            (parse_byte_stream): New method.
527    
528  2008-05-18  Wakaba  <wakaba@suika.fam.cx>  2008-05-18  Wakaba  <wakaba@suika.fam.cx>
529    
530          * HTML.pm.src (parse_byte_string): Fix the column number reported          * HTML.pm.src (parse_byte_string): Fix the column number reported

Legend:
Removed from v.1.237  
changed lines
  Added in v.1.314

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24