/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.237 by wakaba, Sun May 18 03:46:26 2008 UTC revision 1.313 by wakaba, Mon Sep 15 09:02:27 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src, mkhtmlparser.pl: Replace |{prev_char}|
4            by |{s_kwd}| in DATA_STATE.
5    
6    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
7    
8            * HTML.pm.src: Shorten keys.
9    
10    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
11    
12            * HTML.pm.src: Remove checking for control character, surrogate
13            pair, or noncharacter code points and non-Unicode code
14            points (they should be handled by Whatpm::Charset::UnicodeChecker).
15            (parse_char_stream): Support for the |$get_wrapper| argument and
16            character stream error handlers.
17    
18    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
19    
20            * ContentChecker.pm: Don't call |loda_ns_module|
21            for null-namespace elements/attributes.
22    
23            * HTML.pm.src: Fact out $disallowed_control_chars
24            as a hash.
25    
26    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
27    
28            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
29            and |{next_char}| initializations are moved to initialization
30            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
31            with |parse_char_stream|.
32    
33    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
34    
35            * HTML.pm.src (parse_char_stream): Make |set_next_char|
36            invoke |manakai_read_until|, not only |read|, where
37            possible, to decrease the number of |read| method calls.
38    
39            * mkhtmlparser.pl: Related changes to the aforementioned
40            modification.
41    
42    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
43    
44            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
45            would report character error from now.
46    
47    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
48    
49            * HTML.pm.src: White-space-leaded non-white-space character
50            tokens in "before head insertion mode" was not
51            correctly handled.
52            (set_inner_html): Reimplemented using CharString decodehandle
53            class.  Support for $get_wrapper argument.  Support
54            for |{read_until}| feature.
55    
56    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
57    
58            * HTML.pm.src: Make a "bare ero" error for unknown
59            entities point the "&" character.
60    
61    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
62    
63            * HTML.pm.src: It turns out that U+FFFD don't have to
64            be added to the list of excluded characters.
65    
66    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
67    
68            * HTML.pm.src ($char_onerror): Have character decoder's |line|
69            and |column| a higher priority than the one set by the
70            tokenizer's input handler.
71            ($self->{read_until}): Exclude U+FFFD (but this might
72            not be necessary, since now we do line/column fixup in
73            the character decode handle).
74    
75    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
76    
77            * HTML.pm.src: Use |{read_until}| where possible.
78    
79    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
80    
81            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
82            and |manakai_getc_until| to |manakai_read_until| to
83            reduce the number of string copies.
84    
85    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
86    
87            * HTML.pm.src (parse_char_string): Use newly created
88            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
89            standard feature to |open| a string as a filehandle,
90            since Perl's string filehandle seems not supporting |ungetc|
91            method correctly.
92            (parse_char_stream): Define |{getc_until}| method.
93            (DATA_STATE): Experimental support for |getc_until| feature.
94    
95    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
96    
97            * HTML.pm.src: Check points added to newly added branches.
98    
99    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
100    
101            * HTML.pm.src: Remove |{char}|, which is no longer used.
102            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
103            and replaced by |{prev_state}|.
104    
105            * mkhtmlparser.pl: Remove |{char}| feature.
106            Remove |!!!back-next-input-character;| macro.
107    
108    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
109    
110            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
111            entity related tokenizer states in favor of new states
112            implementing the consume character reference algorithm.
113    
114    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
115    
116            * HTML.pm.src: "Consume a character reference" algorithm is
117            now implemented as a tokenizer's state, rather than
118            a method, with minimum changes (more changes will
119            be made, in due course).  "Bogus comment state"'s inner
120            loop gets removed.
121    
122    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
123    
124            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
125            into their own tokenizer states.
126    
127    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
128    
129            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
130            is split into three states.
131    
132    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
133    
134            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
135            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
136            no longer does the tokenizer have to push back next input
137            characters in those states.
138    
139    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
140    
141            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
142            into four states so that no longer does the tokenizer have to push
143            back next input characters in that state.
144    
145    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
146    
147            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
148            which can be used to insert some wrapper between the character
149            stream handle and the tokenizer.  (It is currently not supported
150            for |set_inner_html| for |Element|s).
151    
152    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
153    
154            * HTML.pm.src: Ignore punctuations in charset names.
155    
156    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
157    
158            * ContentChecker.pm: Support for charset-layer error levels.
159    
160            * HTML.pm.src: Don't specify |text| argument for the
161            |chardecode:fallback| error, since it is not the encoding
162            being used alternatively.
163    
164    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
165    
166            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
167    
168    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
169    
170            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
171    
172    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
173    
174            * HTML.pm.src: Bug fix and sync with the spec with regard
175            to after after frameset insertion mode processing (HTML5
176            revision 1909).  Note that the implementation was wrong
177            per the old spec before the r1909 changes.
178    
179    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
180    
181            * HTMLTable.pm: scope=auto algorithm fix synced with the
182            spec (HTML5 revision 2093).
183            ($process_row): Algorithm step numbers synced with the
184            spec (HTML5 revision 2092).
185    
186    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
187    
188            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
189            revision 2094).
190    
191    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
192    
193            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
194    
195    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
196    
197            * HTML.pm.src: '"' and "'" at the end of attribute
198            name (after another attribute) now raise parse error (HTML5
199            revision 2123).  Empty unquoted attribute values are no
200            longer allowed (HTML5 revision 2122).
201    
202    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
203    
204            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
205            revision 2130).
206    
207    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
208    
209            * ContentChecker.pm: |xml:lang| attribute value must be same
210            as |lang| attribute value for HTML elements (HTML5 revision 2062
211            and so on).
212    
213    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
214    
215            * ContentChecker.pm: Error level definition for |xml_id_error|
216            was missing.
217    
218            * URIChecker.pm: The end of the URL should be marked as the
219            error location for an empty path error.  The position
220            between the userinfo and the port components should be
221            marked as the error location for an empty host error.
222    
223    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
224    
225            * URIChecker.pm: Set parameters representing where in the
226            value the error occurs for errors.  Report unknown
227            address format error in warning level, since address
228            formats are rarely added.  Path segments starting with "/.."
229            were misinterpreted as a dot-segment.
230    
231    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
232    
233            * URIChecker.pm (check_iri_reference): Requires
234            |Message::DOM::DOMImplementation|.
235    
236    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
237    
238            * IMTChecker.pm: Updated for the new error reporting architecture.
239    
240            * ContentChecker.pm: Error levels for IMTs are added.
241    
242    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
243    
244            * H2H.pm (_shift_token): Support for unquoted HTML attribute
245            values.
246    
247    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
248    
249            * CacheManifest.pm: Support for new style of error
250            reports.
251    
252            * HTML.pm.src: Set line=1, column=1 to the document node.
253    
254    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
255    
256            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
257            and URL checkers.  Support for more error levels for bogus
258            langauge tag and URL "standards".
259    
260            * LangTag.pm, URIChecker.pm: Support for new style error
261            level reporting.
262    
263    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
264    
265            * ContentChecker.pm: Support for RDF/XML error levels.
266    
267            * HTMLTable.pm, RDFXML.pm: Support for new style of error level
268            specifying.  Error types are revised.
269    
270    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
271    
272            * ContentChecker.pm: All error reporting method calls are
273            renewed.
274    
275    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
276    
277            * HTML.pm.src: All error type names and "text" parameters
278            are revised.  Use new style for "level" specification.
279    
280            * mkhtmlparser.pl: Use new style for "level" specification.
281    
282    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
283    
284            * WebIDL.pm (parse_char_string): Simplified error
285            reporting process for broken ignored valuetype definition.
286            (Valuetype idl_text): Support for special "DOMString" name.
287    
288    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
289    
290            * WebIDL.pm ($get_scoped_name): Append "::::" if the last
291            terminal of the ScopedName is "DOMString", such that whether
292            the last part of the scoped name is "DOMString" or "_DOMString"
293            later.  It is necessary to determine whether a |typedef|
294            definition should be ignored or not.
295            (parse_char_string): Unescape the identifier of
296            exception members.
297            ($resolve): Return undef for builtin types and sequence<T>
298            types (we might not have to do this, however...).
299            (check): Support checking for Exceptions, Valuetypes,
300            and Typedefs.
301            ($serialize_type): Support for "DOMString::::" syntax.
302            (Typedef idl_text): Output Type as "DOMString" if it
303            is really "DOMString" (i.e. its internal representation
304            is "::DOMString::").
305    
306    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
307    
308            * WebIDL.pm ($resolve): New code, based on resolve code
309            for constant types in the |check| method.
310            (check): Support for checking of attributes, operations, and
311            arguments.
312            (Attribute/Operation idl_text): Exception names in getraises,
313            setraises, and raises clauses is serizlied by |$serialize_type|
314            code.
315    
316    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
317    
318            * WebIDL.pm ($integer): Order of selections are changed to match
319            hexadecimal numbers (the original pattern, taken from the spec,
320            was not work for hexadecimal numbers, because the "0" prefix
321            matches to the [0-7]* part (as an empty string) and therefore
322            it does not match with remaining "x..." part of a "0x..." integer
323            literal.
324            ($get_type): It now returns a string, not an array reference,
325            for regular types and |sequence| types (i.e. it in any case
326            returns a string).
327            ($get_next_token): The second item in the array that represents
328            a integer or float token is now a Perl number value, not the
329            original string representation of the number.
330            (check): Support for const value consistency checking.
331            No extended attribute is defined for constants.
332            (Node subclasses): Use simple strings rather than array references
333            for default data type values.
334            ($serialize_type): Type values are now simple strings.
335            (value): If the new attribute value is a false value, then
336            a FALSE value is set to the attribute.
337    
338    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
339    
340            * WebIDL.pm ($get_scoped_name): Now scoped names are stored
341            in its stringified format ("scoped name" as defined in the
342            spec).  Note that future version of this module should not use
343            array references for type values and the |type_text| attribute
344            should be made obsolete.
345            (parse_char_string): Unescape attribute names.
346            (check): Support for checking of whether inherited interfaces
347            are actually defined or not.  Support for checking of whether
348            interface member identifiers are duplicated or not.
349            ($serialize_type): Scoped names are returned as is.  A future
350            version of this code should escape identifiers other than "DOMString",
351            otherwise the idl_text would be non-conforming.
352    
353    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
354    
355            * WebIDL.pm (parse_char_string): Set line/column numbers
356            to generated nodes.  Unescape identifiers.  Extended attributes
357            for Definition's were ignored.
358            (append_child): Set |parent_node| attribute.
359            (parent_node): New attribute.
360            (check): Support interface/exception members.  Support
361            extended attributes.  Support definition identifier uniqueness
362            constraint.
363            (qualified_name): New attribute.
364            (Interface/Exception idl_text): Extended attributes were
365            not prepended to the returned text.
366    
367    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
368    
369            * WebIDL.pm (parse_char_string): Set line/column numbers
370            to interface object experimentally.  s/shift/pop/g, shift
371            would make things wrong.  Support for interface forward
372            declarations was missing.  Broken interface declarations
373            with no block were not ignored entirely.
374            (Whatpm::WebIDL::Node): New abstract class.  This class
375            makes things easier.
376            (child_nodes): New attribute.  Unlike DOM's attribute with
377            same name, this attribute returns a dead list of nodes for
378            simplicity.
379            (get_user_data, set_user_data): New methods.
380            (Module idl_text): A SPACE character should be inserted
381            before the |{| character.
382            (Interface idl_text): Support for interface forward declarations.
383            (is_forward_declaration): New attribute.
384    
385    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
386    
387            * WebIDL.pm (type_text): Better serializer.
388    
389    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
390    
391            * WebIDL.pm: Revise forward-compatible parsing so that
392            it now can handle broken extended attributes and as such.
393    
394    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
395    
396            * WebIDL.pm: Real support for extended attributes.
397            Support for extended attributes with arguments.
398    
399    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
400    
401            * WebIDL.pm: Support for |exception| syntax.
402            (Interface->idl_text): Tentative support for inheritances.
403    
404    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
405    
406            * WebIDL.pm: Hierarchical scoped name support was broken.
407            Support for raises, setraises, and getraises syntaxes.
408    
409    2008-07-18  Wakaba  <wakaba@suika.fam.cx>
410    
411            * WebIDL.pm: Support for |idl_text| attribute, version 1 (no
412            proper support for types, extended attributes, and exceptions yet).
413            WebIDL parser, version 1 (no support for exceptions yet,
414            no proper support for extended attributes yet).
415    
416    2008-07-09  Wakaba  <wakaba@suika.fam.cx>
417    
418            * WebIDL.pm (parse_char_string): Support for basic attribute syntax.
419    
420    2008-06-29  Wakaba  <wakaba@suika.fam.cx>
421    
422            * WebIDL.pm: Support for valuetype and const.
423    
424    2008-06-29  Wakaba  <wakaba@suika.fam.cx>
425            
426            * WebIDL.pm: New module.
427    
428    2008-06-15  Wakaba  <wakaba@suika.fam.cx>
429    
430            * Makefile (Entities.html): URI changed.
431    
432    2008-06-08  Wakaba  <wakaba@suika.fam.cx>
433    
434            * HTML.pm.src: Support for ruby parsing (HTML5 revision 1704).
435    
436    2008-06-01  Wakaba  <wakaba@suika.fam.cx>
437    
438            * HTML.pm.src (_get_next_token): A parse error was missing.
439    
440    2008-06-01  Wakaba  <wakaba@suika.fam.cx>
441    
442            * mklinktypelist.pl: rel=contact is no longer part of the HTML5
443            spec (commented out). (HTML5 revision 1711).
444    
445    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
446    
447            * ContentType.pm: Drop support for UTF-32 (HTML5 revision 1701).
448    
449            * HTML.pm.src: UTF-16BE and UTF-16LE should be considered
450            as UTF-16 (HTML5 revision 1701).
451    
452    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
453    
454            * HTML.pm.src: Support for <noframes> in <head> (HTML5 revision
455            1692).
456    
457    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
458    
459            * HTML.pm.src: The secondary insertion mode used when switching
460            to foreign content is the "in body" insertion mode (HTML5 revision
461            1696).
462    
463    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
464    
465            * HTML.pm.src: Don't raise parse error for <isindex/> (HTML5
466            revision 1697).
467    
468    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
469    
470            * HTML.pm.src: Support for end-of-file token in foreign content
471            insertion mode (HTML5 revision 1693).  Update SVG camelCase
472            attribute list (HTML5 revision 1700).  <textarea> closes
473            </select> (HTML5 revision 1699).  More start tags close in
474            foreign content insertion mode (HTML5 revision 1698).
475    
476    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
477    
478            * HTML.pm.src: ";" is not part of charset name (HTML5 revision 1665).
479    
480    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
481    
482            * HTML.pm.src: More robust charset parameter detection (HTML5
483            revision 1674).
484    
485    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
486    
487            * ContentType.pm: Support for image/vnd.microsoft.icon (HTML5
488            revision 1676).
489    
490    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
491    
492            * HTML.pm.src: Ignore language part of public identifiers for
493            quriks mode detection (HTML5 revision 1679).
494    
495    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
496    
497            * HTML.pm.src: Reduce the number of errors in truncated doctypes (HTML5
498            revision 1685).
499    
500    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
501    
502            * HTML.pm.src: Support for EOF in new states for tags (HTML5
503            revision 1684).
504    
505    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
506    
507            * HTML.pm.src (_reset_insertion_mode): Make <td>.innerHTML
508            work (HTML5 revision 1690).
509    
510    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
511    
512            * HTML.pm.src (_tree_construction_main): Change handling of
513            end tags in head insertion modes (HTML5 revision 1686).
514            (parse_char_string): Bug fix for non-utf8 character string handlings.
515            (parse_char_stream): |ungetc| does not work well for this context.
516    
517    2008-05-18  Wakaba  <wakaba@suika.fam.cx>
518    
519            * HTML.pm.src (parse_byte_string): Redefined to invoke
520            |parse_byte_stream|.
521            (parse_byte_stream): New method.
522    
523  2008-05-18  Wakaba  <wakaba@suika.fam.cx>  2008-05-18  Wakaba  <wakaba@suika.fam.cx>
524    
525          * HTML.pm.src (parse_byte_string): Fix the column number reported          * HTML.pm.src (parse_byte_string): Fix the column number reported

Legend:
Removed from v.1.237  
changed lines
  Added in v.1.313

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24