/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.231 by wakaba, Sat May 10 12:13:43 2008 UTC revision 1.312 by wakaba, Mon Sep 15 08:09:39 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: Shorten keys.
4    
5    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
6    
7            * HTML.pm.src: Remove checking for control character, surrogate
8            pair, or noncharacter code points and non-Unicode code
9            points (they should be handled by Whatpm::Charset::UnicodeChecker).
10            (parse_char_stream): Support for the |$get_wrapper| argument and
11            character stream error handlers.
12    
13    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
14    
15            * ContentChecker.pm: Don't call |loda_ns_module|
16            for null-namespace elements/attributes.
17    
18            * HTML.pm.src: Fact out $disallowed_control_chars
19            as a hash.
20    
21    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
22    
23            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
24            and |{next_char}| initializations are moved to initialization
25            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
26            with |parse_char_stream|.
27    
28    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
29    
30            * HTML.pm.src (parse_char_stream): Make |set_next_char|
31            invoke |manakai_read_until|, not only |read|, where
32            possible, to decrease the number of |read| method calls.
33    
34            * mkhtmlparser.pl: Related changes to the aforementioned
35            modification.
36    
37    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
38    
39            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
40            would report character error from now.
41    
42    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
43    
44            * HTML.pm.src: White-space-leaded non-white-space character
45            tokens in "before head insertion mode" was not
46            correctly handled.
47            (set_inner_html): Reimplemented using CharString decodehandle
48            class.  Support for $get_wrapper argument.  Support
49            for |{read_until}| feature.
50    
51    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
52    
53            * HTML.pm.src: Make a "bare ero" error for unknown
54            entities point the "&" character.
55    
56    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
57    
58            * HTML.pm.src: It turns out that U+FFFD don't have to
59            be added to the list of excluded characters.
60    
61    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
62    
63            * HTML.pm.src ($char_onerror): Have character decoder's |line|
64            and |column| a higher priority than the one set by the
65            tokenizer's input handler.
66            ($self->{read_until}): Exclude U+FFFD (but this might
67            not be necessary, since now we do line/column fixup in
68            the character decode handle).
69    
70    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
71    
72            * HTML.pm.src: Use |{read_until}| where possible.
73    
74    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
75    
76            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
77            and |manakai_getc_until| to |manakai_read_until| to
78            reduce the number of string copies.
79    
80    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
81    
82            * HTML.pm.src (parse_char_string): Use newly created
83            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
84            standard feature to |open| a string as a filehandle,
85            since Perl's string filehandle seems not supporting |ungetc|
86            method correctly.
87            (parse_char_stream): Define |{getc_until}| method.
88            (DATA_STATE): Experimental support for |getc_until| feature.
89    
90    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
91    
92            * HTML.pm.src: Check points added to newly added branches.
93    
94    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
95    
96            * HTML.pm.src: Remove |{char}|, which is no longer used.
97            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
98            and replaced by |{prev_state}|.
99    
100            * mkhtmlparser.pl: Remove |{char}| feature.
101            Remove |!!!back-next-input-character;| macro.
102    
103    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
104    
105            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
106            entity related tokenizer states in favor of new states
107            implementing the consume character reference algorithm.
108    
109    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
110    
111            * HTML.pm.src: "Consume a character reference" algorithm is
112            now implemented as a tokenizer's state, rather than
113            a method, with minimum changes (more changes will
114            be made, in due course).  "Bogus comment state"'s inner
115            loop gets removed.
116    
117    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
118    
119            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
120            into their own tokenizer states.
121    
122    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
123    
124            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
125            is split into three states.
126    
127    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
128    
129            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
130            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
131            no longer does the tokenizer have to push back next input
132            characters in those states.
133    
134    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
135    
136            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
137            into four states so that no longer does the tokenizer have to push
138            back next input characters in that state.
139    
140    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
141    
142            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
143            which can be used to insert some wrapper between the character
144            stream handle and the tokenizer.  (It is currently not supported
145            for |set_inner_html| for |Element|s).
146    
147    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
148    
149            * HTML.pm.src: Ignore punctuations in charset names.
150    
151    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
152    
153            * ContentChecker.pm: Support for charset-layer error levels.
154    
155            * HTML.pm.src: Don't specify |text| argument for the
156            |chardecode:fallback| error, since it is not the encoding
157            being used alternatively.
158    
159    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
160    
161            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
162    
163    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
164    
165            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
166    
167    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
168    
169            * HTML.pm.src: Bug fix and sync with the spec with regard
170            to after after frameset insertion mode processing (HTML5
171            revision 1909).  Note that the implementation was wrong
172            per the old spec before the r1909 changes.
173    
174    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
175    
176            * HTMLTable.pm: scope=auto algorithm fix synced with the
177            spec (HTML5 revision 2093).
178            ($process_row): Algorithm step numbers synced with the
179            spec (HTML5 revision 2092).
180    
181    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
182    
183            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
184            revision 2094).
185    
186    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
187    
188            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
189    
190    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
191    
192            * HTML.pm.src: '"' and "'" at the end of attribute
193            name (after another attribute) now raise parse error (HTML5
194            revision 2123).  Empty unquoted attribute values are no
195            longer allowed (HTML5 revision 2122).
196    
197    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
198    
199            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
200            revision 2130).
201    
202    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
203    
204            * ContentChecker.pm: |xml:lang| attribute value must be same
205            as |lang| attribute value for HTML elements (HTML5 revision 2062
206            and so on).
207    
208    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
209    
210            * ContentChecker.pm: Error level definition for |xml_id_error|
211            was missing.
212    
213            * URIChecker.pm: The end of the URL should be marked as the
214            error location for an empty path error.  The position
215            between the userinfo and the port components should be
216            marked as the error location for an empty host error.
217    
218    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
219    
220            * URIChecker.pm: Set parameters representing where in the
221            value the error occurs for errors.  Report unknown
222            address format error in warning level, since address
223            formats are rarely added.  Path segments starting with "/.."
224            were misinterpreted as a dot-segment.
225    
226    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
227    
228            * URIChecker.pm (check_iri_reference): Requires
229            |Message::DOM::DOMImplementation|.
230    
231    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
232    
233            * IMTChecker.pm: Updated for the new error reporting architecture.
234    
235            * ContentChecker.pm: Error levels for IMTs are added.
236    
237    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
238    
239            * H2H.pm (_shift_token): Support for unquoted HTML attribute
240            values.
241    
242    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
243    
244            * CacheManifest.pm: Support for new style of error
245            reports.
246    
247            * HTML.pm.src: Set line=1, column=1 to the document node.
248    
249    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
250    
251            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
252            and URL checkers.  Support for more error levels for bogus
253            langauge tag and URL "standards".
254    
255            * LangTag.pm, URIChecker.pm: Support for new style error
256            level reporting.
257    
258    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
259    
260            * ContentChecker.pm: Support for RDF/XML error levels.
261    
262            * HTMLTable.pm, RDFXML.pm: Support for new style of error level
263            specifying.  Error types are revised.
264    
265    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
266    
267            * ContentChecker.pm: All error reporting method calls are
268            renewed.
269    
270    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
271    
272            * HTML.pm.src: All error type names and "text" parameters
273            are revised.  Use new style for "level" specification.
274    
275            * mkhtmlparser.pl: Use new style for "level" specification.
276    
277    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
278    
279            * WebIDL.pm (parse_char_string): Simplified error
280            reporting process for broken ignored valuetype definition.
281            (Valuetype idl_text): Support for special "DOMString" name.
282    
283    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
284    
285            * WebIDL.pm ($get_scoped_name): Append "::::" if the last
286            terminal of the ScopedName is "DOMString", such that whether
287            the last part of the scoped name is "DOMString" or "_DOMString"
288            later.  It is necessary to determine whether a |typedef|
289            definition should be ignored or not.
290            (parse_char_string): Unescape the identifier of
291            exception members.
292            ($resolve): Return undef for builtin types and sequence<T>
293            types (we might not have to do this, however...).
294            (check): Support checking for Exceptions, Valuetypes,
295            and Typedefs.
296            ($serialize_type): Support for "DOMString::::" syntax.
297            (Typedef idl_text): Output Type as "DOMString" if it
298            is really "DOMString" (i.e. its internal representation
299            is "::DOMString::").
300    
301    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
302    
303            * WebIDL.pm ($resolve): New code, based on resolve code
304            for constant types in the |check| method.
305            (check): Support for checking of attributes, operations, and
306            arguments.
307            (Attribute/Operation idl_text): Exception names in getraises,
308            setraises, and raises clauses is serizlied by |$serialize_type|
309            code.
310    
311    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
312    
313            * WebIDL.pm ($integer): Order of selections are changed to match
314            hexadecimal numbers (the original pattern, taken from the spec,
315            was not work for hexadecimal numbers, because the "0" prefix
316            matches to the [0-7]* part (as an empty string) and therefore
317            it does not match with remaining "x..." part of a "0x..." integer
318            literal.
319            ($get_type): It now returns a string, not an array reference,
320            for regular types and |sequence| types (i.e. it in any case
321            returns a string).
322            ($get_next_token): The second item in the array that represents
323            a integer or float token is now a Perl number value, not the
324            original string representation of the number.
325            (check): Support for const value consistency checking.
326            No extended attribute is defined for constants.
327            (Node subclasses): Use simple strings rather than array references
328            for default data type values.
329            ($serialize_type): Type values are now simple strings.
330            (value): If the new attribute value is a false value, then
331            a FALSE value is set to the attribute.
332    
333    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
334    
335            * WebIDL.pm ($get_scoped_name): Now scoped names are stored
336            in its stringified format ("scoped name" as defined in the
337            spec).  Note that future version of this module should not use
338            array references for type values and the |type_text| attribute
339            should be made obsolete.
340            (parse_char_string): Unescape attribute names.
341            (check): Support for checking of whether inherited interfaces
342            are actually defined or not.  Support for checking of whether
343            interface member identifiers are duplicated or not.
344            ($serialize_type): Scoped names are returned as is.  A future
345            version of this code should escape identifiers other than "DOMString",
346            otherwise the idl_text would be non-conforming.
347    
348    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
349    
350            * WebIDL.pm (parse_char_string): Set line/column numbers
351            to generated nodes.  Unescape identifiers.  Extended attributes
352            for Definition's were ignored.
353            (append_child): Set |parent_node| attribute.
354            (parent_node): New attribute.
355            (check): Support interface/exception members.  Support
356            extended attributes.  Support definition identifier uniqueness
357            constraint.
358            (qualified_name): New attribute.
359            (Interface/Exception idl_text): Extended attributes were
360            not prepended to the returned text.
361    
362    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
363    
364            * WebIDL.pm (parse_char_string): Set line/column numbers
365            to interface object experimentally.  s/shift/pop/g, shift
366            would make things wrong.  Support for interface forward
367            declarations was missing.  Broken interface declarations
368            with no block were not ignored entirely.
369            (Whatpm::WebIDL::Node): New abstract class.  This class
370            makes things easier.
371            (child_nodes): New attribute.  Unlike DOM's attribute with
372            same name, this attribute returns a dead list of nodes for
373            simplicity.
374            (get_user_data, set_user_data): New methods.
375            (Module idl_text): A SPACE character should be inserted
376            before the |{| character.
377            (Interface idl_text): Support for interface forward declarations.
378            (is_forward_declaration): New attribute.
379    
380    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
381    
382            * WebIDL.pm (type_text): Better serializer.
383    
384    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
385    
386            * WebIDL.pm: Revise forward-compatible parsing so that
387            it now can handle broken extended attributes and as such.
388    
389    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
390    
391            * WebIDL.pm: Real support for extended attributes.
392            Support for extended attributes with arguments.
393    
394    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
395    
396            * WebIDL.pm: Support for |exception| syntax.
397            (Interface->idl_text): Tentative support for inheritances.
398    
399    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
400    
401            * WebIDL.pm: Hierarchical scoped name support was broken.
402            Support for raises, setraises, and getraises syntaxes.
403    
404    2008-07-18  Wakaba  <wakaba@suika.fam.cx>
405    
406            * WebIDL.pm: Support for |idl_text| attribute, version 1 (no
407            proper support for types, extended attributes, and exceptions yet).
408            WebIDL parser, version 1 (no support for exceptions yet,
409            no proper support for extended attributes yet).
410    
411    2008-07-09  Wakaba  <wakaba@suika.fam.cx>
412    
413            * WebIDL.pm (parse_char_string): Support for basic attribute syntax.
414    
415    2008-06-29  Wakaba  <wakaba@suika.fam.cx>
416    
417            * WebIDL.pm: Support for valuetype and const.
418    
419    2008-06-29  Wakaba  <wakaba@suika.fam.cx>
420            
421            * WebIDL.pm: New module.
422    
423    2008-06-15  Wakaba  <wakaba@suika.fam.cx>
424    
425            * Makefile (Entities.html): URI changed.
426    
427    2008-06-08  Wakaba  <wakaba@suika.fam.cx>
428    
429            * HTML.pm.src: Support for ruby parsing (HTML5 revision 1704).
430    
431    2008-06-01  Wakaba  <wakaba@suika.fam.cx>
432    
433            * HTML.pm.src (_get_next_token): A parse error was missing.
434    
435    2008-06-01  Wakaba  <wakaba@suika.fam.cx>
436    
437            * mklinktypelist.pl: rel=contact is no longer part of the HTML5
438            spec (commented out). (HTML5 revision 1711).
439    
440    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
441    
442            * ContentType.pm: Drop support for UTF-32 (HTML5 revision 1701).
443    
444            * HTML.pm.src: UTF-16BE and UTF-16LE should be considered
445            as UTF-16 (HTML5 revision 1701).
446    
447    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
448    
449            * HTML.pm.src: Support for <noframes> in <head> (HTML5 revision
450            1692).
451    
452    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
453    
454            * HTML.pm.src: The secondary insertion mode used when switching
455            to foreign content is the "in body" insertion mode (HTML5 revision
456            1696).
457    
458    2008-05-25  Wakaba  <wakaba@suika.fam.cx>
459    
460            * HTML.pm.src: Don't raise parse error for <isindex/> (HTML5
461            revision 1697).
462    
463    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
464    
465            * HTML.pm.src: Support for end-of-file token in foreign content
466            insertion mode (HTML5 revision 1693).  Update SVG camelCase
467            attribute list (HTML5 revision 1700).  <textarea> closes
468            </select> (HTML5 revision 1699).  More start tags close in
469            foreign content insertion mode (HTML5 revision 1698).
470    
471    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
472    
473            * HTML.pm.src: ";" is not part of charset name (HTML5 revision 1665).
474    
475    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
476    
477            * HTML.pm.src: More robust charset parameter detection (HTML5
478            revision 1674).
479    
480    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
481    
482            * ContentType.pm: Support for image/vnd.microsoft.icon (HTML5
483            revision 1676).
484    
485    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
486    
487            * HTML.pm.src: Ignore language part of public identifiers for
488            quriks mode detection (HTML5 revision 1679).
489    
490    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
491    
492            * HTML.pm.src: Reduce the number of errors in truncated doctypes (HTML5
493            revision 1685).
494    
495    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
496    
497            * HTML.pm.src: Support for EOF in new states for tags (HTML5
498            revision 1684).
499    
500    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
501    
502            * HTML.pm.src (_reset_insertion_mode): Make <td>.innerHTML
503            work (HTML5 revision 1690).
504    
505    2008-05-24  Wakaba  <wakaba@suika.fam.cx>
506    
507            * HTML.pm.src (_tree_construction_main): Change handling of
508            end tags in head insertion modes (HTML5 revision 1686).
509            (parse_char_string): Bug fix for non-utf8 character string handlings.
510            (parse_char_stream): |ungetc| does not work well for this context.
511    
512    2008-05-18  Wakaba  <wakaba@suika.fam.cx>
513    
514            * HTML.pm.src (parse_byte_string): Redefined to invoke
515            |parse_byte_stream|.
516            (parse_byte_stream): New method.
517    
518    2008-05-18  Wakaba  <wakaba@suika.fam.cx>
519    
520            * HTML.pm.src (parse_byte_string): Fix the column number reported
521            by encoding layer error reporter.
522    
523    2008-05-17  Wakaba  <wakaba@suika.fam.cx>
524    
525            * HTML.pm.src (parse_byte_string): Use streaming decoder
526            rather than converting the whole byte string and then parsing.
527            Propagate errors in character encoding layer.
528            (get_next_token): Precise error reporting for |bare stago| error.
529    
530    2008-05-17  Wakaba  <wakaba@suika.fam.cx>
531    
532            * HTML.pm.src (parse_char_stream): New method.
533            (parse_char_string): This method is now defined as an invocation
534            of the |parse_char_stream| method.
535    
536    2008-05-17  Wakaba  <wakaba@suika.fam.cx>
537    
538            * HTML.pm.src (parse_byte_string): Report various status
539            of the sniffing as info-level errors.  Support for new
540            decoding framework in parser resestting.
541            (new): Various default error levels were not set.
542    
543    2008-05-17  Wakaba  <wakaba@suika.fam.cx>
544    
545            * HTML.pm.src (parse_byte_string): HTML5 encoding siniffing
546            algorithm, except for the actual sniffing, is implemented
547            with new framework with Message::Charset::Info.
548    
549    2008-05-16  Wakaba  <wakaba@suika.fam.cx>
550    
551            * CacheManifest.pm (_parse): Drop fragment identifiers from
552            URIs in fallback section (HTML5 revision 1596).
553    
554  2008-05-10  Wakaba  <wakaba@suika.fam.cx>  2008-05-10  Wakaba  <wakaba@suika.fam.cx>
555    
556          * Makefile (Entities.html): URI has changed.          * Makefile (Entities.html): URI has changed.

Legend:
Removed from v.1.231  
changed lines
  Added in v.1.312

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24