/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.262 by wakaba, Sat Jul 19 13:11:30 2008 UTC revision 1.342 by wakaba, Sat Oct 4 09:17:54 2008 UTC
# Line 1  Line 1 
1    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: Drop redundant code (HTML5 revision 1731).
4    
5    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
6    
7            * HTML.pm.src: Support for new definition of |param| and |source|
8            start tag parsing (HTML5 revision 1731).
9    
10    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
11    
12            * HTML.pm.src: <p> steps reimplemented (HTML5 revision 1731).
13    
14    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
15    
16            * HTML.pm.src: <li>, <dt>, and <dd> steps reimplemented (HTML5
17            revisions 1731 and 1831).
18    
19    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
20    
21            * HTML.pm.src: Support for new flow (but not phrasing) elements (HTML5
22            revisions 1731 and 1778).  Support for the </sarcasm> end tag (HTML5
23            revision 1731).
24    
25    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
26    
27            * HTML.pm.src: Support for |command| and |eventsource| elements (HTML5
28            revision 1731).  End tags of |option| and |optgroup| elements are
29            now optional (HTML5 revision 1731).
30    
31    2008-10-04  Wakaba  <wakaba@suika.fam.cx>
32    
33            * HTML.pm.src: New "special" elements added to the list (HTML5
34            revision 1778).  "strile" -> "strike".
35    
36    2008-10-02  Wakaba  <wakaba@suika.fam.cx>
37    
38            * ContentType.pm (get_sniffed_type): Support for the "better"
39            content sniffing (HTML5 revision 1927).  In a case the official
40            type was not returned when the method is invoked in the list
41            context.
42    
43    2008-09-22  Wakaba  <wakaba@suika.fam.cx>
44    
45            * HTML.pm.src: Character references for non-space C0 characters,
46            including U+000B VT, DEL character, noncharacter code points, are
47            now converted to the U+FFFD character (cf. HTML5 revision 2138).
48    
49    2008-09-21  Wakaba  <wakaba@suika.fam.cx>
50    
51            * ContentChecker.pm: |form=""| check support added.
52    
53    2008-09-21  Wakaba  <wakaba@suika.fam.cx>
54    
55            * ContentChecker.pm: |contextmenu| validness is now checked using
56            |id| and |id_type| properties, and |menu| property is removed.
57    
58    2008-09-21  Wakaba  <wakaba@suika.fam.cx>
59    
60            * ContentChecker.pm: Prepare for |form| |name| attribute's
61            duplication checking.
62    
63    2008-09-21  Wakaba  <wakaba@suika.fam.cx>
64    
65            * HTML.pm.src (parse_byte_stream): Support (or non-support) for
66            unsupported charset="" parameter value (HTML5 revision 2131).
67    
68    2008-09-20  Wakaba  <wakaba@suika.fam.cx>
69    
70            * HTML.pm.src: Reminding places where U+000B is allowed as a space
71            character is fixed (cf. HTML5 revision 1738).
72    
73            * ContentChecker.pm, HTMLTable.pm: U+000B is no longer part of
74            space characters (HTML5 revision 1738).
75    
76    2008-09-20  Wakaba  <wakaba@suika.fam.cx>
77    
78            * HTML.pm.src: The "anything else" case for the "after after body"
79            insertion mode was not updated to swtich to the "in body"
80            insertion mode.  U+000B is no longer a space character for the
81            purpose of tree construction phase (HTML5 revision 1738).
82    
83    2008-09-20  Wakaba  <wakaba@suika.fam.cx>
84    
85            * HTML.pm.src: U+000B is no longer a space character (HTML5
86            revision 1738).
87    
88    2008-09-20  Wakaba  <wakaba@suika.fam.cx>
89    
90            * ContentType.pm: 0x0B is no longer a space character (HTML5
91            revision 1738).
92    
93            * HTML.pm.src: U+000B is no longer a space character for the
94            algorithm for extracting an encoding from a Content-Type (HTML5
95            revision 1738).
96    
97    2008-09-20  Wakaba  <wakaba@suika.fam.cx>
98    
99            * ContentChecker.pm ($IsInHTMLInteractiveContent): New.
100    
101    2008-09-18  Wakaba  <wakaba@suika.fam.cx>
102    
103            * LangTag.pm: Add checks for remaining requirements from RFC 4646.
104    
105            * mklangreg.pl: Sort 'Prefix' values by their length, to ease
106            matching.
107    
108    2008-09-18  Wakaba  <wakaba@suika.fam.cx>
109    
110            * LangTag.pm: Warn for private use language subtags.  Error level
111            typos fixed.  Support for Suppress-Script field.
112    
113            * mklangreg.pl: Support for dumping of nested structure.
114    
115    2008-09-18  Wakaba  <wakaba@suika.fam.cx>
116    
117            * LangTag.pm (check_rfc4646_langtag): Check if a tag is in the
118            recommended case as per RFC 4646.
119    
120    2008-09-18  Wakaba  <wakaba@suika.fam.cx>
121    
122            * LangTag.pm (check_rfc4646_langtag): New method.
123    
124    2008-09-18  Wakaba  <wakaba@suika.fam.cx>
125    
126            * mklangreg.pl: New script.
127    
128            * Makefile: Updated for creation of the module for language subtag
129            registry.
130            
131    2008-09-16  Wakaba  <wakaba@suika.fam.cx>
132    
133            * Makefile: WebIDL.html added.
134    
135            * WebIDL.pod: New documentation.
136    
137    2008-09-16  Wakaba  <wakaba@suika.fam.cx>
138    
139            * WebIDL.pm: Checker's error types are redefined.
140    
141    2008-09-16  Wakaba  <wakaba@suika.fam.cx>
142    
143            * WebIDL.pm: Parser's error types are redefined.  Some forward
144            compatible parsing bugs are fixed.  Some unreachable codes are
145            commented out.
146    
147    2008-09-16  Wakaba  <wakaba@suika.fam.cx>
148    
149            * WebIDL.pm: Support for the reminding extended attributes are
150            added.  It does not satisfy the definition that a forward
151            interface declaration has an extended attribute.  It seems that
152            unless explicitly allowed multiple extended attributes with the
153            same name is not allowed, though it is not explicitly mentioned in
154            the spec.
155    
156    2008-09-16  Wakaba  <wakaba@suika.fam.cx>
157    
158            * WebIDL.pm: Unescapes extended attribute names and extended
159            attribute identifiers.  Preserve whether an extended attribute has
160            an argument list of not.  Support for extended attributes:
161            Constructor, ExceptionConsts, IndexGetter, IndexSetter,
162            NameGetter, NameSetter, and Null.
163            (has_argument_list): New attribute.
164            (idl_text): Stringifies argument lists, if any, even if it is
165            empty.
166    
167    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
168    
169            * HTML.pm.src: New state |PCDATA_STATE|.  Use an empty string for
170            |{s_kwd}| in DATA_STATE as default.
171    
172    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
173    
174            * HTML.pm.src, mkhtmlparser.pl: Replace |{prev_char}|
175            by |{s_kwd}| in DATA_STATE.
176    
177    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
178    
179            * HTML.pm.src: Shorten keys.
180    
181    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
182    
183            * HTML.pm.src: Remove checking for control character, surrogate
184            pair, or noncharacter code points and non-Unicode code
185            points (they should be handled by Whatpm::Charset::UnicodeChecker).
186            (parse_char_stream): Support for the |$get_wrapper| argument and
187            character stream error handlers.
188    
189    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
190    
191            * ContentChecker.pm: Don't call |loda_ns_module|
192            for null-namespace elements/attributes.
193    
194            * HTML.pm.src: Fact out $disallowed_control_chars
195            as a hash.
196    
197    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
198    
199            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
200            and |{next_char}| initializations are moved to initialization
201            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
202            with |parse_char_stream|.
203    
204    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
205    
206            * HTML.pm.src (parse_char_stream): Make |set_next_char|
207            invoke |manakai_read_until|, not only |read|, where
208            possible, to decrease the number of |read| method calls.
209    
210            * mkhtmlparser.pl: Related changes to the aforementioned
211            modification.
212    
213    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
214    
215            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
216            would report character error from now.
217    
218    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
219    
220            * HTML.pm.src: White-space-leaded non-white-space character
221            tokens in "before head insertion mode" was not
222            correctly handled.
223            (set_inner_html): Reimplemented using CharString decodehandle
224            class.  Support for $get_wrapper argument.  Support
225            for |{read_until}| feature.
226    
227    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
228    
229            * HTML.pm.src: Make a "bare ero" error for unknown
230            entities point the "&" character.
231    
232    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
233    
234            * HTML.pm.src: It turns out that U+FFFD don't have to
235            be added to the list of excluded characters.
236    
237    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
238    
239            * HTML.pm.src ($char_onerror): Have character decoder's |line|
240            and |column| a higher priority than the one set by the
241            tokenizer's input handler.
242            ($self->{read_until}): Exclude U+FFFD (but this might
243            not be necessary, since now we do line/column fixup in
244            the character decode handle).
245    
246    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
247    
248            * HTML.pm.src: Use |{read_until}| where possible.
249    
250    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
251    
252            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
253            and |manakai_getc_until| to |manakai_read_until| to
254            reduce the number of string copies.
255    
256    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
257    
258            * HTML.pm.src (parse_char_string): Use newly created
259            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
260            standard feature to |open| a string as a filehandle,
261            since Perl's string filehandle seems not supporting |ungetc|
262            method correctly.
263            (parse_char_stream): Define |{getc_until}| method.
264            (DATA_STATE): Experimental support for |getc_until| feature.
265    
266    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
267    
268            * HTML.pm.src: Check points added to newly added branches.
269    
270    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
271    
272            * HTML.pm.src: Remove |{char}|, which is no longer used.
273            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
274            and replaced by |{prev_state}|.
275    
276            * mkhtmlparser.pl: Remove |{char}| feature.
277            Remove |!!!back-next-input-character;| macro.
278    
279    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
280    
281            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
282            entity related tokenizer states in favor of new states
283            implementing the consume character reference algorithm.
284    
285    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
286    
287            * HTML.pm.src: "Consume a character reference" algorithm is
288            now implemented as a tokenizer's state, rather than
289            a method, with minimum changes (more changes will
290            be made, in due course).  "Bogus comment state"'s inner
291            loop gets removed.
292    
293    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
294    
295            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
296            into their own tokenizer states.
297    
298    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
299    
300            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
301            is split into three states.
302    
303    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
304    
305            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
306            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
307            no longer does the tokenizer have to push back next input
308            characters in those states.
309    
310    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
311    
312            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
313            into four states so that no longer does the tokenizer have to push
314            back next input characters in that state.
315    
316    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
317    
318            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
319            which can be used to insert some wrapper between the character
320            stream handle and the tokenizer.  (It is currently not supported
321            for |set_inner_html| for |Element|s).
322    
323    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
324    
325            * HTML.pm.src: Ignore punctuations in charset names.
326    
327    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
328    
329            * ContentChecker.pm: Support for charset-layer error levels.
330    
331            * HTML.pm.src: Don't specify |text| argument for the
332            |chardecode:fallback| error, since it is not the encoding
333            being used alternatively.
334    
335    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
336    
337            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
338    
339    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
340    
341            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
342    
343    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
344    
345            * HTML.pm.src: Bug fix and sync with the spec with regard
346            to after after frameset insertion mode processing (HTML5
347            revision 1909).  Note that the implementation was wrong
348            per the old spec before the r1909 changes.
349    
350    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
351    
352            * HTMLTable.pm: scope=auto algorithm fix synced with the
353            spec (HTML5 revision 2093).
354            ($process_row): Algorithm step numbers synced with the
355            spec (HTML5 revision 2092).
356    
357    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
358    
359            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
360            revision 2094).
361    
362    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
363    
364            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
365    
366    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
367    
368            * HTML.pm.src: '"' and "'" at the end of attribute
369            name (after another attribute) now raise parse error (HTML5
370            revision 2123).  Empty unquoted attribute values are no
371            longer allowed (HTML5 revision 2122).
372    
373    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
374    
375            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
376            revision 2130).
377    
378    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
379    
380            * ContentChecker.pm: |xml:lang| attribute value must be same
381            as |lang| attribute value for HTML elements (HTML5 revision 2062
382            and so on).
383    
384    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
385    
386            * ContentChecker.pm: Error level definition for |xml_id_error|
387            was missing.
388    
389            * URIChecker.pm: The end of the URL should be marked as the
390            error location for an empty path error.  The position
391            between the userinfo and the port components should be
392            marked as the error location for an empty host error.
393    
394    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
395    
396            * URIChecker.pm: Set parameters representing where in the
397            value the error occurs for errors.  Report unknown
398            address format error in warning level, since address
399            formats are rarely added.  Path segments starting with "/.."
400            were misinterpreted as a dot-segment.
401    
402    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
403    
404            * URIChecker.pm (check_iri_reference): Requires
405            |Message::DOM::DOMImplementation|.
406    
407    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
408    
409            * IMTChecker.pm: Updated for the new error reporting architecture.
410    
411            * ContentChecker.pm: Error levels for IMTs are added.
412    
413    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
414    
415            * H2H.pm (_shift_token): Support for unquoted HTML attribute
416            values.
417    
418    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
419    
420            * CacheManifest.pm: Support for new style of error
421            reports.
422    
423            * HTML.pm.src: Set line=1, column=1 to the document node.
424    
425    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
426    
427            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
428            and URL checkers.  Support for more error levels for bogus
429            langauge tag and URL "standards".
430    
431            * LangTag.pm, URIChecker.pm: Support for new style error
432            level reporting.
433    
434    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
435    
436            * ContentChecker.pm: Support for RDF/XML error levels.
437    
438            * HTMLTable.pm, RDFXML.pm: Support for new style of error level
439            specifying.  Error types are revised.
440    
441    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
442    
443            * ContentChecker.pm: All error reporting method calls are
444            renewed.
445    
446    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
447    
448            * HTML.pm.src: All error type names and "text" parameters
449            are revised.  Use new style for "level" specification.
450    
451            * mkhtmlparser.pl: Use new style for "level" specification.
452    
453    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
454    
455            * WebIDL.pm (parse_char_string): Simplified error
456            reporting process for broken ignored valuetype definition.
457            (Valuetype idl_text): Support for special "DOMString" name.
458    
459    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
460    
461            * WebIDL.pm ($get_scoped_name): Append "::::" if the last
462            terminal of the ScopedName is "DOMString", such that whether
463            the last part of the scoped name is "DOMString" or "_DOMString"
464            later.  It is necessary to determine whether a |typedef|
465            definition should be ignored or not.
466            (parse_char_string): Unescape the identifier of
467            exception members.
468            ($resolve): Return undef for builtin types and sequence<T>
469            types (we might not have to do this, however...).
470            (check): Support checking for Exceptions, Valuetypes,
471            and Typedefs.
472            ($serialize_type): Support for "DOMString::::" syntax.
473            (Typedef idl_text): Output Type as "DOMString" if it
474            is really "DOMString" (i.e. its internal representation
475            is "::DOMString::").
476    
477    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
478    
479            * WebIDL.pm ($resolve): New code, based on resolve code
480            for constant types in the |check| method.
481            (check): Support for checking of attributes, operations, and
482            arguments.
483            (Attribute/Operation idl_text): Exception names in getraises,
484            setraises, and raises clauses is serizlied by |$serialize_type|
485            code.
486    
487    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
488    
489            * WebIDL.pm ($integer): Order of selections are changed to match
490            hexadecimal numbers (the original pattern, taken from the spec,
491            was not work for hexadecimal numbers, because the "0" prefix
492            matches to the [0-7]* part (as an empty string) and therefore
493            it does not match with remaining "x..." part of a "0x..." integer
494            literal.
495            ($get_type): It now returns a string, not an array reference,
496            for regular types and |sequence| types (i.e. it in any case
497            returns a string).
498            ($get_next_token): The second item in the array that represents
499            a integer or float token is now a Perl number value, not the
500            original string representation of the number.
501            (check): Support for const value consistency checking.
502            No extended attribute is defined for constants.
503            (Node subclasses): Use simple strings rather than array references
504            for default data type values.
505            ($serialize_type): Type values are now simple strings.
506            (value): If the new attribute value is a false value, then
507            a FALSE value is set to the attribute.
508    
509    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
510    
511            * WebIDL.pm ($get_scoped_name): Now scoped names are stored
512            in its stringified format ("scoped name" as defined in the
513            spec).  Note that future version of this module should not use
514            array references for type values and the |type_text| attribute
515            should be made obsolete.
516            (parse_char_string): Unescape attribute names.
517            (check): Support for checking of whether inherited interfaces
518            are actually defined or not.  Support for checking of whether
519            interface member identifiers are duplicated or not.
520            ($serialize_type): Scoped names are returned as is.  A future
521            version of this code should escape identifiers other than "DOMString",
522            otherwise the idl_text would be non-conforming.
523    
524    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
525    
526            * WebIDL.pm (parse_char_string): Set line/column numbers
527            to generated nodes.  Unescape identifiers.  Extended attributes
528            for Definition's were ignored.
529            (append_child): Set |parent_node| attribute.
530            (parent_node): New attribute.
531            (check): Support interface/exception members.  Support
532            extended attributes.  Support definition identifier uniqueness
533            constraint.
534            (qualified_name): New attribute.
535            (Interface/Exception idl_text): Extended attributes were
536            not prepended to the returned text.
537    
538    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
539    
540            * WebIDL.pm (parse_char_string): Set line/column numbers
541            to interface object experimentally.  s/shift/pop/g, shift
542            would make things wrong.  Support for interface forward
543            declarations was missing.  Broken interface declarations
544            with no block were not ignored entirely.
545            (Whatpm::WebIDL::Node): New abstract class.  This class
546            makes things easier.
547            (child_nodes): New attribute.  Unlike DOM's attribute with
548            same name, this attribute returns a dead list of nodes for
549            simplicity.
550            (get_user_data, set_user_data): New methods.
551            (Module idl_text): A SPACE character should be inserted
552            before the |{| character.
553            (Interface idl_text): Support for interface forward declarations.
554            (is_forward_declaration): New attribute.
555    
556    2008-07-19  Wakaba  <wakaba@suika.fam.cx>
557    
558            * WebIDL.pm (type_text): Better serializer.
559    
560  2008-07-19  Wakaba  <wakaba@suika.fam.cx>  2008-07-19  Wakaba  <wakaba@suika.fam.cx>
561    
562          * WebIDL.pm: Revise forward-compatible parsing so that          * WebIDL.pm: Revise forward-compatible parsing so that

Legend:
Removed from v.1.262  
changed lines
  Added in v.1.342

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24