/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.265 by wakaba, Sat Aug 2 11:49:57 2008 UTC revision 1.312 by wakaba, Mon Sep 15 08:09:39 2008 UTC
# Line 1  Line 1 
1    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: Shorten keys.
4    
5    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
6    
7            * HTML.pm.src: Remove checking for control character, surrogate
8            pair, or noncharacter code points and non-Unicode code
9            points (they should be handled by Whatpm::Charset::UnicodeChecker).
10            (parse_char_stream): Support for the |$get_wrapper| argument and
11            character stream error handlers.
12    
13    2008-09-15  Wakaba  <wakaba@suika.fam.cx>
14    
15            * ContentChecker.pm: Don't call |loda_ns_module|
16            for null-namespace elements/attributes.
17    
18            * HTML.pm.src: Fact out $disallowed_control_chars
19            as a hash.
20    
21    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
22    
23            * HTML.pm.src: Regexp typo fixed.  |{prev_char}|
24            and |{next_char}| initializations are moved to initialization
25            method.  |{read_until}| now supports buffering.  Sync |set_inner_html|
26            with |parse_char_stream|.
27    
28    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
29    
30            * HTML.pm.src (parse_char_stream): Make |set_next_char|
31            invoke |manakai_read_until|, not only |read|, where
32            possible, to decrease the number of |read| method calls.
33    
34            * mkhtmlparser.pl: Related changes to the aforementioned
35            modification.
36    
37    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
38    
39            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
40            would report character error from now.
41    
42    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
43    
44            * HTML.pm.src: White-space-leaded non-white-space character
45            tokens in "before head insertion mode" was not
46            correctly handled.
47            (set_inner_html): Reimplemented using CharString decodehandle
48            class.  Support for $get_wrapper argument.  Support
49            for |{read_until}| feature.
50    
51    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
52    
53            * HTML.pm.src: Make a "bare ero" error for unknown
54            entities point the "&" character.
55    
56    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
57    
58            * HTML.pm.src: It turns out that U+FFFD don't have to
59            be added to the list of excluded characters.
60    
61    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
62    
63            * HTML.pm.src ($char_onerror): Have character decoder's |line|
64            and |column| a higher priority than the one set by the
65            tokenizer's input handler.
66            ($self->{read_until}): Exclude U+FFFD (but this might
67            not be necessary, since now we do line/column fixup in
68            the character decode handle).
69    
70    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
71    
72            * HTML.pm.src: Use |{read_until}| where possible.
73    
74    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
75    
76            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
77            and |manakai_getc_until| to |manakai_read_until| to
78            reduce the number of string copies.
79    
80    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
81    
82            * HTML.pm.src (parse_char_string): Use newly created
83            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
84            standard feature to |open| a string as a filehandle,
85            since Perl's string filehandle seems not supporting |ungetc|
86            method correctly.
87            (parse_char_stream): Define |{getc_until}| method.
88            (DATA_STATE): Experimental support for |getc_until| feature.
89    
90    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
91    
92            * HTML.pm.src: Check points added to newly added branches.
93    
94    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
95    
96            * HTML.pm.src: Remove |{char}|, which is no longer used.
97            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
98            and replaced by |{prev_state}|.
99    
100            * mkhtmlparser.pl: Remove |{char}| feature.
101            Remove |!!!back-next-input-character;| macro.
102    
103    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
104    
105            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
106            entity related tokenizer states in favor of new states
107            implementing the consume character reference algorithm.
108    
109    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
110    
111            * HTML.pm.src: "Consume a character reference" algorithm is
112            now implemented as a tokenizer's state, rather than
113            a method, with minimum changes (more changes will
114            be made, in due course).  "Bogus comment state"'s inner
115            loop gets removed.
116    
117    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
118    
119            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
120            into their own tokenizer states.
121    
122    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
123    
124            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
125            is split into three states.
126    
127    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
128    
129            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
130            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
131            no longer does the tokenizer have to push back next input
132            characters in those states.
133    
134    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
135    
136            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
137            into four states so that no longer does the tokenizer have to push
138            back next input characters in that state.
139    
140    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
141    
142            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
143            which can be used to insert some wrapper between the character
144            stream handle and the tokenizer.  (It is currently not supported
145            for |set_inner_html| for |Element|s).
146    
147    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
148    
149            * HTML.pm.src: Ignore punctuations in charset names.
150    
151    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
152    
153            * ContentChecker.pm: Support for charset-layer error levels.
154    
155            * HTML.pm.src: Don't specify |text| argument for the
156            |chardecode:fallback| error, since it is not the encoding
157            being used alternatively.
158    
159    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
160    
161            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
162    
163    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
164    
165            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
166    
167    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
168    
169            * HTML.pm.src: Bug fix and sync with the spec with regard
170            to after after frameset insertion mode processing (HTML5
171            revision 1909).  Note that the implementation was wrong
172            per the old spec before the r1909 changes.
173    
174    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
175    
176            * HTMLTable.pm: scope=auto algorithm fix synced with the
177            spec (HTML5 revision 2093).
178            ($process_row): Algorithm step numbers synced with the
179            spec (HTML5 revision 2092).
180    
181    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
182    
183            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
184            revision 2094).
185    
186    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
187    
188            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
189    
190    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
191    
192            * HTML.pm.src: '"' and "'" at the end of attribute
193            name (after another attribute) now raise parse error (HTML5
194            revision 2123).  Empty unquoted attribute values are no
195            longer allowed (HTML5 revision 2122).
196    
197    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
198    
199            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
200            revision 2130).
201    
202    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
203    
204            * ContentChecker.pm: |xml:lang| attribute value must be same
205            as |lang| attribute value for HTML elements (HTML5 revision 2062
206            and so on).
207    
208    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
209    
210            * ContentChecker.pm: Error level definition for |xml_id_error|
211            was missing.
212    
213            * URIChecker.pm: The end of the URL should be marked as the
214            error location for an empty path error.  The position
215            between the userinfo and the port components should be
216            marked as the error location for an empty host error.
217    
218    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
219    
220            * URIChecker.pm: Set parameters representing where in the
221            value the error occurs for errors.  Report unknown
222            address format error in warning level, since address
223            formats are rarely added.  Path segments starting with "/.."
224            were misinterpreted as a dot-segment.
225    
226    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
227    
228            * URIChecker.pm (check_iri_reference): Requires
229            |Message::DOM::DOMImplementation|.
230    
231    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
232    
233            * IMTChecker.pm: Updated for the new error reporting architecture.
234    
235            * ContentChecker.pm: Error levels for IMTs are added.
236    
237    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
238    
239            * H2H.pm (_shift_token): Support for unquoted HTML attribute
240            values.
241    
242    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
243    
244            * CacheManifest.pm: Support for new style of error
245            reports.
246    
247            * HTML.pm.src: Set line=1, column=1 to the document node.
248    
249    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
250    
251            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
252            and URL checkers.  Support for more error levels for bogus
253            langauge tag and URL "standards".
254    
255            * LangTag.pm, URIChecker.pm: Support for new style error
256            level reporting.
257    
258    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
259    
260            * ContentChecker.pm: Support for RDF/XML error levels.
261    
262            * HTMLTable.pm, RDFXML.pm: Support for new style of error level
263            specifying.  Error types are revised.
264    
265    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
266    
267            * ContentChecker.pm: All error reporting method calls are
268            renewed.
269    
270    2008-08-15  Wakaba  <wakaba@suika.fam.cx>
271    
272            * HTML.pm.src: All error type names and "text" parameters
273            are revised.  Use new style for "level" specification.
274    
275            * mkhtmlparser.pl: Use new style for "level" specification.
276    
277    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
278    
279            * WebIDL.pm (parse_char_string): Simplified error
280            reporting process for broken ignored valuetype definition.
281            (Valuetype idl_text): Support for special "DOMString" name.
282    
283    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
284    
285            * WebIDL.pm ($get_scoped_name): Append "::::" if the last
286            terminal of the ScopedName is "DOMString", such that whether
287            the last part of the scoped name is "DOMString" or "_DOMString"
288            later.  It is necessary to determine whether a |typedef|
289            definition should be ignored or not.
290            (parse_char_string): Unescape the identifier of
291            exception members.
292            ($resolve): Return undef for builtin types and sequence<T>
293            types (we might not have to do this, however...).
294            (check): Support checking for Exceptions, Valuetypes,
295            and Typedefs.
296            ($serialize_type): Support for "DOMString::::" syntax.
297            (Typedef idl_text): Output Type as "DOMString" if it
298            is really "DOMString" (i.e. its internal representation
299            is "::DOMString::").
300    
301    2008-08-03  Wakaba  <wakaba@suika.fam.cx>
302    
303            * WebIDL.pm ($resolve): New code, based on resolve code
304            for constant types in the |check| method.
305            (check): Support for checking of attributes, operations, and
306            arguments.
307            (Attribute/Operation idl_text): Exception names in getraises,
308            setraises, and raises clauses is serizlied by |$serialize_type|
309            code.
310    
311    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
312    
313            * WebIDL.pm ($integer): Order of selections are changed to match
314            hexadecimal numbers (the original pattern, taken from the spec,
315            was not work for hexadecimal numbers, because the "0" prefix
316            matches to the [0-7]* part (as an empty string) and therefore
317            it does not match with remaining "x..." part of a "0x..." integer
318            literal.
319            ($get_type): It now returns a string, not an array reference,
320            for regular types and |sequence| types (i.e. it in any case
321            returns a string).
322            ($get_next_token): The second item in the array that represents
323            a integer or float token is now a Perl number value, not the
324            original string representation of the number.
325            (check): Support for const value consistency checking.
326            No extended attribute is defined for constants.
327            (Node subclasses): Use simple strings rather than array references
328            for default data type values.
329            ($serialize_type): Type values are now simple strings.
330            (value): If the new attribute value is a false value, then
331            a FALSE value is set to the attribute.
332    
333    2008-08-02  Wakaba  <wakaba@suika.fam.cx>
334    
335            * WebIDL.pm ($get_scoped_name): Now scoped names are stored
336            in its stringified format ("scoped name" as defined in the
337            spec).  Note that future version of this module should not use
338            array references for type values and the |type_text| attribute
339            should be made obsolete.
340            (parse_char_string): Unescape attribute names.
341            (check): Support for checking of whether inherited interfaces
342            are actually defined or not.  Support for checking of whether
343            interface member identifiers are duplicated or not.
344            ($serialize_type): Scoped names are returned as is.  A future
345            version of this code should escape identifiers other than "DOMString",
346            otherwise the idl_text would be non-conforming.
347    
348  2008-08-02  Wakaba  <wakaba@suika.fam.cx>  2008-08-02  Wakaba  <wakaba@suika.fam.cx>
349    
350          * WebIDL.pm (parse_char_string): Set line/column numbers          * WebIDL.pm (parse_char_string): Set line/column numbers

Legend:
Removed from v.1.265  
changed lines
  Added in v.1.312

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24