/[suikacvs]/markup/html/whatpm/Whatpm/ChangeLog
Suika

Diff of /markup/html/whatpm/Whatpm/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.273 by wakaba, Fri Aug 15 14:13:42 2008 UTC revision 1.307 by wakaba, Sun Sep 14 11:57:41 2008 UTC
# Line 1  Line 1 
1    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
2    
3            * HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
4            would report character error from now.
5    
6    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
7    
8            * HTML.pm.src: White-space-leaded non-white-space character
9            tokens in "before head insertion mode" was not
10            correctly handled.
11            (set_inner_html): Reimplemented using CharString decodehandle
12            class.  Support for $get_wrapper argument.  Support
13            for |{read_until}| feature.
14    
15    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
16    
17            * HTML.pm.src: Make a "bare ero" error for unknown
18            entities point the "&" character.
19    
20    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
21    
22            * HTML.pm.src: It turns out that U+FFFD don't have to
23            be added to the list of excluded characters.
24    
25    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
26    
27            * HTML.pm.src ($char_onerror): Have character decoder's |line|
28            and |column| a higher priority than the one set by the
29            tokenizer's input handler.
30            ($self->{read_until}): Exclude U+FFFD (but this might
31            not be necessary, since now we do line/column fixup in
32            the character decode handle).
33    
34    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
35    
36            * HTML.pm.src: Use |{read_until}| where possible.
37    
38    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
39    
40            * HTML.pm.src: Change |{getc_until}| to |{read_until}|
41            and |manakai_getc_until| to |manakai_read_until| to
42            reduce the number of string copies.
43    
44    2008-09-14  Wakaba  <wakaba@suika.fam.cx>
45    
46            * HTML.pm.src (parse_char_string): Use newly created
47            |Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
48            standard feature to |open| a string as a filehandle,
49            since Perl's string filehandle seems not supporting |ungetc|
50            method correctly.
51            (parse_char_stream): Define |{getc_until}| method.
52            (DATA_STATE): Experimental support for |getc_until| feature.
53    
54    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
55    
56            * HTML.pm.src: Check points added to newly added branches.
57    
58    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
59    
60            * HTML.pm.src: Remove |{char}|, which is no longer used.
61            Remove |{entity_in_attr}| and |{last_attribute_value_state}|
62            and replaced by |{prev_state}|.
63    
64            * mkhtmlparser.pl: Remove |{char}| feature.
65            Remove |!!!back-next-input-character;| macro.
66    
67    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
68    
69            * HTML.pm.src: Finally we get rid of all the inner loops.  Remove
70            entity related tokenizer states in favor of new states
71            implementing the consume character reference algorithm.
72    
73    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
74    
75            * HTML.pm.src: "Consume a character reference" algorithm is
76            now implemented as a tokenizer's state, rather than
77            a method, with minimum changes (more changes will
78            be made, in due course).  "Bogus comment state"'s inner
79            loop gets removed.
80    
81    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
82    
83            * HTML.pm.src: Make |PUBLIC| and |SYSTEM| keyword tokenizing
84            into their own tokenizer states.
85    
86    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
87    
88            * HTML.pm.src: |CDATA_SECTION_STATE| (formally |CDATA_BLOCK_STATE|
89            is split into three states.
90    
91    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
92    
93            * HTML.pm.src: |CLOSE_TAG_OPEN_STATE| is broken into
94            itself and new |CDATA_PCDATA_CLOSE_TAG_STATE| so that
95            no longer does the tokenizer have to push back next input
96            characters in those states.
97    
98    2008-09-13  Wakaba  <wakaba@suika.fam.cx>
99    
100            * HTML.pm.src: |MARKUP_DECLARATION_OPEN_STATE| broken
101            into four states so that no longer does the tokenizer have to push
102            back next input characters in that state.
103    
104    2008-09-11  Wakaba  <wakaba@suika.fam.cx>
105    
106            * HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
107            which can be used to insert some wrapper between the character
108            stream handle and the tokenizer.  (It is currently not supported
109            for |set_inner_html| for |Element|s).
110    
111    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
112    
113            * HTML.pm.src: Ignore punctuations in charset names.
114    
115    2008-09-10  Wakaba  <wakaba@suika.fam.cx>
116    
117            * ContentChecker.pm: Support for charset-layer error levels.
118    
119            * HTML.pm.src: Don't specify |text| argument for the
120            |chardecode:fallback| error, since it is not the encoding
121            being used alternatively.
122    
123    2008-09-06  Wakaba  <wakaba@suika.fam.cx>
124    
125            * HTML.pm.src: Support for |XSLT-compat| (HTML5 revision 2141).
126    
127    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
128    
129            * CacheManifest.pm: Support for extensibility (HTML5 revision 2051).
130    
131    2008-08-31  Wakaba  <wakaba@suika.fam.cx>
132    
133            * HTML.pm.src: Bug fix and sync with the spec with regard
134            to after after frameset insertion mode processing (HTML5
135            revision 1909).  Note that the implementation was wrong
136            per the old spec before the r1909 changes.
137    
138    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
139    
140            * HTMLTable.pm: scope=auto algorithm fix synced with the
141            spec (HTML5 revision 2093).
142            ($process_row): Algorithm step numbers synced with the
143            spec (HTML5 revision 2092).
144    
145    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
146    
147            * HTMLTable.pm: Zs is not what we want; we want White_Space! (HTML5
148            revision 2094).
149    
150    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
151    
152            * ContentType.pm: Support for image/svg+xml (HTML5 revision 2096).
153    
154    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
155    
156            * HTML.pm.src: '"' and "'" at the end of attribute
157            name (after another attribute) now raise parse error (HTML5
158            revision 2123).  Empty unquoted attribute values are no
159            longer allowed (HTML5 revision 2122).
160    
161    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
162    
163            * mkhtmlparser.pl: Support for MathML |definitionURL| attribute (HTML5
164            revision 2130).
165    
166    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
167    
168            * ContentChecker.pm: |xml:lang| attribute value must be same
169            as |lang| attribute value for HTML elements (HTML5 revision 2062
170            and so on).
171    
172    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
173    
174            * ContentChecker.pm: Error level definition for |xml_id_error|
175            was missing.
176    
177            * URIChecker.pm: The end of the URL should be marked as the
178            error location for an empty path error.  The position
179            between the userinfo and the port components should be
180            marked as the error location for an empty host error.
181    
182    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
183    
184            * URIChecker.pm: Set parameters representing where in the
185            value the error occurs for errors.  Report unknown
186            address format error in warning level, since address
187            formats are rarely added.  Path segments starting with "/.."
188            were misinterpreted as a dot-segment.
189    
190    2008-08-30  Wakaba  <wakaba@suika.fam.cx>
191    
192            * URIChecker.pm (check_iri_reference): Requires
193            |Message::DOM::DOMImplementation|.
194    
195    2008-08-29  Wakaba  <wakaba@suika.fam.cx>
196    
197            * IMTChecker.pm: Updated for the new error reporting architecture.
198    
199            * ContentChecker.pm: Error levels for IMTs are added.
200    
201    2008-08-17  Wakaba  <wakaba@suika.fam.cx>
202    
203            * H2H.pm (_shift_token): Support for unquoted HTML attribute
204            values.
205    
206    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
207    
208            * CacheManifest.pm: Support for new style of error
209            reports.
210    
211            * HTML.pm.src: Set line=1, column=1 to the document node.
212    
213    2008-08-16  Wakaba  <wakaba@suika.fam.cx>
214    
215            * ContentChecker.pm, RDFXML.pm: Pass {level} object to language tag
216            and URL checkers.  Support for more error levels for bogus
217            langauge tag and URL "standards".
218    
219            * LangTag.pm, URIChecker.pm: Support for new style error
220            level reporting.
221    
222  2008-08-15  Wakaba  <wakaba@suika.fam.cx>  2008-08-15  Wakaba  <wakaba@suika.fam.cx>
223    
224          * ContentChecker.pm: Support for RDF/XML error levels.          * ContentChecker.pm: Support for RDF/XML error levels.

Legend:
Removed from v.1.273  
changed lines
  Added in v.1.307

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24