Contents of /markup/html/whatpm/Whatpm/HTML/ChangeLog

2009-09-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Changed to keep non-normal character
        references as is (HTML5 revision 3374).

2009-09-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Discard unclosed tags (HTML5 revision 2990).

2009-09-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src (_get_next_token): Implemented the "comment end
        space state" (HTML5 revision 3195).

2009-09-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src (_get_next_token): Implemented the "comment end
        bang state" (HTML5 revision 3191).

2009-08-16  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Any "<" character in attribute names become
        parse error (HTML5 revision 3354).

2009-08-16  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Lowercase-fold doctype names (HTML5 revision
        2501, cf. HTML5 revision 3571).

2009-07-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Reduced the number of parse errors on broken
        DOCTYPE (HTML5 revision 3121).

2009-07-03  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Reduced a parse error (HTML5 revision 3194).

2009-07-03  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: "<" in unquoted attribute values is now
        treated as parse error (HTML5 revision 3206).

2008-11-07  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm (dumptree): Support for namespace abbreviation for
        SWML namespaces.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Normalize white space characters in attribute
        value literals in XML documents.  Don't apply character reference
        mapping table for non-NULL non-surrogate code points.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Set the "stop_processing" flag true when a
        parameter entity occurs in a standalone="no" document.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Column number counting fixed.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Raise a parse error for '&' that does not
        introduce a reference in XML.  Support for non-ASCII entity
        reference names.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Make uppercase "&#X" in XML a parse error.
        Remove the limitation of entity name length.  Enable replacement
        of text-only general entities.  Raise a parse error for an
        unparsed entity reference.  Raise a parse error for a general
        entity reference to an undefined entity.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for <!ELEMENT>.
        (AFTER_NOTATION_NAME_STATE): Renamed as |AFTER_MD_DEF_STATE| (i.e.
        after markup declaration definition state).

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for EntityValue.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm: Dump text content of Entity nodes.

        * Tokenizer.pm.src: Support for <!ENTITY ... NDATA>.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src (_get_next_token): Make keywords 'ENTITY',
        'ELEMENT', 'ATTLIST', and 'NOTATION' ASCII case-insensitive.

2008-10-18  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Modifies PUBLIC/SYSTEM identifier tokenizer
        states such that <!ENTITY> and <!NOTATION> can be tokenized by
        those states as well.
        (BOGUS_MD_STATE): A new state; used for bogus markup declarations,
        in favor of BOGUS_COMMENT_STATE.

2008-10-18  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: <!ATTLIST> in the internal subset of an XML
        document, is now fully implemented.

        * Dumper.pm (dumptree): Output allowed tokens and default value
        always.

2008-10-17  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: New token types AtTLIST_TOKEN, ELEMENT_TOKEN,
        GENERAL_ENTITY_TOKEN, PARAMETER_ENTITY_TOKEN, and NOTATION_TOKEN
        are added.  New intertion modes for markup declarations are added.

2008-10-16  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: New token type END_OF_DOCTYPE_TOKEN added.
        New states DOCTYPE_TAG_STATE and
        BOGUS_DOCTYPE_INTERNAL_SUBSET_AFTER_STATE are added.  (Bogus
        string after the internal subset, which was handled by the state
        BOGUS_DOCTYPE_STATE, are now handled by the new state.)  Support
        for comments, bogus comments, and processing instructions in the
        internal subset.  If there is the internal subset, then emit the
        doctype token before the internal subset (with its
        $token->{has_internal_subset} flag set) and an
        END_OF_DOCTYPE_TOKEN after the internal subset.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: $self->{s_kwd} for non-DATA_STATE states are
        renamed as $self->{kwd} to avoid confliction.  Don't raise
        case-sensitivity error for the keyword "DOCTYPE" in HTML mode.
        Support for internal subsets (internal subset itself only; no
        declaration in them is supported yet).  Raise a parse error for
        non-uppercase keywords "PUBLIC" and "SYSTEM" in XML mode.  Raise a
        parse error if no system identifier is specified for a DOCTYPE
        declaration with a public identifier.  Don't close the DOCTYPE
        declaration by a ">" character in the system declaration in XML
        mode.
        
2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Set index attribute to each attribute token,
        for ignoring namespaced duplicate attribute at the XML namespace
        parser layer.  Raise a parse error if the attribute value is
        omitted, in XML mode.  Raise a parse error if the attribute value
        is not quoted, in XML mode.  Raise a parse error if "<" character
        is found in a quoted attribute value, in XML mode.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: XML tag name start character support for end
        tags.  Support for the short end tag syntax of XML5.  Raise a
        parse erorr for a lowercase <!doctype> in XML.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: XML tag name start character support for start
        tags.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for XML processing instructions.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Mark CHARACTER_TOKEN with character reference
        as such, for the support of XML parse error.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Parse error if CDATA section is not closed or
        is placed outside of the root element.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Raise a parse error for XML "]]>" other than
        CDATA section end.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for case-insensitive XML attribute
        names.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm: Typo fixed.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm: New module.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Introduced "in_xml" flag for CDATA section
        support in XML.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Make *_TOKEN (token type constants)
        exportable.  New token types, PI_TOKEN for XML and ABORT_TOKEN for
        document.write() or incremental parsing, are added for future
        extensions.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: New file.

2008-05-24  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Don't escape |"| in
        content (HTML5 revision 1592).

2008-05-24  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Append "\n" after the start
        tag of a |listing| element (HTML5 revision 1675).

2008-03-02  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Typo fixed.

2008-03-01  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Escape NBSP (HTML5 revision
        1277).

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pod: New file.

        * Makefile: New file.

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm: New module (split from ../HTML.pm.src).

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

        * ChangeLog: New file.
        

1	wakaba	1.41	2009-09-05 Wakaba <wakaba@suika.fam.cx>
2
3	wakaba	1.44	* Tokenizer.pm.src: Changed to keep non-normal character
4			references as is (HTML5 revision 3374).
5
6			2009-09-05 Wakaba <wakaba@suika.fam.cx>
7
8	wakaba	1.43	* Tokenizer.pm.src: Discard unclosed tags (HTML5 revision 2990).
9
10			2009-09-05 Wakaba <wakaba@suika.fam.cx>
11
12	wakaba	1.41	* Tokenizer.pm.src (_get_next_token): Implemented the "comment end
13	wakaba	1.42	space state" (HTML5 revision 3195).
14
15			2009-09-05 Wakaba <wakaba@suika.fam.cx>
16
17			* Tokenizer.pm.src (_get_next_token): Implemented the "comment end
18	wakaba	1.41	bang state" (HTML5 revision 3191).
19
20	wakaba	1.39	2009-08-16 Wakaba <wakaba@suika.fam.cx>
21
22	wakaba	1.40	* Tokenizer.pm.src: Any "<" character in attribute names become
23			parse error (HTML5 revision 3354).
24
25			2009-08-16 Wakaba <wakaba@suika.fam.cx>
26
27	wakaba	1.39	* Tokenizer.pm.src: Lowercase-fold doctype names (HTML5 revision
28			2501, cf. HTML5 revision 3571).
29
30	wakaba	1.38	2009-07-05 Wakaba <wakaba@suika.fam.cx>
31
32			* Tokenizer.pm.src: Reduced the number of parse errors on broken
33			DOCTYPE (HTML5 revision 3121).
34
35	wakaba	1.36	2009-07-03 Wakaba <wakaba@suika.fam.cx>
36
37	wakaba	1.37	* Tokenizer.pm.src: Reduced a parse error (HTML5 revision 3194).
38
39			2009-07-03 Wakaba <wakaba@suika.fam.cx>
40
41	wakaba	1.36	* Tokenizer.pm.src: "<" in unquoted attribute values is now
42			treated as parse error (HTML5 revision 3206).
43
44	wakaba	1.35	2008-11-07 Wakaba <wakaba@suika.fam.cx>
45
46			* Dumper.pm (dumptree): Support for namespace abbreviation for
47			SWML namespaces.
48
49	wakaba	1.26	2008-10-19 Wakaba <wakaba@suika.fam.cx>
50
51	wakaba	1.34	* Tokenizer.pm.src: Normalize white space characters in attribute
52			value literals in XML documents. Don't apply character reference
53			mapping table for non-NULL non-surrogate code points.
54
55			2008-10-19 Wakaba <wakaba@suika.fam.cx>
56
57	wakaba	1.33	* Tokenizer.pm.src: Set the "stop_processing" flag true when a
58			parameter entity occurs in a standalone="no" document.
59
60			2008-10-19 Wakaba <wakaba@suika.fam.cx>
61
62	wakaba	1.32	* Tokenizer.pm.src: Column number counting fixed.
63
64			2008-10-19 Wakaba <wakaba@suika.fam.cx>
65
66	wakaba	1.31	* Tokenizer.pm.src: Raise a parse error for '&' that does not
67			introduce a reference in XML. Support for non-ASCII entity
68			reference names.
69
70			2008-10-19 Wakaba <wakaba@suika.fam.cx>
71
72	wakaba	1.30	* Tokenizer.pm.src: Make uppercase "&#X" in XML a parse error.
73			Remove the limitation of entity name length. Enable replacement
74			of text-only general entities. Raise a parse error for an
75			unparsed entity reference. Raise a parse error for a general
76			entity reference to an undefined entity.
77
78			2008-10-19 Wakaba <wakaba@suika.fam.cx>
79
80	wakaba	1.29	* Tokenizer.pm.src: Support for <!ELEMENT>.
81			(AFTER_NOTATION_NAME_STATE): Renamed as \|AFTER_MD_DEF_STATE\| (i.e.
82			after markup declaration definition state).
83
84			2008-10-19 Wakaba <wakaba@suika.fam.cx>
85
86	wakaba	1.28	* Tokenizer.pm.src: Support for EntityValue.
87
88			2008-10-19 Wakaba <wakaba@suika.fam.cx>
89
90	wakaba	1.27	* Dumper.pm: Dump text content of Entity nodes.
91
92			* Tokenizer.pm.src: Support for <!ENTITY ... NDATA>.
93
94			2008-10-19 Wakaba <wakaba@suika.fam.cx>
95
96	wakaba	1.26	* Tokenizer.pm.src (_get_next_token): Make keywords 'ENTITY',
97			'ELEMENT', 'ATTLIST', and 'NOTATION' ASCII case-insensitive.
98
99	wakaba	1.24	2008-10-18 Wakaba <wakaba@suika.fam.cx>
100
101	wakaba	1.25	* Tokenizer.pm.src: Modifies PUBLIC/SYSTEM identifier tokenizer
102			states such that <!ENTITY> and <!NOTATION> can be tokenized by
103			those states as well.
104			(BOGUS_MD_STATE): A new state; used for bogus markup declarations,
105			in favor of BOGUS_COMMENT_STATE.
106
107			2008-10-18 Wakaba <wakaba@suika.fam.cx>
108
109	wakaba	1.24	* Tokenizer.pm.src: <!ATTLIST> in the internal subset of an XML
110			document, is now fully implemented.
111
112			* Dumper.pm (dumptree): Output allowed tokens and default value
113			always.
114
115	wakaba	1.23	2008-10-17 Wakaba <wakaba@suika.fam.cx>
116
117			* Tokenizer.pm.src: New token types AtTLIST_TOKEN, ELEMENT_TOKEN,
118			GENERAL_ENTITY_TOKEN, PARAMETER_ENTITY_TOKEN, and NOTATION_TOKEN
119			are added. New intertion modes for markup declarations are added.
120
121	wakaba	1.22	2008-10-16 Wakaba <wakaba@suika.fam.cx>
122
123			* Tokenizer.pm.src: New token type END_OF_DOCTYPE_TOKEN added.
124			New states DOCTYPE_TAG_STATE and
125			BOGUS_DOCTYPE_INTERNAL_SUBSET_AFTER_STATE are added. (Bogus
126			string after the internal subset, which was handled by the state
127			BOGUS_DOCTYPE_STATE, are now handled by the new state.) Support
128			for comments, bogus comments, and processing instructions in the
129			internal subset. If there is the internal subset, then emit the
130			doctype token before the internal subset (with its
131			$token->{has_internal_subset} flag set) and an
132			END_OF_DOCTYPE_TOKEN after the internal subset.
133
134	wakaba	1.16	2008-10-15 Wakaba <wakaba@suika.fam.cx>
135
136	wakaba	1.21	* Tokenizer.pm.src: $self->{s_kwd} for non-DATA_STATE states are
137			renamed as $self->{kwd} to avoid confliction. Don't raise
138			case-sensitivity error for the keyword "DOCTYPE" in HTML mode.
139			Support for internal subsets (internal subset itself only; no
140			declaration in them is supported yet). Raise a parse error for
141			non-uppercase keywords "PUBLIC" and "SYSTEM" in XML mode. Raise a
142			parse error if no system identifier is specified for a DOCTYPE
143			declaration with a public identifier. Don't close the DOCTYPE
144			declaration by a ">" character in the system declaration in XML
145			mode.
146
147			2008-10-15 Wakaba <wakaba@suika.fam.cx>
148
149	wakaba	1.20	* Tokenizer.pm.src: Set index attribute to each attribute token,
150			for ignoring namespaced duplicate attribute at the XML namespace
151			parser layer. Raise a parse error if the attribute value is
152			omitted, in XML mode. Raise a parse error if the attribute value
153			is not quoted, in XML mode. Raise a parse error if "<" character
154			is found in a quoted attribute value, in XML mode.
155
156			2008-10-15 Wakaba <wakaba@suika.fam.cx>
157
158	wakaba	1.19	* Tokenizer.pm.src: XML tag name start character support for end
159			tags. Support for the short end tag syntax of XML5. Raise a
160			parse erorr for a lowercase <!doctype> in XML.
161
162			2008-10-15 Wakaba <wakaba@suika.fam.cx>
163
164			* Tokenizer.pm.src: XML tag name start character support for start
165	wakaba	1.18	tags.
166
167			2008-10-15 Wakaba <wakaba@suika.fam.cx>
168
169	wakaba	1.17	* Tokenizer.pm.src: Support for XML processing instructions.
170
171			2008-10-15 Wakaba <wakaba@suika.fam.cx>
172
173	wakaba	1.16	* Tokenizer.pm.src: Mark CHARACTER_TOKEN with character reference
174			as such, for the support of XML parse error.
175
176	wakaba	1.8	2008-10-14 Wakaba <wakaba@suika.fam.cx>
177
178	wakaba	1.15	* Tokenizer.pm.src: Parse error if CDATA section is not closed or
179			is placed outside of the root element.
180
181			2008-10-14 Wakaba <wakaba@suika.fam.cx>
182
183	wakaba	1.14	* Tokenizer.pm.src: Raise a parse error for XML "]]>" other than
184			CDATA section end.
185
186			2008-10-14 Wakaba <wakaba@suika.fam.cx>
187
188	wakaba	1.13	* Tokenizer.pm.src: Support for case-insensitive XML attribute
189			names.
190
191			2008-10-14 Wakaba <wakaba@suika.fam.cx>
192
193	wakaba	1.12	* Dumper.pm: Typo fixed.
194
195			2008-10-14 Wakaba <wakaba@suika.fam.cx>
196
197	wakaba	1.11	* Dumper.pm: New module.
198
199			2008-10-14 Wakaba <wakaba@suika.fam.cx>
200
201	wakaba	1.10	* Tokenizer.pm.src: Introduced "in_xml" flag for CDATA section
202			support in XML.
203
204			2008-10-14 Wakaba <wakaba@suika.fam.cx>
205
206	wakaba	1.9	* Tokenizer.pm.src: Make *_TOKEN (token type constants)
207			exportable. New token types, PI_TOKEN for XML and ABORT_TOKEN for
208			document.write() or incremental parsing, are added for future
209			extensions.
210
211			2008-10-14 Wakaba <wakaba@suika.fam.cx>
212
213	wakaba	1.8	* Tokenizer.pm.src: New file.
214
215	wakaba	1.5	2008-05-24 Wakaba <wakaba@suika.fam.cx>
216
217	wakaba	1.7	* Serializer.pm (get_inner_html): Don't escape \|"\| in
218			content (HTML5 revision 1592).
219
220			2008-05-24 Wakaba <wakaba@suika.fam.cx>
221
222	wakaba	1.5	* Serializer.pm (get_inner_html): Append "\n" after the start
223	wakaba	1.6	tag of a \|listing\| element (HTML5 revision 1675).
224	wakaba	1.5
225	wakaba	1.4	2008-03-02 Wakaba <wakaba@suika.fam.cx>
226
227			* Serializer.pm (get_inner_html): Typo fixed.
228
229	wakaba	1.3	2008-03-01 Wakaba <wakaba@suika.fam.cx>
230
231			* Serializer.pm (get_inner_html): Escape NBSP (HTML5 revision
232			1277).
233
234	wakaba	1.2	2007-11-11 Wakaba <wakaba@suika.fam.cx>
235
236			* Serializer.pod: New file.
237
238			* Makefile: New file.
239
240			2007-11-11 Wakaba <wakaba@suika.fam.cx>
241
242			* Serializer.pm: New module (split from ../HTML.pm.src).
243
244			2007-11-11 Wakaba <wakaba@suika.fam.cx>
245
246			* ChangeLog: New file.
247
248