Contents of /markup/html/whatpm/Whatpm/HTML/ChangeLog

2009-09-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src (_get_next_token): Implemented the "comment end
        bang state" (HTML5 revision 3191).

2009-08-16  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Any "<" character in attribute names become
        parse error (HTML5 revision 3354).

2009-08-16  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Lowercase-fold doctype names (HTML5 revision
        2501, cf. HTML5 revision 3571).

2009-07-05  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Reduced the number of parse errors on broken
        DOCTYPE (HTML5 revision 3121).

2009-07-03  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Reduced a parse error (HTML5 revision 3194).

2009-07-03  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: "<" in unquoted attribute values is now
        treated as parse error (HTML5 revision 3206).

2008-11-07  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm (dumptree): Support for namespace abbreviation for
        SWML namespaces.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Normalize white space characters in attribute
        value literals in XML documents.  Don't apply character reference
        mapping table for non-NULL non-surrogate code points.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Set the "stop_processing" flag true when a
        parameter entity occurs in a standalone="no" document.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Column number counting fixed.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Raise a parse error for '&' that does not
        introduce a reference in XML.  Support for non-ASCII entity
        reference names.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Make uppercase "&#X" in XML a parse error.
        Remove the limitation of entity name length.  Enable replacement
        of text-only general entities.  Raise a parse error for an
        unparsed entity reference.  Raise a parse error for a general
        entity reference to an undefined entity.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for <!ELEMENT>.
        (AFTER_NOTATION_NAME_STATE): Renamed as |AFTER_MD_DEF_STATE| (i.e.
        after markup declaration definition state).

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for EntityValue.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm: Dump text content of Entity nodes.

        * Tokenizer.pm.src: Support for <!ENTITY ... NDATA>.

2008-10-19  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src (_get_next_token): Make keywords 'ENTITY',
        'ELEMENT', 'ATTLIST', and 'NOTATION' ASCII case-insensitive.

2008-10-18  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Modifies PUBLIC/SYSTEM identifier tokenizer
        states such that <!ENTITY> and <!NOTATION> can be tokenized by
        those states as well.
        (BOGUS_MD_STATE): A new state; used for bogus markup declarations,
        in favor of BOGUS_COMMENT_STATE.

2008-10-18  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: <!ATTLIST> in the internal subset of an XML
        document, is now fully implemented.

        * Dumper.pm (dumptree): Output allowed tokens and default value
        always.

2008-10-17  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: New token types AtTLIST_TOKEN, ELEMENT_TOKEN,
        GENERAL_ENTITY_TOKEN, PARAMETER_ENTITY_TOKEN, and NOTATION_TOKEN
        are added.  New intertion modes for markup declarations are added.

2008-10-16  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: New token type END_OF_DOCTYPE_TOKEN added.
        New states DOCTYPE_TAG_STATE and
        BOGUS_DOCTYPE_INTERNAL_SUBSET_AFTER_STATE are added.  (Bogus
        string after the internal subset, which was handled by the state
        BOGUS_DOCTYPE_STATE, are now handled by the new state.)  Support
        for comments, bogus comments, and processing instructions in the
        internal subset.  If there is the internal subset, then emit the
        doctype token before the internal subset (with its
        $token->{has_internal_subset} flag set) and an
        END_OF_DOCTYPE_TOKEN after the internal subset.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: $self->{s_kwd} for non-DATA_STATE states are
        renamed as $self->{kwd} to avoid confliction.  Don't raise
        case-sensitivity error for the keyword "DOCTYPE" in HTML mode.
        Support for internal subsets (internal subset itself only; no
        declaration in them is supported yet).  Raise a parse error for
        non-uppercase keywords "PUBLIC" and "SYSTEM" in XML mode.  Raise a
        parse error if no system identifier is specified for a DOCTYPE
        declaration with a public identifier.  Don't close the DOCTYPE
        declaration by a ">" character in the system declaration in XML
        mode.
        
2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Set index attribute to each attribute token,
        for ignoring namespaced duplicate attribute at the XML namespace
        parser layer.  Raise a parse error if the attribute value is
        omitted, in XML mode.  Raise a parse error if the attribute value
        is not quoted, in XML mode.  Raise a parse error if "<" character
        is found in a quoted attribute value, in XML mode.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: XML tag name start character support for end
        tags.  Support for the short end tag syntax of XML5.  Raise a
        parse erorr for a lowercase <!doctype> in XML.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: XML tag name start character support for start
        tags.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for XML processing instructions.

2008-10-15  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Mark CHARACTER_TOKEN with character reference
        as such, for the support of XML parse error.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Parse error if CDATA section is not closed or
        is placed outside of the root element.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Raise a parse error for XML "]]>" other than
        CDATA section end.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Support for case-insensitive XML attribute
        names.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm: Typo fixed.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Dumper.pm: New module.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Introduced "in_xml" flag for CDATA section
        support in XML.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: Make *_TOKEN (token type constants)
        exportable.  New token types, PI_TOKEN for XML and ABORT_TOKEN for
        document.write() or incremental parsing, are added for future
        extensions.

2008-10-14  Wakaba  <wakaba@suika.fam.cx>

        * Tokenizer.pm.src: New file.

2008-05-24  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Don't escape |"| in
        content (HTML5 revision 1592).

2008-05-24  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Append "\n" after the start
        tag of a |listing| element (HTML5 revision 1675).

2008-03-02  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Typo fixed.

2008-03-01  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm (get_inner_html): Escape NBSP (HTML5 revision
        1277).

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pod: New file.

        * Makefile: New file.

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

        * Serializer.pm: New module (split from ../HTML.pm.src).

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

        * ChangeLog: New file.
        

1	2009-09-05 Wakaba <wakaba@suika.fam.cx>
2
3	* Tokenizer.pm.src (_get_next_token): Implemented the "comment end
4	bang state" (HTML5 revision 3191).
5
6	2009-08-16 Wakaba <wakaba@suika.fam.cx>
7
8	* Tokenizer.pm.src: Any "<" character in attribute names become
9	parse error (HTML5 revision 3354).
10
11	2009-08-16 Wakaba <wakaba@suika.fam.cx>
12
13	* Tokenizer.pm.src: Lowercase-fold doctype names (HTML5 revision
14	2501, cf. HTML5 revision 3571).
15
16	2009-07-05 Wakaba <wakaba@suika.fam.cx>
17
18	* Tokenizer.pm.src: Reduced the number of parse errors on broken
19	DOCTYPE (HTML5 revision 3121).
20
21	2009-07-03 Wakaba <wakaba@suika.fam.cx>
22
23	* Tokenizer.pm.src: Reduced a parse error (HTML5 revision 3194).
24
25	2009-07-03 Wakaba <wakaba@suika.fam.cx>
26
27	* Tokenizer.pm.src: "<" in unquoted attribute values is now
28	treated as parse error (HTML5 revision 3206).
29
30	2008-11-07 Wakaba <wakaba@suika.fam.cx>
31
32	* Dumper.pm (dumptree): Support for namespace abbreviation for
33	SWML namespaces.
34
35	2008-10-19 Wakaba <wakaba@suika.fam.cx>
36
37	* Tokenizer.pm.src: Normalize white space characters in attribute
38	value literals in XML documents. Don't apply character reference
39	mapping table for non-NULL non-surrogate code points.
40
41	2008-10-19 Wakaba <wakaba@suika.fam.cx>
42
43	* Tokenizer.pm.src: Set the "stop_processing" flag true when a
44	parameter entity occurs in a standalone="no" document.
45
46	2008-10-19 Wakaba <wakaba@suika.fam.cx>
47
48	* Tokenizer.pm.src: Column number counting fixed.
49
50	2008-10-19 Wakaba <wakaba@suika.fam.cx>
51
52	* Tokenizer.pm.src: Raise a parse error for '&' that does not
53	introduce a reference in XML. Support for non-ASCII entity
54	reference names.
55
56	2008-10-19 Wakaba <wakaba@suika.fam.cx>
57
58	* Tokenizer.pm.src: Make uppercase "&#X" in XML a parse error.
59	Remove the limitation of entity name length. Enable replacement
60	of text-only general entities. Raise a parse error for an
61	unparsed entity reference. Raise a parse error for a general
62	entity reference to an undefined entity.
63
64	2008-10-19 Wakaba <wakaba@suika.fam.cx>
65
66	* Tokenizer.pm.src: Support for <!ELEMENT>.
67	(AFTER_NOTATION_NAME_STATE): Renamed as \|AFTER_MD_DEF_STATE\| (i.e.
68	after markup declaration definition state).
69
70	2008-10-19 Wakaba <wakaba@suika.fam.cx>
71
72	* Tokenizer.pm.src: Support for EntityValue.
73
74	2008-10-19 Wakaba <wakaba@suika.fam.cx>
75
76	* Dumper.pm: Dump text content of Entity nodes.
77
78	* Tokenizer.pm.src: Support for <!ENTITY ... NDATA>.
79
80	2008-10-19 Wakaba <wakaba@suika.fam.cx>
81
82	* Tokenizer.pm.src (_get_next_token): Make keywords 'ENTITY',
83	'ELEMENT', 'ATTLIST', and 'NOTATION' ASCII case-insensitive.
84
85	2008-10-18 Wakaba <wakaba@suika.fam.cx>
86
87	* Tokenizer.pm.src: Modifies PUBLIC/SYSTEM identifier tokenizer
88	states such that <!ENTITY> and <!NOTATION> can be tokenized by
89	those states as well.
90	(BOGUS_MD_STATE): A new state; used for bogus markup declarations,
91	in favor of BOGUS_COMMENT_STATE.
92
93	2008-10-18 Wakaba <wakaba@suika.fam.cx>
94
95	* Tokenizer.pm.src: <!ATTLIST> in the internal subset of an XML
96	document, is now fully implemented.
97
98	* Dumper.pm (dumptree): Output allowed tokens and default value
99	always.
100
101	2008-10-17 Wakaba <wakaba@suika.fam.cx>
102
103	* Tokenizer.pm.src: New token types AtTLIST_TOKEN, ELEMENT_TOKEN,
104	GENERAL_ENTITY_TOKEN, PARAMETER_ENTITY_TOKEN, and NOTATION_TOKEN
105	are added. New intertion modes for markup declarations are added.
106
107	2008-10-16 Wakaba <wakaba@suika.fam.cx>
108
109	* Tokenizer.pm.src: New token type END_OF_DOCTYPE_TOKEN added.
110	New states DOCTYPE_TAG_STATE and
111	BOGUS_DOCTYPE_INTERNAL_SUBSET_AFTER_STATE are added. (Bogus
112	string after the internal subset, which was handled by the state
113	BOGUS_DOCTYPE_STATE, are now handled by the new state.) Support
114	for comments, bogus comments, and processing instructions in the
115	internal subset. If there is the internal subset, then emit the
116	doctype token before the internal subset (with its
117	$token->{has_internal_subset} flag set) and an
118	END_OF_DOCTYPE_TOKEN after the internal subset.
119
120	2008-10-15 Wakaba <wakaba@suika.fam.cx>
121
122	* Tokenizer.pm.src: $self->{s_kwd} for non-DATA_STATE states are
123	renamed as $self->{kwd} to avoid confliction. Don't raise
124	case-sensitivity error for the keyword "DOCTYPE" in HTML mode.
125	Support for internal subsets (internal subset itself only; no
126	declaration in them is supported yet). Raise a parse error for
127	non-uppercase keywords "PUBLIC" and "SYSTEM" in XML mode. Raise a
128	parse error if no system identifier is specified for a DOCTYPE
129	declaration with a public identifier. Don't close the DOCTYPE
130	declaration by a ">" character in the system declaration in XML
131	mode.
132
133	2008-10-15 Wakaba <wakaba@suika.fam.cx>
134
135	* Tokenizer.pm.src: Set index attribute to each attribute token,
136	for ignoring namespaced duplicate attribute at the XML namespace
137	parser layer. Raise a parse error if the attribute value is
138	omitted, in XML mode. Raise a parse error if the attribute value
139	is not quoted, in XML mode. Raise a parse error if "<" character
140	is found in a quoted attribute value, in XML mode.
141
142	2008-10-15 Wakaba <wakaba@suika.fam.cx>
143
144	* Tokenizer.pm.src: XML tag name start character support for end
145	tags. Support for the short end tag syntax of XML5. Raise a
146	parse erorr for a lowercase <!doctype> in XML.
147
148	2008-10-15 Wakaba <wakaba@suika.fam.cx>
149
150	* Tokenizer.pm.src: XML tag name start character support for start
151	tags.
152
153	2008-10-15 Wakaba <wakaba@suika.fam.cx>
154
155	* Tokenizer.pm.src: Support for XML processing instructions.
156
157	2008-10-15 Wakaba <wakaba@suika.fam.cx>
158
159	* Tokenizer.pm.src: Mark CHARACTER_TOKEN with character reference
160	as such, for the support of XML parse error.
161
162	2008-10-14 Wakaba <wakaba@suika.fam.cx>
163
164	* Tokenizer.pm.src: Parse error if CDATA section is not closed or
165	is placed outside of the root element.
166
167	2008-10-14 Wakaba <wakaba@suika.fam.cx>
168
169	* Tokenizer.pm.src: Raise a parse error for XML "]]>" other than
170	CDATA section end.
171
172	2008-10-14 Wakaba <wakaba@suika.fam.cx>
173
174	* Tokenizer.pm.src: Support for case-insensitive XML attribute
175	names.
176
177	2008-10-14 Wakaba <wakaba@suika.fam.cx>
178
179	* Dumper.pm: Typo fixed.
180
181	2008-10-14 Wakaba <wakaba@suika.fam.cx>
182
183	* Dumper.pm: New module.
184
185	2008-10-14 Wakaba <wakaba@suika.fam.cx>
186
187	* Tokenizer.pm.src: Introduced "in_xml" flag for CDATA section
188	support in XML.
189
190	2008-10-14 Wakaba <wakaba@suika.fam.cx>
191
192	* Tokenizer.pm.src: Make *_TOKEN (token type constants)
193	exportable. New token types, PI_TOKEN for XML and ABORT_TOKEN for
194	document.write() or incremental parsing, are added for future
195	extensions.
196
197	2008-10-14 Wakaba <wakaba@suika.fam.cx>
198
199	* Tokenizer.pm.src: New file.
200
201	2008-05-24 Wakaba <wakaba@suika.fam.cx>
202
203	* Serializer.pm (get_inner_html): Don't escape \|"\| in
204	content (HTML5 revision 1592).
205
206	2008-05-24 Wakaba <wakaba@suika.fam.cx>
207
208	* Serializer.pm (get_inner_html): Append "\n" after the start
209	tag of a \|listing\| element (HTML5 revision 1675).
210
211	2008-03-02 Wakaba <wakaba@suika.fam.cx>
212
213	* Serializer.pm (get_inner_html): Typo fixed.
214
215	2008-03-01 Wakaba <wakaba@suika.fam.cx>
216
217	* Serializer.pm (get_inner_html): Escape NBSP (HTML5 revision
218	1277).
219
220	2007-11-11 Wakaba <wakaba@suika.fam.cx>
221
222	* Serializer.pod: New file.
223
224	* Makefile: New file.
225
226	2007-11-11 Wakaba <wakaba@suika.fam.cx>
227
228	* Serializer.pm: New module (split from ../HTML.pm.src).
229
230	2007-11-11 Wakaba <wakaba@suika.fam.cx>
231
232	* ChangeLog: New file.
233
234