/[suikacvs]/markup/html/whatpm/Whatpm/Charset/UnicodeChecker.pm
Suika

Log of /markup/html/whatpm/Whatpm/Charset/UnicodeChecker.pm

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (download) (annotate)
Sticky Tag:

Revision 1.9 - (view) (download) (annotate) - [select for diffs]
Mon Sep 22 06:04:29 2008 UTC (16 years, 9 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
Changes since 1.8: +1 -1 lines
Diff to previous 1.8
++ whatpm/t/ChangeLog	22 Sep 2008 05:59:48 -0000
	* tokenizer-test-1.test: Test data on invalid character references
	are added (cf. HTML5 revision 2138).

	* tokenizer-test-2.dat: Test data on U+000B are updated (HTML5
	revision 2138).

2008-09-22  Wakaba  <wakaba@suika.fam.cx>

++ whatpm/Whatpm/ChangeLog	22 Sep 2008 06:02:01 -0000
2008-09-22  Wakaba  <wakaba@suika.fam.cx>

	* HTML.pm.src: Character references for non-space C0 characters,
	including U+000B VT, DEL character, noncharacter code points, are
	now converted to the U+FFFD character (cf. HTML5 revision 2138).


Revision 1.8 - (view) (download) (annotate) - [select for diffs]
Mon Sep 15 07:19:03 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.7: +58 -16 lines
Diff to previous 1.7
++ whatpm/Whatpm/ChangeLog	15 Sep 2008 07:17:34 -0000
	* HTML.pm.src: Remove checking for control character, surrogate
	pair, or noncharacter code points and non-Unicode code
	points (they should be handled by Whatpm::Charset::UnicodeChecker).
	(parse_char_stream): Support for the |$get_wrapper| argument and
	character stream error handlers.

2008-09-15  Wakaba  <wakaba@suika.fam.cx>

++ whatpm/Whatpm/Charset/ChangeLog	15 Sep 2008 07:18:45 -0000
	* DecodeHandle.pm (onerror): Return |undef| if no explicit value
	is set.

	* UnicodeChecker.pm: Support for HTML5 parse errors.
	(onerror): Return |undef| if no explicit value is set.

2008-09-15  Wakaba  <wakaba@suika.fam.cx>


Revision 1.7 - (view) (download) (annotate) - [select for diffs]
Mon Sep 15 00:49:09 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.6: +124 -121 lines
Diff to previous 1.6
++ whatpm/Whatpm/Charset/ChangeLog	15 Sep 2008 00:48:59 -0000
2008-09-15  Wakaba  <wakaba@suika.fam.cx>

	* UnicodeChecker.pm: Use hash for better performance.


Revision 1.6 - (view) (download) (annotate) - [select for diffs]
Sun Sep 14 11:57:42 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.5: +70 -39 lines
Diff to previous 1.5
++ whatpm/Whatpm/ChangeLog	14 Sep 2008 11:56:24 -0000
	* HTML.pm.src: Use |read| instead of |getc|.  |set_inner_html|
	would report character error from now.

2008-09-14  Wakaba  <wakaba@suika.fam.cx>

++ whatpm/Whatpm/Charset/ChangeLog	14 Sep 2008 11:57:38 -0000
	* DecodeHandle.pm (CharString onerror): New method.

	* UnicodeString.pm (read): New.
	(getc): Removed.
	(manakai_read_until): Checking operation implemented.

2008-09-14  Wakaba  <wakaba@suika.fam.cx>


Revision 1.5 - (view) (download) (annotate) - [select for diffs]
Sun Sep 14 03:07:58 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.4: +10 -8 lines
Diff to previous 1.4
++ whatpm/Whatpm/ChangeLog	14 Sep 2008 03:06:56 -0000
	* HTML.pm.src: Change |{getc_until}| to |{read_until}|
	and |manakai_getc_until| to |manakai_read_until| to
	reduce the number of string copies.

2008-09-14  Wakaba  <wakaba@suika.fam.cx>

++ whatpm/Whatpm/Charset/ChangeLog	14 Sep 2008 03:07:37 -0000
	* DecodeHandle.pm, UnicodeChecker.pm: Change |manakai_getc_until|
	to |manakai_read_until| to reduce the number of string copies.

2008-09-14  Wakaba  <wakaba@suika.fam.cx>


Revision 1.4 - (view) (download) (annotate) - [select for diffs]
Sun Sep 14 01:51:08 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.3: +17 -1 lines
Diff to previous 1.3
++ whatpm/Whatpm/ChangeLog	14 Sep 2008 01:47:27 -0000
2008-09-14  Wakaba  <wakaba@suika.fam.cx>

	* HTML.pm.src (parse_char_string): Use newly created
	|Whatpm::Charset::DecodeHandle::CharString| instead of Perl's
	standard feature to |open| a string as a filehandle,
	since Perl's string filehandle seems not supporting |ungetc|
	method correctly.
	(parse_char_stream): Define |{getc_until}| method.
	(DATA_STATE): Experimental support for |getc_until| feature.

++ whatpm/Whatpm/Charset/ChangeLog	14 Sep 2008 01:50:52 -0000
2008-09-14  Wakaba  <wakaba@suika.fam.cx>

	* DecodeHandle.pm (CharString): New class.
	(Encode read): Don't remove read string from |{char_buffer}|,
	to decease the number of string operations and to enable
	|manakai_getc_until| ungetc'ing without any string operation.
	(manakai_getc_until): New method.

	* UnicodeChecker.pm (getc): Don't |read| more than one
	character, to prevent characters being bufferred
	such that mixture of |getc| and |manakai_getc_until|
	calls does not make the result broken.


Revision 1.3 - (view) (download) (annotate) - [select for diffs]
Thu Sep 11 12:09:38 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.2: +1 -1 lines
Diff to previous 1.2
++ whatpm/Whatpm/Charset/ChangeLog	11 Sep 2008 12:09:15 -0000
	* UnicodeChecker.pm, DecodeHandle.pm: Try to reduce the
	number of string copies and method calls, first round.

2008-09-11  Wakaba  <wakaba@suika.fam.cx>


Revision 1.2 - (view) (download) (annotate) - [select for diffs]
Thu Sep 11 09:55:56 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
Changes since 1.1: +10 -1 lines
Diff to previous 1.1
++ whatpm/Whatpm/Charset/ChangeLog	11 Sep 2008 09:55:54 -0000
	* UnicodeChecker.pm, DecodeHandle.pm: Tentative support
	for |read| method.

2008-09-11  Wakaba  <wakaba@suika.fam.cx>


Revision 1.1 - (view) (download) (annotate) - [select for diffs]
Thu Sep 11 09:12:27 2008 UTC (16 years, 10 months ago) by wakaba
Branch: MAIN
++ whatpm/Whatpm/ChangeLog	11 Sep 2008 09:12:24 -0000
2008-09-11  Wakaba  <wakaba@suika.fam.cx>

	* HTML.pm.src: Methods now accept additional parameter, $get_wrapper,
	which can be used to insert some wrapper between the character
	stream handle and the tokenizer.  (It is currently not supported
	for |set_inner_html| for |Element|s).

++ whatpm/Whatpm/Charset/ChangeLog	11 Sep 2008 09:10:24 -0000
2008-09-11  Wakaba  <wakaba@suika.fam.cx>

	* UnicodeChecker.pm: New module.


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, select a symbolic revision name using the selection box, or choose 'Use Text Field' and enter a numeric revision.

  Diffs between and
  Type of Diff should be a

Sort log by:

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24