Module: @QName: MCharset|Encode @FullName: @@lang: en @@@: Manakai Charset Encode Module @Namespace: http://suika.fam.cx/~wakaba/archive/2005/manakai/Charset/Encode/ @enDesc: The is ... {TODO:: This module is subject to change. } @DISCore:author: DISCore|Wakaba @License: license|Perl+MPL @Date: $Date: 2007/07/29 08:04:40 $ @Require: @@Module: @@@QName: MDOM|DOMFeature @@Module: @@@QName: DISlib|Charset @@Module: @@@QName: MDOM|DOMCore @@Module: @@@QName: MDOM|DOMString Namespace: @c: http://suika.fam.cx/~wakaba/archive/2004/8/18/dom-core# @cs: http://suika.fam.cx/~wakaba/archive/2004/dis/Charset/ @dis: http://suika.fam.cx/~wakaba/archive/2004/8/18/lang#dis-- @DISlib: http://suika.fam.cx/~wakaba/archive/2004/dis/ @domperl: http://suika.fam.cx/~wakaba/archive/2006/dom/perl/ @dx: http://suika.fam.cx/~wakaba/archive/2005/manakai/Util/Error/DOMException# @f: http://suika.fam.cx/~wakaba/archive/2004/dom/feature# @fe: http://suika.fam.cx/www/2006/feature/ @icharset: urn:x-suika-fam-cx:charset: @idl: http://suika.fam.cx/~wakaba/archive/2004/dis/IDL# @kwd: http://suika.fam.cx/~wakaba/archive/2005/rfc2119/ @lang: http://suika.fam.cx/~wakaba/archive/2004/8/18/lang# @license: http://suika.fam.cx/~wakaba/archive/2004/8/18/license# @ManakaiDOM: http://suika.fam.cx/~wakaba/archive/2004/8/18/manakai-dom# @MCharset: http://suika.fam.cx/~wakaba/archive/2005/manakai/Charset/ @mce: http://suika.fam.cx/~wakaba/archive/2005/manakai/Charset/Encode/ @MDOM: http://suika.fam.cx/~wakaba/archive/2004/8/18/manakai-dom#ManakaiDOM. @mn: http://suika.fam.cx/~wakaba/archive/2005/manakai/Util/ManakaiNode# @rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# @rdfs: http://www.w3.org/2000/01/rdf-schema# @str: http://suika.fam.cx/~wakaba/archive/2005/manakai/DOM/DOMString/ @test: http://suika.fam.cx/~wakaba/archive/2004/dis/Test# @xml-auto-charset: http://suika.fam.cx/www/2006/03/xml-entity/ ResourceDef: @QName: MCharset| @rdf:type: dis|ModuleGroup @FullName: @@lang:en @@@: The manakai Charset modules @DISPerl:packageName: Message::Charset:: @DISPerl:interfacePackageName: Message::Charset::IF:: ## -- Features FeatureDef: @QName: MCEncodeFeature @featureQName: fe|MCEncode @FeatureVerDef: @@QName: MCEncodeFeature10 @@f:instanceOf: MCEncodeFeature @@f:version: 1.0 @@enDesc: The manakai Charset Encode Module, version 1.0 ElementTypeBinding: @Name: FeatureDef @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: f|Feature ElementTypeBinding: @Name: FeatureVerDef @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: f|Feature ElementTypeBinding: @Name: featureQName @ElementType: f:name @ShadowContent: @@ContentType: DISCore|QName ElementTypeBinding: @Name: IFQName @ElementType: dis:QName @ShadowContent: @@ForCheck: ManakaiDOM|ForIF ElementTypeBinding: @Name: ClsQName @ElementType: dis:QName @ShadowContent: @@ForCheck: ManakaiDOM|ForClass ElementTypeBinding: @Name: IFISA @ElementType: dis:ISA @ShadowContent: @@ForCheck: ManakaiDOM|ForIF ElementTypeBinding: @Name: ClsISA @ElementType: dis:ISA @ShadowContent: @@ForCheck: ManakaiDOM|ForClass ElementTypeBinding: @Name: IFClsDef @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: @@@@: dis|MultipleResource @@@ForCheck: !ManakaiDOM|ForIF !ManakaiDOM|ForClass @@resourceFor: ManakaiDOM|ForIF @@resourceFor: @@@@: ManakaiDOM|ForClass @@rdf:type: @@@@: DISLang|Interface @@@ForCheck: ManakaiDOM|ForIF @@rdf:type: @@@@: DISLang|Class @@@ForCheck: ManakaiDOM|ForClass @@Implement: @@@@: ||+||ManakaiDOM|ForIF @@@ContentType: DISCore|TFPQNames @@@ForCheck: ManakaiDOM|ForClass @@f:implements: @@@@: MCEncodeFeature10 ElementTypeBinding: @Name: ClsDef @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: @@@@: dis|MultipleResource @@@ForCheck: !ManakaiDOM|ForIF !ManakaiDOM|ForClass @@resourceFor: @@@@: ManakaiDOM|ForClass @@rdf:type: @@@@: DISLang|Class @@@ForCheck: ManakaiDOM|ForClass @@f:implements: @@@@: MCEncodeFeature10 ElementTypeBinding: @Name: Method @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: DISLang|Method ElementTypeBinding: @Name: IntMethod @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: DISLang|Method @@ForCheck: ManakaiDOM|ForClass @@ManakaiDOM:isForInternal: 1 ElementTypeBinding: @Name: Param @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISLang|MethodParameter ElementTypeBinding: @Name: NamedParam @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISLang|MethodParameter @@DISPerl:isNamedParameter: 1 ElementTypeBinding: @Name: Return @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISLang|MethodReturn ElementTypeBinding: @Name: Attr @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISLang|Attribute ElementTypeBinding: @Name: Get @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISLang|AttributeGet ElementTypeBinding: @Name: Set @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISLang|AttributeSet ElementTypeBinding: @Name: InCase @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: ManakaiDOM|InCase ElementTypeBinding: @Name: PerlDef @ElementType: dis:Def @ShadowContent: @@ContentType: lang:Perl @@ForCheck: ManakaiDOM|ForClass ElementTypeBinding: @Name: Test @ElementType: dis:ResourceDef @ShadowContent: @@rdf:type: test|StandaloneTest @@ForCheck: ManakaiDOM|ForClass ElementTypeBinding: @Name: enDesc @ElementType: dis:Description @ShadowContent: @@lang:en ElementTypeBinding: @Name: enImplNote @ElementType: dis:ImplNote @ShadowContent: @@lang:en ElementTypeBinding: @Name: enFN @ElementType: dis:FullName @ShadowContent: @@lang:en ## -- Implementation IFClsDef: @IFQName: MCEncodeImplementation @ClsQName: ManakaiMCEncodeImplementation @domperl:implementedByObjectsOf: c|DOMImplementation @domperl:classImplementedByObjectsOf: c|ManakaiDOMImplementation @enDesc: The interface provides factory methods to create objects. @Test: @@enDesc: The implementation registry should know this class when the module is loaded. @@PerlDef: I: { for my $impl (@{$Message::DOM::ImplementationRegistry ->get_dom_implementation_list ({ => '1.0'})}) { if ($impl->isa ()) { last I; } } $test->assert_never; } # I @Method: @@Name: createMCDecodeHandle @@enDesc: Creates an object. @@Param: @@@Name: charset @@@Type: String @@@enDesc: The URI that identifies the charset in which strings are written to the . @@Param: @@@Name: byteStream @@@Type: DISPerl|Filehandle||ManakaiDOM|all @@@enDesc: A reference to the filehandle that contains the byte stream read by the object. @@Param: @@@Name: onerror @@@Type: DISPerl|CODE||ManakaiDOM|all @@@enDesc: A subroutine that is called back when an error is encountered. The attribute of the created object is set to this parameter value. @@@nullCase: @@@@enDesc: The attribute is set to a do-nothing subroutine. @@Return: @@@Type: MCDecodeHandle @@@enDesc: The newly created filehandle object. @@@nullCase: @@@@enDesc: If the implementation does not support . @@@PerlDef: my $csdef = $Message::Charset::Encode::CharsetDef->{$charset}; my $obj = { => $byteStream, => $charset, => [], => '', => $onerror || sub {}}; if ($csdef->{uri}->{} or $charset eq ) { __DEEP{ my $b = ''; $csdef = $Message::Charset::Encode::CharsetDef ->{}; # UTF-8 with no BOM $obj->{} = 'utf-8'; if (read $obj->{}, $b, 256) { no warnings "substr"; no warnings "uninitialized"; if (substr ($b, 0, 1) eq "<") { if (substr ($b, 1, 1) eq "?") { # ASCII8 __CODE{XMLEntity.guess:: $ascii => $b, $errorCondition => { not $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 1, 1) eq "\x00") { if (substr ($b, 2, 2) eq "?\x00") { # ASCII16LE my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "\x00\x00") { # ASCII32Endian4321 my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } } } elsif (substr ($b, 0, 3) eq "\xEF\xBB\xBF") { # UTF8 $obj->{} = true; substr ($b, 0, 3) = ''; my $c = $b; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or not $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 0, 2) eq "\x00<") { if (substr ($b, 2, 2) eq "\x00?") { # ASCII16BE my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "\x00\x00") { # ASCII32Endian3412 my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } } elsif (substr ($b, 0, 2) eq "\xFE\xFF") { if (substr ($b, 2, 2) eq "\x00<") { # ASCII16BE $obj->{} = true; substr ($b, 0, 2) = ''; my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or not $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-16', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "\x00\x00") { # ASCII32Endian3412 $obj->{} = true; substr ($b, 0, 4) = ''; my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or not $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-16', $restoreBOM => { $obj->{} .= "\x00\x00"; }, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } else { $csdef = $Message::Charset::Encode::CharsetDef ->{}; $obj->{} = 'utf-16'; substr ($b, 0, 2) = ''; $obj->{} = true; } } elsif (substr ($b, 0, 2) eq "\xFF\xFE") { if (substr ($b, 2, 2) eq "<\x00") { # ASCII16LE $obj->{} = true; substr ($b, 0, 2) = ''; my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or not $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-16', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "\x00\x00") { # ASCII32Endian4321 $obj->{} = true; substr ($b, 0, 4) = ''; my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or not $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-16', $restoreBOM => { $obj->{} .= "\x00\x00"; }, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } else { $csdef = $Message::Charset::Encode::CharsetDef ->{}; $obj->{} = 'utf-16'; substr ($b, 0, 2) = ''; $obj->{} = true; } } elsif (substr ($b, 0, 2) eq "\x00\x00") { if (substr ($b, 2, 2) eq "\x00<") { # ASCII32Endian1234 my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "<\x00") { # ASCII32Endian2143 my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => {}, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "\xFE\xFF") { # ASCII32Endian1234 $obj->{} = true; substr ($b, 0, 4) = ''; my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => { $obj->{} = false; $obj->{} .= "\x00\x00\xFE\xFF"; }, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } elsif (substr ($b, 2, 2) eq "\xFF\xFE") { # ASCII32Endian2143 $obj->{} = true; substr ($b, 0, 4) = ''; my $c = $b; $c =~ tr/\x00//d; __CODE{XMLEntity.guess:: $ascii => $c, $errorCondition => { not $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} or $csdef->{} }, $defaultURI => {}, $defaultName => 'utf-8', $restoreBOM => { $obj->{} = false; $obj->{} .= "\x00\x00\xFF\xFE"; }, }__; if (defined $csdef->{}) { $csdef = $Message::Charset::Encode::CharsetDef ->{$csdef->{}}; } } # \x4C\x6F\xA7\x94 EBCDIC } # buffer $obj->{} .= $b; } # read }__; } elsif ($csdef->{uri}->{}) { ## BOM is optional. __DEEP{ my $b = ''; if (read $obj->{}, $b, 3) { if ($b eq "\xEF\xBB\xBF") { $obj->{} = true; } else { $obj->{} .= $b; } } $csdef = $Message::Charset::Encode::CharsetDef ->{}; # UTF-8 with no BOM }__; } elsif ($csdef->{uri}->{}) { ## BOM is mandated. __DEEP{ my $b = ''; if (read $obj->{}, $b, 2) { if ($b eq "\xFE\xFF") { $obj->{} = true; $csdef = $Message::Charset::Encode::CharsetDef ->{}; # UTF-16BE with no BOM } elsif ($b eq "\xFF\xFE") { $obj->{} = true; $csdef = $Message::Charset::Encode::CharsetDef ->{}; # UTF-16LE with no BOM } else { $obj->{} ->($onerror, null, , charset_uri => $charset); $obj->{} = false; $obj->{} .= $b; $csdef = $Message::Charset::Encode::CharsetDef ->{}; # UTF-16BE with no BOM } } else { $obj->{} ->($onerror, null, , charset_uri => $charset); $obj->{} = false; $csdef = $Message::Charset::Encode::CharsetDef ->{}; # UTF-16BE with no BOM } }__; } if ($csdef->{uri}->{}) { $obj->{} = 'gl-jis-1997-swapped'; $obj->{} = 'gl-jis-1997'; $obj->{} = ; $r = bless $obj, ; require Encode::GLJIS1997Swapped; require Encode::GLJIS1997; undef $r unless Encode::find_encoding ($obj->{}); undef $r unless Encode::find_encoding ($obj->{}); } elsif ($csdef->{uri}->{}) { $obj->{} = 'gl-jis-1978'; $obj->{} = 'gl-jis-1983'; $obj->{} = ; $r = bless $obj, ; require Encode::GLJIS1978; require Encode::GLJIS1983; undef $r unless Encode::find_encoding ($obj->{}); undef $r unless Encode::find_encoding ($obj->{}); } elsif (defined $csdef->{}->[0]) { if ($csdef->{uri}->{} or $csdef->{uri}->{}) { $obj->{} = $csdef->{}->[0]; $r = bless $obj, ; require Encode::EUCJP1997; undef $r unless Encode::find_encoding ($obj->{}); } elsif ($csdef->{uri}->{} or $csdef->{uri}->{}) { $obj->{} = $csdef->{}->[0]; $r = bless $obj, ; require Encode::ShiftJIS1997; undef $r unless Encode::find_encoding ($obj->{}); } elsif ($csdef->{}) { $obj->{} = $csdef->{}->[0]; $r = bless $obj, ; require Encode; undef $r unless Encode::find_encoding ($obj->{}); } } unless (defined $r) { __DEEP{ $obj->{} ->($onerror, null, , charset_uri => $charset); }__; } @@CODE: @@@QName: XMLEntity.guess @@@PerlDef: if ($ascii =~ /^<\?xml\s+(?:version\s*=\s*["'][^"']*["']\s*)? encoding\s*=\s*["']([^"']*)/x) { $obj->{} = lc $1; my $__uri = $self-> (, $obj->{}); $csdef = $Message::Charset::Encode::CharsetDef->{$__uri}; if ($errorCondition) { $obj->{} ->($obj->{}, null, , charset_uri => $__uri, charset_name => $obj->{}); } } else { $csdef = $Message::Charset::Encode::CharsetDef->{$defaultURI}; $obj->{} = $defaultName; $restoreBOM; } @@Test: @@@QName: MCEncodeImpl.createMCDecodeHandle.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = 'a'; open my $fh, '>', \$byte; my $efh = $impl-> (, $fh); $test->id ('interface'); $test->assert_isa ($efh, ); $test->id ('onerr'); $test->assert_isa ($efh->, 'CODE'); @@Test: @@@QName: MCEncodeImpl.createMCXMLDecodeHandle.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = 'a'; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('interface'); $test->assert_isa ($efh, ); $test->id ('onerr'); $test->assert_isa ($efh->, 'CODE'); @@Test: @@@QName: MCEncodeImpl.createMCDecodeHandle.3.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $errors = 0; my $byte = 'a'; open my $fh, '>', \$byte; my $efh = $impl-> (q, $fh, sub { $errors++; $test->id ('errortype'); $test->assert_equals ($_[2], ); }); $test->id ('errors'); $test->assert_num_equals (actual_value => $errors, expected_value => 1); $test->id ('return'); $test->assert_null ($efh); @@Test: @@@QName: MCEncodeImpl.createMCDecodeHandle.4.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = 'a'; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('interface'); $test->assert_isa ($efh, ); $test->id ('class'); $test->assert_isa ($efh, ); $test->id ('onerr'); $test->assert_isa ($efh->, 'CODE'); @@Test: @@@QName: MCEncodeImpl.createMCDecodeHandle.5.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = 'a'; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('interface'); $test->assert_isa ($efh, ); $test->id ('class'); $test->assert_isa ($efh, ); $test->id ('onerr'); $test->assert_isa ($efh->, 'CODE'); @@Test: @@@QName: MCEncodeImpl.createMCDecodeHandle.6.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = 'a'; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('interface'); $test->assert_isa ($efh, ); $test->id ('class'); $test->assert_isa ($efh, ); $test->id ('onerr'); $test->assert_isa ($efh->, 'CODE'); @@Test: @@@QName: MCEncodeImpl.createMCDecodeHandle.7.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = 'a'; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('interface'); $test->assert_isa ($efh, ); $test->id ('class'); $test->assert_isa ($efh, ); $test->id ('onerr'); $test->assert_isa ($efh->, 'CODE'); @Method: @@Name: getURIFromCharsetName @@enDesc: Returns a DOM URI that identifies a charset. @@Param: @@@Name: domain @@@Type: String @@@enDesc: A DOM URI that identifies the context in which the charset name is used. @@Param: @@@Name: name @@@Type: String @@@enDesc: The charset name to convert. @@Return: @@@Type: String @@@enDesc: A DOM URI that identifies . @@@nullCase: @@@@enDesc: The implementation was unable to resolve to a URI. @@@PerlDef: if ({ => true, => true, }->{$domain}) { $name = lc $name; if ($domain eq ) { $r = . $name; } elsif ($domain eq ) { $r = . $name; } unless ($Message::Charset::Encode::CharsetDef->{$r}) { U: for my $uri (keys %$Message::Charset::Encode::CharsetDef) { for (@{$Message::Charset::Encode::CharsetDef->{$uri}->{+{ => , => , }->{$domain}} or []}) { if ($_ eq $name) { $r = $uri; last U; } } } # U } } else { $r = null; } @@Test: @@@QName: MCEncodeImpl.name2uri.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; for ( [, 'utf-8'], [, 'x-no-such-charset'], [, 'UTF-8'], [, 'uTf-8'], [, 'utf-16be'], ) { $test->id ('ietfname2uri.'.$_->[1]); my $iname = $impl-> (, $_->[1]); $test->assert_equals ($iname, $_->[0]); } for ( [, 'utf-8'], [, 'x-no-such-charset'], [, 'UTF-8'], [, 'uTf-8'], [, 'utf-16be'], ) { $test->id ('xmlname2uri.'.$_->[1]); my $iname = $impl-> (, $_->[1]); $test->assert_equals ($iname, $_->[0]); } @Method: @@Name: getCharsetNameFromURI @@enDesc: Returns a name for the charset identified by a DOM URI. @@Param: @@@Name: domain @@@Type: String @@@enDesc: A DOM URI that identifies the context in which the charset name is used. @@Param: @@@Name: uri @@@Type: String @@@enDesc: A DOM URI of the charset. @@Return: @@@Type: String @@@enDesc: A charset name that identifies . @@@nullCase: @@@@enDesc: The implementation was unable to find the charset name for the that can be used in context. @@@PerlDef: if ({ => true, => true, }->{$domain}) { $r = $Message::Charset::Encode::CharsetDef->{$uri}->{+{ => , => , }->{$domain}}->[0]; unless (defined $r) { if ($domain eq and substr ($uri, 0, length ) eq ) { $r = substr ($uri, length ); } elsif ($domain eq and substr ($uri, 0, length ) eq ) { $r = substr ($uri, length ); } } } else { $r = null; } @@Test: @@@QName: MCEncodeImpl.uri2name.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; for ( [, 'utf-8'], [, 'x-no-such-charset'], [q, null], ) { $test->id ('uri2ietfname.'.$_->[0]); my $iname = $impl-> (, $_->[0]); $test->assert_equals ($iname, $_->[1]); } for ( [, 'utf-8'], [, 'x-no-such-charset'], [q, null], ) { $test->id ('uri2xmlname.'.$_->[0]); my $iname = $impl-> (, $_->[0]); $test->assert_equals ($iname, $_->[1]); } @CODE: @@QName: createImplForTest @@PerlDef: $impl = ->_new; ##MCEncodeImplementation ElementTypeBinding: @Name: CODE @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISPerl|BlockCode @@ForCheck: ManakaiDOM|ForClass IFClsDef: @IFQName: MCDecodeHandle @ClsQName: ManakaiMCDecodeHandle @enDesc: An provides the read access to a character stream. @enDesc: @@ddid: cestype @@ForCheck: ManakaiDOM|ForClass @@@: The class can be used to encapsulate a byte filehandle with call into a character filehandle-like object. The encoding be stateless and signatureless. In addition, its implementation support the flag. @Attr: @@Name: charset @@enDesc: A URI that identifies the charset of the handle. @@Type: String @@Get: @@@PerlDef: $r = $self->{}; @Attr: @@Name: onerror @@enDesc: A callback function that is invoked when an error is encountered. {P:: The function will be invoked with arguments: - ::: The function itself. - ::: The object. If the error is thrown during the construction of the object, it might be instead. - ::: The DOM URI that identifies the category of the error. - ::: Named parameters depending to the . It throw an exception. } @@Type: DISPerl|CODE||ManakaiDOM|all @@Get: @@@PerlDef: $r = $self->{}; @@Set: @@@PerlDef: $self->{} = $given; @@Test: @@@QName: MCDecodeHandle.onoctetstreamerror.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = "a\xE3\x81\x82\x81a"; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('default'); $test->assert_isa ($efh->, 'CODE'); $test->id ('get.set'); my $sub1 = sub { return "2" }; $efh-> ($sub1); my $sub2 = $efh->; $test->assert_equals ($sub2, $sub1); $test->assert_equals ($sub2->(), "2"); @Method: @@Name: getc @@enDesc: Returns the next character from the input. @@Return: @@@Type: String @@@enDesc: The next character. @@@nullCase: @@@@enDesc: If at the end of the file, or if there was an error, in which case is set. @@@PerlDef: if (@{$self->{}}) { $r = shift @{$self->{}}; } else { __DEEP{ my $error; if ($self->{}) { if (read $self->{}, $self->{}, 256, length $self->{}) { # } else { $error = true; } $self->{} = false; } elsif (512 > length $self->{}) { read $self->{}, $self->{}, 256, length $self->{}; } unless ($error) { my $string = Encode::decode ($self->{}, $self->{}, Encode::FB_QUIET ()); if (length $string) { push @{$self->{}}, split //, $string; $r = shift @{$self->{}}; if (length $self->{}) { $self->{} = true; } } else { if (length $self->{}) { $error = true; } else { $r = null; } } } if ($error) { $r = substr $self->{}, 0, 1, ''; $self->{} ->($self->{}, $self, , octets => \$r); } }__; } @@Test: @@@QName: MCDecodeHandle.getc.1.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = "a\xE3\x81\x82\x81a"; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); my $error = null; $efh-> (sub { my ($self, $efh, $type, %opt) = @_; $error = ${$opt{octets}}; }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id ('1.err'); $test->assert_null ($error); $test->id (2); $test->assert_equals ($efh->getc, "\x{3042}"); $test->id ('1.err'); $test->assert_null ($error); $test->id (3); $test->assert_equals ($efh->getc, "\x81"); $test->id ('1.err'); $test->assert_equals ($error, "\x81"); $error = null; $test->id (4); $test->assert_equals ($efh->getc, "a"); $test->id ('4.err'); $test->assert_null ($error); $test->id ('eof'); $test->assert_null ($efh->getc); $test->id ('eof.err'); $test->assert_null ($error); @@Test: @@@QName: MCDecodeHandle.getc.2.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = "a" x 256; $byte .= "b" x 256; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); my $error = null; $efh-> (sub { my ($self, $efh, $type, %opt) = @_; $error = ${$opt{octets}}; }); for my $i (0..255) { $test->id ("a.$i"); $test->assert_equals ($efh->getc, "a"); $test->id ("a.$i.err"); $test->assert_null ($error); } for my $i (0..255) { $test->id ("b.$i"); $test->assert_equals ($efh->getc, "b"); $test->id ("b.$i.err"); $test->assert_null ($error); } $test->id ('eof'); $test->assert_null ($efh->getc); $test->id ('eof.err'); $test->assert_null ($error); @@Test: @@@QName: MCDecodeHandle.getc.3.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = "a" x 255; $byte .= "\xE3\x81\x82"; $byte .= "b" x 256; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); my $error = null; $efh-> (sub { my ($self, $efh, $type, %opt) = @_; $error = ${$opt{octets}}; }); for my $i (0..254) { $test->id ("a.$i"); $test->assert_equals ($efh->getc, "a"); $test->id ("a.$i.err"); $test->assert_null ($error); } $test->id ("A"); $test->assert_equals ($efh->getc, "\x{3042}"); $test->id ("A.err"); $test->assert_null ($error); for my $i (0..255) { $test->id ("b.$i"); $test->assert_equals ($efh->getc, "b"); $test->id ("b.$i.err"); $test->assert_null ($error); } $test->id ('eof'); $test->assert_null ($efh->getc); $test->id ('eof.err'); $test->assert_null ($error); @@Test: @@@QName: MCDecodeHandle.getc.4.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = "a" x 255; $byte .= "\xE3"; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); my $error = null; $efh-> (sub { my ($self, $efh, $type, %opt) = @_; $error = ${$opt{octets}}; }); for my $i (0..254) { $test->id ("a.$i"); $test->assert_equals ($efh->getc, "a"); $test->id ("a.$i.err"); $test->assert_null ($error); } $test->id ("E3"); $test->assert_equals ($efh->getc, "\xE3"); $test->id ("E3.err"); $test->assert_equals ($error, "\xE3"); $error = null; $test->id ('eof'); $test->assert_null ($efh->getc); $test->id ('eof.err'); $test->assert_null ($error); @Method: @@Name: ungetc @@enDesc: Pushes a character with the given ordinal value back onto the handle's input stream. In only one character of pushback per handle is guaranteed. @@Param: @@@Name: ord @@@Type: idl|unsignedLong||ManakaiDOM|all @@@enDesc: The ordinal value of the character to push back. @@Return: @@@PerlDef: unshift @{$self->{}}, chr $ord; @@Test: @@@QName: MCDecodeHandle.ungetc.test @@@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = "a\x{4E00}b\x{4E11}"; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh); $test->id ('1.getc'); $test->assert_equals ($efh->getc, "a"); $test->id ('1.ungetc'); $efh->ungetc (ord "a"); $test->assert_equals ($efh->getc, "a"); $test->id ('2.getc'); $test->assert_equals ($efh->getc, "\x{4E00}"); $test->id ('2.ungetc'); $efh->ungetc (ord "\x{4E00}"); $test->assert_equals ($efh->getc, "\x{4E00}"); $test->id ('3.getc'); $test->assert_equals ($efh->getc, "b"); $test->id ('4.getc'); $test->assert_equals ($efh->getc, "\x{4E11}"); $test->id ('4.ungetc'); $efh->ungetc (ord "\x{4E11}"); $test->assert_equals ($efh->getc, "\x{4E11}"); @Attr: @@Name: inputEncoding @@enDesc: The name of the input charset. @@Type: String @@Get: @@@enDesc: If there is a string looks like encoding declaration, then the value of it, in lowercase. Otherwise and there is the UTF-16 , then . Otherwise, . @@@nullCase: @@@@enDesc: If the charset is different from . @@@PerlDef: $r = $self->{}; @Attr: @@Name: hasBOM @@enDesc: Whether the decoder detected the or not. @@Type: idl|boolean||ManakaiDOM|all @@Get: @@@TrueCase: @@@@enDesc: If there is the . @@@FalseCase: @@@@enDesc: Either if there is no , the decoder not reached to the end of the , or the decoder implementation does not provide whether there is the or not. @@@PerlDef: $r = $self->{}; @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.1.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xEF\xBB\xBFabc>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_equals ($efh->getc, "b"); $test->id (3); $test->assert_equals ($efh->getc, "c"); $test->id (4); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.2.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_equals ($efh->getc, "b"); $test->id (3); $test->assert_equals ($efh->getc, "c"); $test->id (4); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.3.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xEF\xBB\xBF\xEF\xBB\xBFabc>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id ('zwnbsp'); $test->assert_equals ($efh->getc, "\x{FEFF}"); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_equals ($efh->getc, "b"); $test->id (3); $test->assert_equals ($efh->getc, "c"); $test->id (4); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.4.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xEF\xBB\xBF>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.5.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.6.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_equals ($efh->getc, "b"); $test->id (3); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.utf-8-optional-bom.7.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.1.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFE\xFF\x4E\x00\x00a>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "\x{4E00}"); $test->id (2); $test->assert_equals ($efh->getc, "a"); $test->id (3); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.2.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFF\xFE\x00\x4Ea\x00>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "\x{4E00}"); $test->id (2); $test->assert_equals ($efh->getc, "a"); $test->id (3); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.3.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFE\xFF\x00a>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.4.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFF\xFEa\x00>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = true }); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_false ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.5.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFE\xFFa>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id ('0.error'); $test->assert_null ($error); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id ('1.error'); $test->assert_equals ($error, ); $error = null; $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.6.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFF\xFEa>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id ('0.error'); $test->assert_null ($error); $test->id (1); $test->assert_equals ($efh->getc, "a"); $test->id ('1.error'); $test->assert_equals ($error, ); $error = null; $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.7.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFE\xFF>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id (1); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.8.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFF\xFE>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id (1); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_true ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.9.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFD\xFF>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id ('no.bom'); $test->assert_equals ($error, ); $error = null; $test->id (1); $test->assert_equals ($efh->getc, "\x{FDFF}"); $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.10.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<\xFD>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id ('no.bom'); $test->assert_equals ($error, ); $error = null; $test->id (1); $test->assert_equals ($efh->getc, "\xFD"); $test->id ('1.error'); $test->assert_equals ($error, ); $error = null; $test->id (2); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.utf-16-with-bom.11.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my $byte = qq<>; my $error; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { $error = $_[2] }); $test->id ('no.bom'); $test->assert_equals ($error, ); $error = null; $test->id (1); $test->assert_null ($efh->getc); $test->id ('err'); $test->assert_null ($error); $test->id ('bom'); $test->assert_false ($efh->); @Test: @@QName: MCDecodeHandle.xml.1.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my @testdata = ( { id => q, in => q<>, out => [null], name => 'utf-8', bom => false, }, { id => q, in => "a", out => [null, "a", null], name => 'utf-8', bom => false, }, { id => q, in => "\xEF\xBB\xBF", out => [null], name => 'utf-8', bom => true, }, { id => q, in => "\xEF\xBB\xBFa", out => [null, "a", null], name => 'utf-8', bom => true, }, { id => q, in => "\xEF\xBB\xBF\xEF\xBB\xBF", out => [null, "\x{FEFF}", null], name => 'utf-8', bom => true, }, { id => q, in => "\xFE\xFF", out => [null], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFE", out => [null], name => 'utf-16', bom => true, }, { id => q, in => "\xFE\xFFa", out => [null, "a", []], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFEa", out => [null, "a", []], name => 'utf-16', bom => true, }, { id => q, in => "\xFE\xFF\x4E\x00", out => [null, "\x{4E00}", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFE\x00\x4E", out => [null, "\x{4E00}", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFE\xFF\x00<", out => [null, "<", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFE<\x00", out => [null, "<", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFE\xFF\xFE\xFF", out => [null, "\x{FEFF}", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFE\xFF\xFE", out => [null, "\x{FEFF}", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFE\xFF\x00\x00", out => [null, "\x00", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFE\x00\x00", out => [null, "\x00", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFE\xFF\x00<\x00?", out => [null, "<", null, "?", null], name => 'utf-16', bom => true, }, { id => q, in => "\xFF\xFE<\x00?\x00", out => [null, "<", null, "?", null], name => 'utf-16', bom => true, }, { id => q, in => qq[\xFE\xFF\x00<\x00?\x00x\x00m\x00l\x00 \x00v\x00e\x00r]. qq[\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"]. qq[\x00 \x00e\x00n\x00c\x00o\x00d\x00i\x00n\x00g\x00=]. qq[\x00"\x00u\x00t\x00f\x00-\x001\x006\x00"\x00?\x00>], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "1", null, "6", null, '"', null, "?", null, ">", null], name => 'utf-16', bom => true, }, { id => q, in => qq[\xFF\xFE<\x00?\x00x\x00m\x00l\x00 \x00v\x00e\x00r]. qq[\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"]. qq[\x00 \x00e\x00n\x00c\x00o\x00d\x00i\x00n\x00g\x00=]. qq[\x00"\x00u\x00t\x00f\x00-\x001\x006\x00"\x00?\x00>\x00], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "1", null, "6", null, '"', null, "?", null, ">", null], name => 'utf-16', bom => true, }, { id => q, in => qq[\x00<\x00?\x00x\x00m\x00l\x00 \x00v\x00e\x00r]. qq[\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"]. qq[\x00 \x00e\x00n\x00c\x00o\x00d\x00i\x00n\x00g\x00=]. qq[\x00"\x00u\x00t\x00f\x00-\x001\x006\x00b\x00e\x00"\x00?\x00>], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "1", null, "6", null, "b", null, "e", null, '"', null, "?", null, ">", null], name => 'utf-16be', bom => false, }, { id => q, in => qq[<\x00?\x00x\x00m\x00l\x00 \x00v\x00e\x00r]. qq[\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"]. qq[\x00 \x00e\x00n\x00c\x00o\x00d\x00i\x00n\x00g\x00=]. qq[\x00"\x00u\x00t\x00f\x00-\x001\x006\x00l\x00e\x00"]. qq[\x00?\x00>\x00], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "1", null, "6", null, "l", null, "e", null, '"', null, "?", null, ">", null], name => 'utf-16le', bom => false, }, { id => q<16be.decl.1>, in => qq[\x00<\x00?\x00x\x00m\x00l\x00 \x00v\x00e\x00r]. qq[\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"]. qq[\x00 \x00e\x00n\x00c\x00o\x00d\x00i\x00n\x00g\x00=]. qq[\x00"\x00u\x00t\x00f\x00-\x001\x006\x00"\x00?\x00>], out => [[], "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "1", null, "6", null, '"', null, "?", null, ">", null], name => 'utf-16', bom => false, }, { id => q<16le.decl.1>, in => qq[<\x00?\x00x\x00m\x00l\x00 \x00v\x00e\x00r]. qq[\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"]. qq[\x00 \x00e\x00n\x00c\x00o\x00d\x00i\x00n\x00g\x00=]. qq[\x00"\x00u\x00t\x00f\x00-\x001\x006\x00"\x00?\x00>\x00], out => [[], "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "1", null, "6", null, '"', null, "?", null, ">", null], name => 'utf-16', bom => false, }, { id => q<8.decl.1>, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "8", null, '"', null, "?", null, ">", null], name => 'utf-8', bom => false, }, { id => q<8.decl.2>, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "8", null, '"', null, "?", null, ">", null], name => 'utf-8', bom => false, }, { id => q<8.decl.3>, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "1", null, '"', null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "8", null, '"', null, "?", null, ">", null], name => 'utf-8', bom => false, }, { id => q<8.decl.4>, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "v", null, "e", null, "r", null, "s", null, "i", null, "o", null, "n", null, "=", null, '"', null, "1", null, ".", null, "0", null, '"', null, "?", null, ">", null], name => 'utf-8', bom => false, }, { id => q, in => qq[\xEF\xBB\xBF], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "t", null, "f", null, "-", null, "8", null, '"', null, "?", null, ">", null], name => 'utf-8', bom => true, }, { id => q, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "u", null, "s", null, "-", null, "a", null, "s", null, "c", null, "i", null, "i", null, '"', null, "?", null, ">", null], name => 'us-ascii', bom => false, }, { id => q, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, '"', null, "U", null, "S", null, "-", null, "a", null, "s", null, "c", null, "i", null, "i", null, '"', null, "?", null, ">", null], name => 'us-ascii', bom => false, }, { id => q, in => qq[], out => [null, "<", null, "?", null, "x", null, "m", null, "l", null, " ", null, "e", null, "n", null, "c", null, "o", null, "d", null, "i", null, "n", null, "g", null, "=", null, "'", null, "u", null, "s", null, "-", null, "a", null, "s", null, "c", null, "i", null, "i", null, "'", null, "?", null, ">", null], name => 'us-ascii', bom => false, }, ); for my $testdata (@testdata) { my $byte = $testdata->{in}; my $error; my $i = 0; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { my (null, null, $etype, %opt) = @_; $error = [$etype, \%opt]; }); $test->id ("$testdata->{id}.bom"); my $tf = $testdata->{bom} ? 'assert_true' : 'assert_false'; $test->$tf ($efh->); $test->id ("$testdata->{id}.name"); $test->assert_equals ($efh->, $testdata->{name}); while (@{$testdata->{out}}) { if ($i != 0) { my $c = shift @{$testdata->{out}}; $test->id ("$testdata->{id}.$i"); $test->assert_equals ($efh->getc, $c); } my $v = shift @{$testdata->{out}}; $test->id ("$testdata->{id}.$i.error"); if (defined $v) { $test->assert_not_null ($error); $test->assert_equals ($error->[0], $v->[0]); } else { $test->assert_null ($error->[0]); } $error = null; $i++; } $test->id ("$testdata->{id}.eof"); $test->assert_null ($efh->getc); $test->assert_null ($error); } # testdata @Method: @@Name: close @@Return: @@@PerlDef: close $self->{}; ##MCDecodeHandle ClsDef: @ClsQName: ManakaiMCEUCJPDecodeHandle @ClsISA: ManakaiMCDecodeHandle @Method: @@Name: getc @@enDesc: Returns the next character from the input. @@Return: @@@Type: String @@@enDesc: The next character. @@@nullCase: @@@@enDesc: If at the end of the file, or if there was an error, in which case is set. @@@PerlDef: if (@{$self->{}}) { $r = shift @{$self->{}}; } else { __DEEP{ my $error; if ($self->{}) { if (read $self->{}, $self->{}, 256, length $self->{}) { # } else { $error = true; } $self->{} = false; } elsif (512 > length $self->{}) { read $self->{}, $self->{}, 256, length $self->{}; } unless ($error) { my $string = Encode::decode ($self->{}, $self->{}, Encode::FB_QUIET ()); if (length $string) { push @{$self->{}}, split //, $string; $r = shift @{$self->{}}; if (length $self->{}) { $self->{} = true; } } else { if (length $self->{}) { $error = true; } else { $r = null; } } } if ($error) { $r = substr $self->{}, 0, 1, ''; my $etype = ; if ($r =~ /^[\xA1-\xFE]/) { if ($self->{} =~ s/^([\xA1-\xFE])//) { $r .= $1; $etype = ; } } elsif ($r eq "\x8F") { if ($self->{} =~ s/^([\xA1-\xFE][\xA1-\xFE]?)//) { $r .= $1; $etype = if length $1 == 2; } } elsif ($r eq "\x8E") { if ($self->{} =~ s/^([\xA1-\xFE])//) { $r .= $1; $etype = ; } } elsif ($r eq "\xA0" or $r eq "\xFF") { $etype = ; } $self->{} ->($self->{}, $self, $etype, octets => \$r); } }__; } @Test: @@QName: MCEUCJPDecodeHandle.1.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my @testdata = ( { id => q, in => q<>, out => [null], }, { id => q, in => qq<\x00>, out => [null, "\x00", null], }, { id => q, in => qq<\x0D>, out => [null, "\x0D", null], }, { id => q, in => qq<\x0E>, out => [null, "\x0E", null], }, # Error?? { id => q, in => qq<\x0F>, out => [null, "\x0F", null], }, # Error?? { id => q, in => qq<\x1B>, out => [null, "\x1B", null], }, # Error?? { id => q, in => q, out => [null, "a", null], }, { id => q, in => qq<\x20>, out => [null, "\x20", null], }, { id => q<5C>, in => qq<\x5C>, out => [null, "\x5C", null], }, { id => q, in => qq<\x7E>, out => [null, "\x7E", null], }, { id => q, in => qq<\x7F>, out => [null, "\x7F", null], }, { id => q, in => qq<\x80>, out => [null, "\x80", null], }, { id => q, in => qq<\x8C>, out => [null, "\x8C", null], }, { id => q, in => qq<\x8E>, out => [null, "\x8E", []], }, { id => q, in => qq<\x8F>, out => [null, "\x8F", []], }, { id => q, in => qq<\xA0>, out => [null, "\xA0", []], }, { id => q, in => qq<\xA1>, out => [null, "\xA1", []], }, { id => q, in => qq<\xA2>, out => [null, "\xA2", []], }, { id => q, in => qq<\xFD>, out => [null, "\xFD", []], }, { id => q, in => qq<\xFE>, out => [null, "\xFE", []], }, { id => q, in => qq<\xFF>, out => [null, "\xFF", []], }, { id => q, in => qq<\x00\x00>, out => [null, "\x00", null, "\x00", null], }, { id => q, in => qq<\x0D\x0A>, out => [null, "\x0D", null, "\x0A", null], }, { id => q, in => qq<\x1B\x28>, out => [null, "\x1B", null, "\x28", null], },# Error?? { id => q, in => qq<\x20\x20>, out => [null, "\x20", null, "\x20", null], }, { id => q, in => qq, out => [null, "a", null, "b", null], }, { id => q, in => qq<\xA0\xA1>, out => [null, "\xA0", [], "\xA1", []], }, { id => q, in => qq<\xA1\xA1>, out => [null, "\x{3000}", null], }, { id => q, in => qq<\xA1\xA2>, out => [null, "\x{3001}", null], }, { id => q, in => qq<\xA1\xA4>, out => [null, "\x{FF0C}", null], # FULLWIDTH COMMA }, { id => q, in => qq<\xA1\xA6>, out => [null, "\x{30FB}", null], # KATAKABA MIDDLE DOT }, { id => q, in => qq<\xA1\xA7>, out => [null, "\x{FF1A}", null], # FULLWIDTH COLON }, { id => q, in => qq<\xA1\xB1>, out => [null, "\x{203E}", null], # OVERLINE }, { id => q, in => qq<\xA1\xBD>, out => [null, "\x{2014}", null], # EM DASH }, { id => q, in => qq<\xA1\xC0>, out => [null, "\x{FF3C}", null], # FULLWIDTH REVERSE SOLIDUS }, { id => q, in => qq<\xA1\xC1>, out => [null, "\x{301C}", null], # WAVE DASH }, { id => q, in => qq<\xA1\xC2>, out => [null, "\x{2016}", null], # DOUBLE VERTICAL LINE }, { id => q, in => qq<\xA1\xC4>, out => [null, "\x{2026}", null], # HORIZONTAL ELLIPSIS }, { id => q, in => qq<\xA1\xDD>, out => [null, "\x{2212}", null], # MINUS SIGN }, { id => q, in => qq<\xA1\xEF>, out => [null, "\x{00A5}", null], # YEN SIGN }, { id => q, in => qq<\xA1\xF1>, out => [null, "\x{00A2}", null], # CENT SIGN }, { id => q, in => qq<\xA1\xF2>, out => [null, "\x{00A3}", null], # POUND SIGN }, { id => q, in => qq<\xA1\xFF>, out => [null, "\xA1", [], "\xFF", []], }, { id => q, in => qq<\xA2\xAE>, out => [null, "\x{3013}", null], # GETA MARK }, { id => q, in => qq<\xA2\xAF>, out => [null, "\xA2\xAF", []], }, { id => q, in => qq<\xA2\xBA>, out => [null, "\x{2208}", null], # ELEMENT OF }, { id => q, in => qq<\xA2\xFE>, out => [null, "\x{25EF}", null], # LARGE CIRCLE }, { id => q, in => qq<\xAD\xCE>, out => [null, "\xAD\xCE", []], }, { id => q, in => qq<\xB0\xA6>, out => [null, "\x{611B}", null], # han }, { id => q, in => qq<\xF4\xA6>, out => [null, "\x{7199}", null], # han }, { id => q<8ea1>, in => qq<\x8E\xA1>, out => [null, "\x{FF61}", null], }, { id => q<8efe>, in => qq<\x8E\xFE>, out => [null, "\x8E\xFE", []], }, { id => q<8ffe>, in => qq<\x8F\xFE>, out => [null, "\x8F\xFE", []], }, { id => q, in => qq<\xA1\xA2\xA3>, out => [null, "\x{3001}", null, "\xA3", []], }, { id => q<8ea1a1>, in => qq<\x8E\xA1\xA1>, out => [null, "\x{FF61}", null, "\xA1", []], }, { id => q<8fa1a1>, in => qq<\x8F\xA1\xA1>, out => [null, "\x8F\xA1\xA1", []], }, { id => q<8fa2af>, in => qq<\x8F\xA2\xAF>, out => [null, "\x{02D8}", null], }, { id => q<8fa2b7>, in => qq<\x8F\xA2\xB7>, out => [null, "\x{FF5E}", null], # FULLWIDTH TILDE }, { id => q, in => qq<\xA1\xA2\xA1\xA3>, out => [null, "\x{3001}", null, "\x{3002}", null], }, { id => q<8fa2af>, in => qq<\x8F\xA2\xAF\xAF>, out => [null, "\x{02D8}", null, "\xAF", []], }, { id => q<8fa2afafa1>, in => qq<\x8F\xA2\xAF\xAF\xA1>, out => [null, "\x{02D8}", null, "\xAF\xA1", []], }, ); for my $testdata (@testdata) { my $byte = $testdata->{in}; my $error; my $i = 0; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { my (null, null, $etype, %opt) = @_; $error = [$etype, \%opt]; }); while (@{$testdata->{out}}) { if ($i != 0) { my $c = shift @{$testdata->{out}}; $test->id ("$testdata->{id}.$i"); $test->assert_equals ($efh->getc, $c); } my $v = shift @{$testdata->{out}}; $test->id ("$testdata->{id}.$i.error"); if (defined $v) { $test->assert_not_null ($error); $test->assert_equals ($error->[0], $v->[0]); } else { $test->assert_null ($error->[0]); } $error = null; $i++; } $test->id ("$testdata->{id}.eof"); $test->assert_null ($efh->getc); $test->assert_null ($error); } # testdata ##MCEUCJPDecodeHandle ClsDef: @ClsQName: ManakaiMCShiftJISDecodeHandle @ClsISA: ManakaiMCDecodeHandle @Method: @@Name: getc @@enDesc: Returns the next character from the input. @@Return: @@@Type: String @@@enDesc: The next character. @@@nullCase: @@@@enDesc: If at the end of the file, or if there was an error, in which case is set. @@@PerlDef: if (@{$self->{}}) { $r = shift @{$self->{}}; } else { __DEEP{ my $error; if ($self->{}) { if (read $self->{}, $self->{}, 256, length $self->{}) { # } else { $error = true; } $self->{} = false; } elsif (512 > length $self->{}) { read $self->{}, $self->{}, 256, length $self->{}; } unless ($error) { my $string = Encode::decode ($self->{}, $self->{}, Encode::FB_QUIET ()); if (length $string) { push @{$self->{}}, split //, $string; $r = shift @{$self->{}}; if (length $self->{}) { $self->{} = true; } } else { if (length $self->{}) { $error = true; } else { $r = null; } } } if ($error) { $r = substr $self->{}, 0, 1, ''; my $etype = ; if ($r =~ /^[\x81-\x9F\xE0-\xEF]/) { if ($self->{} =~ s/(.)//s) { $r .= $1; # not limited to \x40-\xFC - \x7F $etype = ; } } elsif ($r =~ /^[\x80\xA0\xF0-\xFF]/) { $etype = ; } $self->{} ->($self->{}, $self, $etype, octets => \$r); } }__; } @Test: @@QName: MCShiftJISDecodeHandle.1.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my @testdata = ( { id => q, in => q<>, out => [null], }, { id => q, in => qq<\x00>, out => [null, "\x00", null], }, { id => q, in => qq<\x0D>, out => [null, "\x0D", null], }, { id => q, in => qq<\x0E>, out => [null, "\x0E", null], }, # Error?? { id => q, in => qq<\x0F>, out => [null, "\x0F", null], }, # Error?? { id => q, in => qq<\x1B>, out => [null, "\x1B", null], }, # Error?? { id => q, in => q, out => [null, "a", null], }, { id => q, in => qq<\x20>, out => [null, "\x20", null], }, { id => q, in => qq<\x5C>, out => [null, "\xA5", null], # YEN SIGN }, { id => q, in => qq<\x7E>, out => [null, "\x{203E}", null], # OVERLINE }, { id => q, in => qq<\x7F>, out => [null, "\x7F", null], }, { id => q, in => qq<\x80>, out => [null, "\x80", []], }, { id => q, in => qq<\x8C>, out => [null, "\x8C", []], }, { id => q, in => qq<\x8E>, out => [null, "\x8E", []], }, { id => q, in => qq<\x8F>, out => [null, "\x8F", []], }, { id => q, in => qq<\xA0>, out => [null, "\xA0", []], }, { id => q, in => qq<\xA1>, out => [null, "\x{FF61}", null], }, { id => q, in => qq<\xA2>, out => [null, "\x{FF62}", null], }, { id => q, in => qq<\xdf>, out => [null, "\x{FF9F}", null], }, { id => q, in => qq<\xe0>, out => [null, "\xE0", []], }, { id => q, in => qq<\xEF>, out => [null, "\xEF", []], }, { id => q, in => qq<\xF0>, out => [null, "\xF0", []], }, { id => q, in => qq<\xFC>, out => [null, "\xFC", []], }, { id => q, in => qq<\xFD>, out => [null, "\xFD", []], }, { id => q, in => qq<\xFE>, out => [null, "\xFE", []], }, { id => q, in => qq<\xFF>, out => [null, "\xFF", []], }, { id => q, in => qq<\x00\x00>, out => [null, "\x00", null, "\x00", null], }, { id => q, in => qq<\x0D\x0A>, out => [null, "\x0D", null, "\x0A", null], }, { id => q, in => qq<\x1B\x28>, out => [null, "\x1B", null, "\x28", null], },# Error?? { id => q, in => qq<\x20\x20>, out => [null, "\x20", null, "\x20", null], }, { id => q, in => qq, out => [null, "a", null, "b", null], }, { id => q<8040>, in => qq<\x80\x40>, out => [null, "\x80", [], "\x40", null], }, { id => q<8100>, in => qq<\x81\x00>, out => [null, "\x81\x00", []], }, { id => q<8101>, in => qq<\x81\x01>, out => [null, "\x81\x01", []], }, { id => q<813F>, in => qq<\x81\x3F>, out => [null, "\x81\x3F", []], }, { id => q<8140>, in => qq<\x81\x40>, out => [null, "\x{3000}", null], }, { id => q<8141>, in => qq<\x81\x41>, out => [null, "\x{3001}", null], }, { id => q<8143>, in => qq<\x81\x43>, out => [null, "\x{FF0C}", null], # FULLWIDTH COMMA }, { id => q<8150>, in => qq<\x81\x50>, out => [null, "\x{FFE3}", null], # FULLWIDTH MACRON }, { id => q<815C>, in => qq<\x81\x5C>, out => [null, "\x{2014}", null], # EM DASH }, { id => q<815F>, in => qq<\x81\x5F>, out => [null, "\x{005C}", null], # REVERSE SOLIDUS }, { id => q<8160>, in => qq<\x81\x60>, out => [null, "\x{301C}", null], # WAVE DASH }, { id => q<8161>, in => qq<\x81\x61>, out => [null, "\x{2016}", null], # DOUBLE VERTICAL LINE }, { id => q<8163>, in => qq<\x81\x63>, out => [null, "\x{2026}", null], # HORIZONTAL ELLIPSIS }, { id => q<817C>, in => qq<\x81\x7C>, out => [null, "\x{2212}", null], # MINUS SIGN }, { id => q<817F>, in => qq<\x81\x7F>, out => [null, "\x81\x7F", []], }, { id => q<818F>, in => qq<\x81\x8F>, out => [null, "\x{FFE5}", null], # FULLWIDTH YEN SIGN }, { id => q<8191>, in => qq<\x81\x91>, out => [null, "\x{00A2}", null], # CENT SIGN }, { id => q<8192>, in => qq<\x81\x92>, out => [null, "\x{00A3}", null], # POUND SIGN }, { id => q<81AC>, in => qq<\x81\xAC>, out => [null, "\x{3013}", null], # GETA MARK }, { id => q<81AD>, in => qq<\x81\xAD>, out => [null, "\x81\xAD", []], }, { id => q<81B8>, in => qq<\x81\xB8>, out => [null, "\x{2208}", null], # ELEMENT OF }, { id => q<81CA>, in => qq<\x81\xCA>, out => [null, "\x{00AC}", null], # NOT SIGN }, { id => q<81FC>, in => qq<\x81\xFC>, out => [null, "\x{25EF}", null], # LARGE CIRCLE }, { id => q<81FD>, in => qq<\x81\xFD>, out => [null, "\x81\xFD", []], }, { id => q<81FE>, in => qq<\x81\xFE>, out => [null, "\x81\xFE", []], }, { id => q<81FF>, in => qq<\x81\xFF>, out => [null, "\x81\xFF", []], }, { id => q, in => qq<\xDD\xDE>, out => [null, "\x{FF9D}", null, "\x{FF9E}", null], }, { id => q, in => qq<\xE0\x40>, out => [null, "\x{6F3E}", null], }, { id => q, in => qq<\xEA\xA4>, out => [null, "\x{7199}", null], }, { id => q, in => qq<\xEA\xA5>, out => [null, "\xEA\xA5", []], }, { id => q, in => qq<\xEB\x40>, out => [null, "\xEB\x40", []], }, { id => q, in => qq<\xED\x40>, out => [null, "\xED\x40", []], }, { id => q, in => qq<\xEF\xFC>, out => [null, "\xEF\xFC", []], }, { id => q, in => qq<\xF0\x40>, out => [null, "\xF0", [], "\x40", null], }, { id => q, in => qq<\xF1\x40>, out => [null, "\xF1", [], "\x40", null], }, { id => q, in => qq<\xFB\x40>, out => [null, "\xFB", [], "\x40", null], }, { id => q, in => qq<\xFc\x40>, out => [null, "\xFC", [], "\x40", null], }, { id => q, in => qq<\xFD\x40>, out => [null, "\xFD", [], "\x40", null], }, { id => q, in => qq<\xFE\x40>, out => [null, "\xFE", [], "\x40", null], }, { id => q, in => qq<\xFF\x40>, out => [null, "\xFF", [], "\x40", null], }, { id => q<81408142>, in => qq<\x81\x40\x81\x42>, out => [null, "\x{3000}", null, "\x{3002}", null], }, ); for my $testdata (@testdata) { my $byte = $testdata->{in}; my $error; my $i = 0; open my $fh, '<', \$byte; my $efh = $impl-> (, $fh, sub { my (null, null, $etype, %opt) = @_; $error = [$etype, \%opt]; }); while (@{$testdata->{out}}) { if ($i != 0) { my $c = shift @{$testdata->{out}}; $test->id ("$testdata->{id}.$i"); $test->assert_equals ($efh->getc, $c); } my $v = shift @{$testdata->{out}}; $test->id ("$testdata->{id}.$i.error"); if (defined $v) { $test->assert_not_null ($error); $test->assert_equals ($error->[0], $v->[0]); } else { $test->assert_null ($error->[0]); } $error = null; $i++; } $test->id ("$testdata->{id}.eof"); $test->assert_null ($efh->getc); $test->assert_null ($error); } # testdata ##MCShiftJISDecodeHandle ClsDef: @ClsQName: ManakaiMCISO2022JPDecodeHandle @ClsISA: ManakaiMCDecodeHandle @Method: @@Name: getc @@enDesc: Returns the next character from the input. @@Return: @@@Type: String @@@enDesc: The next character. @@@nullCase: @@@@enDesc: If at the end of the file, or if there was an error, in which case is set. @@@PerlDef: if (@{$self->{}}) { $r = shift @{$self->{}}; } else { __DEEP{ A: { my $error; if ($self->{}) { if (read $self->{}, $self->{}, 256, length $self->{}) { # } else { $error = true; } $self->{} = false; } elsif (512 > length $self->{}) { read $self->{}, $self->{}, 256, length $self->{}; } unless ($error) { if ($self->{} =~ s/^\x1B(\x24[\x40\x42]|\x28[\x42\x4A])//) { $self->{} = { "\x24\x40" => , "\x24\x42" => , "\x28\x42" => , "\x28\x4A" => , }->{$1}; redo A; } elsif ($self->{} eq ) { # IRV if ($self->{} =~ s/^([\x00-\x0D\x10-\x1A\x1C-\x7F]+)//) { push @{$self->{}}, split //, $1; $r = shift @{$self->{}}; } else { if (length $self->{}) { $error = true; } else { $r = null; } } } elsif ($self->{} eq ) { # 0201 if ($self->{} =~ s/^([\x00-\x0D\x10-\x1A\x1C-\x7F]+)//) { my $v = $1; $v =~ tr/\x5C\x7E/\xA5\x{203E}/; push @{$self->{}}, split //, $v; $r = shift @{$self->{}}; } else { if (length $self->{}) { $error = true; } else { $r = null; $self->{} ->($self->{}, $self, , state => $self->{}); } } } elsif ($self->{} eq ) { # 1983 my $v = Encode::decode ($self->{}, $self->{}, Encode::FB_QUIET ()); if (length $v) { push @{$self->{}}, split //, $v; $r = shift @{$self->{}}; } else { if (length $self->{}) { $error = true; } else { $r = null; $self->{} ->($self->{}, $self, , state => $self->{}); } } } elsif ($self->{} eq ) { # 1978 my $v = Encode::decode ($self->{}, $self->{}, Encode::FB_QUIET ()); if (length $v) { push @{$self->{}}, split //, $v; $r = shift @{$self->{}}; } else { if (length $self->{}) { $error = true; } else { $r = null; $self->{} ->($self->{}, $self, , state => $self->{}); } } } else { $error = true; } } if ($error) { $r = substr $self->{}, 0, 1, ''; my $etype = ; if (($self->{} eq or $self->{} eq ) and $r =~ /^[\x21-\x7E]/ and $self->{} =~ s/^([\x21-\x7E])//) { $r .= $1; $etype = ; } elsif ($r eq "\x1B" and $self->{} =~ s/^\(H//) { # Old 0201 $r .= "(H"; $self->{} = ; } $self->{} ->($self->{}, $self, $etype, octets => \$r); } } # A }__; } @Test: @@QName: MCISO2022JPDecodeHandle.1.test @@PerlDef: my $impl; __CODE{createImplForTest:: $impl => $impl}__; my @testdata = ( { id => q, in => q<>, out1 => [null], out2 => [null], }, { id => q, in => qq<\x00>, out1 => [null, "\x00", null], out2 => [null, "\x00", null], }, { id => q, in => qq<\x0D>, out1 => [null, "\x0D", null], out2 => [null, "\x0D", null], }, # Error? { id => q<0A>, in => qq<\x0A>, out1 => [null, "\x0A", null], out2 => [null, "\x0A", null], }, # Error? { id => q, in => qq<\x0E>, out1 => [null, "\x0E", []], out2 => [null, "\x0E", []], }, { id => q, in => qq<\x0F>, out1 => [null, "\x0F", []], out2 => [null, "\x0F", []], }, { id => q, in => qq<\x1B>, out1 => [null, "\x1B", []], out2 => [null, "\x1B", []], }, { id => q, in => q, out1 => [null, "a", null], out2 => [null, "a", null], }, { id => q, in => qq<\x20>, out1 => [null, "\x20", null], out2 => [null, "\x20", null], }, { id => q, in => qq<\x5C>, out1 => [null, "\x5C", null], out2 => [null, "\x5C", null], }, { id => q, in => qq<\x7E>, out1 => [null, "\x7E", null], out2 => [null, "\x7E", null], }, { id => q, in => qq<\x7F>, out1 => [null, "\x7F", null], out2 => [null, "\x7F", null], }, { id => q, in => qq<\x80>, out1 => [null, "\x80", []], out2 => [null, "\x80", []], }, { id => q, in => qq<\x8C>, out1 => [null, "\x8C", []], out2 => [null, "\x8C", []], }, { id => q, in => qq<\x8E>, out1 => [null, "\x8E", []], out2 => [null, "\x8E", []], }, { id => q, in => qq<\x8F>, out1 => [null, "\x8F", []], out2 => [null, "\x8F", []], }, { id => q, in => qq<\xA0>, out1 => [null, "\xA0", []], out2 => [null, "\xA0", []], }, { id => q, in => qq<\xA1>, out1 => [null, "\xA1", []], out2 => [null, "\xA1", []], }, { id => q, in => qq<\xA2>, out1 => [null, "\xA2", []], out2 => [null, "\xA2", []], }, { id => q, in => qq<\xdf>, out1 => [null, "\xDF", []], out2 => [null, "\xDF", []], }, { id => q, in => qq<\xe0>, out1 => [null, "\xE0", []], out2 => [null, "\xE0", []], }, { id => q, in => qq<\xEF>, out1 => [null, "\xEF", []], out2 => [null, "\xEF", []], }, { id => q, in => qq<\xF0>, out1 => [null, "\xF0", []], out2 => [null, "\xF0", []], }, { id => q, in => qq<\xFC>, out1 => [null, "\xFC", []], out2 => [null, "\xFC", []], }, { id => q, in => qq<\xFD>, out1 => [null, "\xFD", []], out2 => [null, "\xFD", []], }, { id => q, in => qq<\xFE>, out1 => [null, "\xFE", []], out2 => [null, "\xFE", []], }, { id => q, in => qq<\xFF>, out1 => [null, "\xFF", []], out2 => [null, "\xFF", []], }, { id => q, in => qq<\x00\x00>, out1 => [null, "\x00", null, "\x00", null], out2 => [null, "\x00", null, "\x00", null], }, { id => q, in => qq<\x0D\x0A>, out1 => [null, "\x0D", null, "\x0A", null], out2 => [null, "\x0D", null, "\x0A", null], }, { id => q, in => qq<\x1B\x1B>, out1 => [null, "\x1B", [], "\x1B", []], out2 => [null, "\x1B", [], "\x1B", []], }, { id => q, in => qq<\x1B\x20>, out1 => [null, "\x1B", [], "\x20", null], out2 => [null, "\x1B", [], "\x20", null], }, { id => q, in => qq<\x1B\x24>, out1 => [null, "\x1B", [], "\x24", null], out2 => [null, "\x1B", [], "\x24", null], }, { id => q, in => qq<\x1B\x28>, out1 => [null, "\x1B", [], "\x28", null], out2 => [null, "\x1B", [], "\x28", null], }, { id => q, in => qq<\x20\x20>, out1 => [null, "\x20", null, "\x20", null], out2 => [null, "\x20", null, "\x20", null], }, { id => q, in => qq, out1 => [null, "a", null, "b", null], out2 => [null, "a", null, "b", null], }, { id => q<8040>, in => qq<\x80\x40>, out1 => [null, "\x80", [], "\x40", null], out2 => [null, "\x80", [], "\x40", null], }, { id => q<1B2440>, in => qq<\x1B\x24\x40>, out1 => [null], out2 => [null], eof_error => [], }, { id => q<1B2442>, in => qq<\x1B\x24\x42>, out1 => [null], out2 => [null], eof_error => [], }, { id => q<1B2840>, in => qq<\x1B\x28\x40>, out1 => [null, "\x1B", [], "(", null, "\x40", null], out2 => [null, "\x1B", [], "(", null, "\x40", null], }, { id => q<1B2842>, in => qq<\x1B\x28\x42>, out1 => [null], out2 => [null], }, { id => q<1B284A>, in => qq<\x1B\x28\x4A>, out1 => [null], out2 => [null], eof_error => [], }, { id => q<1B$B1B(B>, in => qq<\x1B\x24\x42\x1B\x28\x42>, out1 => [null], out2 => [null], }, { id => q<1B(B1B(B>, in => qq<\x1B\x28\x42\x1B\x28\x42>, out1 => [null], out2 => [null], }, { id => q<1B(Ba1B(B>, in => qq<\x1B\x28\x42a\x1B\x28\x42>, out1 => [null, "a", null], out2 => [null, "a", null], }, { id => q<1B(Ba1B(B1B(B>, in => qq<\x1B\x28\x42a\x1B\x28\x42\x1B\x28\x42>, out1 => [null, "a", null], out2 => [null, "a", null], }, { id => q<1B$42!!1B2842>, in => qq<\x1B\x24\x42!!\x1B\x28\x42>, out1 => [null, "\x{3000}", null], out2 => [null, "\x{3000}", null], }, { id => q<1B$4221211B284A>, in => qq<\x1B\x24\x42!!\x1B\x28\x4A>, out1 => [null, "\x{3000}", null], out2 => [null, "\x{3000}", null], eof_error => [], }, { id => q<1B$4021211B2842>, in => qq<\x1B\x24\x40!!\x1B\x28\x42>, out1 => [null, "\x{3000}", null], out2 => [null, "\x{3000}", null], }, { id => q<1B$402121211B2842>, in => qq<\x1B\x24\x40!!!\x1B\x28\x42>, out1 => [null, "\x{3000}", null, "!", []], out2 => [null, "\x{3000}", null, "!", []], }, { id => q<1B$4021211B2442!!1B2842>, in => qq<\x1B\x24\x40!!\x1B\x24\x42!!\x1B\x28\x42>, out1 => [null, "\x{3000}", null, "\x{3000}", null], out2 => [null, "\x{3000}", null, "\x{3000}", null], }, { id => q<1B$4021211B2440!!1B2842>, in => qq<\x1B\x24\x40!!\x1B\x24\x40!!\x1B\x28\x42>, out1 => [null, "\x{3000}", null, "\x{3000}", null], out2 => [null, "\x{3000}", null, "\x{3000}", null], }, { id => q<1B$@!"1B(B\~|>, in => qq<\x1B\x24\x40!"\x1B(B\\~|>, out1 => [null, "\x{3001}", null, "\x5C", null, "\x7E", null, "|", null], out2 => [null, "\x{3001}", null, "\x5C", null, "\x7E", null, "|", null], }, { id => q<1B$B!"1B(J\~|1B(B>, in => qq<\x1B\x24\x42!"\x1B(J\\~|\x1B(B>, out1 => [null, "\x{3001}", null, "\xA5", null, "\x{203E}", null, "|", null], out2 => [null, "\x{3001}", null, "\xA5", null, "\x{203E}", null, "|", null], }, { id => q<78compat.3022(16-02)>, in => qq<\x1B\$\@\x30\x22\x1B\$B\x30\x22\x1B(B>, out1 => [null, "\x{555E}", null, "\x{5516}", null], out2 => [null, "\x{5516}", null, "\x{5516}", null], }, { id => q, in => qq<\x1B\$\@\x22\x39\x1B\$B\x22\x39\x1B(B>, out1 => [null, "\x22\x39", [], "\x22\x39", []], out2 => [null, "\x22\x39", [], "\x22\x39", []], }, { id => q<83add.223A>, in => qq<\x1B\$\@\x22\x3A\x1B\$B\x22\x3A\x1B(B>, out1 => [null, "\x22\x3A", [], "\x{2208}", null], out2 => [null, "\x{2208}", null, "\x{2208}", null], }, { id => q<83add.2840>, in => qq<\x1B\$\@\x28\x40\x1B\$B\x28\x40\x1B(B>, out1 => [null, "\x28\x40", [], "\x{2542}", null], out2 => [null, "\x{2542}", null, "\x{2542}", null], }, { id => q<83add.7421>, in => qq<\x1B\$\@\x74\x21\x1B\$B\x74\x21\x1B(B>, out1 => [null, "\x74\x21", [], "\x{582F}", null], out2 => [null, "\x{5C2D}", null, "\x{582F}", null], }, { id => q<83swap.3033>, in => qq<\x1B\$\@\x30\x33\x1B\$B\x30\x33\x1B(B>, out1 => [null, "\x{9C3A}", null, "\x{9BF5}", null], out2 => [null, "\x{9C3A}", null, "\x{9BF5}", null], }, { id => q<83swap.724D>, in => qq<\x1B\$\@\x72\x4D\x1B\$B\x72\x4D\x1B(B>, out1 => [null, "\x{9BF5}", null, "\x{9C3A}", null], out2 => [null, "\x{9BF5}", null, "\x{9C3A}", null], }, { id => q<90add.7425>, in => qq<\x1B\$\@\x74\x25\x1B\$B\x74\x25\x1B(B>, out1 => [null, "\x74\x25", [], "\x74\x25", []], out2 => [null, "\x{51DC}", null, "\x{51DC}", null], }, { id => q<90add.7426>, in => qq<\x1B\$\@\x74\x26\x1B\$B\x74\x26\x1B(B>, out1 => [null, "\x74\x26", [], "\x74\x26", []], out2 => [null, "\x{7199}", null, "\x{7199}", null], }, ); for my $testdata (@testdata) { for my $c (1..2) { my $byte = $testdata->{in}; my $error; my $i = 0; open my $fh, '<', \$byte; my $efh = $impl-> ([null, , ]->[$c], $fh, sub { my (null, null, $etype, %opt) = @_; $error = [$etype, \%opt]; }); while (@{$testdata->{"out$c"}}) { if ($i != 0) { my $c = shift @{$testdata->{"out$c"}}; $test->id ("$testdata->{id}.$i.$c"); $test->assert_equals ($efh->getc, $c); } my $v = shift @{$testdata->{"out$c"}}; $test->id ("$testdata->{id}.$i.error.$c"); if (defined $v) { $test->assert_not_null ($error); $test->assert_equals ($error->[0], $v->[0]); } else { $test->assert_null ($error->[0]); } $error = null; $i++; } $test->id ("$testdata->{id}.eof.$c"); $test->assert_null ($efh->getc); my $v = $testdata->{eof_error}; $test->id ("$testdata->{id}.$i.error.$c"); if (defined $v) { $test->assert_not_null ($error); $test->assert_equals ($error->[0], $v->[0]); } else { $test->assert_null ($error->[0]); } $error = null; }} # testdata ##MCISO2022JPDecodeHandle PropDef: @QName: mce|state @mce:key: s @enDesc: ISO-2022-JP state. ResourceDef: @QName: mce|State.2440 @mce:key: s1 @enDesc: ISO-2022-JP state: ESC 2/4 4/0. @DISCore:resourceType: DISCore|Resource @DISCore:resourceType: DISCore|Property ResourceDef: @QName: mce|State.2442 @mce:key: s2 @enDesc: ISO-2022-JP state: ESC 2/4 4/2. @DISCore:resourceType: DISCore|Resource @DISCore:resourceType: DISCore|Property ResourceDef: @QName: mce|State.2842 @mce:key: s3 @enDesc: ISO-2022-JP state: ESC 2/8 4/2. @DISCore:resourceType: DISCore|Resource ResourceDef: @QName: mce|State.284A @mce:key: s4 @enDesc: ISO-2022-JP state: ESC 2/8 4/10. @DISCore:resourceType: DISCore|Resource ResourceDef: @QName: String @AliasFor: str|DOMString PropDef: @QName: mce|onerror @mce:key: onerr PropDef: @QName: mce|inputEncoding @mce:key: ie PropDef: @QName: mce|hasBOM @mce:key: bom PropDef: @QName: mce|continue @mce:key: cc @enDesc: Whether the contains octets that might be part of characters. PropDef: @QName: mce|charset @mce:key: cs PropDef: @QName: mce|perlEncodingName @mce:key: enc PropDef: @QName: mce|filehandle @mce:key: fh RPropDef: @QName: mce|key @subsetOf: DISPerl|propHashKey PropDef: @QName: mce|characterQueue @mce:key: cq PropDef: @QName: mce|byteBuffer @mce:key: bb ElementTypeBinding: @Name: PropDef @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISCore|Property @@ForCheck: =ManakaiDOM|all ElementTypeBinding: @Name: RPropDef @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: DISSource|ResourceProperty @@ForCheck: =ManakaiDOM|all ElementTypeBinding: @Name: nullCase @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: ManakaiDOM|InCase @@Value: @@@is-null:1 @@@ContentType: DISCore|String ElementTypeBinding: @Name: TrueCase @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: ManakaiDOM|InCase @@Value: @@@@: 1 @@@ContentType: DISCore|Boolean @@Type: idl|boolean||ManakaiDOM|all ElementTypeBinding: @Name: FalseCase @ElementType: dis:ResourceDef @ShadowContent: @@DISCore:resourceType: ManakaiDOM|InCase @@Value: @@@@: 0 @@@ContentType: DISCore|Boolean @@Type: idl|boolean||ManakaiDOM|all ResourceDef: @DISCore:resourceType: cs|CharsetSet @cs:moduleRef: DISlib|Charset