#?PESRC/1.0 Name: ISO2022::EUCJACompatible ShortDescription: An Encode module of EUC (8-bit ISO/IEC 2022 based compatible coding system) for Japanese POD:DESCRIPTION: Due to the historical reason, some of current implementions of JIS coded character sets support pairs of same characters dupplicately, with halfwidth/fullwidth property. Although this situation is undesirable, we can't ignore such historical implementions and latest JIS coded character set standards admire such dupulicate encoding to and only to "keep compatibility with current practice." This module provides encoder and decoder for such coding systems that comform to JIS and that based on EUC (an 8-bit ISO/IEC 2022) structure. Those coding systems SHOULD not be used for new implemention or new data. They may not comform to future version of JIS or other standards. { Name: euc-japan-1990-fullwidth Alias: euc-jp-1990-fullwidth ajec Cversion: C:bit=8 C:G0=G94:B ## ASCII C:G1=G94n:B@ ## JIS X 0208-1990 C:G2=G94:I ## JIS X 0201 Katakana C:G3=G94n:D ## JIS X 0212-1990 C:designate:*:default=-1 C:option:{Ginvoke_to_left}=[1,0,0,0] C:option:{Ginvoke_by_single_shift}=[0,0,1,1] C:option:{C1invoke_to_right}=1 C:option:{reset}->{Gdesignation}=0 C:option:{reset}->{Ginvoke}=0 C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}] Encode: =>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw ->iso2022:C Decode: <-iso2022:C <=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs Description: EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese, ISO/IEC 646 IRV + JIS X 0208-1990 + JIS X 0212-1990 + JIS X 0201-1990 Katakana. Some characters defined in JIS X 0208 and all characters defined in JIS X 0201 are mapped to FULLWIDTH or HALFWIDTH characters of UCS as specified by JIS X 0221-1995 and JIS X 0208:1997. This encoding is same as C except that this encoding has private use areas as described in JIS X 0208-1990 and JIS X 0212-1990 (as the "free area") and defined in AJEC. (See "UI-OSF Application Platform Profile for Japanese Environment".) Its mapping to UCS private use area is defined by CDE/Motif WG. Even 1990 version of JIS coded character set standards allows "free area", current version of JIS does not. It should not be used without compatibility purpose. This encoding is a "compatible" version of C defined in Encode::ISO2022::EUCJA. } { Name: eucJP-open Alias: eucJP-ascii Cversion: C:bit=8 C:G0=G94:B ## ASCII C:G1=G94n:B@ ## JIS X 0208-1990 C:G2=G94:I ## JIS X 0201 Katakana C:G3=G94n:D ## JIS X 0212-1990 C:designate:*:default=-1 C:option:{Ginvoke_to_left}=[1,0,0,0] C:option:{Ginvoke_by_single_shift}=[0,0,1,1] C:option:{C1invoke_to_right}=1 C:option:{reset}->{Gdesignation}=0 C:option:{reset}->{Ginvoke}=0 C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}] Encode: =>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_open_ascii ucs_to_jisx0201_katakana_hw ->iso2022:C Decode: <-iso2022:C <=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_open_ascii_to_ucs jisx0201_katakana_hw_to_ucs Description: EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese. This encoding is same as C except that this encoding has IBM extended characters as defined by CDE/Motif WG. Even 1990 version of JIS coded character set standards allows "free area", current version of JIS does not. It should not be used without compatibility purpose. } { Name: eucJP-0201 Cversion: C:bit=8 C:G0=G94:J ## JIS X 0201 Roman C:G1=G94n:B@ ## JIS X 0208-1990 C:G2=G94:I ## JIS X 0201 Katakana C:G3=G94n:D ## JIS X 0212-1990 C:designate:*:default=-1 C:option:{Ginvoke_to_left}=[1,0,0,0] C:option:{Ginvoke_by_single_shift}=[0,0,1,1] C:option:{C1invoke_to_right}=1 C:option:{reset}->{Gdesignation}=0 C:option:{reset}->{Ginvoke}=0 C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}] Encode: =>ucs_to_jisx0201_latin ucs_to_jisx0208_1990_open_0201 ucs_to_jisx0212_1990_open_0201 ucs_to_jisx0201_katakana_hw ->iso2022:C Decode: <-iso2022:C <=jisx0201_latin_to_ucs jisx0208_1990_open_02101_to_ucs jisx0212_1990_open_0201_to_ucs jisx0201_katakana_hw_to_ucs Description: ISO/IEC 2022 based 8-bit encoding for Japanese. This encoding is same as C but use JIS X 0201 roman set instead of ASCII, as defined by CDE/Motif WG. EUC is defined to use ASCII but this coding system use JIS X 0201 roman, so this coding system SHOULD NOT be used without compatibility purpose. } { Name: euc-japan-1997-fullwidth Alias: euc-jp-1997-fullwidth Cversion: C:bit=8 C:G0=G94:B ## ASCII C:G1=G94n:B@ ## JIS X 0208:1997 C:G2=G94:I ## JIS X 0201 Katakana C:G3=G94n:D ## JIS X 0212-1990 C:designate:*:default=-1 C:option:{Ginvoke_to_left}=[1,0,0,0] C:option:{Ginvoke_by_single_shift}=[0,0,1,1] C:option:{C1invoke_to_right}=1 C:option:{reset}->{Gdesignation}=0 C:option:{reset}->{Ginvoke}=0 C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}] Encode: =>ucs_to_ascii ucs_to_jisx0208_1997_irv ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw ->iso2022:C Decode: <-iso2022:C <=jisx0208_1997_irv_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs Description: EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese, ISO/IEC 646 IRV + JIS X 0208:1997 + JIS X 0212-1990 + JIS X 0201:1997 Katakana. Some characters defined in JIS X 0208 and all characters defined in JIS X 0201 are mapped to FULLWIDTH or HALFWIDTH characters of UCS as specified by JIS X 0221-1995 and JIS X 0208:1997. This encoding is a "compatible" version of C defined in Encode::ISO2022::EUCJA. } { Name: euc-jisx0213-fullwidth Alias: euc-japan-2000-fullwidth euc-jp-2000-fullwidth Cversion: C:bit=8 C:G0=G94:B ## ASCII C:G1=G94n:O ## JIS X 0213:2000 plane 1 C:G2=G94:I ## JIS X 0201 Katakana C:G3=G94n:P ## JIS X 0213:2000 plane 2 C:designate:*:default=-1 C:option:{Ginvoke_to_left}=[1,0,0,0] C:option:{Ginvoke_by_single_shift}=[0,0,1,1] C:option:{C1invoke_to_right}=1 C:option:{reset}->{Gdesignation}=0 C:option:{reset}->{Ginvoke}=0 C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}] Encode: =>ucs_to_ascii ucs_to_jisx0213_2000_1_irv ucs_to_jisx0213_2000_2 ucs_to_jisx0201_katakana_hw ->iso2022:C Decode: <-iso2022:C <=jisx0213_2000_1_irv_to_ucs jisx0213_2000_2_to_ucs jisx0201_katakana_hw_to_ucs Description: EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese, ISO/IEC 646 IRV + JIS X 0213:2000 + JIS X 0201:1997 Katakana. Some characters defined in JIS X 0213 and all characters defined in JIS X 0201 are mapped to FULLWIDTH or HALFWIDTH characters of UCS as specified by JIS X 0213:2000. This encoding is a "compatible" version of C defined in Encode::ISO2022::EUCJA. } POD:EXAMPLE: use %%MYSELF%%; while (<>) { print "FW-> : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997-fullwidth', $_)); print "FW->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997-fullwidth', $_)); print " ->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997', $_)); print " -> : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997', $_)); } POD:SEE ALSO: %%ReferenceJISX0212_1995%% %%ReferenceJISX0208_1997%% %%ReferenceJISX0213_2000%% L "UI-OSF Application Platform Profile for Japanese Environment Version 1.1", UI-OSF Japanese Localization Group, 1993-05-21. (In Japanese) or (In English). "OSF/JVC Recommended Code Set Conversion Specification between Japanese EUC and Shift-JIS, and Survey on Actual Situation of Japanese Code Sets", OSF/JVC CDE/Motif Technical WG, 1996-01-19. (In Japanese) or (In English). "Problems and Solutions for Unicode and User/Vendor Defined Characters", TOG/JVC CDE/Motif Technical WG, 1996-10-25. (In Japanese) or (In English). POD:LICENSE: Copyright %%YEAR%% Wakaba %%PerlLicense%%