Encode/ISO2022/EUCJACompatible.esr

#?PESRC/1.0
Name:
        ISO2022::EUCJACompatible
ShortDescription:
        An Encode module of EUC (8-bit ISO/IEC 2022
        based compatible coding system) for Japanese
POD:DESCRIPTION:
        Due to the historical reason, some of current
        implementions of JIS coded character sets support
        pairs of same characters dupplicately, with
        halfwidth/fullwidth property.
        
        Although this situation is undesirable, we can't ignore
        such historical implementions and latest JIS coded
        character set standards admire such dupulicate
        encoding to and only to "keep compatibility with
        current practice."
        
        This module provides encoder and decoder for such
        coding systems that comform to JIS and that based
        on EUC (an 8-bit ISO/IEC 2022) structure.
        
        Those coding systems SHOULD not be used for new
        implemention or new data.  They may not comform
        to future version of JIS or other standards.

{
Name:
        euc-japan-1990-fullwidth
Alias:
        euc-jp-1990-fullwidth ajec
Cversion:
        C:bit=8
        C:G0=G94:B      ## ASCII
        C:G1=G94n:B@    ## JIS X 0208-1990
        C:G2=G94:I      ## JIS X 0201 Katakana
        C:G3=G94n:D     ## JIS X 0212-1990
        C:designate:*:default=-1
        C:option:{Ginvoke_to_left}=[1,0,0,0]
        C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
        C:option:{C1invoke_to_right}=1
        C:option:{reset}->{Gdesignation}=0
        C:option:{reset}->{Ginvoke}=0
        C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
Encode:
        =>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw
        ->iso2022:C
Decode:
        <-iso2022:C
        <=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs
Description:
        EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
        ISO/IEC 646 IRV + JIS X 0208-1990 + JIS X 0212-1990
        + JIS X 0201-1990 Katakana.  Some characters defined
        in JIS X 0208 and all characters defined in JIS X 0201
        are mapped to FULLWIDTH or HALFWIDTH characters of UCS
        as specified by JIS X 0221-1995 and JIS X 0208:1997.
        
        This encoding is same as C<euc-japan-1997-fullwidth>
        except that this encoding has private use areas as
        described in JIS X 0208-1990 and JIS X 0212-1990
        (as the "free area") and defined in AJEC.  (See "UI-OSF
        Application Platform Profile for Japanese Environment".)
        Its mapping to UCS private use area is defined by
        CDE/Motif WG.
        
        Even 1990 version of JIS coded character set standards
        allows "free area", current version of JIS does not.
        It should not be used without compatibility purpose.
        
        This encoding is a "compatible" version of
        C<euc-japan-1990> defined in Encode::ISO2022::EUCJA.
}

{
Name:
        eucJP-open
Alias:
        eucJP-ascii
Cversion:
        C:bit=8
        C:G0=G94:B      ## ASCII
        C:G1=G94n:B@    ## JIS X 0208-1990
        C:G2=G94:I      ## JIS X 0201 Katakana
        C:G3=G94n:D     ## JIS X 0212-1990
        C:designate:*:default=-1
        C:option:{Ginvoke_to_left}=[1,0,0,0]
        C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
        C:option:{C1invoke_to_right}=1
        C:option:{reset}->{Gdesignation}=0
        C:option:{reset}->{Ginvoke}=0
        C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
Encode:
        =>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_open_ascii ucs_to_jisx0201_katakana_hw
        ->iso2022:C
Decode:
        <-iso2022:C
        <=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_open_ascii_to_ucs jisx0201_katakana_hw_to_ucs
Description:
        EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese.
        
        This encoding is same as C<euc-japan-1990-fullwidth>
        except that this encoding has IBM extended characters
        as defined by CDE/Motif WG.
        
        Even 1990 version of JIS coded character set standards
        allows "free area", current version of JIS does not.
        It should not be used without compatibility purpose.
}

{
Name:
        eucJP-0201
Cversion:
        C:bit=8
        C:G0=G94:J      ## JIS X 0201 Roman
        C:G1=G94n:B@    ## JIS X 0208-1990
        C:G2=G94:I      ## JIS X 0201 Katakana
        C:G3=G94n:D     ## JIS X 0212-1990
        C:designate:*:default=-1
        C:option:{Ginvoke_to_left}=[1,0,0,0]
        C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
        C:option:{C1invoke_to_right}=1
        C:option:{reset}->{Gdesignation}=0
        C:option:{reset}->{Ginvoke}=0
        C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
Encode:
        =>ucs_to_jisx0201_latin ucs_to_jisx0208_1990_open_0201 ucs_to_jisx0212_1990_open_0201 ucs_to_jisx0201_katakana_hw
        ->iso2022:C
Decode:
        <-iso2022:C
        <=jisx0201_latin_to_ucs jisx0208_1990_open_02101_to_ucs jisx0212_1990_open_0201_to_ucs jisx0201_katakana_hw_to_ucs
Description:
        ISO/IEC 2022 based 8-bit encoding for Japanese.
        
        This encoding is same as C<eucJP-ascii> but use JIS X 0201
        roman set instead of ASCII, as defined by CDE/Motif WG.
        
        EUC is defined to use ASCII but this coding system use
        JIS X 0201 roman, so this coding system SHOULD NOT be
        used without compatibility purpose.
}

{
Name:
        euc-japan-1997-fullwidth
Alias:
        euc-jp-1997-fullwidth
Cversion:
        C:bit=8
        C:G0=G94:B      ## ASCII
        C:G1=G94n:B@    ## JIS X 0208:1997
        C:G2=G94:I      ## JIS X 0201 Katakana
        C:G3=G94n:D     ## JIS X 0212-1990
        C:designate:*:default=-1
        C:option:{Ginvoke_to_left}=[1,0,0,0]
        C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
        C:option:{C1invoke_to_right}=1
        C:option:{reset}->{Gdesignation}=0
        C:option:{reset}->{Ginvoke}=0
        C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
Encode:
        =>ucs_to_ascii ucs_to_jisx0208_1997_irv ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw
        ->iso2022:C
Decode:
        <-iso2022:C
        <=jisx0208_1997_irv_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs
Description:
        EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
        ISO/IEC 646 IRV + JIS X 0208:1997 + JIS X 0212-1990
        + JIS X 0201:1997 Katakana.  Some characters defined
        in JIS X 0208 and all characters defined in JIS X 0201
        are mapped to FULLWIDTH or HALFWIDTH characters of UCS
        as specified by JIS X 0221-1995 and JIS X 0208:1997.
        
        This encoding is a "compatible" version of
        C<euc-japan-1997> defined in Encode::ISO2022::EUCJA.
}

{
Name:
        euc-jisx0213-fullwidth
Alias:
        euc-japan-2000-fullwidth euc-jp-2000-fullwidth
Cversion:
        C:bit=8
        C:G0=G94:B      ## ASCII
        C:G1=G94n:O     ## JIS X 0213:2000 plane 1
        C:G2=G94:I      ## JIS X 0201 Katakana
        C:G3=G94n:P     ## JIS X 0213:2000 plane 2
        C:designate:*:default=-1
        C:option:{Ginvoke_to_left}=[1,0,0,0]
        C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
        C:option:{C1invoke_to_right}=1
        C:option:{reset}->{Gdesignation}=0
        C:option:{reset}->{Ginvoke}=0
        C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}]
Encode:
        =>ucs_to_ascii ucs_to_jisx0213_2000_1_irv ucs_to_jisx0213_2000_2 ucs_to_jisx0201_katakana_hw
        ->iso2022:C
Decode:
        <-iso2022:C
        <=jisx0213_2000_1_irv_to_ucs jisx0213_2000_2_to_ucs jisx0201_katakana_hw_to_ucs
Description:
        EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
        ISO/IEC 646 IRV + JIS X 0213:2000 + JIS X 0201:1997
        Katakana.  Some characters defined in JIS X 0213 and
        all characters defined in JIS X 0201 are mapped to
        FULLWIDTH or HALFWIDTH characters of UCS as specified
        by JIS X 0213:2000.
        
        This encoding is a "compatible" version of
        C<euc-jisx0213> defined in Encode::ISO2022::EUCJA.
}

POD:EXAMPLE:
          use %%MYSELF%%;
          while (<>) {
            print "FW->  : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997-fullwidth', $_));
            print "FW->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997-fullwidth', $_));
            print "  ->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997', $_));
            print "  ->  : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997', $_));
          }

POD:SEE ALSO:
        %%ReferenceJISX0212_1995%%
        
        %%ReferenceJISX0208_1997%%
        
        %%ReferenceJISX0213_2000%%
        
        L<Encode::ISO2022::EUCJA>
        
        "UI-OSF Application Platform Profile for Japanese Environment
        Version 1.1", UI-OSF Japanese Localization Group, 1993-05-21.
        <http://www.li18nux.org/~numa/uocjle-a4.pdf> (In Japanese)
        or <http://www.li18nux.org/~numa/uocjleE.pdf> (In English).
        
        "OSF/JVC Recommended Code Set Conversion Specification
        between Japanese EUC and Shift-JIS, and
        Survey on Actual Situation of Japanese Code Sets",
        OSF/JVC CDE/Motif Technical WG, 1996-01-19.
        <http://www.opengroup.or.jp/jvc/cde/sjis-euc.html> (In Japanese)
        or <http://www.opengroup.or.jp/jvc/cde/sjis-euc-e.html> (In English).
        
        "Problems and Solutions for Unicode and User/Vendor Defined Characters",
        TOG/JVC CDE/Motif Technical WG, 1996-10-25.
        <http://www.opengroup.or.jp/jvc/cde/ucs-conv.html> (In Japanese)
        or <http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html> (In English).

POD:LICENSE:
        Copyright %%YEAR%% Wakaba <w@suika.fam.cx>
        
        %%PerlLicense%%
1	#?PESRC/1.0
2	Name:
3	ISO2022::EUCJACompatible
4	ShortDescription:
5	An Encode module of EUC (8-bit ISO/IEC 2022
6	based compatible coding system) for Japanese
7	POD:DESCRIPTION:
8	Due to the historical reason, some of current
9	implementions of JIS coded character sets support
10	pairs of same characters dupplicately, with
11	halfwidth/fullwidth property.
12
13	Although this situation is undesirable, we can't ignore
14	such historical implementions and latest JIS coded
15	character set standards admire such dupulicate
16	encoding to and only to "keep compatibility with
17	current practice."
18
19	This module provides encoder and decoder for such
20	coding systems that comform to JIS and that based
21	on EUC (an 8-bit ISO/IEC 2022) structure.
22
23	Those coding systems SHOULD not be used for new
24	implemention or new data. They may not comform
25	to future version of JIS or other standards.
26
27	{
28	Name:
29	euc-japan-1990-fullwidth
30	Alias:
31	euc-jp-1990-fullwidth ajec
32	Cversion:
33	C:bit=8
34	C:G0=G94:B ## ASCII
35	C:G1=G94n:B@ ## JIS X 0208-1990
36	C:G2=G94:I ## JIS X 0201 Katakana
37	C:G3=G94n:D ## JIS X 0212-1990
38	C:designate:*:default=-1
39	C:option:{Ginvoke_to_left}=[1,0,0,0]
40	C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
41	C:option:{C1invoke_to_right}=1
42	C:option:{reset}->{Gdesignation}=0
43	C:option:{reset}->{Ginvoke}=0
44	C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
45	Encode:
46	=>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw
47	->iso2022:C
48	Decode:
49	<-iso2022:C
50	<=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs
51	Description:
52	EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
53	ISO/IEC 646 IRV + JIS X 0208-1990 + JIS X 0212-1990
54	+ JIS X 0201-1990 Katakana. Some characters defined
55	in JIS X 0208 and all characters defined in JIS X 0201
56	are mapped to FULLWIDTH or HALFWIDTH characters of UCS
57	as specified by JIS X 0221-1995 and JIS X 0208:1997.
58
59	This encoding is same as C<euc-japan-1997-fullwidth>
60	except that this encoding has private use areas as
61	described in JIS X 0208-1990 and JIS X 0212-1990
62	(as the "free area") and defined in AJEC. (See "UI-OSF
63	Application Platform Profile for Japanese Environment".)
64	Its mapping to UCS private use area is defined by
65	CDE/Motif WG.
66
67	Even 1990 version of JIS coded character set standards
68	allows "free area", current version of JIS does not.
69	It should not be used without compatibility purpose.
70
71	This encoding is a "compatible" version of
72	C<euc-japan-1990> defined in Encode::ISO2022::EUCJA.
73	}
74
75	{
76	Name:
77	eucJP-open
78	Alias:
79	eucJP-ascii
80	Cversion:
81	C:bit=8
82	C:G0=G94:B ## ASCII
83	C:G1=G94n:B@ ## JIS X 0208-1990
84	C:G2=G94:I ## JIS X 0201 Katakana
85	C:G3=G94n:D ## JIS X 0212-1990
86	C:designate:*:default=-1
87	C:option:{Ginvoke_to_left}=[1,0,0,0]
88	C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
89	C:option:{C1invoke_to_right}=1
90	C:option:{reset}->{Gdesignation}=0
91	C:option:{reset}->{Ginvoke}=0
92	C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
93	Encode:
94	=>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_open_ascii ucs_to_jisx0201_katakana_hw
95	->iso2022:C
96	Decode:
97	<-iso2022:C
98	<=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_open_ascii_to_ucs jisx0201_katakana_hw_to_ucs
99	Description:
100	EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese.
101
102	This encoding is same as C<euc-japan-1990-fullwidth>
103	except that this encoding has IBM extended characters
104	as defined by CDE/Motif WG.
105
106	Even 1990 version of JIS coded character set standards
107	allows "free area", current version of JIS does not.
108	It should not be used without compatibility purpose.
109	}
110
111	{
112	Name:
113	eucJP-0201
114	Cversion:
115	C:bit=8
116	C:G0=G94:J ## JIS X 0201 Roman
117	C:G1=G94n:B@ ## JIS X 0208-1990
118	C:G2=G94:I ## JIS X 0201 Katakana
119	C:G3=G94n:D ## JIS X 0212-1990
120	C:designate:*:default=-1
121	C:option:{Ginvoke_to_left}=[1,0,0,0]
122	C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
123	C:option:{C1invoke_to_right}=1
124	C:option:{reset}->{Gdesignation}=0
125	C:option:{reset}->{Ginvoke}=0
126	C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
127	Encode:
128	=>ucs_to_jisx0201_latin ucs_to_jisx0208_1990_open_0201 ucs_to_jisx0212_1990_open_0201 ucs_to_jisx0201_katakana_hw
129	->iso2022:C
130	Decode:
131	<-iso2022:C
132	<=jisx0201_latin_to_ucs jisx0208_1990_open_02101_to_ucs jisx0212_1990_open_0201_to_ucs jisx0201_katakana_hw_to_ucs
133	Description:
134	ISO/IEC 2022 based 8-bit encoding for Japanese.
135
136	This encoding is same as C<eucJP-ascii> but use JIS X 0201
137	roman set instead of ASCII, as defined by CDE/Motif WG.
138
139	EUC is defined to use ASCII but this coding system use
140	JIS X 0201 roman, so this coding system SHOULD NOT be
141	used without compatibility purpose.
142	}
143
144	{
145	Name:
146	euc-japan-1997-fullwidth
147	Alias:
148	euc-jp-1997-fullwidth
149	Cversion:
150	C:bit=8
151	C:G0=G94:B ## ASCII
152	C:G1=G94n:B@ ## JIS X 0208:1997
153	C:G2=G94:I ## JIS X 0201 Katakana
154	C:G3=G94n:D ## JIS X 0212-1990
155	C:designate:*:default=-1
156	C:option:{Ginvoke_to_left}=[1,0,0,0]
157	C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
158	C:option:{C1invoke_to_right}=1
159	C:option:{reset}->{Gdesignation}=0
160	C:option:{reset}->{Ginvoke}=0
161	C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
162	Encode:
163	=>ucs_to_ascii ucs_to_jisx0208_1997_irv ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw
164	->iso2022:C
165	Decode:
166	<-iso2022:C
167	<=jisx0208_1997_irv_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs
168	Description:
169	EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
170	ISO/IEC 646 IRV + JIS X 0208:1997 + JIS X 0212-1990
171	+ JIS X 0201:1997 Katakana. Some characters defined
172	in JIS X 0208 and all characters defined in JIS X 0201
173	are mapped to FULLWIDTH or HALFWIDTH characters of UCS
174	as specified by JIS X 0221-1995 and JIS X 0208:1997.
175
176	This encoding is a "compatible" version of
177	C<euc-japan-1997> defined in Encode::ISO2022::EUCJA.
178	}
179
180	{
181	Name:
182	euc-jisx0213-fullwidth
183	Alias:
184	euc-japan-2000-fullwidth euc-jp-2000-fullwidth
185	Cversion:
186	C:bit=8
187	C:G0=G94:B ## ASCII
188	C:G1=G94n:O ## JIS X 0213:2000 plane 1
189	C:G2=G94:I ## JIS X 0201 Katakana
190	C:G3=G94n:P ## JIS X 0213:2000 plane 2
191	C:designate:*:default=-1
192	C:option:{Ginvoke_to_left}=[1,0,0,0]
193	C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
194	C:option:{C1invoke_to_right}=1
195	C:option:{reset}->{Gdesignation}=0
196	C:option:{reset}->{Ginvoke}=0
197	C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}]
198	Encode:
199	=>ucs_to_ascii ucs_to_jisx0213_2000_1_irv ucs_to_jisx0213_2000_2 ucs_to_jisx0201_katakana_hw
200	->iso2022:C
201	Decode:
202	<-iso2022:C
203	<=jisx0213_2000_1_irv_to_ucs jisx0213_2000_2_to_ucs jisx0201_katakana_hw_to_ucs
204	Description:
205	EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
206	ISO/IEC 646 IRV + JIS X 0213:2000 + JIS X 0201:1997
207	Katakana. Some characters defined in JIS X 0213 and
208	all characters defined in JIS X 0201 are mapped to
209	FULLWIDTH or HALFWIDTH characters of UCS as specified
210	by JIS X 0213:2000.
211
212	This encoding is a "compatible" version of
213	C<euc-jisx0213> defined in Encode::ISO2022::EUCJA.
214	}
215
216	POD:EXAMPLE:
217	use %%MYSELF%%;
218	while (<>) {
219	print "FW-> : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997-fullwidth', $_));
220	print "FW->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997-fullwidth', $_));
221	print " ->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997', $_));
222	print " -> : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997', $_));
223	}
224
225	POD:SEE ALSO:
226	%%ReferenceJISX0212_1995%%
227
228	%%ReferenceJISX0208_1997%%
229
230	%%ReferenceJISX0213_2000%%
231
232	L<Encode::ISO2022::EUCJA>
233
234	"UI-OSF Application Platform Profile for Japanese Environment
235	Version 1.1", UI-OSF Japanese Localization Group, 1993-05-21.
236	<http://www.li18nux.org/~numa/uocjle-a4.pdf> (In Japanese)
237	or <http://www.li18nux.org/~numa/uocjleE.pdf> (In English).
238
239	"OSF/JVC Recommended Code Set Conversion Specification
240	between Japanese EUC and Shift-JIS, and
241	Survey on Actual Situation of Japanese Code Sets",
242	OSF/JVC CDE/Motif Technical WG, 1996-01-19.
243	<http://www.opengroup.or.jp/jvc/cde/sjis-euc.html> (In Japanese)
244	or <http://www.opengroup.or.jp/jvc/cde/sjis-euc-e.html> (In English).
245
246	"Problems and Solutions for Unicode and User/Vendor Defined Characters",
247	TOG/JVC CDE/Motif Technical WG, 1996-10-25.
248	<http://www.opengroup.or.jp/jvc/cde/ucs-conv.html> (In Japanese)
249	or <http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html> (In English).
250
251	POD:LICENSE:
252	Copyright %%YEAR%% Wakaba <w@suika.fam.cx>
253
254	%%PerlLicense%%