/[suikacvs]/perl/lib/Encode/ISO2022/EUCJACompatible.esr
Suika

Contents of /perl/lib/Encode/ISO2022/EUCJACompatible.esr

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations) (download)
Wed Nov 6 09:29:16 2002 UTC (22 years, 1 month ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
2002-11-06  Wakaba <w@suika.fam.cx>

	* SevenBit.esr, EightBit.esr, JUNET.esr,
	JUNETCompatible.esr, EUCJA.esr, EUCJACompatible.esr,
	EUCKR.esr, EUCZH.esr: New files.
	* SevenBit.pm, EightBit.pm: Removed.
	(Now these modules are auto-generated from *.esr files.)

1 wakaba 1.1 #?PESRC/1.0
2     Name:
3     ISO2022::EUCJACompatible
4     ShortDescription:
5     An Encode module of EUC (8-bit ISO/IEC 2022
6     based compatible coding system) for Japanese
7     POD:DESCRIPTION:
8     Due to the historical reason, some of current
9     implementions of JIS coded character sets support
10     pairs of same characters dupplicately, with
11     halfwidth/fullwidth property.
12    
13     Although this situation is undesirable, we can't ignore
14     such historical implementions and latest JIS coded
15     character set standards admire such dupulicate
16     encoding to and only to "keep compatibility with
17     current practice."
18    
19     This module provides encoder and decoder for such
20     coding systems that comform to JIS and that based
21     on EUC (an 8-bit ISO/IEC 2022) structure.
22    
23     Those coding systems SHOULD not be used for new
24     implemention or new data. They may not comform
25     to future version of JIS or other standards.
26    
27     {
28     Name:
29     euc-japan-1990-fullwidth
30     Alias:
31     euc-jp-1990-fullwidth ajec
32     Cversion:
33     C:bit=8
34     C:G0=G94:B ## ASCII
35     C:G1=G94n:B@ ## JIS X 0208-1990
36     C:G2=G94:I ## JIS X 0201 Katakana
37     C:G3=G94n:D ## JIS X 0212-1990
38     C:designate:*:default=-1
39     C:option:{Ginvoke_to_left}=[1,0,0,0]
40     C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
41     C:option:{C1invoke_to_right}=1
42     C:option:{reset}->{Gdesignation}=0
43     C:option:{reset}->{Ginvoke}=0
44     C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
45     Encode:
46     =>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw
47     ->iso2022:C
48     Decode:
49     <-iso2022:C
50     <=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs
51     Description:
52     EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
53     ISO/IEC 646 IRV + JIS X 0208-1990 + JIS X 0212-1990
54     + JIS X 0201-1990 Katakana. Some characters defined
55     in JIS X 0208 and all characters defined in JIS X 0201
56     are mapped to FULLWIDTH or HALFWIDTH characters of UCS
57     as specified by JIS X 0221-1995 and JIS X 0208:1997.
58    
59     This encoding is same as C<euc-japan-1997-fullwidth>
60     except that this encoding has private use areas as
61     described in JIS X 0208-1990 and JIS X 0212-1990
62     (as the "free area") and defined in AJEC. (See "UI-OSF
63     Application Platform Profile for Japanese Environment".)
64     Its mapping to UCS private use area is defined by
65     CDE/Motif WG.
66    
67     Even 1990 version of JIS coded character set standards
68     allows "free area", current version of JIS does not.
69     It should not be used without compatibility purpose.
70    
71     This encoding is a "compatible" version of
72     C<euc-japan-1990> defined in Encode::ISO2022::EUCJA.
73     }
74    
75     {
76     Name:
77     eucJP-open
78     Alias:
79     eucJP-ascii
80     Cversion:
81     C:bit=8
82     C:G0=G94:B ## ASCII
83     C:G1=G94n:B@ ## JIS X 0208-1990
84     C:G2=G94:I ## JIS X 0201 Katakana
85     C:G3=G94n:D ## JIS X 0212-1990
86     C:designate:*:default=-1
87     C:option:{Ginvoke_to_left}=[1,0,0,0]
88     C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
89     C:option:{C1invoke_to_right}=1
90     C:option:{reset}->{Gdesignation}=0
91     C:option:{reset}->{Ginvoke}=0
92     C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
93     Encode:
94     =>ucs_to_ascii ucs_to_jisx0208_1990_open_ascii ucs_to_jisx0212_1990_open_ascii ucs_to_jisx0201_katakana_hw
95     ->iso2022:C
96     Decode:
97     <-iso2022:C
98     <=jisx0208_1990_open_ascii_to_ucs jisx0212_1990_open_ascii_to_ucs jisx0201_katakana_hw_to_ucs
99     Description:
100     EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese.
101    
102     This encoding is same as C<euc-japan-1990-fullwidth>
103     except that this encoding has IBM extended characters
104     as defined by CDE/Motif WG.
105    
106     Even 1990 version of JIS coded character set standards
107     allows "free area", current version of JIS does not.
108     It should not be used without compatibility purpose.
109     }
110    
111     {
112     Name:
113     eucJP-0201
114     Cversion:
115     C:bit=8
116     C:G0=G94:J ## JIS X 0201 Roman
117     C:G1=G94n:B@ ## JIS X 0208-1990
118     C:G2=G94:I ## JIS X 0201 Katakana
119     C:G3=G94n:D ## JIS X 0212-1990
120     C:designate:*:default=-1
121     C:option:{Ginvoke_to_left}=[1,0,0,0]
122     C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
123     C:option:{C1invoke_to_right}=1
124     C:option:{reset}->{Gdesignation}=0
125     C:option:{reset}->{Ginvoke}=0
126     C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
127     Encode:
128     =>ucs_to_jisx0201_latin ucs_to_jisx0208_1990_open_0201 ucs_to_jisx0212_1990_open_0201 ucs_to_jisx0201_katakana_hw
129     ->iso2022:C
130     Decode:
131     <-iso2022:C
132     <=jisx0201_latin_to_ucs jisx0208_1990_open_02101_to_ucs jisx0212_1990_open_0201_to_ucs jisx0201_katakana_hw_to_ucs
133     Description:
134     ISO/IEC 2022 based 8-bit encoding for Japanese.
135    
136     This encoding is same as C<eucJP-ascii> but use JIS X 0201
137     roman set instead of ASCII, as defined by CDE/Motif WG.
138    
139     EUC is defined to use ASCII but this coding system use
140     JIS X 0201 roman, so this coding system SHOULD NOT be
141     used without compatibility purpose.
142     }
143    
144     {
145     Name:
146     euc-japan-1997-fullwidth
147     Alias:
148     euc-jp-1997-fullwidth
149     Cversion:
150     C:bit=8
151     C:G0=G94:B ## ASCII
152     C:G1=G94n:B@ ## JIS X 0208:1997
153     C:G2=G94:I ## JIS X 0201 Katakana
154     C:G3=G94n:D ## JIS X 0212-1990
155     C:designate:*:default=-1
156     C:option:{Ginvoke_to_left}=[1,0,0,0]
157     C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
158     C:option:{C1invoke_to_right}=1
159     C:option:{reset}->{Gdesignation}=0
160     C:option:{reset}->{Ginvoke}=0
161     C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B@'}]
162     Encode:
163     =>ucs_to_ascii ucs_to_jisx0208_1997_irv ucs_to_jisx0212_1990_irv ucs_to_jisx0201_katakana_hw
164     ->iso2022:C
165     Decode:
166     <-iso2022:C
167     <=jisx0208_1997_irv_to_ucs jisx0212_1990_irv_to_ucs jisx0201_katakana_hw_to_ucs
168     Description:
169     EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
170     ISO/IEC 646 IRV + JIS X 0208:1997 + JIS X 0212-1990
171     + JIS X 0201:1997 Katakana. Some characters defined
172     in JIS X 0208 and all characters defined in JIS X 0201
173     are mapped to FULLWIDTH or HALFWIDTH characters of UCS
174     as specified by JIS X 0221-1995 and JIS X 0208:1997.
175    
176     This encoding is a "compatible" version of
177     C<euc-japan-1997> defined in Encode::ISO2022::EUCJA.
178     }
179    
180     {
181     Name:
182     euc-jisx0213-fullwidth
183     Alias:
184     euc-japan-2000-fullwidth euc-jp-2000-fullwidth
185     Cversion:
186     C:bit=8
187     C:G0=G94:B ## ASCII
188     C:G1=G94n:O ## JIS X 0213:2000 plane 1
189     C:G2=G94:I ## JIS X 0201 Katakana
190     C:G3=G94n:P ## JIS X 0213:2000 plane 2
191     C:designate:*:default=-1
192     C:option:{Ginvoke_to_left}=[1,0,0,0]
193     C:option:{Ginvoke_by_single_shift}=[0,0,1,1]
194     C:option:{C1invoke_to_right}=1
195     C:option:{reset}->{Gdesignation}=0
196     C:option:{reset}->{Ginvoke}=0
197     C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}]
198     Encode:
199     =>ucs_to_ascii ucs_to_jisx0213_2000_1_irv ucs_to_jisx0213_2000_2 ucs_to_jisx0201_katakana_hw
200     ->iso2022:C
201     Decode:
202     <-iso2022:C
203     <=jisx0213_2000_1_irv_to_ucs jisx0213_2000_2_to_ucs jisx0201_katakana_hw_to_ucs
204     Description:
205     EUC (ISO/IEC 2022 based 8-bit encoding) for Japanese,
206     ISO/IEC 646 IRV + JIS X 0213:2000 + JIS X 0201:1997
207     Katakana. Some characters defined in JIS X 0213 and
208     all characters defined in JIS X 0201 are mapped to
209     FULLWIDTH or HALFWIDTH characters of UCS as specified
210     by JIS X 0213:2000.
211    
212     This encoding is a "compatible" version of
213     C<euc-jisx0213> defined in Encode::ISO2022::EUCJA.
214     }
215    
216     POD:EXAMPLE:
217     use %%MYSELF%%;
218     while (<>) {
219     print "FW-> : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997-fullwidth', $_));
220     print "FW->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997-fullwidth', $_));
221     print " ->FW: ". Encode::encode ('euc-jp-1997-fullwidth', Encode::decode ('euc-jp-1997', $_));
222     print " -> : ". Encode::encode ('euc-jp-1997', Encode::decode ('euc-jp-1997', $_));
223     }
224    
225     POD:SEE ALSO:
226     %%ReferenceJISX0212_1995%%
227    
228     %%ReferenceJISX0208_1997%%
229    
230     %%ReferenceJISX0213_2000%%
231    
232     L<Encode::ISO2022::EUCJA>
233    
234     "UI-OSF Application Platform Profile for Japanese Environment
235     Version 1.1", UI-OSF Japanese Localization Group, 1993-05-21.
236     <http://www.li18nux.org/~numa/uocjle-a4.pdf> (In Japanese)
237     or <http://www.li18nux.org/~numa/uocjleE.pdf> (In English).
238    
239     "OSF/JVC Recommended Code Set Conversion Specification
240     between Japanese EUC and Shift-JIS, and
241     Survey on Actual Situation of Japanese Code Sets",
242     OSF/JVC CDE/Motif Technical WG, 1996-01-19.
243     <http://www.opengroup.or.jp/jvc/cde/sjis-euc.html> (In Japanese)
244     or <http://www.opengroup.or.jp/jvc/cde/sjis-euc-e.html> (In English).
245    
246     "Problems and Solutions for Unicode and User/Vendor Defined Characters",
247     TOG/JVC CDE/Motif Technical WG, 1996-10-25.
248     <http://www.opengroup.or.jp/jvc/cde/ucs-conv.html> (In Japanese)
249     or <http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html> (In English).
250    
251     POD:LICENSE:
252     Copyright %%YEAR%% Wakaba <w@suika.fam.cx>
253    
254     %%PerlLicense%%

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24