1 |
wakaba |
1.1 |
#?PESRC/1.0 |
2 |
|
|
Name: |
3 |
|
|
ISO2022::JUNETCompatible |
4 |
|
|
ShortDescription: |
5 |
|
|
An Encode module of 7-bit ISO/IEC 2022 |
6 |
|
|
based compatible coding systems for Japanese |
7 |
|
|
POD:DESCRIPTION: |
8 |
|
|
Due to the historical reason, some of current |
9 |
|
|
implementions of JIS coded character sets support |
10 |
|
|
pairs of same characters dupplicately, with |
11 |
|
|
halfwidth/fullwidth property. |
12 |
|
|
|
13 |
|
|
Although this situation is undesirable, we can't ignore |
14 |
|
|
such historical implementions and latest JIS coded |
15 |
|
|
character set standards admire such dupulicate |
16 |
|
|
encoding to and only to "keep compatibility with |
17 |
|
|
current practice." |
18 |
|
|
|
19 |
|
|
This module provides encoder and decoder for such |
20 |
|
|
coding systems that comform to JIS and that based |
21 |
|
|
on 7-bit ISO/IEC 2022 structure. |
22 |
|
|
|
23 |
wakaba |
1.3 |
Those coding systems SHOULD NOT be used for new |
24 |
|
|
implemention NOR new data. They may not comform |
25 |
wakaba |
1.1 |
to future version of JIS or other standards. |
26 |
|
|
|
27 |
|
|
{ |
28 |
|
|
Name: |
29 |
|
|
iso-2022-jp-fullwidth |
30 |
|
|
Cversion: |
31 |
|
|
C:bit=7 |
32 |
|
|
C:G0=G94:B |
33 |
|
|
C:designate:*:default=-1 |
34 |
|
|
C:designate:G94:B=0 ## ASCII |
35 |
|
|
C:designate:G94:J=0 ## JIS X 0201 Roman |
36 |
|
|
C:designate:G94n:@=0 ## JIS X 0208-1978 |
37 |
|
|
C:designate:G94n:B=0 ## JIS X 0208-1983 |
38 |
|
|
C:designate:G94n:B@=0 ## JIS X 0208-1990 |
39 |
|
|
C:option:{use_revision}=0 |
40 |
|
|
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B'}] |
41 |
wakaba |
1.2 |
Encode:Prepare: |
42 |
wakaba |
1.1 |
C:GR=undef |
43 |
|
|
C:C1=C1:~ |
44 |
|
|
C:G1=G96:~ |
45 |
wakaba |
1.2 |
Encode: |
46 |
wakaba |
1.1 |
=>ucs_to_ascii ucs_to_jisx0208_1983_irv ucs_to_jisx0208_1978_irv ucs_to_jisx0201_latin |
47 |
|
|
->iso2022:C |
48 |
|
|
Decode: |
49 |
|
|
<-iso2022:C |
50 |
|
|
<=jisx0208_1983_irv_to_ucs jisx0208_1978_irv_to_ucs jisx0201_latin_to_ucs jisx0201_katakana_to_ucs jisx0212_1990_irv_to_ucs jisx0208_1997_irv_to_ucs |
51 |
|
|
Description: |
52 |
|
|
ISO/IEC 2022 based 7-bit encoding for Japanese, |
53 |
|
|
ASCII + JIS X 0201 + JIS X 0208-1978 + JIS X 0208-1983. |
54 |
|
|
Some characters defined in JIS X 0208 are mapped to FULLWIDTH |
55 |
|
|
area of UCS as specified in JIS X 0208:1997. |
56 |
|
|
|
57 |
|
|
This encoding is a "compatible" version of |
58 |
|
|
C<iso-2022-jp> defined in Encode::ISO2022::JUNET. |
59 |
|
|
|
60 |
|
|
When decoding, mapping tables from coded character |
61 |
|
|
sets listed below to UCS are also loaded to restore |
62 |
|
|
incorrectly labeled data. |
63 |
|
|
|
64 |
|
|
JIS X 0201 Katakana coded character set, |
65 |
|
|
JIS X 0212-1990, JIS X 0213:2000 |
66 |
|
|
|
67 |
|
|
Note that for Windows user, Encode::ISO2022::CP932 |
68 |
|
|
may be useful to try to restore broken "ISO-2022-JP" |
69 |
|
|
data. |
70 |
|
|
} |
71 |
|
|
|
72 |
|
|
{ |
73 |
|
|
Name: |
74 |
|
|
iso-2022-jp-3-fullwidth |
75 |
|
|
Cversion: |
76 |
|
|
C:bit=7 |
77 |
|
|
C:G0=G94:B |
78 |
|
|
C:designate:*:default=-1 |
79 |
|
|
C:designate:G94:B=0 ## ASCII |
80 |
|
|
#C:designate:G94n:B=0 ## JIS X 0213:2000 plane 1 |
81 |
|
|
C:designate:G94n:O=0 ## JIS X 0213:2000 plane 1 |
82 |
|
|
C:designate:G94n:P=0 ## JIS X 0213:2000 plane 2 |
83 |
|
|
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}] |
84 |
wakaba |
1.2 |
Encode:Prepare: |
85 |
wakaba |
1.1 |
C:GR=undef |
86 |
|
|
C:C1=C1:~ |
87 |
|
|
C:G1=G96:~ |
88 |
wakaba |
1.2 |
Encode: |
89 |
wakaba |
1.1 |
=>ucs_to_ascii ucs_to_jisx0213_2000_1_irv ucs_to_jisx0213_2000_2 |
90 |
|
|
->iso2022:C |
91 |
|
|
Decode: |
92 |
|
|
<-iso2022:C |
93 |
|
|
<=jisx0213_2000_1_irv_to_ucs jisx0213_2000_2_to_ucs jisx0208_1997_irv_to_ucs |
94 |
|
|
Description: |
95 |
|
|
ISO/IEC 2022 based 7-bit encoding for Japanese, |
96 |
|
|
ISO/IEC 646 IRV + JIS X 0213:2000. Some characters |
97 |
|
|
defined in JIS X 0213 are mapped to FULLWIDTH |
98 |
|
|
area of UCS as specified in JIS X 0213:2000. |
99 |
|
|
|
100 |
|
|
This encoding is a "compatible" version of |
101 |
|
|
C<iso-2022-jp-3> defined in Encode::ISO2022::JUNET. |
102 |
|
|
} |
103 |
|
|
|
104 |
|
|
POD:SEE ALSO: |
105 |
|
|
%%ReferenceJISX0212_1995%% |
106 |
|
|
|
107 |
|
|
%%ReferenceJISX0208_1997%% |
108 |
|
|
|
109 |
|
|
%%ReferenceJISX0213_2000%% |
110 |
|
|
|
111 |
|
|
%%ReferenceRFC1468%% |
112 |
|
|
|
113 |
wakaba |
1.3 |
L<Encode::ISO2022::JUNET> |
114 |
|
|
|
115 |
|
|
L<Encode::ISO2022::EightBit>, L<Encode::ISO2022::EUCJACompatible>, |
116 |
|
|
L<Encode::SJIS::JISCompatible> |
117 |
wakaba |
1.1 |
|
118 |
|
|
POD:LICENSE: |
119 |
|
|
Copyright %%YEAR%% Wakaba <w@suika.fam.cx> |
120 |
|
|
|
121 |
|
|
%%PerlLicense%% |