| 1 |
wakaba |
1.1 |
#?PESRC/1.0 |
| 2 |
|
|
Name: |
| 3 |
|
|
SJIS::JIS |
| 4 |
|
|
ShortDescription: |
| 5 |
|
|
The Encode module for shift JIS coding systems |
| 6 |
|
|
Description: |
| 7 |
|
|
This module defines convertion between perl internal |
| 8 |
|
|
representation and shift JIS coding systems defined |
| 9 |
|
|
in standards of JIS (Japan Industrial Standards). |
| 10 |
|
|
|
| 11 |
|
|
{ |
| 12 |
wakaba |
1.2 |
AliasOf: |
| 13 |
|
|
shift_jisx0213 |
| 14 |
|
|
Alias: |
| 15 |
|
|
sjis s-jis x-sjis x_sjis x-sjis-jp shift-jis shiftjis shift.jis x-shiftjis x-shift-jis x-shift_jis |
| 16 |
|
|
Description: |
| 17 |
|
|
"Shift JIS" coding system. |
| 18 |
|
|
|
| 19 |
|
|
Since this name is ambiguous (it can now refer all or any |
| 20 |
|
|
of shift JIS coding system family), this name should not |
| 21 |
|
|
be used to address specific coding system. In this module, |
| 22 |
|
|
this is considered as an alias name to the shift JIS with |
| 23 |
|
|
latest official definition, currently of JIS X 0213:2000 |
| 24 |
|
|
Appendix 1 (with implemention level 4). |
| 25 |
|
|
|
| 26 |
|
|
Note that the name "Shift_JIS" is not associated with |
| 27 |
|
|
this name, because IANA registry [IANAREG] assignes |
| 28 |
|
|
it to a shift JIS defined by JIS X 0208:1997. |
| 29 |
|
|
} |
| 30 |
|
|
|
| 31 |
|
|
{ |
| 32 |
|
|
AliasOf: |
| 33 |
|
|
shift_jisx0213-ascii |
| 34 |
|
|
Alias: |
| 35 |
|
|
sjis-ascii shift-jis-ascii |
| 36 |
|
|
Description: |
| 37 |
|
|
Same as sjis but ASCII (ISO/IEC 646 IRV) instead of |
| 38 |
|
|
JIS X 0201 Roman (or Latin) set. |
| 39 |
|
|
|
| 40 |
|
|
In spite of the history of shift JIS, ASCII is sometimes |
| 41 |
|
|
used instead of JIS X 0201 Roman set, because of compatibility |
| 42 |
|
|
with ASCII world. |
| 43 |
|
|
} |
| 44 |
|
|
|
| 45 |
|
|
{ |
| 46 |
wakaba |
1.1 |
Name: |
| 47 |
|
|
shift-jis-1997 |
| 48 |
|
|
Alias: |
| 49 |
|
|
shift_jis csshiftjis japanese-shift-jis |
| 50 |
|
|
Cversion: |
| 51 |
|
|
C:bit=8 |
| 52 |
|
|
C:G0=G94:J ## JIS X 0201:1997 Graphic character set for Latin letters |
| 53 |
|
|
C:G1=G94n:B@ ## JIS X 0208:1997 |
| 54 |
|
|
C:G2=G94:I ## JIS X 0201:1997 Graphic character set for Katakana |
| 55 |
wakaba |
1.3 |
C:G3=G94n:~ ## undefined |
| 56 |
wakaba |
1.1 |
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B', revision => '@'}] |
| 57 |
|
|
Encode: |
| 58 |
|
|
=>ucs_to_jisx0201_latin ucs_to_jisx0208_1997 ucs_to_jisx0201_katakana |
| 59 |
|
|
->sjis:C |
| 60 |
|
|
Decode: |
| 61 |
|
|
<-sjis:C |
| 62 |
|
|
<=jisx0201_latin_to_ucs jisx0208_1997_to_ucs jisx0201_katakana_to_ucs |
| 63 |
|
|
Description: |
| 64 |
|
|
Shift coded representation defined by JIS X 0208:1997 Appendix 1. |
| 65 |
|
|
|
| 66 |
|
|
Note that although IANA registry [IANAREG] assignes the name "MS_Kanji" |
| 67 |
|
|
as an alias of "Shift_JIS", in this module this name is not aliased |
| 68 |
|
|
for the shift-coded representation of JIS X 0208, since it is more |
| 69 |
|
|
apropreate to associate this name with |
| 70 |
|
|
Microsoft-defined-Shift-JIS. (Microsoft's shift JIS is a superset of |
| 71 |
|
|
"shifted" form of JIS X 0208-1990, so it causes no problem to decode, |
| 72 |
|
|
despite MS Shift JIS does not comform to JIS X 0208:1997.) |
| 73 |
|
|
} |
| 74 |
|
|
|
| 75 |
|
|
{ |
| 76 |
|
|
Name: |
| 77 |
|
|
shift-jis-1997-ascii |
| 78 |
|
|
Alias: |
| 79 |
|
|
shift_jis-ascii |
| 80 |
|
|
Cversion: |
| 81 |
|
|
C:bit=8 |
| 82 |
|
|
C:G0=G94:B ## ISO/IEC 646:1991 IRV |
| 83 |
|
|
C:G1=G94n:B@ ## JIS X 0208:1997 |
| 84 |
|
|
C:G2=G94:I ## JIS X 0201:1997 Graphic character set for Katakana |
| 85 |
wakaba |
1.3 |
C:G3=G94n:~ ## undefined |
| 86 |
wakaba |
1.1 |
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'B', revision => '@'}] |
| 87 |
|
|
Encode: |
| 88 |
|
|
=>ucs_to_ascii ucs_to_jisx0208_1997 ucs_to_jisx0201_katakana |
| 89 |
|
|
->sjis:C |
| 90 |
|
|
Decode: |
| 91 |
|
|
<-sjis:C |
| 92 |
|
|
<=jisx0208_1997_to_ucs jisx0201_katakana_to_ucs |
| 93 |
|
|
Description: |
| 94 |
|
|
Same as shift-jis-1997 but ASCII (ISO/IEC 646 IRV) |
| 95 |
|
|
instead of JIS X 0201:1997 Latin character set. |
| 96 |
|
|
|
| 97 |
|
|
Note that this coding system does NOT comform to |
| 98 |
|
|
JIS X 0208:1997 Appendix 1. |
| 99 |
|
|
} |
| 100 |
|
|
|
| 101 |
|
|
{ |
| 102 |
|
|
Name: |
| 103 |
|
|
shift_jisx0213 |
| 104 |
|
|
Alias: |
| 105 |
|
|
japanese-shift-jisx0213 shift-jisx0213 x-shift_jisx0213 shift-jis-3 shift-jis-2000 sjisx0213 |
| 106 |
|
|
Cversion: |
| 107 |
|
|
C:bit=8 |
| 108 |
|
|
C:G0=G94:J ## JIS X 0201:1997 Graphic character set for Latin letters |
| 109 |
|
|
C:G1=G94n:O ## JIS X 0213:2000 plane 1 |
| 110 |
|
|
C:G2=G94:I ## JIS X 0201:1997 Graphic character set for Katakana |
| 111 |
|
|
C:G3=G94n:P ## JIS X 0213:2000 plane 2 |
| 112 |
|
|
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}] |
| 113 |
|
|
Encode: |
| 114 |
|
|
=>ucs_to_jisx0201_latin ucs_to_jisx0213_2000_1 ucs_to_jisx0213_2000_2 ucs_to_jisx0201_katakana |
| 115 |
|
|
->sjis:C |
| 116 |
|
|
Decode: |
| 117 |
|
|
<-sjis:C |
| 118 |
|
|
<=jisx0201_latin_to_ucs jisx0213_2000_1_to_ucs jisx0213_2000_2_to_ucs jisx0201_katakana_to_ucs |
| 119 |
|
|
Description: |
| 120 |
|
|
Shift_JISX0213 coded representation, defined by |
| 121 |
|
|
JIS X 0213:2000 Appendix 1 (implemention level 4). |
| 122 |
|
|
} |
| 123 |
|
|
|
| 124 |
|
|
{ |
| 125 |
|
|
Name: |
| 126 |
|
|
shift_jisx0213-ascii |
| 127 |
|
|
Alias: |
| 128 |
|
|
shift-jis-2000-ascii |
| 129 |
|
|
Cversion: |
| 130 |
|
|
C:bit=8 |
| 131 |
|
|
C:G0=G94:B ## ISO/IEC 646:1991 IRV |
| 132 |
|
|
C:G1=G94n:O ## JIS X 0213:2000 plane 1 |
| 133 |
|
|
C:G2=G94:I ## JIS X 0201:1997 Graphic character set for Katakana |
| 134 |
|
|
C:G3=G94n:P ## JIS X 0213:2000 plane 2 |
| 135 |
|
|
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}] |
| 136 |
|
|
Encode: |
| 137 |
|
|
=>ucs_to_ascii ucs_to_jisx0213_2000_1 ucs_to_jisx0213_2000_2 ucs_to_jisx0201_katakana |
| 138 |
|
|
->sjis:C |
| 139 |
|
|
Decode: |
| 140 |
|
|
<-sjis:C |
| 141 |
|
|
<=jisx0213_2000_1_to_ucs jisx0213_2000_2_to_ucs jisx0201_katakana_to_ucs |
| 142 |
|
|
Description: |
| 143 |
|
|
Same as Shift_JISX0213 but using ASCII (ISO/IEC 646 IRV) |
| 144 |
|
|
instead of JIS X 0201:1997 Latin character set. |
| 145 |
|
|
(Alias: shift-jis-2000-ascii) |
| 146 |
|
|
|
| 147 |
|
|
Note that this coding system does NOT comform to |
| 148 |
|
|
JIS X 0213:2000 Appendix 1. |
| 149 |
|
|
} |
| 150 |
|
|
|
| 151 |
|
|
{ |
| 152 |
|
|
Name: |
| 153 |
|
|
shift_jisx0213-plane1 |
| 154 |
|
|
Cversion: |
| 155 |
|
|
C:bit=8 |
| 156 |
|
|
C:G0=G94:J ## JIS X 0201:1997 Graphic character set for Latin letters |
| 157 |
|
|
C:G1=G94n:O ## JIS X 0213:2000 plane 1 |
| 158 |
|
|
C:G2=G94:I ## JIS X 0201:1997 Graphic character set for Katakana |
| 159 |
wakaba |
1.3 |
C:G3=G94n:~ ## undefined |
| 160 |
wakaba |
1.1 |
C:option:{undef_char}=["\x22\x2E", {type => 'G94n', charset => 'O'}] |
| 161 |
|
|
Encode: |
| 162 |
|
|
=>ucs_to_jisx0201_latin ucs_to_jisx0213_2000_1 ucs_to_jisx0201_katakana |
| 163 |
|
|
->sjis:C |
| 164 |
|
|
Decode: |
| 165 |
|
|
<-sjis:C |
| 166 |
|
|
<=jisx0201_latin_to_ucs jisx0213_2000_1_to_ucs jisx0201_katakana_to_ucs |
| 167 |
|
|
Description: |
| 168 |
|
|
Shift_JISX0213-plane1 coded representation defined by JIS X 0213:2000 |
| 169 |
|
|
Appendix 1 (i.e. implemention level 3). |
| 170 |
|
|
} |
| 171 |
|
|
|
| 172 |
|
|
POD:ENCODING:POSTAMBLE: |
| 173 |
|
|
Shift JISes using alternative character names defined |
| 174 |
|
|
by JIS X 0208:1997 or JIS X 0213:2000 (so some characters |
| 175 |
|
|
are mapped to HALFWIDTH or FULLWIDTH area of UCS) are not |
| 176 |
|
|
included in this module but in L<Encode::SJIS::JISCompatible>. |
| 177 |
|
|
|
| 178 |
|
|
Shift JISes with JIS X 0208-1978, -1983, -1990 are not |
| 179 |
|
|
included in this module since these standards did NOT |
| 180 |
|
|
define them (but some vendor did implement them as non-standard |
| 181 |
|
|
encodings). They are defined in Encode::SJIS::* modules |
| 182 |
|
|
other than this. |
| 183 |
|
|
|
| 184 |
|
|
Being ones of Shift JISes defined in *JIS* TR, Shift JIS |
| 185 |
|
|
variants defined by JIS TR X 0015 "XML Japanese profile" |
| 186 |
|
|
is not supported by this module, but by other Encode::SJIS::* |
| 187 |
|
|
modulss. Shift JISes which is given their names by JIS TR X 0015 |
| 188 |
|
|
are all vendor defined variants of Shift JISes and do NOT part |
| 189 |
|
|
of formal JIS coded character set standards NOR comform to |
| 190 |
|
|
those standards. (It might sound strange that JIS TR defines |
| 191 |
|
|
names for encodings clealy not comforming to JIS standards. |
| 192 |
|
|
JIS TR "standalization" process is much looser than JIS standard's.) |
| 193 |
|
|
|
| 194 |
|
|
POD:SEE ALSO: |
| 195 |
|
|
%%ReferenceJISX0201_1997%% |
| 196 |
|
|
|
| 197 |
|
|
%%ReferenceJISX0208_1997%% |
| 198 |
|
|
|
| 199 |
|
|
%%ReferenceJISX0213_2000%% |
| 200 |
|
|
|
| 201 |
|
|
L<Encode::SJIS> |
| 202 |
|
|
|
| 203 |
|
|
%%ReferemceIANAREG%% |
| 204 |
|
|
|
| 205 |
|
|
POD:LICENSE: |
| 206 |
|
|
Copyright %%YEAR%% Nanashi-san <nanashi-san@nanashi.invalid> |
| 207 |
|
|
|
| 208 |
|
|
%%PerlLicense%% |