/[suikacvs]/www/2005/pre-id/iso-2022-jp.txt
Suika

Contents of /www/2005/pre-id/iso-2022-jp.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations) (download)
Fri Mar 11 05:36:57 2005 UTC (19 years, 8 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 * Note:
2 * The name is likely to become "ISO-2022-JP" rather than "junet-code".
3 --
4
5 Network Working Group [authors]
6 Internet Draft [organizations]
7 10th April 1992
8
9
10 JUNET Japanese Character Encoding for Internet Messages
11
12
13 Status of this Memo
14
15 This draft document will be submitted to the RFC editor as an
16 informational document. This is a working document only, it should
17 neither be cited nor quoted in any formal document. This document
18 will expire before 10th October 1992. Distribution of this memo is
19 unlimited. Please send comments to net-char@sra.co.jp.
20
21
22 Introduction
23
24 This document describes the encoding used in plain text electronic
25 mail and network news in several Japanese networks. It was first
26 specified by and used in JUNET [JUNET]. The encoding is now also
27 widely used in Japanese IP communities and the Japanese BITNET
28 community (BITNETJP).
29
30 This document provides a name for the encoding which is intended to
31 be used in the "charset" parameter field of MIME [MIME] messages.
32
33 This document only describes the encoding of plain text. The encoding
34 of other subtypes of text, such as rich text, is not discussed here.
35
36
37 Informal Description
38
39 The message body starts in ASCII, and switches to Japanese characters
40 through an escape sequence. For example, the escape sequence ESC $ B
41 (three bytes) indicates that the bytes following this escape sequence
42 are Japanese characters, which are encoded in two bytes each. To
43 switch back to ASCII, the escape sequence ESC ( B is used.
44
45 The following table gives the escape sequences and the character sets
46 used in JUNET messages.
47
48 ESC ( B ASCII
49 ESC ( J JIS X 0201-1976 (left-hand part)
50 ESC $ @ JIS X 0208-1978
51 ESC $ B JIS X 0208-1983
52
53
54
55
56 [authors] [Page 1]
57
58 Internet Draft 10th April 1992
59
60
61 The left-hand part of JIS X 0201-1976 is identical to ASCII except
62 for backslash () and tilde (~). The backslash is replaced by the Yen
63 sign, and the tilde is replaced by macron (overline). This set is
64 Japan's national variant of ISO 646.
65
66 The JIS X 0208 character sets consist of Kanji, Hiragana, Katakana
67 and some other symbols and characters. Each character takes up two
68 bytes.
69
70 For further details about the JIS Japanese national character set
71 standards, refer to the JIS standards themselves. For further
72 information about the escape sequences, see ISO 2022 [ISO2022].
73
74 If there are JIS X 0208 characters on a line, there must be a switch
75 to ASCII or to the left-hand part of JIS X 0201 before the end of the
76 line (i.e. before the CRLF). This means that the next line starts in
77 the character set that was switched to before the end of the previous
78 line. Other restrictions are given in the Formal Description below.
79
80
81 Formal Description
82
83 This section provides a formal description of the JUNET encoding. In
84 the event that this description is not consistent with the above
85 informal description, this formal description shall take precedence.
86
87 The notational conventions used here are identical to those used in
88 RFC 822 [RFC822].
89
90 The * (asterisk) convention is as follows:
91
92 l*m something
93
94 meaning at least l and at most m somethings, with l and m taking
95 default values of 0 and infinity, respectively.
96
97
98 line = *text *1( *segment single-byte-seq *text ) CRLF
99
100 segment = single-byte-segment / double-byte-segment
101
102 single-byte-segment = single-byte-seq 1*text
103
104 double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 )
105
106 single-byte-seq = ESC "(" ( "B" / "J" )
107
108 double-byte-seq = ESC "$" ( "@" / "B" )
109
110
111
112 [authors] [Page 2]
113
114 Internet Draft 10th April 1992
115
116
117 ; ( Octal, Decimal.)
118
119 ESC = <ISO 2022 ESC, escape> ; ( 33, 27.)
120
121 one-of-94 = <any char in 94-char set> ; (41-176, 33.-126.)
122
123 CHAR = <any ASCII character> ; ( 0-177, 0.-127.)
124
125 text = <any CHAR, including bare ; => atoms, specials,
126 CR & bare LF, but NOT ; comments and
127 including CRLF> ; quoted-strings are
128 ; NOT recognized.
129
130
131 Additional restrictions that are difficult to describe in the above
132 are as follows.
133
134 Adjacent segments should have different escape sequences. For
135 example, the following is not recommended:
136
137 ESC $ B .... ESC $ B ....
138
139
140 MIME Considerations
141
142 The name given to the JUNET character encoding is "junet-code". This
143 name is intended to be used in MIME messages as follows:
144
145 Content-Type: text/plain; charset=junet-code
146
147 The JUNET encoding is already in 7-bit form, so the correct "transfer
148 encoding" to use is:
149
150 Content-Transfer-Encoding: 7bit
151
152 It should be noted that applying the Base64 or Quoted-Printable
153 encoding will render the message unreadable in current JUNET
154 software.
155
156
157 Background Information
158
159 The JUNET encoding was described in the JUNET User's Guide [JUNET]
160 (JUNET Riyou No Tebiki Dai Ippan).
161
162 The encoding is based on the particular usage of ISO 2022 [ISO2022]
163 announced by 4/1. However, the escape sequence normally used for this
164 announcement is not included in JUNET messages.
165
166
167
168 [authors] [Page 3]
169
170 Internet Draft 10th April 1992
171
172
173 References
174
175 [ISO2022] International Standard for Organization (ISO), "Information
176 processing -- ISO 7-bit and 8-bit coded character sets -- Code
177 extension techniques", International Standard, 1986, Ref. No. ISO
178 2022-1986 (E)
179
180 [JUNET] JUNET Riyou No Tebiki Sakusei Iin Kai (JUNET User's Guide
181 Drafting Committee), "JUNET Riyou No Tebiki (Dai Ippan)" ("JUNET
182 User's Guide (First Edition)"), February 1988
183
184 [MIME] Nathaniel Borenstein and Ned Freed, "MIME (Multipurpose
185 Internet Mail Extensions): Mechanisms for Specifying and Describing
186 the Format of Internet Message Bodies", Internet Draft, March 1992,
187 draft-ietf-822ext-messagebodies-06.txt
188
189 [RFC822] David H. Crocker, "Standard for the Format of ARPA Internet
190 Text Messages", Internet standard, August 1982, rfc822
191
192
193 Security Considerations
194
195 Security considerations are not discussed in this memo.
196
197
198 Authors' Addresses
199
200 [authors]
201 [organizations]
202 [addresses]
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224 [authors] [Page 4]

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24