/[suikacvs]/www/2005/pre-id/iso-2022-jp.txt
Suika

Contents of /www/2005/pre-id/iso-2022-jp.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations) (download)
Fri Mar 11 05:36:57 2005 UTC (19 years, 8 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
File MIME type: text/plain
New

1 wakaba 1.1 * Note:
2     * The name is likely to become "ISO-2022-JP" rather than "junet-code".
3     --
4    
5     Network Working Group [authors]
6     Internet Draft [organizations]
7     10th April 1992
8    
9    
10     JUNET Japanese Character Encoding for Internet Messages
11    
12    
13     Status of this Memo
14    
15     This draft document will be submitted to the RFC editor as an
16     informational document. This is a working document only, it should
17     neither be cited nor quoted in any formal document. This document
18     will expire before 10th October 1992. Distribution of this memo is
19     unlimited. Please send comments to net-char@sra.co.jp.
20    
21    
22     Introduction
23    
24     This document describes the encoding used in plain text electronic
25     mail and network news in several Japanese networks. It was first
26     specified by and used in JUNET [JUNET]. The encoding is now also
27     widely used in Japanese IP communities and the Japanese BITNET
28     community (BITNETJP).
29    
30     This document provides a name for the encoding which is intended to
31     be used in the "charset" parameter field of MIME [MIME] messages.
32    
33     This document only describes the encoding of plain text. The encoding
34     of other subtypes of text, such as rich text, is not discussed here.
35    
36    
37     Informal Description
38    
39     The message body starts in ASCII, and switches to Japanese characters
40     through an escape sequence. For example, the escape sequence ESC $ B
41     (three bytes) indicates that the bytes following this escape sequence
42     are Japanese characters, which are encoded in two bytes each. To
43     switch back to ASCII, the escape sequence ESC ( B is used.
44    
45     The following table gives the escape sequences and the character sets
46     used in JUNET messages.
47    
48     ESC ( B ASCII
49     ESC ( J JIS X 0201-1976 (left-hand part)
50     ESC $ @ JIS X 0208-1978
51     ESC $ B JIS X 0208-1983
52    
53    
54    
55    
56     [authors] [Page 1]
57    
58     Internet Draft 10th April 1992
59    
60    
61     The left-hand part of JIS X 0201-1976 is identical to ASCII except
62     for backslash () and tilde (~). The backslash is replaced by the Yen
63     sign, and the tilde is replaced by macron (overline). This set is
64     Japan's national variant of ISO 646.
65    
66     The JIS X 0208 character sets consist of Kanji, Hiragana, Katakana
67     and some other symbols and characters. Each character takes up two
68     bytes.
69    
70     For further details about the JIS Japanese national character set
71     standards, refer to the JIS standards themselves. For further
72     information about the escape sequences, see ISO 2022 [ISO2022].
73    
74     If there are JIS X 0208 characters on a line, there must be a switch
75     to ASCII or to the left-hand part of JIS X 0201 before the end of the
76     line (i.e. before the CRLF). This means that the next line starts in
77     the character set that was switched to before the end of the previous
78     line. Other restrictions are given in the Formal Description below.
79    
80    
81     Formal Description
82    
83     This section provides a formal description of the JUNET encoding. In
84     the event that this description is not consistent with the above
85     informal description, this formal description shall take precedence.
86    
87     The notational conventions used here are identical to those used in
88     RFC 822 [RFC822].
89    
90     The * (asterisk) convention is as follows:
91    
92     l*m something
93    
94     meaning at least l and at most m somethings, with l and m taking
95     default values of 0 and infinity, respectively.
96    
97    
98     line = *text *1( *segment single-byte-seq *text ) CRLF
99    
100     segment = single-byte-segment / double-byte-segment
101    
102     single-byte-segment = single-byte-seq 1*text
103    
104     double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 )
105    
106     single-byte-seq = ESC "(" ( "B" / "J" )
107    
108     double-byte-seq = ESC "$" ( "@" / "B" )
109    
110    
111    
112     [authors] [Page 2]
113    
114     Internet Draft 10th April 1992
115    
116    
117     ; ( Octal, Decimal.)
118    
119     ESC = <ISO 2022 ESC, escape> ; ( 33, 27.)
120    
121     one-of-94 = <any char in 94-char set> ; (41-176, 33.-126.)
122    
123     CHAR = <any ASCII character> ; ( 0-177, 0.-127.)
124    
125     text = <any CHAR, including bare ; => atoms, specials,
126     CR & bare LF, but NOT ; comments and
127     including CRLF> ; quoted-strings are
128     ; NOT recognized.
129    
130    
131     Additional restrictions that are difficult to describe in the above
132     are as follows.
133    
134     Adjacent segments should have different escape sequences. For
135     example, the following is not recommended:
136    
137     ESC $ B .... ESC $ B ....
138    
139    
140     MIME Considerations
141    
142     The name given to the JUNET character encoding is "junet-code". This
143     name is intended to be used in MIME messages as follows:
144    
145     Content-Type: text/plain; charset=junet-code
146    
147     The JUNET encoding is already in 7-bit form, so the correct "transfer
148     encoding" to use is:
149    
150     Content-Transfer-Encoding: 7bit
151    
152     It should be noted that applying the Base64 or Quoted-Printable
153     encoding will render the message unreadable in current JUNET
154     software.
155    
156    
157     Background Information
158    
159     The JUNET encoding was described in the JUNET User's Guide [JUNET]
160     (JUNET Riyou No Tebiki Dai Ippan).
161    
162     The encoding is based on the particular usage of ISO 2022 [ISO2022]
163     announced by 4/1. However, the escape sequence normally used for this
164     announcement is not included in JUNET messages.
165    
166    
167    
168     [authors] [Page 3]
169    
170     Internet Draft 10th April 1992
171    
172    
173     References
174    
175     [ISO2022] International Standard for Organization (ISO), "Information
176     processing -- ISO 7-bit and 8-bit coded character sets -- Code
177     extension techniques", International Standard, 1986, Ref. No. ISO
178     2022-1986 (E)
179    
180     [JUNET] JUNET Riyou No Tebiki Sakusei Iin Kai (JUNET User's Guide
181     Drafting Committee), "JUNET Riyou No Tebiki (Dai Ippan)" ("JUNET
182     User's Guide (First Edition)"), February 1988
183    
184     [MIME] Nathaniel Borenstein and Ned Freed, "MIME (Multipurpose
185     Internet Mail Extensions): Mechanisms for Specifying and Describing
186     the Format of Internet Message Bodies", Internet Draft, March 1992,
187     draft-ietf-822ext-messagebodies-06.txt
188    
189     [RFC822] David H. Crocker, "Standard for the Format of ARPA Internet
190     Text Messages", Internet standard, August 1982, rfc822
191    
192    
193     Security Considerations
194    
195     Security considerations are not discussed in this memo.
196    
197    
198     Authors' Addresses
199    
200     [authors]
201     [organizations]
202     [addresses]
203    
204    
205    
206    
207    
208    
209    
210    
211    
212    
213    
214    
215    
216    
217    
218    
219    
220    
221    
222    
223    
224     [authors] [Page 4]

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24