* Note:
* The name is likely to become "ISO-2022-JP" rather than "junet-code".
--

Network Working Group                                          [authors]
Internet Draft                                           [organizations]
                                                         10th April 1992


        JUNET Japanese Character Encoding for Internet Messages


Status of this Memo

   This draft document will be submitted to the RFC editor as an
   informational document.  This is a working document only, it should
   neither be cited nor quoted in any formal document.  This document
   will expire before 10th October 1992.  Distribution of this memo is
   unlimited.  Please send comments to net-char@sra.co.jp.


Introduction

   This document describes the encoding used in plain text electronic
   mail and network news in several Japanese networks. It was first
   specified by and used in JUNET [JUNET]. The encoding is now also
   widely used in Japanese IP communities and the Japanese BITNET
   community (BITNETJP).

   This document provides a name for the encoding which is intended to
   be used in the "charset" parameter field of MIME [MIME] messages.

   This document only describes the encoding of plain text. The encoding
   of other subtypes of text, such as rich text, is not discussed here.


Informal Description

   The message body starts in ASCII, and switches to Japanese characters
   through an escape sequence. For example, the escape sequence ESC $ B
   (three bytes) indicates that the bytes following this escape sequence
   are Japanese characters, which are encoded in two bytes each.  To
   switch back to ASCII, the escape sequence ESC ( B is used.

   The following table gives the escape sequences and the character sets
   used in JUNET messages.

           ESC ( B         ASCII
           ESC ( J         JIS X 0201-1976 (left-hand part)
           ESC $ @         JIS X 0208-1978
           ESC $ B         JIS X 0208-1983


[authors]                                                       [Page 1]

Internet Draft                                           10th April 1992


   The left-hand part of JIS X 0201-1976 is identical to ASCII except
   for backslash () and tilde (~). The backslash is replaced by the Yen
   sign, and the tilde is replaced by macron (overline). This set is
   Japan's national variant of ISO 646.

   The JIS X 0208 character sets consist of Kanji, Hiragana, Katakana
   and some other symbols and characters. Each character takes up two
   bytes.

   For further details about the JIS Japanese national character set
   standards, refer to the JIS standards themselves. For further
   information about the escape sequences, see ISO 2022 [ISO2022].

   If there are JIS X 0208 characters on a line, there must be a switch
   to ASCII or to the left-hand part of JIS X 0201 before the end of the
   line (i.e. before the CRLF). This means that the next line starts in
   the character set that was switched to before the end of the previous
   line. Other restrictions are given in the Formal Description below.


Formal Description

   This section provides a formal description of the JUNET encoding. In
   the event that this description is not consistent with the above
   informal description, this formal description shall take precedence.

   The notational conventions used here are identical to those used in
   RFC 822 [RFC822].

   The * (asterisk) convention is as follows:

           l*m something

   meaning at least l and at most m somethings, with l and m taking
   default values of 0 and infinity, respectively.


   line                = *text *1( *segment single-byte-seq *text ) CRLF

   segment             = single-byte-segment / double-byte-segment

   single-byte-segment = single-byte-seq 1*text

   double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 )

   single-byte-seq     = ESC "(" ( "B" / "J" )

   double-byte-seq     = ESC "$" ( "@" / "B" )


[authors]                                                       [Page 2]

Internet Draft                                           10th April 1992


                                                    ; ( Octal, Decimal.)

   ESC                 = <ISO 2022 ESC, escape>     ; (    33,      27.)

   one-of-94           = <any char in 94-char set>  ; (41-176, 33.-126.)

   CHAR                = <any ASCII character>      ; ( 0-177,  0.-127.)

   text                = <any CHAR, including bare  ; => atoms, specials,
                          CR & bare LF, but NOT     ;  comments and
                          including CRLF>           ;  quoted-strings are
                                                    ;  NOT recognized.


   Additional restrictions that are difficult to describe in the above
   are as follows.

   Adjacent segments should have different escape sequences. For
   example, the following is not recommended:

           ESC $ B .... ESC $ B ....


MIME Considerations

   The name given to the JUNET character encoding is "junet-code". This
   name is intended to be used in MIME messages as follows:

           Content-Type: text/plain; charset=junet-code

   The JUNET encoding is already in 7-bit form, so the correct "transfer
   encoding" to use is:

           Content-Transfer-Encoding: 7bit

   It should be noted that applying the Base64 or Quoted-Printable
   encoding will render the message unreadable in current JUNET
   software.


Background Information

   The JUNET encoding was described in the JUNET User's Guide [JUNET]
   (JUNET Riyou No Tebiki Dai Ippan).

   The encoding is based on the particular usage of ISO 2022 [ISO2022]
   announced by 4/1. However, the escape sequence normally used for this
   announcement is not included in JUNET messages.


[authors]                                                       [Page 3]

Internet Draft                                           10th April 1992


References

   [ISO2022] International Standard for Organization (ISO), "Information
   processing -- ISO 7-bit and 8-bit coded character sets -- Code
   extension techniques", International Standard, 1986, Ref. No. ISO
   2022-1986 (E)

   [JUNET] JUNET Riyou No Tebiki Sakusei Iin Kai (JUNET User's Guide
   Drafting Committee), "JUNET Riyou No Tebiki (Dai Ippan)" ("JUNET
   User's Guide (First Edition)"), February 1988

   [MIME] Nathaniel Borenstein and Ned Freed, "MIME (Multipurpose
   Internet Mail Extensions): Mechanisms for Specifying and Describing
   the Format of Internet Message Bodies", Internet Draft, March 1992,
   draft-ietf-822ext-messagebodies-06.txt

   [RFC822] David H. Crocker, "Standard for the Format of ARPA Internet
   Text Messages", Internet standard, August 1982, rfc822


Security Considerations

   Security considerations are not discussed in this memo.


Authors' Addresses

   [authors]
   [organizations]
   [addresses]


[authors]                                                       [Page 4]