| 1 |
wakaba |
1.1 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| 2 |
|
|
<html xmlns="http://www.w3.org/1999/xhtml"> |
| 3 |
|
|
<head> |
| 4 |
|
|
<title>Char::Class::RFC1815 - Regular Expression Character Classes - C<RFC1815></title> |
| 5 |
|
|
<link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" /> |
| 6 |
|
|
<link rev="made" href="mailto:admin@suika.fam.cx" /> |
| 7 |
|
|
</head> |
| 8 |
|
|
|
| 9 |
|
|
<body> |
| 10 |
|
|
|
| 11 |
|
|
<p><a name="__index__"></a></p> |
| 12 |
|
|
<!-- INDEX BEGIN --> |
| 13 |
|
|
|
| 14 |
|
|
<ul> |
| 15 |
|
|
|
| 16 |
|
|
<li><a href="#name">NAME</a></li> |
| 17 |
|
|
<li><a href="#description">DESCRIPTION</a></li> |
| 18 |
|
|
<li><a href="#collection_names">COLLECTION NAMES</a></li> |
| 19 |
|
|
<li><a href="#example">EXAMPLE</a></li> |
| 20 |
|
|
<li><a href="#see_also">SEE ALSO</a></li> |
| 21 |
|
|
<li><a href="#license">LICENSE</a></li> |
| 22 |
|
|
</ul> |
| 23 |
|
|
<!-- INDEX END --> |
| 24 |
|
|
|
| 25 |
|
|
<hr /> |
| 26 |
|
|
<p> |
| 27 |
|
|
</p> |
| 28 |
|
|
<h1><a name="name">NAME</a></h1> |
| 29 |
|
|
<p>Char::Class::RFC1815 - Regular Expression Character Classes - <code>RFC1815</code></p> |
| 30 |
|
|
<p> |
| 31 |
|
|
</p> |
| 32 |
|
|
<hr /> |
| 33 |
|
|
<h1><a name="description">DESCRIPTION</a></h1> |
| 34 |
|
|
<p>Two ISO/IEC 10646 character repertories, defined by RFC 1815.</p> |
| 35 |
|
|
<p>RFC 1815 defines two profiled text encoding schemes based on |
| 36 |
|
|
ISO/IEC 10646. Because of the size of full ISO/IEC 10646 |
| 37 |
|
|
character set, full implemention of it is (or WAS at least |
| 38 |
|
|
at the days of RFC 1815 published) too difficult, so a couple |
| 39 |
|
|
of profiled (restricted) ISO/IEC 10646-base coded character sets |
| 40 |
|
|
were defined (as RFC 1815 did). (For details of these schemes, |
| 41 |
|
|
see RFC 1815.)</p> |
| 42 |
|
|
<p>Both encoding schemes (and most of such schemes) uses two octet |
| 43 |
|
|
BMP form (ie. UCS-2), the only realistic encoding form of those |
| 44 |
|
|
days. Today it should be obsoleted. But their repertories |
| 45 |
|
|
can be useful for interoperability purpose even today.</p> |
| 46 |
|
|
<dl> |
| 47 |
|
|
<dt><strong><a name="item_iso_2d10646">ISO-10646</a></strong><br /> |
| 48 |
|
|
</dt> |
| 49 |
|
|
<dd> |
| 50 |
|
|
The repertory of charset <a href="#item_iso_2d10646"><code>ISO-10646</code></a> is same as ISO/IEC 8859-1 |
| 51 |
|
|
(MIME name: <code>ISO-8859-1</code>. Aka Latin-1). In most situation |
| 52 |
|
|
you can simply write <code>[ -ÿ]</code>. |
| 53 |
|
|
</dd> |
| 54 |
|
|
<p></p> |
| 55 |
|
|
<dt><strong><a name="item_iso_2d10646_2dj_2d1">ISO-10646-J-1</a></strong><br /> |
| 56 |
|
|
</dt> |
| 57 |
|
|
<dd> |
| 58 |
|
|
The repertory of charset <a href="#item_iso_2d10646_2dj_2d1"><code>ISO-10646-J-1</code></a> is superset of |
| 59 |
|
|
BASIC JAPANESE plus FULLWIDTH ALPHANUMERICS plus HALFWIDTH KATAKANA, |
| 60 |
|
|
defined by JIS X 0221 Appendix 1. |
| 61 |
|
|
</dd> |
| 62 |
|
|
<dd> |
| 63 |
|
|
<p>This repertory includes all available characters of Japanized |
| 64 |
|
|
Windoze NT 3.51. (Note that more characters are available |
| 65 |
|
|
in later versions of Windoze NT.)</p> |
| 66 |
|
|
</dd> |
| 67 |
|
|
<dd> |
| 68 |
|
|
<p>Description of RFC 1815 does mention any character of U+21xx |
| 69 |
|
|
and U+24xx. It would be a bug, since ISO-10646-J-1 is defined |
| 70 |
|
|
as alternate of JIS X 0208, including those characters. |
| 71 |
|
|
This module also includes those characters.</p> |
| 72 |
|
|
</dd> |
| 73 |
|
|
<dd> |
| 74 |
|
|
<p>Since description of RFC 1815 is ambitious, most of character list |
| 75 |
|
|
of <a href="#item_iso_2d10646_2dj_2d1"><code>ISO-10646-J-1</code></a> is made from the list of JIS X 0221 Appendix 1. |
| 76 |
|
|
(Note that RFC 1815 is not referring JIS X 0221-1995 because of |
| 77 |
|
|
schedule of both memo/standard.)</p> |
| 78 |
|
|
</dd> |
| 79 |
|
|
<p></p></dl> |
| 80 |
|
|
<p> |
| 81 |
|
|
</p> |
| 82 |
|
|
<hr /> |
| 83 |
|
|
<h1><a name="collection_names">COLLECTION NAMES</a></h1> |
| 84 |
|
|
<dl> |
| 85 |
|
|
<dt><strong><a name="item_inrfc1815iso10646j1"><code>InRFC1815ISO10646J1</code></a></strong><br /> |
| 86 |
|
|
</dt> |
| 87 |
|
|
</dl> |
| 88 |
|
|
<p> |
| 89 |
|
|
</p> |
| 90 |
|
|
<hr /> |
| 91 |
|
|
<h1><a name="example">EXAMPLE</a></h1> |
| 92 |
|
|
<pre> |
| 93 |
|
|
use Char::Class::RFC1815; |
| 94 |
|
|
if ($s =~ /\p{InISO10646J1}/) { |
| 95 |
|
|
print "Match!\n"; |
| 96 |
|
|
}</pre> |
| 97 |
|
|
<p> |
| 98 |
|
|
</p> |
| 99 |
|
|
<hr /> |
| 100 |
|
|
<h1><a name="see_also">SEE ALSO</a></h1> |
| 101 |
|
|
<p>RFC 1815 <urn:ietf:rfc:1815></p> |
| 102 |
|
|
<p><a href="../../Char/Class/UCS.html">the Char::Class::UCS manpage</a></p> |
| 103 |
|
|
<p><a href="../../Char/Class/JISX0221.html">the Char::Class::JISX0221 manpage</a></p> |
| 104 |
|
|
<p> |
| 105 |
|
|
</p> |
| 106 |
|
|
<hr /> |
| 107 |
|
|
<h1><a name="license">LICENSE</a></h1> |
| 108 |
|
|
<p>Copyright 2007 Wakaba <<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>></p> |
| 109 |
|
|
<p>This library and the library generated by it is free software; |
| 110 |
|
|
you can redistribute them and/or modify them under the same |
| 111 |
|
|
terms as Perl itself.</p> |
| 112 |
|
|
|
| 113 |
|
|
</body> |
| 114 |
|
|
|
| 115 |
|
|
</html> |