/[suikacvs]/markup/html/whatpm/Whatpm/Charset/UniversalCharDet.html
Suika

Contents of /markup/html/whatpm/Whatpm/Charset/UniversalCharDet.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (show annotations) (download) (as text)
Sun Oct 5 06:42:04 2008 UTC (16 years, 8 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
Changes since 1.1: +59 -50 lines
File MIME type: text/html
++ ChangeLog	5 Oct 2008 06:41:29 -0000
2008-10-05  Wakaba  <wakaba@suika.fam.cx>

	* readme.en.html: Missing link to Whatpm::RDFXML module is added.
	Typo fixed.  Noted that Media Query Level 3 is not supported yet.
	Linked to WebHACC supported standards documentation.

++ whatpm/Whatpm/Charset/ChangeLog	5 Oct 2008 06:41:56 -0000
2008-10-05  Wakaba  <wakaba@suika.fam.cx>

	* UniversalCharDet.pod: Typo fixed.

1 <?xml version="1.0" ?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml">
4 <head>
5 <title>Whatpm::Charset::UniversalCharDet - A Perl Interface to universalchardet
6 Character Encoding Detection</title>
7 <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
8 <meta http-equiv="content-type" content="text/html; charset=utf-8" />
9 <link rev="made" href="mailto:wakaba@suika.fam.cx" />
10 </head>
11
12 <body>
13
14
15 <!-- INDEX BEGIN -->
16 <div name="index">
17 <p><a name="__index__"></a></p>
18
19 <ul>
20
21 <li><a href="#name">NAME</a></li>
22 <li><a href="#synopsis">SYNOPSIS</a></li>
23 <li><a href="#description">DESCRIPTION</a></li>
24 <li><a href="#method">METHOD</a></li>
25 <li><a href="#dependency">DEPENDENCY</a></li>
26 <li><a href="#troubleshooting">TROUBLESHOOTING</a></li>
27 <li><a href="#see_also">SEE ALSO</a></li>
28 <li><a href="#author">AUTHOR</a></li>
29 <li><a href="#license">LICENSE</a></li>
30 </ul>
31
32 <hr name="index" />
33 </div>
34 <!-- INDEX END -->
35
36 <p>
37 </p>
38 <h1><a name="name">NAME</a></h1>
39 <p>Whatpm::Charset::UniversalCharDet - A Perl Interface to universalchardet
40 Character Encoding Detection</p>
41 <p>
42 </p>
43 <hr />
44 <h1><a name="synopsis">SYNOPSIS</a></h1>
45 <pre>
46 require Whatpm::Charset::UniversalCharDet;
47 $charset_name = Whatpm::Charset::UniversalCharDet
48 -&gt;detect_byte_string ($byte_string);
49 # $charset_name: charset name (in lowercase) or undef</pre>
50 <p>
51 </p>
52 <hr />
53 <h1><a name="description">DESCRIPTION</a></h1>
54 <p>The <code>Whatpm::Charset::UniversalCharDet</code> module is a Perl interface to
55 the universalchardet character encoding detection.</p>
56 <p>The universalchardet is originally developed by Mozilla project
57 and then ported to other platforms. The <code>Whatpm::Charset::UniversalCharDet</code>
58 module provides a Perl interface to Universal Encoding Detector,
59 a Python port of the Mozilla's universalchardet code. Future
60 version of this module might provide an interface to another
61 port of the universalchardet.</p>
62 <p>
63 </p>
64 <hr />
65 <h1><a name="method">METHOD</a></h1>
66 <dl>
67 <dt><strong><a name="detect_byte_string" class="item"><em>$charset</em> = Whatpm::Charset::UniversalCharDet-&gt;detect_byte_string (<em>$s</em>)</a></strong>
68
69 <dd>
70 <p>Detect the character encoding of the specified byte string.</p>
71 </dd>
72 <dl>
73 <dt><strong><a name="_s" class="item"><em>$s</em></a></strong>
74
75 <dd>
76 <p>The byte string.</p>
77 </dd>
78 </li>
79 <dt><strong><a name="_charset" class="item"><em>$charset</em></a></strong>
80
81 <dd>
82 <p>The name of the character encoding, detected by universalchardet,
83 in lowercase.
84 If no character encoding can be detected, because, e.g., no implementation
85 for universalchardet is found, <code>undef</code> is returned.</p>
86 </dd>
87 <dd>
88 <p>For the list of supported encodings, see documentation for
89 Universal Encoding Detector
90 &lt;http://chardet.feedparser.org/docs/supported-encodings.html&gt;.</p>
91 </dd>
92 </li>
93 </dl>
94 </dl>
95 <p>
96 </p>
97 <hr />
98 <h1><a name="dependency">DEPENDENCY</a></h1>
99 <dl>
100 <dt><strong><a name="inline_python" class="item"><a href="../../Inline/Python.html">the Inline::Python manpage</a></a></strong>
101
102 <dd>
103 <p>A Perl module available at CPAN
104 &lt;http://search.cpan.org/~neilw/Inline-Python-0.22/&gt;.</p>
105 </dd>
106 <dd>
107 <p>To install the module using <em>CPAN.pm</em>:</p>
108 </dd>
109 <dd>
110 <pre>
111 root# perl -MCPAN -eshell
112 cpan&gt; install Inline::Python</pre>
113 </dd>
114 </li>
115 <dt><strong><a name="python" class="item">Python</a></strong>
116
117 <dd>
118 <p>Available at &lt;http://www.python.org/download/&gt;.</p>
119 </dd>
120 </li>
121 <dt><strong><a name="universal_encoding_detector" class="item">Universal Encoding Detector</a></strong>
122
123 <dd>
124 <p>Available at &lt;http://chardet.feedparser.org/download/&gt;.</p>
125 </dd>
126 <dd>
127 <p>Expand the archive and then execute <code>python setup.py install</code>
128 in the expanded directory.</p>
129 </dd>
130 </li>
131 </dl>
132 <p>
133 </p>
134 <hr />
135 <h1><a name="troubleshooting">TROUBLESHOOTING</a></h1>
136 <p>The <code>Whatpm::Charset::UniversalCharDet</code> module does not raise
137 error even when it fails to load the universalchardet library;
138 it simply <code>warn</code>s the error message.</p>
139 <p>This behavior can be changed by setting a true value to the flag
140 <code>$Whatpm::Charset::UniversalCharDet::DEBUG</code> - it will make any error
141 invoke <code>die</code> instead of <code>warn</code>.</p>
142 <p>Common error messages are as follows:</p>
143 <dl>
144 <dt><strong><a name="can_t_locate_inline_pm_in_inc" class="item">Can't locate Inline.pm in @INC</a></strong>
145
146 <dd>
147 <p>Module <em>Inline</em> is not installed.</p>
148 </dd>
149 </li>
150 <dt><strong><a name="error_you_have_specified_python_as_an_inline_programming_language" class="item">Error. You have specified 'Python' as an Inline programming language.</a></strong>
151
152 <dd>
153 <p>Module <a href="../../Inline/Python.html">the Inline::Python manpage</a> is not installed.</p>
154 </dd>
155 </li>
156 <dt><strong><a name="couldn_t_find_an_appropriate_directory_for_inline_to_use" class="item">Couldn't find an appropriate DIRECTORY for Inline to use.</a></strong>
157
158 <dd>
159 <p>The temporary directory for the <em>Inline</em> module is not available.
160 See <a href="../../Inline/Python.html#the_inline_directory">The Inline DIRECTORY in the Inline::Python manpage</a> or
161 &lt;http://search.cpan.org/~ingy/Inline-0.44/Inline.pod#The_Inline_DIRECTORY&gt;.</p>
162 </dd>
163 </li>
164 <dt><strong><a name="error_py_eval_raised_an_exception" class="item">Error -- py_eval raised an exception</a></strong>
165
166 <dd>
167 <p>Universal Encoding Detector is not installed.</p>
168 </dd>
169 </li>
170 </dl>
171 <p>
172 </p>
173 <hr />
174 <h1><a name="see_also">SEE ALSO</a></h1>
175 <p>UNIVCHARDET - SuikaWiki
176 &lt;http://suika.fam.cx/gate/2005/sw/UNIVCHARDET&gt;</p>
177 <p>Universal Encoding Detector: character encoding auto-detection in Python
178 &lt;http://chardet.feedparser.org/&gt;</p>
179 <p>A composite approach to language/encoding detection
180 &lt;http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html&gt;</p>
181 <p>
182 </p>
183 <hr />
184 <h1><a name="author">AUTHOR</a></h1>
185 <p>Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;.</p>
186 <p>
187 </p>
188 <hr />
189 <h1><a name="license">LICENSE</a></h1>
190 <p>Copyright 2007-2008 Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;</p>
191 <p>This library is free software; you can redistribute it
192 and/or modify it under the same terms as Perl itself.</p>
193
194 </body>
195
196 </html>

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24