/[suikacvs]/markup/html/whatpm/Whatpm/Charset/UniversalCharDet.html
Suika

Contents of /markup/html/whatpm/Whatpm/Charset/UniversalCharDet.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (hide annotations) (download) (as text)
Sun Oct 5 06:42:04 2008 UTC (16 years, 6 months ago) by wakaba
Branch: MAIN
CVS Tags: HEAD
Changes since 1.1: +59 -50 lines
File MIME type: text/html
++ ChangeLog	5 Oct 2008 06:41:29 -0000
2008-10-05  Wakaba  <wakaba@suika.fam.cx>

	* readme.en.html: Missing link to Whatpm::RDFXML module is added.
	Typo fixed.  Noted that Media Query Level 3 is not supported yet.
	Linked to WebHACC supported standards documentation.

++ whatpm/Whatpm/Charset/ChangeLog	5 Oct 2008 06:41:56 -0000
2008-10-05  Wakaba  <wakaba@suika.fam.cx>

	* UniversalCharDet.pod: Typo fixed.

1 wakaba 1.2 <?xml version="1.0" ?>
2 wakaba 1.1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3     <html xmlns="http://www.w3.org/1999/xhtml">
4     <head>
5     <title>Whatpm::Charset::UniversalCharDet - A Perl Interface to universalchardet
6     Character Encoding Detection</title>
7     <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
8 wakaba 1.2 <meta http-equiv="content-type" content="text/html; charset=utf-8" />
9     <link rev="made" href="mailto:wakaba@suika.fam.cx" />
10 wakaba 1.1 </head>
11    
12     <body>
13    
14 wakaba 1.2
15     <!-- INDEX BEGIN -->
16     <div name="index">
17 wakaba 1.1 <p><a name="__index__"></a></p>
18    
19     <ul>
20    
21     <li><a href="#name">NAME</a></li>
22     <li><a href="#synopsis">SYNOPSIS</a></li>
23     <li><a href="#description">DESCRIPTION</a></li>
24     <li><a href="#method">METHOD</a></li>
25     <li><a href="#dependency">DEPENDENCY</a></li>
26     <li><a href="#troubleshooting">TROUBLESHOOTING</a></li>
27     <li><a href="#see_also">SEE ALSO</a></li>
28     <li><a href="#author">AUTHOR</a></li>
29     <li><a href="#license">LICENSE</a></li>
30     </ul>
31 wakaba 1.2
32     <hr name="index" />
33     </div>
34 wakaba 1.1 <!-- INDEX END -->
35    
36     <p>
37     </p>
38     <h1><a name="name">NAME</a></h1>
39     <p>Whatpm::Charset::UniversalCharDet - A Perl Interface to universalchardet
40     Character Encoding Detection</p>
41     <p>
42     </p>
43     <hr />
44     <h1><a name="synopsis">SYNOPSIS</a></h1>
45     <pre>
46     require Whatpm::Charset::UniversalCharDet;
47     $charset_name = Whatpm::Charset::UniversalCharDet
48     -&gt;detect_byte_string ($byte_string);
49     # $charset_name: charset name (in lowercase) or undef</pre>
50     <p>
51     </p>
52     <hr />
53     <h1><a name="description">DESCRIPTION</a></h1>
54     <p>The <code>Whatpm::Charset::UniversalCharDet</code> module is a Perl interface to
55     the universalchardet character encoding detection.</p>
56     <p>The universalchardet is originally developed by Mozilla project
57     and then ported to other platforms. The <code>Whatpm::Charset::UniversalCharDet</code>
58     module provides a Perl interface to Universal Encoding Detector,
59     a Python port of the Mozilla's universalchardet code. Future
60     version of this module might provide an interface to another
61     port of the universalchardet.</p>
62     <p>
63     </p>
64     <hr />
65     <h1><a name="method">METHOD</a></h1>
66     <dl>
67 wakaba 1.2 <dt><strong><a name="detect_byte_string" class="item"><em>$charset</em> = Whatpm::Charset::UniversalCharDet-&gt;detect_byte_string (<em>$s</em>)</a></strong>
68    
69 wakaba 1.1 <dd>
70 wakaba 1.2 <p>Detect the character encoding of the specified byte string.</p>
71 wakaba 1.1 </dd>
72     <dl>
73 wakaba 1.2 <dt><strong><a name="_s" class="item"><em>$s</em></a></strong>
74    
75 wakaba 1.1 <dd>
76 wakaba 1.2 <p>The byte string.</p>
77 wakaba 1.1 </dd>
78 wakaba 1.2 </li>
79     <dt><strong><a name="_charset" class="item"><em>$charset</em></a></strong>
80    
81 wakaba 1.1 <dd>
82 wakaba 1.2 <p>The name of the character encoding, detected by universalchardet,
83 wakaba 1.1 in lowercase.
84     If no character encoding can be detected, because, e.g., no implementation
85 wakaba 1.2 for universalchardet is found, <code>undef</code> is returned.</p>
86 wakaba 1.1 </dd>
87     <dd>
88     <p>For the list of supported encodings, see documentation for
89     Universal Encoding Detector
90     &lt;http://chardet.feedparser.org/docs/supported-encodings.html&gt;.</p>
91     </dd>
92 wakaba 1.2 </li>
93     </dl>
94 wakaba 1.1 </dl>
95     <p>
96     </p>
97     <hr />
98     <h1><a name="dependency">DEPENDENCY</a></h1>
99     <dl>
100 wakaba 1.2 <dt><strong><a name="inline_python" class="item"><a href="../../Inline/Python.html">the Inline::Python manpage</a></a></strong>
101    
102 wakaba 1.1 <dd>
103 wakaba 1.2 <p>A Perl module available at CPAN
104     &lt;http://search.cpan.org/~neilw/Inline-Python-0.22/&gt;.</p>
105 wakaba 1.1 </dd>
106     <dd>
107     <p>To install the module using <em>CPAN.pm</em>:</p>
108     </dd>
109     <dd>
110     <pre>
111     root# perl -MCPAN -eshell
112     cpan&gt; install Inline::Python</pre>
113     </dd>
114 wakaba 1.2 </li>
115     <dt><strong><a name="python" class="item">Python</a></strong>
116    
117 wakaba 1.1 <dd>
118 wakaba 1.2 <p>Available at &lt;http://www.python.org/download/&gt;.</p>
119 wakaba 1.1 </dd>
120 wakaba 1.2 </li>
121     <dt><strong><a name="universal_encoding_detector" class="item">Universal Encoding Detector</a></strong>
122    
123 wakaba 1.1 <dd>
124 wakaba 1.2 <p>Available at &lt;http://chardet.feedparser.org/download/&gt;.</p>
125 wakaba 1.1 </dd>
126     <dd>
127     <p>Expand the archive and then execute <code>python setup.py install</code>
128     in the expanded directory.</p>
129     </dd>
130 wakaba 1.2 </li>
131     </dl>
132 wakaba 1.1 <p>
133     </p>
134     <hr />
135     <h1><a name="troubleshooting">TROUBLESHOOTING</a></h1>
136     <p>The <code>Whatpm::Charset::UniversalCharDet</code> module does not raise
137     error even when it fails to load the universalchardet library;
138     it simply <code>warn</code>s the error message.</p>
139 wakaba 1.2 <p>This behavior can be changed by setting a true value to the flag
140     <code>$Whatpm::Charset::UniversalCharDet::DEBUG</code> - it will make any error
141     invoke <code>die</code> instead of <code>warn</code>.</p>
142     <p>Common error messages are as follows:</p>
143 wakaba 1.1 <dl>
144 wakaba 1.2 <dt><strong><a name="can_t_locate_inline_pm_in_inc" class="item">Can't locate Inline.pm in @INC</a></strong>
145    
146 wakaba 1.1 <dd>
147 wakaba 1.2 <p>Module <em>Inline</em> is not installed.</p>
148 wakaba 1.1 </dd>
149 wakaba 1.2 </li>
150     <dt><strong><a name="error_you_have_specified_python_as_an_inline_programming_language" class="item">Error. You have specified 'Python' as an Inline programming language.</a></strong>
151    
152 wakaba 1.1 <dd>
153 wakaba 1.2 <p>Module <a href="../../Inline/Python.html">the Inline::Python manpage</a> is not installed.</p>
154 wakaba 1.1 </dd>
155 wakaba 1.2 </li>
156     <dt><strong><a name="couldn_t_find_an_appropriate_directory_for_inline_to_use" class="item">Couldn't find an appropriate DIRECTORY for Inline to use.</a></strong>
157    
158 wakaba 1.1 <dd>
159 wakaba 1.2 <p>The temporary directory for the <em>Inline</em> module is not available.
160 wakaba 1.1 See <a href="../../Inline/Python.html#the_inline_directory">The Inline DIRECTORY in the Inline::Python manpage</a> or
161 wakaba 1.2 &lt;http://search.cpan.org/~ingy/Inline-0.44/Inline.pod#The_Inline_DIRECTORY&gt;.</p>
162 wakaba 1.1 </dd>
163 wakaba 1.2 </li>
164     <dt><strong><a name="error_py_eval_raised_an_exception" class="item">Error -- py_eval raised an exception</a></strong>
165    
166 wakaba 1.1 <dd>
167 wakaba 1.2 <p>Universal Encoding Detector is not installed.</p>
168 wakaba 1.1 </dd>
169 wakaba 1.2 </li>
170     </dl>
171 wakaba 1.1 <p>
172     </p>
173     <hr />
174     <h1><a name="see_also">SEE ALSO</a></h1>
175     <p>UNIVCHARDET - SuikaWiki
176     &lt;http://suika.fam.cx/gate/2005/sw/UNIVCHARDET&gt;</p>
177     <p>Universal Encoding Detector: character encoding auto-detection in Python
178     &lt;http://chardet.feedparser.org/&gt;</p>
179     <p>A composite approach to language/encoding detection
180     &lt;http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html&gt;</p>
181     <p>
182     </p>
183     <hr />
184     <h1><a name="author">AUTHOR</a></h1>
185     <p>Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;.</p>
186     <p>
187     </p>
188     <hr />
189     <h1><a name="license">LICENSE</a></h1>
190 wakaba 1.2 <p>Copyright 2007-2008 Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;</p>
191 wakaba 1.1 <p>This library is free software; you can redistribute it
192     and/or modify it under the same terms as Perl itself.</p>
193    
194     </body>
195    
196     </html>

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24