=head1 NAME Whatpm::Charset::UniversalCharDet - A Perl Interface to universalchardet Character Encoding Detection =head1 SYNOPSIS require Whatpm::Charset::UniversalCharDet; $charset_name = Whatpm::Charset::UniversalCharDet ->detect_byte_string ($byte_string); # $charset_name: charset name (in lowercase) or undef =head1 DESCRIPTION The C module is a Perl interface to the universalchardet character encoding detection. The universalchardet is originally developed by Mozilla project and then ported to other platforms. The C module provides a Perl interface to Universal Encoding Detector, a Python port of the Mozilla's universalchardet code. Future version of this module might provide an interface to another port of the universalchardet. =head1 METHOD =over 4 =item I<$charset> = Whatpm::Charset::UniversalCharDet->detect_byte_string (I<$s>) Detect the character encoding of the specified byte string. =over 4 =item I<$s> The byte string. =item I<$charset> The name of the character encoding, detected by universalchardet, in lowercase. If no character encoding can be detected, because, e.g., no implementation for universalchardet is found, C is returned. For the list of supported encodings, see documentation for Universal Encoding Detector . =back =back =head1 DEPENDENCY =over 4 =item L A Perl module available at CPAN . To install the module using L: root# perl -MCPAN -eshell cpan> install Inline::Python =item Python Available at . =item Universal Encoding Detector Available at . Expand the archive and then execute C in the expanded directory. =back =head1 TROUBLESHOOTING The C module does not raise error even when it fails to load the universalchardet library; it simply Cs the error message. This behavior can be changed by setting a true value to the flag C<$Whatpm::Charset::UniversalCharDet::DEBUG> - it will make any error invoke C instead of C. Common error messages are as follows: =over 4 =item Can't locate Inline.pm in @INC Module L is not installed. =item Error. You have specified 'Python' as an Inline programming language. Module L is not installed. =item Couldn't find an appropriate DIRECTORY for Inline to use. The temporary directory for the L module is not available. See L or . =item Error -- py_eval raised an exception Universal Encoding Detector is not installed. =back =head1 SEE ALSO UNIVCHARDET - SuikaWiki Universal Encoding Detector: character encoding auto-detection in Python A composite approach to language/encoding detection =head1 AUTHOR Wakaba . =head1 LICENSE Copyright 2007-2008 Wakaba This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut ## $Date: 2008/10/05 06:42:04 $