Char::Normalize::FullwidthHalfwidth - Fullwidth/halfwidth character normalization
use Char::Normalize::FullwidthHalfwidth qw/normalize_width/; $s = <>; normalize_width (\$s); print $s;
The Char::Normalize::FullwidthHalfwidth
module provides a function
that normalizes fullwidth/halfwidth compatibility characters into
their canonical representations.
This module provides a function, normalize_width
. It can be
imported to a package by standard Exporter
method, as:
use Char::Normalize::FullwidthHalfwidth qw/normalize_width/;
Note that the use
statement does not export anything unless the
function name was explicitly specified.
Alternatively, you can invoke the function in its fully qualified form as:
require Char::Normalize::FullwidthHalfwidth; Char::Normalize::FullwidthHalfwidth::normalize_width (\$scalarref);
normalize_width ($scalarref)
Normalize the fullwidth/halfwidth characters in the scalar referenced by the argument into their preferable form. The argument must be a scalar reference. The scalar is treated as a character string (possibly with the utf8 flag set), not a byte string. The function returns the scalar reference.
The function performs the following conversions:
U+3000
IDEOGRAPHIC SPACE
(so-called fullwidth space)
Replaced by a U+0020
SPACE
(so-called halfwidth space)
character.
U+FF01
..U+FF5E
(so-called fullwidth ASCII characters)
Replaced by a character in the range U+0021
..U+007E
(so-called
halfwidth ASCII characters).
U+FF61
..U+FF9F
(halfwidth Katakana)
Replaced by a corresponding so-called fullwidth Katakana (or
ideographic punctuation). Note that U+FF9E
HALFWIDTH KATAKANA
VOICED SOUND MARK
and U+FF9F
HALFWIDTH KATAKANA SEMI-VOICED
SOUND MARK
are replaced by U+3099
COMBINING KATAKANA-HIRAGANA
VOICED SOUND MARK
and U+309A
COMBINING KATAKANA-HIRAGANA
SEMI-VOICED SOUND MARK
respectively, not their spacing variants.
U+FFE0
..U+FFE6
(fullwidth symbols)
Replaced by a corresponding canonical character.
Not all compatibility characters in the fullwidth and halfwidth block of the Unicode Standard are currently supported - especially, halfwidth Hangul syllables are not converted to their fullwidth equivalents. A future version of this module is expected to address this issue by extending the conversion table.
Wakaba <w@suika.fam.cx>.
This module was originally developed as part of SuikaWiki http://suika.fam.cx/~wakaba/wiki/sw/n/SuikaWiki.
Copyright 2008 Wakaba <w@suika.fam.cx>
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.