Whatpm (beta)

Introduction

Whatpm is a work-in-progress set of Perl modules for Web hypertext application technologies. It is part of the manakai project.

Modules
Whatpm::CacheManifest
An HTML5 cache manifest parser.
Whatpm::Charset::UniversalCharDet
A Perl interface to universalchardet character encoding detection library.
Whatpm::ContentChecker
A DOM5 HTML (in-memory representation of a document) conformance checker with a partial support for Atom 1.0. (See also demo.)
Whatpm::ContentType
An implementation of HTML5 Content Type sniffing algorithm.
Whatpm::CSS::SelectorsParser
A group of selectors parser. (See also demo.)
Whatpm::CSS::SelectorsSerializer
A group of selectors serializer. (See also specification and demo.)
Whatpm::CSS::Tokenizer
A CSS tokenizer. (See also demo.)
Whatpm::HTML
An implementation of HTML5 document and fragment parsing algorithms. It can be used to convert an arbitrary string into a DOM. (See also demo.)
Whatpm::HTML::Serializer
An implementation of HTML5 fragment serialization algorithm. (See also demo.)
Whatpm::HTMLTable
An implementation of the HTML5 table algorithm. It can be used to extract a table structure from a DOM table element node. (See also demo.)
Whatpm::IMTChecker
An Internet Media Type (aka MIME type) label conformance checker.
Whatpm::URIChecker
An IRI reference conformance checker.
Whatpm::XMLSerializer
A simple XML serializer.

Note that all of these modules are work in progress and have a number of unresolved problems.

Note also that some modules have no documentation for now.

Specification
SSFT Specification
The specification for the serialization format used for testing Selectors-related modules.
manakai's Selectors Extensions
The specification for :-manakai-* pseudo-classes implemented by Selectors-related modules.
Documentations
List of error types
Description of errors to be notified to callback functions by Whatpm modules.
Selectors object
Description of data structure for Selectors, as implemented by Whatpm::CSS::SelectorsParser (as output), and Whatpm::CSS::SelectorsSerializer (as input).
List of predefined user data names
List of user data names defined by Whatpm modules.

Demo

Dependency

Perl 5.8 or later
It is recommended to use newer stable release of Perl 5.8 (or later).
Some modules require Encode modules, which are part of standard Perl distribution.
Modules from manakai-core
Error
Module Whatpm::HTML requires Error, which is bundled in manakai-core.
Message::IMT::InternetMediaType
Module Whatpm::IMTChecker depends on Message::IMT::InternetMediaType, which is part of manakai-core.
Message::URI::URIReference
Modules Whatpm::URIChecker and Whatpm::CacheManifest depend on Message::URI::URIReference, which is part of manakai-core.
Message::Charset::Info
Module Whatpm::ContentChecker depends on Message::Charset::Info, which is part of manakai-core.
Message::DOM::DOMImplementation and related modules
Testing for module Whatpm::ContentChecker depends on Message::DOM::DOMImplementation and related modules in manakai-core. They are not required in practice.
manakai charlib
Module Whatpm::Charset::DecodeHandle depends on modules in manakai charlib for decoding of Japanese character encodings. See the documentation for manakai charlib for more information.
Python, Perl Inline::Python module, and Universal Encoding Detector
For the module Whatpm::Charset::UniversalCharDet being meaningful, these softwares are requires on the system. See the documentation for more information.
JSON
Testing for modules Whatpm::HTML and Whatpm::CSS::Tokenizer depends on JSON and related modules. They are not required in practice.

Distribution

The development version of Whatpm may be found in the CVS repository.

TO DO

Acknowledgments

Thanks to the html5lib team for their HTML5 parser test data.

Author

.

License

Copyright 2007 Wakaba <>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.