Whatpm is a work-in-progress set of Perl modules for Web hypertext application technologies. It is part of the manakai project.
Whatpm supports various Web standard technologies, including HTML, XHTML, XML, CSS, HTTP, and URL.
An Atom feed for manakai/Whatpm updates is available.
Note that all of these modules are work in progress and have a number of unresolved problems.
Note also that some modules have no documentation yet.
Modules related to HTML and XHTML are as follows:
Whatpm::HTML::Parser
Whatpm::HTML
is
deprecated in favor of Whatpm::HTML::Parser
.
Whatpm::HTML::Serializer
Whatpm::HTML::Table
table
element node. (See also demo.)Whatpm::HTML::ParserData
Modules for the XML support is as follow:
Whatpm::XML::Parser
An XML parser with non-draconian error handling. It can
construct a DOM tree from XML 1.0/1.1 documents that does not
rely on external entities (including the external subset entity) and
that does not contain general entity reference that
references an entity whose replacement text contains character
&
or <
. It also supports XML namespaces.
It does not stop the process to construct a DOM tree even if it
detects a well-formedness or a namespace well-formedness error. It
recovers from errors in a manner similar to HTML5's tokenization
algorithm. It is expected that the combination of this module and a
future extension to the Whatpm::ContentChecker
framework
will provide a mean to detect all well-formedness and validity errors,
if desired.
(See also demo.)
Whatpm::XMLSerializer
The module for conformance checking of a DOM tree (i.e. a in-memory representation of an HTML or XML document) is as follows:
Whatpm::ContentChecker
Whatpm::HTML::Dumper
For these modules, a DOM implementation that supports the
manakai's Perl binding of DOM is necessary to represent a document
in memory. The manakai-core
package contains such an implementation,
Message::DOM::Implementation
, but it
should also be possible to use any other implementation that supports
the binding.
In addition, the following module is available:
Whatpm::Atom::Aggregator
Modules for CSS and related technologies are as follows:
Whatpm::CSS::Colors
Whatpm::CSS::Cascade
Whatpm::CSS::MediaQueryParser
Whatpm::CSS::MediaQuerySerializer
Whatpm::CSS::Parser
Whatpm::CSS::SelectorsParser
Whatpm::CSS::SelectorsSerializer
Whatpm::CSS::Tokenizer
For the Whatpm::CSS::Parser
module reresenting a CSSOM
tree, modules in the manakai-core
package are used. Those modules also provide the serializer for the
CSSOM tree, in the form of the standard
css_text
CSSOM attribute.
Modules for WebVTT are as follows:
Whatpm::WebVTT::Parser
Whatpm::WebVTT::Serializer
Whatpm::WebVTT::Checker
Modules for HTTP and related technologies are as follows:
Whatpm::ContentType
Whatpm::IMTChecker
Message::MIME::Type
instead.
Currently, support for parsing of HTTP headers and as such is not yet available.
Module for the URL support is as follows:
Whatpm::URIChecker
Support for HTML5's realistic definition of URL is not available yet.
Following modules provide support for other Web-related technologies:
Whatpm::CacheManifest
Whatpm::Charset::DecodeHandle
Whatpm::Charset::UnicodeChecker
Whatpm::Charset::UniversalCharDet
Whatpm::LangTag
Whatpm::RDFXML
Whatpm::WebIDL
;
character.
Modules listed above, which are included in
the Whatpm package, can be used by directly use
ing or
require
ing these modules and then invoking their native
interface. For more information on those native interfaces, see the
document of those modules and the source code of them.
In addition, some of functionality provided by those modules can be
accessed via standardized DOM interfaces implemented by modules
included in the manakai-core
package. See the document of the module Message::DOM::DOMImplementation
for the way to access to the DOM interfaces.
The table below summarizes the relationship between Whatpm modules and DOM methods/attributes implemented by manakai-core modules:
Whatpm module | DOM methods/attributes |
---|---|
Whatpm::CSS::Cascade
| get_computed_style (ViewCSS ),
current_style (ElementCSS )
|
Whatpm::CSS::Parser
| CSSStyleDeclaration 's attributes and methods,
css_text (CSSOM interfaces)
|
Whatpm::CSS::Serializer
| |
Whatpm::CSS::SelectorsParser
| query_selector , query_selector_all
(DocumentSelector , ElementSelector )
|
selector_text (CSSStyleRule )
| |
Whatpm::CSS::SelectorsSerializer
| |
Whatpm::HTML
| inner_html (HTMLDocument ,
Element )
|
Whatpm::HTML::Serializer
| |
Whatpm::XML::Parser
| |
Whatpm::XMLSerializer
|
For the description of functionalities provided by each module, see pod documentation of the module. HTML version of pod documentations are linked from the list of modules above.
In addition, there are additional documents for some topics:
Whatpm::CSS::SelectorsParser
(as output), and
Whatpm::CSS::SelectorsSerializer
(as input).Following specifications define Whatpm-specific formats and extensions:
-manakai-*
properties
and property values implemented by CSS-related modules.
:-manakai-*
pseudo-classes implemented by Selectors-related modules.See also a list of applications using modules in the manakai-core package; some of them indirectly use Whatpm modules via DOM interfaces provided by manakai-core.
Encode
modules, which are part of standard Perl distribution.Whatpm::Charset::DecodeHandle
depends on
modules in manakai
charlib for decoding of Japanese character encodings.
See the documentation for
manakai
charlib for more information.Inline::Python
module, and Universal Encoding
DetectorWhatpm::Charset::UniversalCharDet
being
meaningful, these softwares are required on the system. See the
documentation
for more information.The manakai git repository includes all Whatpm Perl module files. You can clone the repository by:
$ git clone https://suika.suikawiki.org/gate/git/bare/manakai.git/
Whatpm::ContentType
,
HTML tokenization,
HTML tree construction,
Whatpm::ContentChecker
).meta
.## TODOand
## ISSUEin each module.)
See also the bug tracking system.
Thanks to the html5lib team for their HTML5 parser test data.
Copyright 2007‐2012 Wakaba <wakaba@suikawiki.org>
.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.