| 13 |
<div class="section" id="introduction"> |
<div class="section" id="introduction"> |
| 14 |
<h2>Introduction</h2> |
<h2>Introduction</h2> |
| 15 |
|
|
| 16 |
<p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of <m>P</m>erl |
<p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of |
| 17 |
<m>m</m>odules for <m>W</m>eb <m>h</m>ypertext <m>a</m>pplication |
<mark>P</mark>erl <mark>m</mark>odules for <mark>W</mark>eb |
| 18 |
<m>t</m>echnologies. It is part |
<mark>h</mark>ypertext <mark>a</mark>pplication |
| 19 |
of the <a href="http://suika.fam.cx/www/2006/manakai/" rel=up>manakai</a> |
<mark>t</mark>echnologies. It is part of the <a |
| 20 |
|
href="http://suika.fam.cx/www/2006/manakai/" rel=up>manakai</a> |
| 21 |
project.</p> |
project.</p> |
| 22 |
|
|
| 23 |
|
<p>Whatpm supports various Web standard technologies, including <a |
| 24 |
|
href="#modules-html">HTML, XHTML</a>, <a href="#modules-xml">XML</a>, |
| 25 |
|
<a hreF="#modules-css">CSS</a>, <a href="#modules-http">HTTP</a>, and |
| 26 |
|
<a href="#modules-url">URL</a>. |
| 27 |
|
</div> |
| 28 |
|
|
| 29 |
|
<div class=section id=modules> |
| 30 |
|
<h2>Modules</h2> |
| 31 |
|
|
| 32 |
|
<div class=section id=modules-html-xml> |
| 33 |
|
<h3>Modules for HTML and XML</h3> |
| 34 |
|
|
| 35 |
|
<p id=modules-html>Modules related to HTML and XHTML are as follows: |
| 36 |
|
<dl> |
| 37 |
|
<dt id=module-whatpm-html><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt> |
| 38 |
|
<dd>An implementation of HTML5 document and fragment |
| 39 |
|
parsing algorithms. It can be used |
| 40 |
|
to convert an arbitrary string into a |
| 41 |
|
<abbr title="Document Object Model">DOM</abbr>. (See also |
| 42 |
|
<a href="#demo-html-parser">demo</a>.)</dd> |
| 43 |
|
<dt id=module-whatpm-html-serializer><a href="Whatpm/HTML/Serializer.html"><code>Whatpm::HTML::Serializer</code></a></dt> |
| 44 |
|
<dd>An implementation of HTML5 fragment serialization algorithm. |
| 45 |
|
(See also <a href="#demo-html-parser">demo</a>.)</dd> |
| 46 |
|
<dt><a href="Whatpm/HTMLTable.html"><code>Whatpm::HTMLTable</code></a></dt> |
| 47 |
|
<dd>An implementation of the HTML5 table algorithm. It can be |
| 48 |
|
used to extract a table structure from a DOM <code>table</code> |
| 49 |
|
element node. (See also <a href="#demo-html-table">demo</a>.)</dd> |
| 50 |
|
</dl> |
| 51 |
|
|
| 52 |
|
<p id=modules-xml>The module for <i>tentative</i> XML support is as follow: |
| 53 |
|
<dl> |
| 54 |
|
<dt><a href="Whatpm/XMLSerializer.html"><code>Whatpm::XMLSerializer</code></a></dt> |
| 55 |
|
<dd>A simple XML serializer.</dd> |
| 56 |
|
</dl> |
| 57 |
|
|
| 58 |
|
<p><i>Real</i> XML parser and serializer are currently not available yet. |
| 59 |
|
|
| 60 |
|
<p id=modules-cc>The module for conformance checking of a DOM tree (i.e. |
| 61 |
|
a in-memory representation of a HTML or XML document) is as follows: |
| 62 |
<dl> |
<dl> |
|
<dt>Modules</dt> |
|
|
<dd><dl> |
|
|
<dt><a href="Whatpm/CacheManifest.html"><code>Whatpm::CacheManifest</code></a></dt> |
|
|
<dd>An |
|
|
<a href="http://www.whatwg.org/specs/web-apps/current-work/#manifests">HTML5 |
|
|
cache manifest</a> parser.</dd> |
|
|
<dt id=whatpm-charset-universalchardet><a href="Whatpm/Charset/UniversalCharDet.html"><code>Whatpm::Charset::UniversalCharDet</code></a></dt> |
|
|
<dd>A Perl interface to universalchardet character encoding detection |
|
|
library.</dd> |
|
| 63 |
<dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt> |
<dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt> |
| 64 |
<dd>A DOM5 HTML (in-memory representation of a document) conformance |
<dd>A DOM5 HTML (in-memory representation of a document) conformance |
| 65 |
checker with a partial support for Atom 1.0. (See also |
checker with a partial support for Atom 1.0. (See also |
| 66 |
<a href="#demo-html-parser">demo</a>.)</dd> |
<a href="#demo-html-parser">demo</a>.)</dd> |
| 67 |
<dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt> |
</dl> |
| 68 |
<dd>An implementation of HTML5 Content Type sniffing algorithm.</dd> |
|
| 69 |
|
<p>Currently, conformance checking of HTML/XHTML and Atom documents |
| 70 |
|
is supported. |
| 71 |
|
</div> |
| 72 |
|
|
| 73 |
<dt><a href="Whatpm/CSS/Cascade.html"><code>Whatpm::CSS::Cascade</code></a> |
<div class=section id=modules-css> |
| 74 |
<dd>A media-independent implementation of CSS cascading and value |
<h3>Modules for CSS</h3> |
|
computation. (See also <a href="#demo-css-parser">demo</a>.) |
|
|
<dt><a href="Whatpm/CSS/Parser.html"><code>Whatpm::CSS::Parser</code></a> |
|
|
<dd>A CSS parser that constructs CSSOM trees from style sheets. (See |
|
|
also <a href="#demo-css-parser">demo</a>.) |
|
| 75 |
|
|
| 76 |
|
<p>Modules for CSS and related technologies are as follows: |
| 77 |
|
<dl> |
| 78 |
|
<dt><a href="Whatpm/CSS/Cascade.html"><code>Whatpm::CSS::Cascade</code></a> |
| 79 |
|
<dd>A media-independent implementation of CSS cascading and value |
| 80 |
|
computation. (See also <a href="#demo-css-parser">demo</a>.) |
| 81 |
|
<dt><a href="Whatpm/CSS/Parser.html"><code>Whatpm::CSS::Parser</code></a> |
| 82 |
|
<dd>A CSS parser that constructs CSSOM trees from style sheets. (See |
| 83 |
|
also <a href="#demo-css-parser">demo</a>.) |
| 84 |
<dt><a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a></dt> |
<dt><a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a></dt> |
| 85 |
<dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of |
<dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of |
| 86 |
selectors</a> parser. (See also <a href="#demo-css-parser">demo</a>.)</dd> |
selectors</a> parser. (See also <a href="#demo-css-parser">demo</a>.)</dd> |
| 90 |
and <a href="#demo-css-parser">demo</a>.)</dd> |
and <a href="#demo-css-parser">demo</a>.)</dd> |
| 91 |
<dt><a href="Whatpm/CSS/Tokenizer.html"><code>Whatpm::CSS::Tokenizer</code></a></dt> |
<dt><a href="Whatpm/CSS/Tokenizer.html"><code>Whatpm::CSS::Tokenizer</code></a></dt> |
| 92 |
<dd>A CSS tokenizer. (See also <a href="#demo-css-parser">demo</a>.)</dd> |
<dd>A CSS tokenizer. (See also <a href="#demo-css-parser">demo</a>.)</dd> |
| 93 |
<dt id=module-whatpm-html><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt> |
</dl> |
| 94 |
<dd>An implementation of HTML5 document and fragment |
</div> |
| 95 |
parsing algorithms. It can be used |
|
| 96 |
to convert an arbitrary string into a |
<div class=section id=modules-http> |
| 97 |
<abbr title="Document Object Model">DOM</abbr>. (See also |
<h3>Modules for HTTP</h3> |
| 98 |
<a href="#demo-html-parser">demo</a>.)</dd> |
|
| 99 |
<dt id=module-whatpm-html-serializer><a href="Whatpm/HTML/Serializer.html"><code>Whatpm::HTML::Serializer</code></a></dt> |
<p>Modules for HTTP and related technologies are as follows: |
| 100 |
<dd>An implementation of HTML5 fragment serialization algorithm. |
<dl> |
| 101 |
(See also <a href="#demo-html-parser">demo</a>.)</dd> |
<dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt> |
| 102 |
<dt><a href="Whatpm/HTMLTable.html"><code>Whatpm::HTMLTable</code></a></dt> |
<dd>An implementation of HTML5 Content Type sniffing algorithm.</dd> |
|
<dd>An implementation of the HTML5 table algorithm. It can be |
|
|
used to extract a table structure from a DOM <code>table</code> |
|
|
element node. (See also <a href="#demo-html-table">demo</a>.)</dd> |
|
| 103 |
<dt><a href="Whatpm/IMTChecker.html"><code>Whatpm::IMTChecker</code></a></dt> |
<dt><a href="Whatpm/IMTChecker.html"><code>Whatpm::IMTChecker</code></a></dt> |
| 104 |
<dd>An Internet Media Type (<abbr>aka</abbr> MIME type) label |
<dd>An Internet Media Type (<abbr>aka</abbr> MIME type) label |
| 105 |
conformance checker.</dd> |
conformance checker.</dd> |
| 106 |
|
</dl> |
| 107 |
|
|
| 108 |
|
<p>Currently, support for parsing of HTTP headers and as such is not |
| 109 |
|
yet available. |
| 110 |
|
</div> |
| 111 |
|
|
| 112 |
|
<div class=section id=modules-url> |
| 113 |
|
<h3>Module for URL</h3> |
| 114 |
|
|
| 115 |
|
<p>Module for the URL support is as follows: |
| 116 |
|
<dl> |
| 117 |
<dt><a href="Whatpm/URIChecker.html"><code>Whatpm::URIChecker</code></a></dt> |
<dt><a href="Whatpm/URIChecker.html"><code>Whatpm::URIChecker</code></a></dt> |
| 118 |
<dd>An IRI reference conformance checker.</dd> |
<dd>An IRI reference conformance checker.</dd> |
| 119 |
|
</dl> |
| 120 |
|
|
| 121 |
|
<p>Support for HTML5's realistic definition of URL is not available yet. |
| 122 |
|
</div> |
| 123 |
|
|
| 124 |
|
<div class=section id=modules-misc> |
| 125 |
|
<h3>Modules for other technologies</h3> |
| 126 |
|
|
| 127 |
|
<p>Following modules provide support for other Web-related technologies: |
| 128 |
|
<dl> |
| 129 |
|
<dt><a href="Whatpm/CacheManifest.html"><code>Whatpm::CacheManifest</code></a></dt> |
| 130 |
|
<dd>An |
| 131 |
|
<a href="http://www.whatwg.org/specs/web-apps/current-work/#manifests">HTML5 |
| 132 |
|
cache manifest</a> parser.</dd> |
| 133 |
|
<dt id=whatpm-charset-universalchardet><a href="Whatpm/Charset/UniversalCharDet.html"><code>Whatpm::Charset::UniversalCharDet</code></a></dt> |
| 134 |
|
<dd>A Perl interface to universalchardet character encoding detection |
| 135 |
|
library.</dd> |
| 136 |
|
<dt><a href="Whatpm/LangTag.html"><code>Whatpm::LangTag</code></a> |
| 137 |
|
<dd>A language tag parser and conformance checker, supporting both |
| 138 |
|
older RFC 3066 definition and latest RFC 4646 definition. (See also |
| 139 |
|
<a href="#demo-langtag">demo</a>.) |
| 140 |
<dt><a href="Whatpm/WebIDL.html"><code>Whatpm::WebIDL</code></a></dt> |
<dt><a href="Whatpm/WebIDL.html"><code>Whatpm::WebIDL</code></a></dt> |
| 141 |
<dd>A WebIDL fragment parser. It parses an IDL fragment, whether conforming |
<dd>A WebIDL fragment parser. It parses an IDL fragment, whether conforming |
| 142 |
or not, and constructs a DOM-like object model for further processing. |
or not, and constructs a DOM-like object model for further processing. |
| 143 |
Non-conforming (or broken) IDL fragment-like string will be parsed using |
Non-conforming (or broken) IDL fragment-like string will be parsed using |
| 144 |
CSS-like error-tolerant parsing rules, e.g. ignoring anything until next |
CSS-like error-tolerant parsing rules, e.g. ignoring anything until next |
| 145 |
<code>;</code> character. |
<code>;</code> character. |
| 146 |
|
</dl> |
| 147 |
<dt><a href="Whatpm/LangTag.html"><code>Whatpm::LangTag</code></a> |
</div> |
|
<dd>A language tag parser and conformance checker, supporting both |
|
|
older RFC 3066 definition and latest RFC 4646 definition. (See also |
|
|
<a href="#demo-langtag">demo</a>.) |
|
|
|
|
|
<dt><a href="Whatpm/XMLSerializer.html"><code>Whatpm::XMLSerializer</code></a></dt> |
|
|
<dd>A simple XML serializer.</dd> |
|
|
</dl> |
|
| 148 |
|
|
| 149 |
<p>Note that all of these modules are <em>work in progress</em> |
<p>Note that all of these modules are <em>work in progress</em> |
| 150 |
and have <a href="#todo">a number of unresolved problems</a>.</p> |
and have <a href="#todo">a number of unresolved problems</a>.</p> |
| 151 |
|
|
| 152 |
<p>Note also that some modules have no documentation for now.</p> |
<p>Note also that some modules have no documentation for now.</p> |
| 153 |
</dd> |
|
| 154 |
<dt id=spec>Specifications</dt> |
<!-- Whatpm::ContentChecker::*, Whatpm::H2H, Whatpm::NanoDOM, and |
| 155 |
<dd><dl> |
Whatpm::XMLParser are intentionally omitted from the list. --> |
| 156 |
<dt id=spec-ssft><a href="http://suika.fam.cx/www/markup/selectors/ssft/ssft"><abbr title="Selectors Serialization Format for Testing">SSFT</abbr> |
</div> |
| 157 |
Specification</a></dt> |
|
| 158 |
<dd>The specification for the serialization format used for |
<div class=section id=documents> |
| 159 |
testing Selectors-related modules.</dd> |
<h2>Documents</h2> |
| 160 |
<dt id=spec-manakai-selectors"><a href="http://suika.fam.cx/gate/2005/sw/manakai/Selectors%20Extensions">manakai's |
|
| 161 |
Selectors Extensions</a></dt> |
<p>For the description of functionalities provided by each module, see |
| 162 |
<dd>The specification for <code>:-manakai-<var>*</var></code> |
<abbr>pod</abbr> documentation of the module. HTML version of |
| 163 |
pseudo-classes implemented by Selectors-related modules.</dd> |
<abbr>pod</abbr> documentations are linked from the <a |
| 164 |
</dl></dd> |
href="#modules">list of modules above</a>. |
| 165 |
<dt>Documentations</dt> |
|
| 166 |
<dd><dl> |
<p>In addition, there are additional documents for some topics: |
| 167 |
|
<dl> |
| 168 |
<dt><a href="http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types">List of error types</a></dt> |
<dt><a href="http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types">List of error types</a></dt> |
| 169 |
|
<!-- @@ TODO: Need to update the link - the document above is out of date --> |
| 170 |
<dd>Description of errors to be notified to callback functions by Whatpm |
<dd>Description of errors to be notified to callback functions by Whatpm |
| 171 |
modules.</dd> |
modules.</dd> |
| 172 |
<dt><a href="Whatpm/CSS/selectors-object">Selectors object</a></dt> |
|
| 173 |
<dd>Description of data structure for Selectors, as implemented by |
<dt><a href="Whatpm/CSS/selectors-object">Selectors object</a></dt> |
| 174 |
<a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a> |
<dd>Description of data structure for Selectors, as implemented by |
| 175 |
(as output), and |
<a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a> |
| 176 |
<a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a> |
(as output), and |
| 177 |
(as input)<!--, and |
<a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a> |
| 178 |
<a href="http://suika.fam.cx/www/manakai-core/lib/Message/DOM/SelectorsAPI.html"><code>Message::DOM::SelectorsAPI</code></a>-->.</dd> |
(as input)<!--, and |
| 179 |
<dt id=doc-user-data-names><a href="http://suika.fam.cx/gate/2005/sw/manakai/Predefined%20User%20Data%20Names">List of predefined user data names</a></dt> |
<a href="http://suika.fam.cx/www/manakai-core/lib/Message/DOM/SelectorsAPI.html"><code>Message::DOM::SelectorsAPI</code></a>-->.</dd> |
| 180 |
<dd>List of user data names defined by Whatpm modules.</dd> |
|
| 181 |
<dt id=doc-handles><a href="Whatpm/Charset/handles">Handle objects</a> |
<dt id=doc-user-data-names><a href="http://suika.fam.cx/gate/2005/sw/manakai/Predefined%20User%20Data%20Names">List of predefined user data names</a></dt> |
| 182 |
<dd>Description of character or byte stream input handle interfaces. |
<dd>List of user data names defined by Whatpm modules.</dd> |
| 183 |
</dl> |
|
| 184 |
</dd> |
<dt id=doc-handles><a href="Whatpm/Charset/handles">Handle objects</a> |
| 185 |
|
<dd>Description of character or byte stream input handle interfaces. |
| 186 |
|
</dl> |
| 187 |
|
|
| 188 |
|
<p>Following specifications define Whatpm-specific formats and extensions: |
| 189 |
|
<dl id=spec> |
| 190 |
|
<dt id=spec-ssft><a href="http://suika.fam.cx/www/markup/selectors/ssft/ssft"><abbr title="Selectors Serialization Format for Testing">SSFT</abbr> |
| 191 |
|
Specification</a></dt> |
| 192 |
|
<dd>The specification for the serialization format used for |
| 193 |
|
testing Selectors-related modules.</dd> |
| 194 |
|
|
| 195 |
|
<dt id=spec-manakai-selectors"><a href="http://suika.fam.cx/gate/2005/sw/manakai/Selectors%20Extensions">manakai's |
| 196 |
|
Selectors Extensions</a></dt> |
| 197 |
|
<dd>The specification for <code>:-manakai-<var>*</var></code> |
| 198 |
|
pseudo-classes implemented by Selectors-related modules.</dd> |
| 199 |
</dl> |
</dl> |
|
<!-- Whatpm::ContentChecker::*, Whatpm::H2H, Whatpm::NanoDOM, and |
|
|
Whatpm::XMLParser are intentionally omitted from the list. --> |
|
| 200 |
</div> |
</div> |
| 201 |
|
|
| 202 |
<div class="section" id="demo"> |
<div class="section" id="demo"> |