/[suikacvs]/markup/html/whatpm/readme.en.html
Suika

Diff of /markup/html/whatpm/readme.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.3 by wakaba, Sun May 20 11:12:25 2007 UTC revision 1.24 by wakaba, Mon Sep 15 12:22:09 2008 UTC
# Line 13  Technologies (beta)</title> Line 13  Technologies (beta)</title>
13  <div class="section" id="introduction">  <div class="section" id="introduction">
14  <h2>Introduction</h2>  <h2>Introduction</h2>
15    
16  <p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of Perl modules for  <p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of <m>P</m>erl
17  Web hypertext application technologies.</p>  <m>m</m>odules for <m>W</m>eb <m>h</m>ypertext <m>a</m>pplication
18    <m>t</m>echnologies.  It is part
19  <p>It currently contains three Perl modules:</p>  of the <a href="http://suika.fam.cx/www/2006/manakai/" rel=up>manakai</a>
20    project.</p>
21    
22  <dl>  <dl>
23  <dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt>  <dt>Modules</dt>
24    <dd>An implementation of HTML5 Content Type sniffing algorithm.</dd>  <dd><dl>
25  <dt><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt>  <dt><a href="Whatpm/CacheManifest.html"><code>Whatpm::CacheManifest</code></a></dt>
26    <dd>An implementation of HTML5 parsing algorithm and    <dd>An
27    <code>innerHTML</code> serialization.</dd>    <a href="http://www.whatwg.org/specs/web-apps/current-work/#manifests">HTML5
28      cache manifest</a> parser.</dd>
29    <dt id=whatpm-charset-universalchardet><a href="Whatpm/Charset/UniversalCharDet.html"><code>Whatpm::Charset::UniversalCharDet</code></a></dt>
30      <dd>A Perl interface to universalchardet character encoding detection
31      library.</dd>
32  <dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt>  <dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt>
33    <dd>A DOM5 HTML (in-memory representation of a document) conformance    <dd>A DOM5 HTML (in-memory representation of a document) conformance
34    checker.</dd>    checker with a partial support for Atom 1.0.  (See also
35      <a href="#demo-html-parser">demo</a>.)</dd>
36    <dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt>
37      <dd>An implementation of HTML5 Content Type sniffing algorithm.</dd>
38    <dt><a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a></dt>
39      <dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of
40      selectors</a> parser.  (See also <a href="#demo-css-parser">demo</a>.)</dd>
41    <dt><a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a></dt>
42      <dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of
43      selectors</a> serializer.  (See also <a href="#spec-ssft">specification</a>
44      and <a href="#demo-css-parser">demo</a>.)</dd>
45    <dt><a href="Whatpm/CSS/Tokenizer.html"><code>Whatpm::CSS::Tokenizer</code></a></dt>
46      <dd>A CSS tokenizer.  (See also <a href="#demo-css-parser">demo</a>.)</dd>
47    <dt id=module-whatpm-html><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt>
48      <dd>An implementation of HTML5 document and fragment
49      parsing algorithms.  It can be used
50      to convert an arbitrary string into a
51      <abbr title="Document Object Model">DOM</abbr>.  (See also
52      <a href="#demo-html-parser">demo</a>.)</dd>
53    <dt id=module-whatpm-html-serializer><a href="Whatpm/HTML/Serializer.html"><code>Whatpm::HTML::Serializer</code></a></dt>
54      <dd>An implementation of HTML5 fragment serialization algorithm.
55      (See also <a href="#demo-html-parser">demo</a>.)</dd>
56    <dt><a href="Whatpm/HTMLTable.html"><code>Whatpm::HTMLTable</code></a></dt>
57      <dd>An implementation of the HTML5 table algorithm.  It can be
58      used to extract a table structure from a DOM <code>table</code>
59      element node.  (See also <a href="#demo-html-table">demo</a>.)</dd>
60    <dt><a href="Whatpm/IMTChecker.html"><code>Whatpm::IMTChecker</code></a></dt>
61      <dd>An Internet Media Type (<abbr>aka</abbr> MIME type) label
62      conformance checker.</dd>
63    <dt><a href="Whatpm/URIChecker.html"><code>Whatpm::URIChecker</code></a></dt>
64      <dd>An IRI reference conformance checker.</dd>
65    
66    <dt><a href="Whatpm/WebIDL.html"><code>Whatpm::WebIDL</code></a></dt>
67      <dd>A WebIDL fragment parser.  It parses an IDL fragment, whether conforming
68      or not, and constructs a DOM-like object model for further processing.
69      Non-conforming (or broken) IDL fragment-like string will be parsed using
70      CSS-like error-tolerant parsing rules, e.g. ignoring anything until next
71      <code>;</code> character.
72    
73    <dt><a href="Whatpm/XMLSerializer.html"><code>Whatpm::XMLSerializer</code></a></dt>
74      <dd>A simple XML serializer.</dd>
75      </dl>
76    
77      <p>Note that all of these modules are <em>work in progress</em>
78      and have <a href="#todo">a number of unresolved problems</a>.</p>
79    
80      <p>Note also that some modules have no documentation for now.</p>
81      </dd>
82    <dt id=spec>Specifications</dt>
83      <dd><dl>
84        <dt id=spec-ssft><a href="http://suika.fam.cx/www/markup/selectors/ssft/ssft"><abbr title="Selectors Serialization Format for Testing">SSFT</abbr>
85        Specification</a></dt>
86          <dd>The specification for the serialization format used for
87          testing Selectors-related modules.</dd>
88        <dt id=spec-manakai-selectors"><a href="http://suika.fam.cx/gate/2005/sw/manakai/Selectors%20Extensions">manakai's
89        Selectors Extensions</a></dt>
90          <dd>The specification for <code>:-manakai-<var>*</var></code>
91          pseudo-classes implemented by Selectors-related modules.</dd>
92      </dl></dd>
93    <dt>Documentations</dt>
94      <dd><dl>
95    <dt><a href="http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types">List of error types</a></dt>
96      <dd>Description of errors to be notified to callback functions by Whatpm
97      modules.</dd>
98        <dt><a href="Whatpm/CSS/selectors-object">Selectors object</a></dt>
99          <dd>Description of data structure for Selectors, as implemented by
100          <a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a>
101          (as output), and
102          <a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a>
103          (as input)<!--, and
104          <a href="http://suika.fam.cx/www/manakai-core/lib/Message/DOM/SelectorsAPI.html"><code>Message::DOM::SelectorsAPI</code></a>-->.</dd>
105        <dt id=doc-user-data-names><a href="http://suika.fam.cx/gate/2005/sw/manakai/Predefined%20User%20Data%20Names">List of predefined user data names</a></dt>
106          <dd>List of user data names defined by Whatpm modules.</dd>
107        <dt id=doc-handles><a href="Whatpm/Charset/handles">Handle objects</a>
108          <dd>Description of character or byte stream input handle interfaces.
109        </dl>
110      </dd>
111  </dl>  </dl>
112  </div>  </div>
113    
114  <div class="section" id="demo">  <div class="section" id="demo">
115  <h2>Demo</h2>  <h2>Demo</h2>
116    
117  <p><a href="http://suika.fam.cx/gate/2007/html/parser-interface">HTML5 parser  <ul>
118  and checker demo</a></p>  <li id=demo-html-parser-nanodom><a href="http://suika.fam.cx/gate/2007/html/parser-interface">HTML5 parser
119    and checker demo</a>
120    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/parser.cgi">source</a>,
121    with <a href="Whatpm/NanoDOM.html">a lightweight non-conforming
122    DOM implementation</a>)</li>
123    <li id=demo-html-parser-manakai><a href="http://suika.fam.cx/gate/2007/html/parser-manakai-interface">HTML5
124    parser and checker demo, with manakai's DOM implementation</a>
125    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/parser-manakai.cgi">source</a>)</li>
126    <li id=demo-html-table><a href="http://suika.fam.cx/gate/2007/html/table-interface">HTML5 table
127    structure visualization demo</a>
128    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/table.cgi">source</a>)</li>
129    <li id=demo-css-parser><a href="http://suika.fam.cx/gate/2007/css/parser-interface">CSS tokenizer
130    demo</a>
131    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/css/parser.cgi">source</a>)</li>
132    </ul>
133    </div>
134    
135    <div class="section" id="dependency">
136    <h2>Dependency</h2>
137    
138    <dl>
139    <dt id=dependency-perl>Perl 5.8 or later</dt>
140      <dd>It is recommended to use newer stable release of Perl 5.8 (or
141      later).</dd>
142      <dd id=dependency-encode>Some modules require <code>Encode</code>
143      modules, which are part of standard Perl distribution.</dd>
144    <dt id=dependency-manakai-core>Modules from
145    <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a></dt>
146      <dd>
147        <dl>
148    <dt id=dependency-error><a href="http://search.cpan.org/author/SHLOMIF/Error-0.17009/lib/Error.pm"><code>Error</code></a></dt>
149      <dd>Module <code>Whatpm::HTML</code> requires <code>Error</code>,
150      which is bundled in
151      <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
152    <dt><code>Message::IMT::InternetMediaType</code></dt>
153      <dd>Module <code>Whatpm::IMTChecker</code> depends on
154      <code>Message::IMT::InternetMediaType</code>, which is part of
155      <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
156    <dt><code>Message::URI::URIReference</code></dt>
157      <dd>Modules <code>Whatpm::URIChecker</code> and
158      <code>Whatpm::CacheManifest</code> depend on
159      <a href="http://suika.fam.cx/www/manakai-core/lib/Message/URI/URIReference.html"><code>Message::URI::URIReference</code></a>,
160      which is part of
161      <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
162      <dt><code>Message::Charset::Info</code></dt>
163        <dd>Module <code>Whatpm::ContentChecker</code> depends on
164        <a href="http://suika.fam.cx/www/manakai-core/lib/Message/Charset/Info.html"><code>Message::Charset::Info</code></a>,
165        which is part of
166        <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
167    <dt><code>Message::DOM::DOMImplementation</code>
168      <dd>Module <code>Whatpm::URIChecker</code> depends on
169      <code>Message::DOM::DOMImplementation</code>,
170        which is part of
171        <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.
172    <dt><code>Message::DOM::DOMImplementation</code> and related modules</dt>
173      <dd><em>Testing</em> for module <code>Whatpm::ContentChecker</code>
174      depends on <code>Message::DOM::DOMImplementation</code> and related modules
175      in <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.
176      They are not required in practice.</dd>
177        </dl>
178      </dd>
179    <dt><a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
180    charlib</a></dt>
181      <dd>Module <code>Whatpm::Charset::DecodeHandle</code> depends on
182      modules in <a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
183      charlib</a> for decoding of <em>Japanese character encodings</em>.
184      See the documentation for
185      <a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
186      charlib</a> for more information.</dd>
187    <dt><a href="http://www.python.org/">Python</a>, Perl
188    <a href="http://search.cpan.org/~neilw/Inline-Python-0.22/"><code>Inline::Python</code></a>
189    module, and <a href="http://chardet.feedparser.org/">Universal Encoding
190    Detector</a></dt>
191      <dd>For the module <code>Whatpm::Charset::UniversalCharDet</code> being
192      meaningful, these softwares are requires on the system.  See the
193      <a href="Whatpm/Charset/UniversalCharDet.html#dependency">documentation</a>
194      for more information.</dd>
195    <dt><a href="http://search.cpan.org/~makamaka/JSON-1.14/"><code>JSON</code></a></dt>
196      <dd><em>Testing</em> for modules <code>Whatpm::HTML</code> and
197      <code>Whatpm::CSS::Tokenizer</code>
198      depends on <a href="http://search.cpan.org/~makamaka/JSON-1.14/"><code>JSON</code> and related modules</a>.
199      They are not required in practice.</dd>
200    </dl>
201  </div>  </div>
202    
203  <div class="section" id="download">  <div class="section" id="download">
# Line 55  repository</a>.</p> Line 218  repository</a>.</p>
218      <a href="t/tokenizer-result">HTML tokenization</a>,      <a href="t/tokenizer-result">HTML tokenization</a>,
219      <a href="t/tree-construction-result">HTML tree construction</a>,      <a href="t/tree-construction-result">HTML tree construction</a>,
220      <a href="t/content-checker-result"><code>Whatpm::ContentChecker</code></a>).</li>      <a href="t/content-checker-result"><code>Whatpm::ContentChecker</code></a>).</li>
221      <li>Merge with the <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>
222          code tree.
223    <li>Charset detection.</li>    <li>Charset detection.</li>
224    <li>Table validation.</li>    <li>Validation for <code>meta</code>.</li>
225    <li>Validation for <code>rel</code>, <code>meta</code>.</li>    <li>Validation for media queries, IRIs (against URI schemes), language tags,
   <li>Validation for media types, media queries, IRIs, language tags,  
226      and so on.</li>      and so on.</li>
227    <li><q>Whatpm</q> is a code name in fact.  Please let me know    <li>Documentations are missing for some features.</li>
228      if you have a better name.</li>    <li>XML parser<!-- with application cache selection algorithm hook-->.</li>
229    <li>In addition, each module has its own TO DO items.    <li>In addition, each module has its own TO DO items.
230      (Search for <q>## TODO</q> and <q>## ISSUE</q> in each module.)</li>      (Search for <q>## TODO</q> and <q>## ISSUE</q> in each module.)</li>
231  </ul>  </ul>
232  </div>  </div>
233    
234    <div class=section id=acknowledgments>
235    <h2>Acknowledgments</h2>
236    
237    <p>Thanks to the <a href="http://code.google.com/p/html5lib/">html5lib</a>
238    team for their
239    <a href="http://html5lib.googlecode.com/svn/trunk/testdata/">HTML5
240    parser test data</a>.</p>
241    </div>
242    
243  <div class="section" id="author">  <div class="section" id="author">
244  <h2>Author</h2>  <h2>Author</h2>
245    
246  <p><a href="http://suika.fam.cx/~wakaba/who?">Wakaba</a>.</p>  <p><a href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.</p>
247  </div>  </div>
248    
249  <div class="section" id="license">  <div class="section" id="license">
250  <h2>License</h2>  <h2>License</h2>
251    
252  <p>Copyright 2007 Wakaba &lt;w@suika.fam.cx></p>  <p>Copyright 2007$B!>(B2008 Wakaba
253    <code class="mail">&lt;<a href="mailto:w@suika.fam.cx"
254        rel="author">w@suika.fam.cx</a>></code>.</p>
255    
256  <p>This library is free software; you can redistribute it and/or modify  <p>This library is free software; you can redistribute it and/or modify
257  it under the same terms as Perl itself.</p>  it under the same terms as Perl itself.</p>

Legend:
Removed from v.1.3  
changed lines
  Added in v.1.24

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24