/[suikacvs]/markup/html/whatpm/readme.en.html
Suika

Diff of /markup/html/whatpm/readme.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.5 by wakaba, Mon Jul 16 04:51:21 2007 UTC revision 1.26 by wakaba, Wed Sep 17 06:16:06 2008 UTC
# Line 13  Technologies (beta)</title> Line 13  Technologies (beta)</title>
13  <div class="section" id="introduction">  <div class="section" id="introduction">
14  <h2>Introduction</h2>  <h2>Introduction</h2>
15    
16  <p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of Perl modules for  <p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of
17  Web hypertext application technologies.</p>  <mark>P</mark>erl <mark>m</mark>odules for <mark>W</mark>eb
18    <mark>h</mark>ypertext <mark>a</mark>pplication
19    <mark>t</mark>echnologies.  It is part of the <a
20    href="http://suika.fam.cx/www/2006/manakai/" rel=up>manakai</a>
21    project.</p>
22    
23    <p>Whatpm supports various Web standard technologies, including <a
24    href="#modules-html">HTML, XHTML</a>, <a href="#modules-xml">XML</a>,
25    <a hreF="#modules-css">CSS</a>, <a href="#modules-http">HTTP</a>, and
26    <a href="#modules-url">URL</a>.
27    </div>
28    
29    <div class=section id=modules>
30    <h2>Modules</h2>
31    
32    <div class=section id=modules-html-xml>
33    <h3>Modules for HTML and XML</h3>
34    
35    <p id=modules-html>Modules related to HTML and XHTML are as follows:
36    <dl>
37    <dt id=module-whatpm-html><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt>
38      <dd>An implementation of HTML5 document and fragment
39      parsing algorithms.  It can be used
40      to convert an arbitrary string into a
41      <abbr title="Document Object Model">DOM</abbr>.  (See also
42      <a href="#demo-html-parser">demo</a>.)</dd>
43    <dt id=module-whatpm-html-serializer><a href="Whatpm/HTML/Serializer.html"><code>Whatpm::HTML::Serializer</code></a></dt>
44      <dd>An implementation of HTML5 fragment serialization algorithm.
45      (See also <a href="#demo-html-parser">demo</a>.)</dd>
46    <dt><a href="Whatpm/HTMLTable.html"><code>Whatpm::HTMLTable</code></a></dt>
47      <dd>An implementation of the HTML5 table algorithm.  It can be
48      used to extract a table structure from a DOM <code>table</code>
49      element node.  (See also <a href="#demo-html-table">demo</a>.)</dd>
50    </dl>
51    
52    <p id=modules-xml>The module for <i>tentative</i> XML support is as follow:
53    <dl>
54    <dt><a href="Whatpm/XMLSerializer.html"><code>Whatpm::XMLSerializer</code></a></dt>
55      <dd>A simple XML serializer.</dd>
56    </dl>
57    
58    <p><i>Real</i> XML parser and serializer are currently not available yet.
59    
60    <p id=modules-cc>The module for conformance checking of a DOM tree (i.e.
61    a in-memory representation of a HTML or XML document) is as follows:
62  <dl>  <dl>
63  <dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt>  <dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt>
64    <dd>A DOM5 HTML (in-memory representation of a document) conformance    <dd>A DOM5 HTML (in-memory representation of a document) conformance
65    checker.</dd>    checker with a partial support for Atom 1.0.  (See also
66      <a href="#demo-html-parser">demo</a>.)</dd>
67    </dl>
68    
69    <p>Currently, conformance checking of HTML/XHTML and Atom documents
70    is supported.
71    </div>
72    
73    <div class=section id=modules-css>
74    <h3>Modules for CSS</h3>
75    
76    <p>Modules for CSS and related technologies are as follows:
77    <dl>
78    <dt><a href="Whatpm/CSS/Cascade.html"><code>Whatpm::CSS::Cascade</code></a>
79      <dd>A media-independent implementation of CSS cascading and value
80      computation.  (See also <a href="#demo-css-parser">demo</a>.)
81    <dt><a href="Whatpm/CSS/Parser.html"><code>Whatpm::CSS::Parser</code></a>
82      <dd>A CSS parser that constructs CSSOM trees from style sheets.  (See
83      also <a href="#demo-css-parser">demo</a>.)
84    <dt><a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a></dt>
85      <dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of
86      selectors</a> parser.  (See also <a href="#demo-css-parser">demo</a>.)</dd>
87    <dt><a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a></dt>
88      <dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of
89      selectors</a> serializer.  (See also <a href="#spec-ssft">specification</a>
90      and <a href="#demo-css-parser">demo</a>.)</dd>
91    <dt><a href="Whatpm/CSS/Tokenizer.html"><code>Whatpm::CSS::Tokenizer</code></a></dt>
92      <dd>A CSS tokenizer.  (See also <a href="#demo-css-parser">demo</a>.)</dd>
93    </dl>
94    </div>
95    
96    <div class=section id=modules-http>
97    <h3>Modules for HTTP</h3>
98    
99    <p>Modules for HTTP and related technologies are as follows:
100    <dl>
101  <dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt>  <dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt>
102    <dd>An implementation of HTML5 Content Type sniffing algorithm.</dd>    <dd>An implementation of HTML5 Content Type sniffing algorithm.</dd>
 <dt><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt>  
   <dd>An implementation of HTML5 parsing algorithm and  
   <code>innerHTML</code> serialization.</dd>  
 <dt><a href="Whatpm/HTMLTable.html"><code>Whatpm::HTMLTable</code></a></dt>  
   <dd>An implementation of the HTML5 table algorithm.</dd>  
103  <dt><a href="Whatpm/IMTChecker.html"><code>Whatpm::IMTChecker</code></a></dt>  <dt><a href="Whatpm/IMTChecker.html"><code>Whatpm::IMTChecker</code></a></dt>
104    <dd>An Internet Media Type (<abbr>aka</abbr> MIME type) label    <dd>An Internet Media Type (<abbr>aka</abbr> MIME type) label
105    conformance checker.</dd>    conformance checker.</dd>
106    </dl>
107    
108    <p>Currently, support for parsing of HTTP headers and as such is not
109    yet available.
110    </div>
111    
112    <div class=section id=modules-url>
113    <h3>Module for URL</h3>
114    
115    <p>Module for the URL support is as follows:
116    <dl>
117  <dt><a href="Whatpm/URIChecker.html"><code>Whatpm::URIChecker</code></a></dt>  <dt><a href="Whatpm/URIChecker.html"><code>Whatpm::URIChecker</code></a></dt>
118    <dd>An IRI reference conformance checker.</dd>    <dd>An IRI reference conformance checker.</dd>
119  <dt><a href="Whatpm/XMLSerializer.html"><code>Whatpm::XMLSerializer</code></a></dt>  </dl>
120    <dd>A simple XML serializer.</dd>  
121    <p>Support for HTML5's realistic definition of URL is not available yet.
122    </div>
123    
124    <div class=section id=modules-misc>
125    <h3>Modules for other technologies</h3>
126    
127    <p>Following modules provide support for other Web-related technologies:
128    <dl>
129    <dt><a href="Whatpm/CacheManifest.html"><code>Whatpm::CacheManifest</code></a></dt>
130      <dd>An
131      <a href="http://www.whatwg.org/specs/web-apps/current-work/#manifests">HTML5
132      cache manifest</a> parser.</dd>
133    <dt id=whatpm-charset-universalchardet><a href="Whatpm/Charset/UniversalCharDet.html"><code>Whatpm::Charset::UniversalCharDet</code></a></dt>
134      <dd>A Perl interface to universalchardet character encoding detection
135      library.</dd>
136      <dt><a href="Whatpm/LangTag.html"><code>Whatpm::LangTag</code></a>
137        <dd>A language tag parser and conformance checker, supporting both
138        older RFC 3066 definition and latest RFC 4646 definition.  (See also
139        <a href="#demo-langtag">demo</a>.)
140    <dt><a href="Whatpm/WebIDL.html"><code>Whatpm::WebIDL</code></a></dt>
141      <dd>A WebIDL fragment parser.  It parses an IDL fragment, whether conforming
142      or not, and constructs a DOM-like object model for further processing.
143      Non-conforming (or broken) IDL fragment-like string will be parsed using
144      CSS-like error-tolerant parsing rules, e.g. ignoring anything until next
145      <code>;</code> character.
146    </dl>
147    </div>
148    
149      <p>Note that all of these modules are <em>work in progress</em>
150      and have <a href="#todo">a number of unresolved problems</a>.</p>
151    
152      <p>Note also that some modules have no documentation for now.</p>
153    
154    <!-- Whatpm::ContentChecker::*, Whatpm::H2H, Whatpm::NanoDOM, and
155         Whatpm::XMLParser are intentionally omitted from the list. -->
156    </div>
157    
158    <div class=section id=documents>
159    <h2>Documents</h2>
160    
161    <p>For the description of functionalities provided by each module, see
162    <abbr>pod</abbr> documentation of the module.  HTML version of
163    <abbr>pod</abbr> documentations are linked from the <a
164    href="#modules">list of modules above</a>.
165    
166    <p>In addition, there are additional documents for some topics:
167    <dl>
168  <dt><a href="http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types">List of error types</a></dt>  <dt><a href="http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types">List of error types</a></dt>
169    <!-- @@ TODO: Need to update the link - the document above is out of date -->
170      <dd>Description of errors to be notified to callback functions by Whatpm
171      modules.</dd>
172    
173    <dt><a href="Whatpm/CSS/selectors-object">Selectors object</a></dt>
174      <dd>Description of data structure for Selectors, as implemented by
175      <a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a>
176      (as output), and
177      <a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a>
178      (as input)<!--, and
179      <a href="http://suika.fam.cx/www/manakai-core/lib/Message/DOM/SelectorsAPI.html"><code>Message::DOM::SelectorsAPI</code></a>-->.</dd>
180    
181    <dt id=doc-user-data-names><a href="http://suika.fam.cx/gate/2005/sw/manakai/Predefined%20User%20Data%20Names">List of predefined user data names</a></dt>
182      <dd>List of user data names defined by Whatpm modules.</dd>
183    
184    <dt id=doc-handles><a href="Whatpm/Charset/handles">Handle objects</a>
185      <dd>Description of character or byte stream input handle interfaces.
186  </dl>  </dl>
187    
188  <p>Note that all of these modules are <em>work in progress</em>  <p>Following specifications define Whatpm-specific formats and extensions:
189  and have <a href="#todo">a number of unresolved problems</a>.</p>  <dl id=spec>
190    <dt id=spec-ssft><a href="http://suika.fam.cx/www/markup/selectors/ssft/ssft"><abbr title="Selectors Serialization Format for Testing">SSFT</abbr>
191    Specification</a></dt>
192      <dd>The specification for the serialization format used for
193      testing Selectors-related modules.</dd>
194    
195    <dt id=spec-manakai-selectors"><a href="http://suika.fam.cx/gate/2005/sw/manakai/Selectors%20Extensions">manakai's
196    Selectors Extensions</a></dt>
197      <dd>The specification for <code>:-manakai-<var>*</var></code>
198      pseudo-classes implemented by Selectors-related modules.</dd>
199    </dl>
200  </div>  </div>
201    
202  <div class="section" id="demo">  <div class="section" id="demo">
203  <h2>Demo</h2>  <h2>Demo</h2>
204    
205  <ul>  <ul>
206  <li><a href="http://suika.fam.cx/gate/2007/html/parser-interface">HTML5 parser  <li id=demo-html-parser-nanodom><a href="http://suika.fam.cx/gate/2007/html/parser-interface">HTML5 parser
207  and checker demo</a></li>  and checker demo</a>
208  <li><a href="http://suika.fam.cx/gate/2007/html/table-interface">HTML5 table  (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/parser.cgi">source</a>,
209  structure visualization demo</a></li>  with <a href="Whatpm/NanoDOM.html">a lightweight non-conforming
210    DOM implementation</a>)</li>
211    <li id=demo-html-parser-manakai><a href="http://suika.fam.cx/gate/2007/html/parser-manakai-interface">HTML5
212    parser and checker demo, with manakai's DOM implementation</a>
213    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/parser-manakai.cgi">source</a>)</li>
214    <li id=demo-html-table><a href="http://suika.fam.cx/gate/2007/html/table-interface">HTML5 table
215    structure visualization demo</a>
216    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/table.cgi">source</a>)</li>
217    
218    <li id=demo-css-parser><a href="http://suika.fam.cx/gate/2007/css/parser-interface">CSS
219    tokenizer, parser, and computed style computation demo</a>
220    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/css/parser.cgi">source</a>)</li>
221    
222    <li id=demo-langtag><a href="http://suika.fam.cx/gate/2007/langtag/langtag-demo-interface">Language
223    tag parsing and conformance checking demo</a>
224    (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/langtag/langtag-demo.cgi">source</a>)
225    </ul>
226    </div>
227    
228    <div class=section id=applications>
229    <h2>Application</h2>
230    
231    <ul>
232    <li><a href="http://suika.fam.cx/gate/2007/html/cc/"><abbr>WebHACC</abbr>
233    (Web hypertext application conformance checker)</a>
234  </ul>  </ul>
235  </div>  </div>
236    
237    <div class="section" id="dependency">
238    <h2>Dependency</h2>
239    
240    <dl>
241    <dt id=dependency-perl>Perl 5.8 or later</dt>
242      <dd>It is recommended to use newer stable release of Perl 5.8 (or
243      later).</dd>
244      <dd id=dependency-encode>Some modules require <code>Encode</code>
245      modules, which are part of standard Perl distribution.</dd>
246    <dt id=dependency-manakai-core>Modules from
247    <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a></dt>
248      <dd>
249        <dl>
250    <dt id=dependency-error><a href="http://search.cpan.org/author/SHLOMIF/Error-0.17009/lib/Error.pm"><code>Error</code></a></dt>
251      <dd>Module <code>Whatpm::HTML</code> requires <code>Error</code>,
252      which is bundled in
253      <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
254    <dt><code>Message::IMT::InternetMediaType</code></dt>
255      <dd>Module <code>Whatpm::IMTChecker</code> depends on
256      <code>Message::IMT::InternetMediaType</code>, which is part of
257      <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
258    <dt><code>Message::URI::URIReference</code></dt>
259      <dd>Modules <code>Whatpm::URIChecker</code> and
260      <code>Whatpm::CacheManifest</code> depend on
261      <a href="http://suika.fam.cx/www/manakai-core/lib/Message/URI/URIReference.html"><code>Message::URI::URIReference</code></a>,
262      which is part of
263      <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
264      <dt><code>Message::Charset::Info</code></dt>
265        <dd>Module <code>Whatpm::ContentChecker</code> depends on
266        <a href="http://suika.fam.cx/www/manakai-core/lib/Message/Charset/Info.html"><code>Message::Charset::Info</code></a>,
267        which is part of
268        <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
269    <dt><code>Message::DOM::DOMImplementation</code>
270      <dd>Module <code>Whatpm::URIChecker</code> depends on
271      <code>Message::DOM::DOMImplementation</code>,
272        which is part of
273        <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.
274    <dt><code>Message::DOM::DOMImplementation</code> and related modules</dt>
275      <dd><em>Testing</em> for module <code>Whatpm::ContentChecker</code>
276      depends on <code>Message::DOM::DOMImplementation</code> and related modules
277      in <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.
278      They are not required in practice.</dd>
279        </dl>
280      </dd>
281    <dt><a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
282    charlib</a></dt>
283      <dd>Module <code>Whatpm::Charset::DecodeHandle</code> depends on
284      modules in <a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
285      charlib</a> for decoding of <em>Japanese character encodings</em>.
286      See the documentation for
287      <a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
288      charlib</a> for more information.</dd>
289    <dt><a href="http://www.python.org/">Python</a>, Perl
290    <a href="http://search.cpan.org/~neilw/Inline-Python-0.22/"><code>Inline::Python</code></a>
291    module, and <a href="http://chardet.feedparser.org/">Universal Encoding
292    Detector</a></dt>
293      <dd>For the module <code>Whatpm::Charset::UniversalCharDet</code> being
294      meaningful, these softwares are required on the system.  See the
295      <a href="Whatpm/Charset/UniversalCharDet.html#dependency">documentation</a>
296      for more information.</dd>
297    <dt><a href="http://search.cpan.org/~makamaka/JSON-1.14/"><code>JSON</code></a></dt>
298      <dd><em>Testing</em> for modules <code>Whatpm::HTML</code> and
299      <code>Whatpm::CSS::Tokenizer</code>
300      depends on <a href="http://search.cpan.org/~makamaka/JSON-1.14/"><code>JSON</code> and related modules</a>.
301      They are not required in practice.</dd>
302    </dl>
303    </div>
304    
305  <div class="section" id="download">  <div class="section" id="download">
306  <h2>Distribution</h2>  <h2>Distribution</h2>
307    
# Line 59  structure visualization demo</a></li> Line 309  structure visualization demo</a></li>
309  <a href="http://suika.fam.cx/gate/cvs/markup/html/whatpm/">CVS  <a href="http://suika.fam.cx/gate/cvs/markup/html/whatpm/">CVS
310  repository</a>.</p>  repository</a>.</p>
311    
312    <p><a href="http://suika.fam.cx/gate/cvs/markup/html/whatpm/whatpm.tar.gz?tarball=1">The
313    latest developmenet version of the Whatpm</a> is also available as a
314    tarball.
315    
316  </div>  </div>
317    
318  <div class="section" id="todo">  <div class="section" id="todo">
# Line 70  repository</a>.</p> Line 324  repository</a>.</p>
324      <a href="t/tokenizer-result">HTML tokenization</a>,      <a href="t/tokenizer-result">HTML tokenization</a>,
325      <a href="t/tree-construction-result">HTML tree construction</a>,      <a href="t/tree-construction-result">HTML tree construction</a>,
326      <a href="t/content-checker-result"><code>Whatpm::ContentChecker</code></a>).</li>      <a href="t/content-checker-result"><code>Whatpm::ContentChecker</code></a>).</li>
327      <li>Merge with the <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>
328          code tree.
329    <li>Charset detection.</li>    <li>Charset detection.</li>
330    <li>Validation for <code>meta</code>.</li>    <li>Validation for <code>meta</code>.</li>
331    <li>Validation for media queries, IRIs (against URI schemes), language tags,    <li>Validation for media queries, IRIs (against URI schemes), language tags,
332      and so on.</li>      and so on.</li>
333    <li>Documentations are missing for some features.</li>    <li>Documentations are missing for some features.</li>
334    <li><q>Whatpm</q> is a code name in fact.  Please let me know    <li>XML parser<!-- with application cache selection algorithm hook-->.</li>
     if you have a better name.</li>  
335    <li>In addition, each module has its own TO DO items.    <li>In addition, each module has its own TO DO items.
336      (Search for <q>## TODO</q> and <q>## ISSUE</q> in each module.)</li>      (Search for <q>## TODO</q> and <q>## ISSUE</q> in each module.)</li>
337  </ul>  </ul>
338  </div>  </div>
339    
340    <div class=section id=acknowledgments>
341    <h2>Acknowledgments</h2>
342    
343    <p>Thanks to the <a href="http://code.google.com/p/html5lib/">html5lib</a>
344    team for their
345    <a href="http://html5lib.googlecode.com/svn/trunk/testdata/">HTML5
346    parser test data</a>.</p>
347    </div>
348    
349  <div class="section" id="author">  <div class="section" id="author">
350  <h2>Author</h2>  <h2>Author</h2>
351    
# Line 91  repository</a>.</p> Line 355  repository</a>.</p>
355  <div class="section" id="license">  <div class="section" id="license">
356  <h2>License</h2>  <h2>License</h2>
357    
358  <p>Copyright 2007 Wakaba  <p>Copyright 2007$B!>(B2008 Wakaba
359  <code class="mail">&lt;<a href="mailto:w@suika.fam.cx"  <code class="mail">&lt;<a href="mailto:w@suika.fam.cx"
360      rel="author">w@suika.fam.cx</a>></code>.</p>      rel="author">w@suika.fam.cx</a>></code>.</p>
361    

Legend:
Removed from v.1.5  
changed lines
  Added in v.1.26

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24