/[suikacvs]/markup/html/whatpm/readme.en.html
Suika

Contents of /markup/html/whatpm/readme.en.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.30 - (show annotations) (download) (as text)
Mon Oct 13 06:18:30 2008 UTC (16 years ago) by wakaba
Branch: MAIN
Changes since 1.29: +5 -2 lines
File MIME type: text/html
++ whatpm/t/ChangeLog	13 Oct 2008 06:18:26 -0000
2008-10-13  Wakaba  <wakaba@suika.fam.cx>

	* tokenizer-test-2.dat: A test result was wrong.

++ whatpm/Whatpm/ChangeLog	13 Oct 2008 06:17:59 -0000
2008-10-13  Wakaba  <wakaba@suika.fam.cx>

	* HTML.pm.src: Steps for CDATA/RCDATA elements in tree
	construction stage synced with the spec (HTML5 revisions 2139 and
	2302).

1 <!DOCTYPE html>
2 <html lang="en">
3 <head>
4 <title>Whatpm &mdash; Perl Modules for Web Hypertext Application
5 Technologies (beta)</title>
6 <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/xhtml">
7 <link rel="license" href="#license">
8 <link rel="author" href="#author">
9 </head>
10 <body>
11 <h1>Whatpm &mdash; Perl modules for Web hypertext application technologies
12 (<em>beta</em>)</h1>
13
14 <div class="section" id="introduction">
15 <h2>Introduction</h2>
16
17 <p><dfn>Whatpm</dfn> is a <em>work-in-progress</em> set of
18 <mark>P</mark>erl <mark>m</mark>odules for <mark>W</mark>eb
19 <mark>h</mark>ypertext <mark>a</mark>pplication
20 <mark>t</mark>echnologies. It is part of the <a
21 href="http://suika.fam.cx/www/2006/manakai/" rel=up>manakai</a>
22 project.</p>
23
24 <p>Whatpm supports various Web standard technologies, including <a
25 href="#modules-html">HTML, XHTML</a>, <a href="#modules-xml">XML</a>,
26 <a hreF="#modules-css">CSS</a>, <a href="#modules-http">HTTP</a>, and
27 <a href="#modules-url">URL</a>.
28 </div>
29
30 <div class=section id=modules>
31 <h2>Modules</h2>
32
33 <p>Note that all of these modules are <em>work in progress</em>
34 and have <a href="#todo">a number of unresolved problems</a>.</p>
35
36 <p>Note also that some modules have no documentation for now.</p>
37
38 <div class=section id=modules-html-xml>
39 <h3>Modules for HTML and XML</h3>
40
41 <p id=modules-html>Modules related to HTML and XHTML are as follows:
42 <dl>
43 <dt id=module-whatpm-html><a href="Whatpm/HTML.html"><code>Whatpm::HTML</code></a></dt>
44 <dd>An implementation of HTML5 document and fragment
45 parsing algorithms. It can be used
46 to convert an arbitrary string into a
47 <abbr title="Document Object Model">DOM</abbr>. (See also
48 <a href="#demo-html-parser">demo</a>.)</dd>
49 <dt id=module-whatpm-html-serializer><a href="Whatpm/HTML/Serializer.html"><code>Whatpm::HTML::Serializer</code></a></dt>
50 <dd>An implementation of HTML5 fragment serialization algorithm.
51 (See also <a href="#demo-html-parser">demo</a>.)</dd>
52 <dt><a href="Whatpm/HTMLTable.html"><code>Whatpm::HTMLTable</code></a></dt>
53 <dd>An implementation of the HTML5 table algorithm. It can be
54 used to extract a table structure from a DOM <code>table</code>
55 element node. (See also <a href="#demo-html-table">demo</a>.)</dd>
56 </dl>
57
58 <p id=modules-xml>The module for <i>tentative</i> XML support is as follow:
59 <dl>
60 <dt><a href="Whatpm/XMLSerializer.html"><code>Whatpm::XMLSerializer</code></a></dt>
61 <dd>A simple XML serializer.</dd>
62 </dl>
63
64 <p><i>Real</i> XML parser and serializer are currently not available yet.
65
66 <p id=modules-cc>The module for conformance checking of a DOM tree (i.e.
67 a in-memory representation of an HTML or XML document) is as follows:
68 <dl>
69
70 <dt><a href="Whatpm/ContentChecker.html"><code>Whatpm::ContentChecker</code></a></dt>
71
72 <dd>A DOM5 HTML (in-memory representation of a document) conformance
73 checker with a partial support for Atom 1.0. (See also <a
74 href="#demo-html-parser">demo</a> and <a
75 href="#app-webhacc">application</a>.)
76
77 </dl>
78
79 <p>Currently, conformance checking of HTML/XHTML and Atom documents
80 is supported.
81
82 <p>For these modules, a DOM implementation that supports the manakai's
83 Perl binding<!-- @@ TODO: ref --> of DOM is necessary to represent a
84 document in memory. The <a
85 href="http://suika.fam.cx/www/manakai-core/doc/web/">manakai-core</a>
86 package contains such an implementation,
87 <code>Message::DOM::Implementation</code><!-- @@ TODO: ref -->, but it
88 should also be possible to use any other implementation that supports
89 the binding.
90
91 </div>
92
93 <div class=section id=modules-css>
94 <h3>Modules for CSS</h3>
95
96 <p>Modules for CSS and related technologies are as follows:
97 <dl>
98 <dt><a href="Whatpm/CSS/Cascade.html"><code>Whatpm::CSS::Cascade</code></a>
99 <dd>A media-independent implementation of CSS cascading and value
100 computations. (See also <a href="#demo-css-parser">demo</a>.)
101
102 <dt><a href="Whatpm/CSS/MediaQueryParser.pm"><code>Whatpm::CSS::MediaQueryParser</code></a>
103
104 <dd>A media query parser. Note that only CSS 2.1 media types are
105 supported at the moment.
106
107 <dt><a href="Whatpm/CSS/MediaQuerySerializer.pm"><code>Whatpm::CSS::MediaQuerySerializer</code></a>
108
109 <dd>A media query serializer. Note that only CSS 2.1 media types
110 are supported at the moment.
111
112 <dt><a href="Whatpm/CSS/Parser.html"><code>Whatpm::CSS::Parser</code></a>
113 <dd>A CSS parser that constructs CSSOM trees from style sheets. (See
114 also <a href="#demo-css-parser">demo</a>.)
115 <dt><a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a></dt>
116 <dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of
117 selectors</a> parser. (See also <a href="#demo-css-parser">demo</a>.)</dd>
118 <dt><a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a></dt>
119 <dd>A <a href="http://www.w3.org/TR/css3-selectors/#grouping">group of
120 selectors</a> serializer. (See also <a href="#spec-ssft">specification</a>
121 and <a href="#demo-css-parser">demo</a>.)</dd>
122 <dt><a href="Whatpm/CSS/Tokenizer.html"><code>Whatpm::CSS::Tokenizer</code></a></dt>
123 <dd>A CSS tokenizer. (See also <a href="#demo-css-parser">demo</a>.)</dd>
124 </dl>
125
126 <p>For the <code>Whatpm::CSS::Parser</code> module reresenting a CSSOM
127 tree, modules in the <a
128 href="http://suika.fam.cx/www/manakai-core/doc/web/">manakai-core</a>
129 package are used. Those modules also provide the serializer for the
130 CSSOM tree, in the form of the standard <code>css_text</code> CSSOM
131 attribute.
132
133 </div>
134
135 <div class=section id=modules-http>
136 <h3>Modules for HTTP</h3>
137
138 <p>Modules for HTTP and related technologies are as follows:
139 <dl>
140 <dt><a href="Whatpm/ContentType.html"><code>Whatpm::ContentType</code></a></dt>
141 <dd>An implementation of HTML5 Content Type sniffing algorithm.</dd>
142 <dt><a href="Whatpm/IMTChecker.html"><code>Whatpm::IMTChecker</code></a></dt>
143 <dd>An Internet Media Type (<abbr>aka</abbr> MIME type) label
144 conformance checker.</dd>
145 </dl>
146
147 <p>Currently, support for parsing of HTTP headers and as such is not
148 yet available.
149 </div>
150
151 <div class=section id=modules-url>
152 <h3>Module for URL</h3>
153
154 <p>Module for the URL support is as follows:
155 <dl>
156 <dt><a href="Whatpm/URIChecker.html"><code>Whatpm::URIChecker</code></a></dt>
157 <dd>An IRI reference conformance checker.</dd>
158 </dl>
159
160 <p>Support for HTML5's realistic definition of URL is not available yet.
161 </div>
162
163 <div class=section id=modules-misc>
164 <h3>Modules for other technologies</h3>
165
166 <p>Following modules provide support for other Web-related technologies:
167 <dl>
168 <dt><a href="Whatpm/CacheManifest.html"><code>Whatpm::CacheManifest</code></a></dt>
169 <dd>An
170 <a href="http://www.whatwg.org/specs/web-apps/current-work/#manifests">HTML5
171 cache manifest</a> parser.</dd>
172 <dt><a href="Whatpm/Charset/DecodeHandle.html"><code>Whatpm::Charset::DecodeHandle</code></a>
173 <dd>A filehandle-like wrapper <a href="#doc-handles">interface</a> to
174 decode byte stream encoded in some character encoding.
175 <dt><a href="Whatpm/Charset/UnicodeChecker.html"><code>Whatpm::Charset::UnicodeChecker</code></a>
176 <dd>A Unicode character string checker.
177 <dt id=whatpm-charset-universalchardet><a href="Whatpm/Charset/UniversalCharDet.html"><code>Whatpm::Charset::UniversalCharDet</code></a></dt>
178 <dd>A Perl interface to universalchardet character encoding detection
179 library.</dd>
180 <dt><a href="Whatpm/LangTag.html"><code>Whatpm::LangTag</code></a>
181 <dd>A language tag parser and conformance checker, supporting both
182 older RFC 3066 definition and latest RFC 4646 definition. (See also
183 <a href="#demo-langtag">demo</a>.)
184
185 <dt><a href="Whatpm/RDFXML.html"><code>Whatpm::RDFXML</code></a>
186 <dd>An implementation of RDF/XML by which RDF triples can be extracted
187 from RDF/XML documents.
188
189 <dt><a href="Whatpm/WebIDL.html"><code>Whatpm::WebIDL</code></a></dt>
190 <dd>A WebIDL fragment parser. It parses an IDL fragment, whether conforming
191 or not, and constructs a DOM-like object model for further processing.
192 Non-conforming (or broken) IDL fragment-like string will be parsed using
193 CSS-like error-tolerant parsing rules, e.g. ignoring anything until next
194 <code>;</code> character.
195 </dl>
196 </div>
197
198 <!-- Whatpm::ContentChecker::*, Whatpm::H2H, Whatpm::NanoDOM, and
199 Whatpm::XMLParser are intentionally omitted from the list. -->
200 </div>
201
202 <div class=section id=documents>
203 <h2>Documents</h2>
204
205 <p>For the description of functionalities provided by each module, see
206 <abbr>pod</abbr> documentation of the module. HTML version of
207 <abbr>pod</abbr> documentations are linked from the <a
208 href="#modules">list of modules above</a>.
209
210 <p>In addition, there are additional documents for some topics:
211 <dl>
212
213 <dt><a href="http://suika.fam.cx/gate/2007/html/standards">Standards
214 supported by WebHACC</a>
215
216 <dd>List and description of Web standards supported by the WebHACC
217 conformance checker. Although it is a documentation for the WebHACC,
218 it is also applicable to Whatpm in general (note that WebHACC is an
219 interactive user interface for the conformance checking feature
220 provided by Whatpm).
221
222 <dt><a href="http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types">List of error types</a></dt>
223 <!-- @@ TODO: Need to update the link - the document above is out of date -->
224 <dd>Description of errors to be notified to callback functions by Whatpm
225 modules.</dd>
226
227 <dt><a href="Whatpm/CSS/selectors-object">Selectors object</a></dt>
228 <dd>Description of data structure for Selectors, as implemented by
229 <a href="Whatpm/CSS/SelectorsParser.html"><code>Whatpm::CSS::SelectorsParser</code></a>
230 (as output), and
231 <a href="Whatpm/CSS/SelectorsSerializer.html"><code>Whatpm::CSS::SelectorsSerializer</code></a>
232 (as input)<!--, and
233 <a href="http://suika.fam.cx/www/manakai-core/lib/Message/DOM/SelectorsAPI.html"><code>Message::DOM::SelectorsAPI</code></a>-->.</dd>
234
235 <dt id=doc-user-data-names><a href="http://suika.fam.cx/gate/2005/sw/manakai/Predefined%20User%20Data%20Names">List of predefined user data names</a></dt>
236 <dd>List of user data names defined by Whatpm modules.</dd>
237
238 <dt id=doc-handles><a href="Whatpm/Charset/handles">Handle objects</a>
239 <dd>Description of character or byte stream input handle interfaces.
240 </dl>
241
242 <p>Following specifications define Whatpm-specific formats and extensions:
243 <dl id=spec>
244 <dt id=spec-ssft><a href="http://suika.fam.cx/www/markup/selectors/ssft/ssft"><abbr title="Selectors Serialization Format for Testing">SSFT</abbr>
245 Specification</a></dt>
246 <dd>The specification for the serialization format used for
247 testing Selectors-related modules.</dd>
248
249 <dt><a href="http://suika.fam.cx/gate/2005/sw/manakai/CSS%20Extensions">manakai's
250 CSS extensions</a>
251 <dd>The specification for <code>-manakai-<var>*</var></code> properties
252 and property values implemented by CSS-related modules.
253 <dt id=spec-manakai-selectors"><a href="http://suika.fam.cx/gate/2005/sw/manakai/Selectors%20Extensions">manakai's
254 Selectors extensions</a>
255 <dd>The specification for <code>:-manakai-<var>*</var></code>
256 pseudo-classes implemented by Selectors-related modules.</dd>
257 </dl>
258 </div>
259
260 <div class="section" id="demo">
261 <h2>Demo</h2>
262
263 <ul id=demo-html-parser>
264 <li id=demo-html-parser-nanodom><a href="http://suika.fam.cx/gate/2007/html/parser-interface">HTML5 parser
265 and checker demo</a>
266 (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/parser.cgi">source</a>,
267 with <a href="Whatpm/NanoDOM.html">a lightweight non-conforming
268 DOM implementation</a>)</li>
269 <li id=demo-html-parser-manakai><a href="http://suika.fam.cx/gate/2007/html/parser-manakai-interface">HTML5
270 parser and checker demo, with manakai's DOM implementation</a>
271 (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/parser-manakai.cgi">source</a>)</li>
272 <li id=demo-html-table><a href="http://suika.fam.cx/gate/2007/html/table-interface">HTML5 table
273 structure visualization demo</a>
274 (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/html/table.cgi">source</a>)</li>
275
276 <li id=demo-css-parser><a href="http://suika.fam.cx/gate/2007/css/parser-interface">CSS
277 tokenizer, parser, and computed style computation demo</a>
278 (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/css/parser.cgi">source</a>)</li>
279
280 <li id=demo-langtag><a href="http://suika.fam.cx/gate/2007/langtag/langtag-demo-interface">Language
281 tag parsing and conformance checking demo</a>
282 (<a href="http://suika.fam.cx/gate/cvs/*checkout*/webroot/gate/2007/langtag/langtag-demo.cgi">source</a>)
283 </ul>
284 </div>
285
286 <div class=section id=applications>
287 <h2>Applications</h2>
288
289 <ul>
290
291 <li id=app-webhacc><a
292 href="http://suika.fam.cx/gate/2007/html/cc/"><abbr>WebHACC</abbr>
293 (Web hypertext application conformance checker)</a> (See also <a
294 href="http://suika.fam.cx/gate/2007/html/cc-about"><cite>about
295 WebHACC</cite></a>).
296
297 <li><a href="http://suika.fam.cx/www/webidl2tests/readme">wttjs</a>, a
298 WebIDL ECMAScript binding test suite generator.
299
300 </ul>
301 </div>
302
303 <div class="section" id="dependency">
304 <h2>Dependency</h2>
305
306 <dl>
307 <dt id=dependency-perl>Perl 5.8 or later</dt>
308 <dd>It is recommended to use newer stable release of Perl 5.8 (or
309 later).</dd>
310 <dd id=dependency-encode>Some modules require <code>Encode</code>
311 modules, which are part of standard Perl distribution.</dd>
312 <dt id=dependency-manakai-core>Modules from
313 <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a></dt>
314 <dd>
315 <dl>
316 <dt id=dependency-error><a href="http://search.cpan.org/author/SHLOMIF/Error-0.17009/lib/Error.pm"><code>Error</code></a></dt>
317 <dd>Module <code>Whatpm::HTML</code> requires <code>Error</code>,
318 which is bundled in
319 <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
320 <dt><code>Message::IMT::InternetMediaType</code></dt>
321 <dd>Module <code>Whatpm::IMTChecker</code> depends on
322 <code>Message::IMT::InternetMediaType</code>, which is part of
323 <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
324 <dt><code>Message::URI::URIReference</code></dt>
325 <dd>Modules <code>Whatpm::URIChecker</code> and
326 <code>Whatpm::CacheManifest</code> depend on
327 <a href="http://suika.fam.cx/www/manakai-core/lib/Message/URI/URIReference.html"><code>Message::URI::URIReference</code></a>,
328 which is part of
329 <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
330 <dt><code>Message::Charset::Info</code></dt>
331 <dd>Module <code>Whatpm::ContentChecker</code> depends on
332 <a href="http://suika.fam.cx/www/manakai-core/lib/Message/Charset/Info.html"><code>Message::Charset::Info</code></a>,
333 which is part of
334 <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.</dd>
335 <dt><code>Message::DOM::DOMImplementation</code>
336 <dd>Module <code>Whatpm::URIChecker</code> depends on
337 <code>Message::DOM::DOMImplementation</code>,
338 which is part of
339 <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.
340 <dt><code>Message::DOM::DOMImplementation</code> and related modules</dt>
341 <dd><em>Testing</em> for module <code>Whatpm::ContentChecker</code>
342 depends on <code>Message::DOM::DOMImplementation</code> and related modules
343 in <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>.
344 They are not required for any practical use of those modules.
345 </dl>
346 </dd>
347 <dt><a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
348 charlib</a></dt>
349 <dd>Module <code>Whatpm::Charset::DecodeHandle</code> depends on
350 modules in <a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
351 charlib</a> for decoding of <em>Japanese character encodings</em>.
352 See the documentation for
353 <a href="http://suika.fam.cx/www/manakai-charlib/readme">manakai
354 charlib</a> for more information.</dd>
355 <dt><a href="http://www.python.org/">Python</a>, Perl
356 <a href="http://search.cpan.org/~neilw/Inline-Python-0.22/"><code>Inline::Python</code></a>
357 module, and <a href="http://chardet.feedparser.org/">Universal Encoding
358 Detector</a></dt>
359 <dd>For the module <code>Whatpm::Charset::UniversalCharDet</code> being
360 meaningful, these softwares are required on the system. See the
361 <a href="Whatpm/Charset/UniversalCharDet.html#dependency">documentation</a>
362 for more information.</dd>
363 <dt><a href="http://search.cpan.org/~makamaka/JSON-1.14/"><code>JSON</code></a></dt>
364 <dd><em>Testing</em> for modules <code>Whatpm::HTML</code> and
365 <code>Whatpm::CSS::Tokenizer</code>
366 depends on <a href="http://search.cpan.org/~makamaka/JSON-1.14/"><code>JSON</code> and related modules</a>.
367 They are not required for any practical use of those modules.
368 </dl>
369 </div>
370
371 <div class="section" id="download">
372 <h2>Distribution</h2>
373
374 <p>The development version of Whatpm may be found in the
375 <a href="http://suika.fam.cx/gate/cvs/markup/html/whatpm/">CVS
376 repository</a>.</p>
377
378 <p><a href="http://suika.fam.cx/gate/cvs/markup/html/whatpm/whatpm.tar.gz?tarball=1">The
379 latest developmenet version of the Whatpm</a> is also available as a
380 tarball.
381
382 </div>
383
384 <div class="section" id="todo">
385 <h2>TO DO</h2>
386
387 <ul>
388 <li>Bug fix (Test results:
389 <a href="t/content-type-result"><code>Whatpm::ContentType</code></a>,
390 <a href="t/tokenizer-result">HTML tokenization</a>,
391 <a href="t/tree-construction-result">HTML tree construction</a>,
392 <a href="t/content-checker-result"><code>Whatpm::ContentChecker</code></a>).</li>
393 <li>Merge with the <a href="http://suika.fam.cx/www/2006/manakai/">manakai-core</a>
394 code tree.
395 <li>Charset detection.</li>
396 <li>Validation for <code>meta</code>.</li>
397 <li>Validation for media queries (level 3), IRIs (against URI schemes),
398 and so on.</li>
399 <li>Documentations are missing for some features.</li>
400 <li>XML parser<!-- with application cache selection algorithm hook-->.</li>
401 <li>In addition, each module has its own TO DO items.
402 (Search for <q>## TODO</q> and <q>## ISSUE</q> in each module.)</li>
403 </ul>
404 </div>
405
406 <div class=section id=acknowledgments>
407 <h2>Acknowledgments</h2>
408
409 <p>Thanks to the <a href="http://code.google.com/p/html5lib/">html5lib</a>
410 team for their
411 <a href="http://html5lib.googlecode.com/svn/trunk/testdata/">HTML5
412 parser test data</a>.</p>
413 </div>
414
415 <div class="section" id="author">
416 <h2>Author</h2>
417
418 <p><a href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.</p>
419 </div>
420
421 <div class="section" id="license">
422 <h2>License</h2>
423
424 <p>Copyright 2007$B!>(B2008 Wakaba
425 <code class="mail">&lt;<a href="mailto:w@suika.fam.cx"
426 rel="author">w@suika.fam.cx</a>></code>.</p>
427
428 <p>This library is free software; you can redistribute it and/or modify
429 it under the same terms as Perl itself.</p>
430 </div>
431
432 </body>
433 </html>

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24