/[suikacvs]/markup/html/whatpm/Whatpm/HTML.html
Suika

Contents of /markup/html/whatpm/Whatpm/HTML.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.7 - (show annotations) (download) (as text)
Sun Nov 11 04:59:35 2007 UTC (16 years, 11 months ago) by wakaba
Branch: MAIN
Changes since 1.6: +1 -25 lines
File MIME type: text/html
++ ChangeLog	11 Nov 2007 04:59:27 -0000
2007-11-11  Wakaba  <wakaba@suika.fam.cx>

	* readme.en.html: Link to |Whatpm::HTML::Serializer|.

++ whatpm/Whatpm/ChangeLog	11 Nov 2007 04:59:14 -0000
2007-11-11  Wakaba  <wakaba@suika.fam.cx>

	* HTML.pod (get_inner_html): Removed.

	* Makefile (HTML-all, HTML-clean): New.

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

	* HTML.pm.src (get_inner_html): Removed (moved to HTML/Serializer.pm).

++ whatpm/Whatpm/HTML/ChangeLog	11 Nov 2007 04:58:48 -0000
2007-11-11  Wakaba  <wakaba@suika.fam.cx>

	* Serializer.pod: New file.

	* Makefile: New file.

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

	* Serializer.pm: New module (split from ../HTML.pm.src).

2007-11-11  Wakaba  <wakaba@suika.fam.cx>

	* ChangeLog: New file.

1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2 <html xmlns="http://www.w3.org/1999/xhtml">
3 <head>
4 <title>Whatpm::HTML - An HTML Parser and Serializer</title>
5 <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
6 <link rev="made" href="mailto:admin@suika.fam.cx" />
7 </head>
8
9 <body>
10
11 <p><a name="__index__"></a></p>
12 <!-- INDEX BEGIN -->
13
14 <ul>
15
16 <li><a href="#name">NAME</a></li>
17 <li><a href="#synopsis">SYNOPSIS</a></li>
18 <li><a href="#description">DESCRIPTION</a></li>
19 <li><a href="#methods">METHODS</a></li>
20 <li><a href="#lowlevel_interface">LOW-LEVEL INTERFACE</a></li>
21 <ul>
22
23 <li><a href="#application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></li>
24 </ul>
25
26 <li><a href="#error_reports">ERROR REPORTS</a></li>
27 <li><a href="#to_do">TO DO</a></li>
28 <li><a href="#see_also">SEE ALSO</a></li>
29 <li><a href="#author">AUTHOR</a></li>
30 <li><a href="#license">LICENSE</a></li>
31 </ul>
32 <!-- INDEX END -->
33
34 <hr />
35 <p>
36 </p>
37 <h1><a name="name">NAME</a></h1>
38 <p>Whatpm::HTML - An HTML Parser and Serializer</p>
39 <p>
40 </p>
41 <hr />
42 <h1><a name="synopsis">SYNOPSIS</a></h1>
43 <pre>
44 use Whatpm::HTML;
45
46 my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;
47 # $doc = an empty DOM |Document| object
48 my $on_error = sub {
49 my %error = @_;
50 warn $error{type}, &quot;\n&quot;;
51 };
52
53 Whatpm::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);
54
55 ## Now, |$doc| is the DOM representation of |$s|.</pre>
56 <p>
57 </p>
58 <hr />
59 <h1><a name="description">DESCRIPTION</a></h1>
60 <p>The <code>Whatpm::HTML</code> module contains HTML parser and serializer.</p>
61 <p>The HTML parser can be used to construct the DOM tree representation
62 from an HTML document. The parsing and tree construction are done
63 as described in the Web Application 1.0 specification.</p>
64 <p>The HTML serializer can be used to obtain the HTML document representation
65 of a DOM tree (or a tree fragment thereof). The serialization
66 is performed as described in the Web Applications 1.0 specification
67 for <code>innerHTML</code> DOM attribute.</p>
68 <p>This module is part of Whatpm - Perl Modules for
69 Web Hypertext Application Technologies.</p>
70 <p>
71 </p>
72 <hr />
73 <h1><a name="methods">METHODS</a></h1>
74 <dl>
75 <dt><strong><a name="item_parse_string">[<em>$doc</em> =] Whatpm::HTML-&gt;parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br />
76 </dt>
77 <dd>
78 Parse a string <em>$s</em> as an HTML document.
79 </dd>
80 <dd>
81 <p>The first argument, <em>$s</em>, MUST be a string. It is parsed
82 as a sequence of characters representing an HTML document.</p>
83 </dd>
84 <dd>
85 <p>The second argument, <em>$doc</em>, MUST be an empty read-write
86 DOM <code>Document</code> object. The HTML DOM tree is constructed
87 onto this <code>Document</code> object.</p>
88 </dd>
89 <dd>
90 <p>The third argument, <em>$onerror</em>, MUST be a reference to
91 the error handler code. Whenever a parse error is detected,
92 this code is invoked with an argument that contains a
93 useless string that might describe what is wrong.
94 The code MAY throw an exception, so that whole the parsing
95 process aborts. Otherwise, the parser will continue to
96 process the input. The code MUST NOT modify <em>$s</em> or <em>$doc</em>.
97 If it does, then the result is undefined.
98 This argument is optional; if missing, any
99 parse error makes that string being <code>warn</code>ed.</p>
100 </dd>
101 <dd>
102 <p><strong>NOTE</strong>: To be a conforming user agent, the code MUST either
103 abort the processing by throwing an exception at the first
104 invocation or MUST continue the processing until the parser
105 stops.</p>
106 </dd>
107 <dd>
108 <p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p>
109 </dd>
110 <dd>
111 <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming
112 implementation of DOM that only implements a subset that
113 is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and
114 serializing.
115 With this module, creating a new HTML <code>Document</code> object
116 from a string containing HTML document might be coded as:</p>
117 </dd>
118 <dd>
119 <pre>
120 use Whatpm::HTML;
121 use Whatpm::NanoDOM;
122 my $doc = Whatpm::HTML-&gt;parse_string
123 ($s =&gt; Whatpm::NanoDOM::Document-&gt;new, $onerror);</pre>
124 </dd>
125 <p></p></dl>
126 <p>
127 </p>
128 <hr />
129 <h1><a name="lowlevel_interface">LOW-LEVEL INTERFACE</a></h1>
130 <p>@@ TBW</p>
131 <p>
132 </p>
133 <h2><a name="application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></h2>
134 <p>Once a parser <em>$p</em> is instantiated by method <code>new</code>,
135 a <code>CODE</code> reference can be set to <code>$p-&gt;{application_cache_selection}</code>.
136 That <code>CODE</code> will be called back when the application cache selection
137 algorithm MUST be run per HTML5. By default,
138 <code>$p-&gt;{application_cache_selection}</code> is set to an empty subroutine.</p>
139 <p>The subroutine will be invoked with an argument <em>manifest_uri</em>,
140 which is set to the manifest URI when the algorithm MUST be invoked
141 with a manifest URI, or is set to <code>undef</code> when the algorithm MUST
142 be invoked without no manifest URI.</p>
143 <p>
144 </p>
145 <hr />
146 <h1><a name="error_reports">ERROR REPORTS</a></h1>
147 <p>@@ TBW</p>
148 <p>The list of the error types is available in
149 Whatpm Error Types &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
150 <p>
151 </p>
152 <hr />
153 <h1><a name="to_do">TO DO</a></h1>
154 <p>Tokenizer should emit a sequence of character tokens as one token
155 to improve performance.</p>
156 <p>A method that accepts a byte stream as an input.</p>
157 <p>Charset detection algorithm.</p>
158 <p>Documentation for the setter of inner_html.</p>
159 <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>
160 <p>
161 </p>
162 <hr />
163 <h1><a name="see_also">SEE ALSO</a></h1>
164 <p>Whatpm &lt;http://suika.fam.cx/www/markup/html/whatpm/readme&gt;.</p>
165 <p>Whatpm Error Types
166 &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
167 <p>HTML5 &lt;http://whatwg.org/html5&gt;.</p>
168 <p><a href="../Whatpm/HTML/Serializer.html">the Whatpm::HTML::Serializer manpage</a>.</p>
169 <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a>.</p>
170 <p><a href="../Whatpm/ContentChecker/HTML.html">the Whatpm::ContentChecker::HTML manpage</a>.</p>
171 <p>
172 </p>
173 <hr />
174 <h1><a name="author">AUTHOR</a></h1>
175 <p>Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;.</p>
176 <p>
177 </p>
178 <hr />
179 <h1><a name="license">LICENSE</a></h1>
180 <p>Copyright 2007 Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;</p>
181 <p>This library is free software; you can redistribute it
182 and/or modify it under the same terms as Perl itself.</p>
183
184 </body>
185
186 </html>

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24