/[suikacvs]/markup/html/whatpm/Whatpm/HTML.html
Suika

Contents of /markup/html/whatpm/Whatpm/HTML.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.4 - (show annotations) (download) (as text)
Sun Nov 4 03:20:34 2007 UTC (17 years ago) by wakaba
Branch: MAIN
Changes since 1.3: +7 -7 lines
File MIME type: text/html
++ whatpm/t/ChangeLog	4 Nov 2007 03:11:55 -0000
2007-11-04  Wakaba  <wakaba@suika.fam.cx>

	* content-model-4.dat: New tests for rel=up (HTML5 revision 1112)
	and rel=noreferer (HTML5 revision 1118).

++ whatpm/Whatpm/ChangeLog	4 Nov 2007 03:11:13 -0000
2007-11-04  Wakaba  <wakaba@suika.fam.cx>

	* mklinktypelist.pl: Support for rel=noreferer (HTML5 revision 1118).

1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2 <html xmlns="http://www.w3.org/1999/xhtml">
3 <head>
4 <title>Whatpm::HTML - An HTML Parser</title>
5 <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
6 <link rev="made" href="mailto:admin@suika.fam.cx" />
7 </head>
8
9 <body>
10
11 <p><a name="__index__"></a></p>
12 <!-- INDEX BEGIN -->
13
14 <ul>
15
16 <li><a href="#name">NAME</a></li>
17 <li><a href="#synopsis">SYNOPSIS</a></li>
18 <li><a href="#description">DESCRIPTION</a></li>
19 <li><a href="#methods">METHODS</a></li>
20 <li><a href="#to_do">TO DO</a></li>
21 <li><a href="#see_also">SEE ALSO</a></li>
22 <li><a href="#author">AUTHOR</a></li>
23 <li><a href="#license">LICENSE</a></li>
24 </ul>
25 <!-- INDEX END -->
26
27 <hr />
28 <p>
29 </p>
30 <h1><a name="name">NAME</a></h1>
31 <p>Whatpm::HTML - An HTML Parser</p>
32 <p>
33 </p>
34 <hr />
35 <h1><a name="synopsis">SYNOPSIS</a></h1>
36 <pre>
37 use Whatpm::HTML;
38
39 my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;
40 # $doc = an empty DOM |Document| object
41 my $on_error = sub {
42 my $error_code = shift;
43 warn $error_code, &quot;\n&quot;;
44 };
45
46 Whatpm::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);
47
48 ## Then, |$doc| is the DOM representation of |$s|.</pre>
49 <p>
50 </p>
51 <hr />
52 <h1><a name="description">DESCRIPTION</a></h1>
53 <p>The <code>Whatpm::HTML</code> module contains HTML parser and serializer.</p>
54 <p>The HTML parser can be used to construct the DOM tree representation
55 from an HTML document. The parsing and tree construction are done
56 as described in the Web Application 1.0 specification.</p>
57 <p>The HTML serializer can be used to obtain the HTML document representation
58 of a DOM tree (or a tree fragment thereof). The serialization
59 is performed as described in the Web Applications 1.0 specification
60 for <code>innerHTML</code> DOM attribute.</p>
61 <p>This module is part of Whatpm - Perl Modules for
62 Web Hypertext Application Technologies.</p>
63 <p>
64 </p>
65 <hr />
66 <h1><a name="methods">METHODS</a></h1>
67 <dl>
68 <dt><strong><a name="item_parse_string">[<em>$doc</em> =] Whatpm::HTML-&gt;parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br />
69 </dt>
70 <dd>
71 Parse a string <em>$s</em> as an HTML document.
72 </dd>
73 <dd>
74 <p>The first argument, <em>$s</em>, MUST be a string. It is parsed
75 as a sequence of characters representing an HTML document.</p>
76 </dd>
77 <dd>
78 <p>The second argument, <em>$doc</em>, MUST be an empty read-write
79 DOM <code>Document</code> object. The HTML DOM tree is constructed
80 onto this <code>Document</code> object.</p>
81 </dd>
82 <dd>
83 <p>The third argument, <em>$onerror</em>, MUST be a reference to
84 the error handler code. Whenever a parse error is detected,
85 this code is invoked with an argument that contains a
86 useless string that might describe what is wrong.
87 The code MAY throw an exception, so that whole the parsing
88 process aborts. Otherwise, the parser will continue to
89 process the input. The code MUST NOT modify <em>$s</em> or <em>$doc</em>.
90 If it does, then the result is undefined.
91 This argument is optional; if missing, any
92 parse error makes that string being <code>warn</code>ed.</p>
93 </dd>
94 <dd>
95 <p><strong>NOTE</strong>: To be a conforming user agent, the code MUST either
96 abort the processing by throwing an exception at the first
97 invocation or MUST continue the processing until the parser
98 stops.</p>
99 </dd>
100 <dd>
101 <p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p>
102 </dd>
103 <dd>
104 <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming
105 implementation of DOM that only implements a subset that
106 is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and
107 serializing.
108 With this module, creating a new HTML <code>Document</code> object
109 from a string containing HTML document might be coded as:</p>
110 </dd>
111 <dd>
112 <pre>
113 use Whatpm::HTML;
114 use Whatpm::NanoDOM;
115 my $doc = Whatpm::HTML-&gt;parse_string
116 ($s =&gt; Whatpm::NanoDOM::Document-&gt;new, $onerror);</pre>
117 </dd>
118 <p></p>
119 <dt><strong><a name="item_get_inner_html"><em>$s</em> = Whatpm::HTML-&gt;get_inner_html (<em>$node</em>[, <em>$onerror</em>]);</a></strong><br />
120 </dt>
121 <dd>
122 Return the HTML serialization of a DOM node <em>$node</em>.
123 </dd>
124 <dd>
125 <p>The first argument, <em>$node</em>, MUST be a DOM <code>Document</code>,
126 <code>Element</code>, or <code>DocumentFragment</code> node.</p>
127 </dd>
128 <dd>
129 <p>The second argument, <em>$onerror</em>, MUST be a reference to the
130 error handling code. This code will be invoked if a descendant
131 of <em>$node</em> is neither of <code>Element</code>, <code>Text</code>, <code>CDATASection</code>,
132 <code>Comment</code>, <code>DocumentType</code>, nor <code>EntityReference</code>, so
133 that an <code>INVALID_STATE_ERR</code> exception MUST be thrown.
134 The code will be invoked with an argument, which is the node
135 whose type is invalid.
136 The argument <em>$onerror</em> is optional; if missing, any erroneous
137 node is simply ignored.</p>
138 </dd>
139 <dd>
140 <p>The method returns a reference to the <code>inner_html</code> attribute
141 value, i.e. the HTML serialization of the <em>$node</em>.</p>
142 </dd>
143 <p></p></dl>
144 <p>
145 </p>
146 <hr />
147 <h1><a name="to_do">TO DO</a></h1>
148 <p>Tokenizer should emit a sequence of character tokens as one token
149 to improve performance.</p>
150 <p>A method that accepts a byte stream as an input.</p>
151 <p>Charset detection algorithm.</p>
152 <p>Documentation for the setter of inner_html.</p>
153 <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>
154 <p>
155 </p>
156 <hr />
157 <h1><a name="see_also">SEE ALSO</a></h1>
158 <p>Whatpm
159 &lt;http://suika.fam.cx/www/markup/html/whatpm/readme&gt;</p>
160 <p>Web Applications 1.0 Working Draft (aka HTML5)
161 &lt;http://whatwg.org/html5&gt;. (Revision 792, 1 May 2007)</p>
162 <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a></p>
163 <p>
164 </p>
165 <hr />
166 <h1><a name="author">AUTHOR</a></h1>
167 <p>Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;.</p>
168 <p>
169 </p>
170 <hr />
171 <h1><a name="license">LICENSE</a></h1>
172 <p>Copyright 2007 Wakaba &lt;<a href="mailto:w@suika.fam.cx">w@suika.fam.cx</a>&gt;</p>
173 <p>This library is free software; you can redistribute it
174 and/or modify it under the same terms as Perl itself.</p>
175
176 </body>
177
178 </html>

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24