/[suikacvs]/markup/html/whatpm/Whatpm/HTML.html
Suika

Diff of /markup/html/whatpm/Whatpm/HTML.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.3 by wakaba, Wed May 2 13:44:33 2007 UTC revision 1.8 by wakaba, Sun Nov 11 08:39:42 2007 UTC
# Line 1  Line 1 
1  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2  <html xmlns="http://www.w3.org/1999/xhtml">  <html xmlns="http://www.w3.org/1999/xhtml">
3  <head>  <head>
4  <title>Whatpm::HTML - An HTML Parser</title>  <title>Whatpm::HTML - An HTML Parser and Serializer</title>
5  <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />  <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
6  <link rev="made" href="mailto:admin@suika.fam.cx" />  <link rev="made" href="mailto:admin@suika.fam.cx" />
7  </head>  </head>
# Line 17  Line 17 
17          <li><a href="#synopsis">SYNOPSIS</a></li>          <li><a href="#synopsis">SYNOPSIS</a></li>
18          <li><a href="#description">DESCRIPTION</a></li>          <li><a href="#description">DESCRIPTION</a></li>
19          <li><a href="#methods">METHODS</a></li>          <li><a href="#methods">METHODS</a></li>
20            <li><a href="#lowlevel_interface">LOW-LEVEL INTERFACE</a></li>
21            <ul>
22    
23                    <li><a href="#application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></li>
24            </ul>
25    
26            <li><a href="#error_reports">ERROR REPORTS</a></li>
27          <li><a href="#to_do">TO DO</a></li>          <li><a href="#to_do">TO DO</a></li>
28            <li><a href="#dependency">DEPENDENCY</a></li>
29          <li><a href="#see_also">SEE ALSO</a></li>          <li><a href="#see_also">SEE ALSO</a></li>
30          <li><a href="#author">AUTHOR</a></li>          <li><a href="#author">AUTHOR</a></li>
31          <li><a href="#license">LICENSE</a></li>          <li><a href="#license">LICENSE</a></li>
# Line 28  Line 36 
36  <p>  <p>
37  </p>  </p>
38  <h1><a name="name">NAME</a></h1>  <h1><a name="name">NAME</a></h1>
39  <p>Whatpm::HTML - An HTML Parser</p>  <p>Whatpm::HTML - An HTML Parser and Serializer</p>
40  <p>  <p>
41  </p>  </p>
42  <hr />  <hr />
# Line 39  Line 47 
47    my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;    my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;
48    # $doc = an empty DOM |Document| object    # $doc = an empty DOM |Document| object
49    my $on_error = sub {    my $on_error = sub {
50      my $error_code = shift;      my %error = @_;
51      warn $error_code, &quot;\n&quot;;      warn $error{type}, &quot;\n&quot;;
52    };    };
53        
54    Whatpm::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);    Whatpm::HTML-&gt;parse_char_string ($s =&gt; $doc, $onerror);
55        
56    ## Then, |$doc| is the DOM representation of |$s|.</pre>    ## Now, |$doc| is the DOM representation of |$s|.</pre>
57  <p>  <p>
58  </p>  </p>
59  <hr />  <hr />
# Line 65  Web Hypertext Application Technologies.< Line 73  Web Hypertext Application Technologies.<
73  <hr />  <hr />
74  <h1><a name="methods">METHODS</a></h1>  <h1><a name="methods">METHODS</a></h1>
75  <dl>  <dl>
76  <dt><strong><a name="item_parse_string">[<em>$doc</em> =] Whatpm::HTML-&gt;parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br />  <dt><strong><a name="item_parse_char_string">[<em>$doc</em> =] Whatpm::HTML-&gt;parse_char_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br />
77  </dt>  </dt>
78  <dd>  <dd>
79  Parse a string <em>$s</em> as an HTML document.  Parse a string <em>$s</em> as an HTML document.
# Line 102  stops.</p> Line 110  stops.</p>
110  </dd>  </dd>
111  <dd>  <dd>
112  <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming  <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming
113  implementation of DOM that only implements the subset that  implementation of DOM that only implements a subset that
114  is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and  is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and
115  serializing.  serializing.
116  With this module, creating a new HTML <code>Document</code> object  With this module, creating a new HTML <code>Document</code> object
# Line 112  from a string containing HTML document m Line 120  from a string containing HTML document m
120  <pre>  <pre>
121    use Whatpm::HTML;    use Whatpm::HTML;
122    use Whatpm::NanoDOM;    use Whatpm::NanoDOM;
123    my $doc = Whatpm::HTML-&gt;parse_string    my $doc = Whatpm::HTML-&gt;parse_char_string
124        ($s =&gt; Whatpm::NanoDOM::Document-&gt;new, $onerror);</pre>        ($s =&gt; Whatpm::NanoDOM::Document-&gt;new, $onerror);</pre>
125  </dd>  </dd>
 <p></p>  
 <dt><strong><a name="item_get_inner_html"><em>$s</em> = Whatpm::HTML-&gt;get_inner_html (<em>$node</em>[, <em>$onerror</em>]);</a></strong><br />  
 </dt>  
 <dd>  
 Return the HTML serialization of a DOM node <em>$node</em>.  
 </dd>  
 <dd>  
 <p>The first argument, <em>$node</em>, MUST be a DOM <code>Document</code>,  
 <code>Node</code>, or <code>DocumentFragment</code> object.</p>  
 </dd>  
 <dd>  
 <p>The second argument, <em>$onerror</em>, MUST be a reference to the  
 error handling code.  This code will be invoked if a descendant  
 of <em>$node</em> is not of <code>Element</code>, <code>Text</code>, <code>CDATASection</code>,  
 <code>Comment</code>, <code>DocumentType</code>, or <code>EntityReference</code> so  
 that <code>INVALID_STATE_ERR</code> MUST be thrown.  
 The code will be invoked with an argument, which is the node  
 whose type is invalid.    
 This argument is optional; if missing, any such  
 node is simply ignored.</p>  
 </dd>  
 <dd>  
 <p>The method returns a reference to the <code>inner_html</code> attribute  
 value, i.e. the HTML serialization of the <em>$node</em>.</p>  
 </dd>  
126  <p></p></dl>  <p></p></dl>
127  <p>  <p>
128  </p>  </p>
129  <hr />  <hr />
130    <h1><a name="lowlevel_interface">LOW-LEVEL INTERFACE</a></h1>
131    <p>@@ TBW</p>
132    <p>
133    </p>
134    <h2><a name="application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></h2>
135    <p>Once a parser <em>$p</em> is instantiated by method <code>new</code>,
136    a <code>CODE</code> reference can be set to <code>$p-&gt;{application_cache_selection}</code>.
137    That <code>CODE</code> will be called back when the application cache selection
138    algorithm MUST be run per HTML5.  By default,
139    <code>$p-&gt;{application_cache_selection}</code> is set to an empty subroutine.</p>
140    <p>The subroutine will be invoked with an argument <em>manifest_uri</em>,
141    which is set to the manifest URI when the algorithm MUST be invoked
142    with a manifest URI, or is set to <code>undef</code> when the algorithm MUST
143    be invoked without no manifest URI.</p>
144    <p>
145    </p>
146    <hr />
147    <h1><a name="error_reports">ERROR REPORTS</a></h1>
148    <p>@@ TBW</p>
149    <p>The list of the error types is available in
150    Whatpm Error Types &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
151    <p>
152    </p>
153    <hr />
154  <h1><a name="to_do">TO DO</a></h1>  <h1><a name="to_do">TO DO</a></h1>
155    <p>Documentation for parse_byte_string.</p>
156  <p>Tokenizer should emit a sequence of character tokens as one token  <p>Tokenizer should emit a sequence of character tokens as one token
157  to improve performance.</p>  to improve performance.</p>
158  <p>A method that accepts a byte stream as an input.</p>  <p>A method that accepts a byte stream as an input.</p>
159  <p>Charset detection algorithm.</p>  <p>Charset detection algorithm.</p>
160  <p>Setting inner_html.</p>  <p>Documentation for the setter of inner_html.</p>
161  <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>  <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>
162  <p>  <p>
163  </p>  </p>
164  <hr />  <hr />
165    <h1><a name="dependency">DEPENDENCY</a></h1>
166    <p>This module requires <em>Error</em>.  That module is available at CPAN
167    &lt;http://search.cpan.org/author/SHLOMIF/Error-0.17009/lib/Error.pm&gt;.
168    It is also part of manakai-core distribution
169    &lt;http://suika.fam.cx/www/2006/manakai/&gt;.</p>
170    <p>
171    </p>
172    <hr />
173  <h1><a name="see_also">SEE ALSO</a></h1>  <h1><a name="see_also">SEE ALSO</a></h1>
174  <p>Whatpm  <p>Whatpm &lt;http://suika.fam.cx/www/markup/html/whatpm/readme&gt;.</p>
175  &lt;http://suika.fam.cx/www/markup/html/whatpm/readme&gt;</p>  <p>Whatpm Error Types
176  <p>Web Applications 1.0 Working Draft (aka HTML5)  &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
177  &lt;http://whatwg.org/html5&gt;.  (Revision 792, 1 May 2007)</p>  <p>HTML5 &lt;http://whatwg.org/html5&gt;.</p>
178  <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a></p>  <p><a href="../Whatpm/HTML/Serializer.html">the Whatpm::HTML::Serializer manpage</a>.</p>
179    <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a>.</p>
180    <p><a href="../Whatpm/ContentChecker/HTML.html">the Whatpm::ContentChecker::HTML manpage</a>.</p>
181  <p>  <p>
182  </p>  </p>
183  <hr />  <hr />

Legend:
Removed from v.1.3  
changed lines
  Added in v.1.8

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24