/[suikacvs]/markup/html/whatpm/Whatpm/HTML.html
Suika

Diff of /markup/html/whatpm/Whatpm/HTML.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.2 by wakaba, Tue May 1 10:47:37 2007 UTC revision 1.6 by wakaba, Sun Nov 4 04:34:30 2007 UTC
# Line 1  Line 1 
1  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2  <html xmlns="http://www.w3.org/1999/xhtml">  <html xmlns="http://www.w3.org/1999/xhtml">
3  <head>  <head>
4  <title>Whatpm::HTML - An HTML Parser</title>  <title>Whatpm::HTML - An HTML Parser and Serializer</title>
5  <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />  <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
6  <link rev="made" href="mailto:admin@suika.fam.cx" />  <link rev="made" href="mailto:admin@suika.fam.cx" />
7  </head>  </head>
# Line 17  Line 17 
17          <li><a href="#synopsis">SYNOPSIS</a></li>          <li><a href="#synopsis">SYNOPSIS</a></li>
18          <li><a href="#description">DESCRIPTION</a></li>          <li><a href="#description">DESCRIPTION</a></li>
19          <li><a href="#methods">METHODS</a></li>          <li><a href="#methods">METHODS</a></li>
20            <li><a href="#lowlevel_interface">LOW-LEVEL INTERFACE</a></li>
21            <ul>
22    
23                    <li><a href="#application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></li>
24            </ul>
25    
26            <li><a href="#error_reports">ERROR REPORTS</a></li>
27          <li><a href="#to_do">TO DO</a></li>          <li><a href="#to_do">TO DO</a></li>
28          <li><a href="#see_also">SEE ALSO</a></li>          <li><a href="#see_also">SEE ALSO</a></li>
29          <li><a href="#author">AUTHOR</a></li>          <li><a href="#author">AUTHOR</a></li>
# Line 28  Line 35 
35  <p>  <p>
36  </p>  </p>
37  <h1><a name="name">NAME</a></h1>  <h1><a name="name">NAME</a></h1>
38  <p>Whatpm::HTML - An HTML Parser</p>  <p>Whatpm::HTML - An HTML Parser and Serializer</p>
39  <p>  <p>
40  </p>  </p>
41  <hr />  <hr />
# Line 39  Line 46 
46    my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;    my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;
47    # $doc = an empty DOM |Document| object    # $doc = an empty DOM |Document| object
48    my $on_error = sub {    my $on_error = sub {
49      my $error_code = shift;      my %error = @_;
50      warn $error_code, &quot;\n&quot;;      warn $error{type}, &quot;\n&quot;;
51    };    };
52        
53    Whatpm::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);    Whatpm::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);
54        
55    ## Then, |$doc| is the DOM representation of |$s|.</pre>    ## Now, |$doc| is the DOM representation of |$s|.</pre>
56  <p>  <p>
57  </p>  </p>
58  <hr />  <hr />
# Line 92  This argument is optional; if missing, a Line 99  This argument is optional; if missing, a
99  parse error makes that string being <code>warn</code>ed.</p>  parse error makes that string being <code>warn</code>ed.</p>
100  </dd>  </dd>
101  <dd>  <dd>
102    <p><strong>NOTE</strong>: To be a conforming user agent, the code MUST either
103    abort the processing by throwing an exception at the first
104    invocation or MUST continue the processing until the parser
105    stops.</p>
106    </dd>
107    <dd>
108  <p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p>  <p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p>
109  </dd>  </dd>
110  <dd>  <dd>
111  <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming  <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming
112  implementation of DOM that only implements the subset that  implementation of DOM that only implements a subset that
113  is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and  is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and
114  serializing.  serializing.
115  With this module, creating a new HTML <code>Document</code> object  With this module, creating a new HTML <code>Document</code> object
116  from a string containing HTML document can be coded as:</p>  from a string containing HTML document might be coded as:</p>
117  </dd>  </dd>
118  <dd>  <dd>
119  <pre>  <pre>
# Line 117  Return the HTML serialization of a DOM n Line 130  Return the HTML serialization of a DOM n
130  </dd>  </dd>
131  <dd>  <dd>
132  <p>The first argument, <em>$node</em>, MUST be a DOM <code>Document</code>,  <p>The first argument, <em>$node</em>, MUST be a DOM <code>Document</code>,
133  <code>Node</code>, or <code>DocumentFragment</code> object.</p>  <code>Element</code>, or <code>DocumentFragment</code> node.</p>
134  </dd>  </dd>
135  <dd>  <dd>
136  <p>The second argument, <em>$onerror</em>, MUST be a reference to the  <p>The second argument, <em>$onerror</em>, MUST be a reference to the
137  error handling code.  This code will be invoked if a descendant  error handling code.  This code will be invoked if a descendant
138  of <code>$node</code> is not of <code>Element</code>, <code>Text</code>, <code>CDATASection</code>,  of <em>$node</em> is neither of <code>Element</code>, <code>Text</code>, <code>CDATASection</code>,
139  <code>Comment</code>, <code>DocumentType</code>, or <code>EntityReference</code> so  <code>Comment</code>, <code>DocumentType</code>, nor <code>EntityReference</code>, so
140  that <code>INVALID_STATE_ERR</code> MUST be thrown.  that an <code>INVALID_STATE_ERR</code> exception MUST be thrown.
141  The code will be invoked with an argument, which is the node  The code will be invoked with an argument, which is the node
142  whose type is invalid.    whose type is invalid.  
143  This argument is optional; if missing, any such  The argument <em>$onerror</em> is optional; if missing, any erroneous
144  node is simply ignored.</p>  node is simply ignored.</p>
145  </dd>  </dd>
146  <dd>  <dd>
147  <p>The method returns a reference to the <code>inner_html</code> attribute  <p>The method returns a reference to the <code>inner_html</code> attribute
148  value, i.e. the HTML serialization of the <code>$node</code>.</p>  value, i.e. the HTML serialization of the <em>$node</em>.</p>
149  </dd>  </dd>
150  <p></p></dl>  <p></p></dl>
151  <p>  <p>
152  </p>  </p>
153  <hr />  <hr />
154    <h1><a name="lowlevel_interface">LOW-LEVEL INTERFACE</a></h1>
155    <p>@@ TBW</p>
156    <p>
157    </p>
158    <h2><a name="application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></h2>
159    <p>Once a parser <em>$p</em> is instantiated by method <code>new</code>,
160    a <code>CODE</code> reference can be set to <code>$p-&gt;{application_cache_selection}</code>.
161    That <code>CODE</code> will be called back when the application cache selection
162    algorithm MUST be run per HTML5.  By default,
163    <code>$p-&gt;{application_cache_selection}</code> is set to an empty subroutine.</p>
164    <p>The subroutine will be invoked with an argument <em>manifest_uri</em>,
165    which is set to the manifest URI when the algorithm MUST be invoked
166    with a manifest URI, or is set to <code>undef</code> when the algorithm MUST
167    be invoked without no manifest URI.</p>
168    <p>
169    </p>
170    <hr />
171    <h1><a name="error_reports">ERROR REPORTS</a></h1>
172    <p>@@ TBW</p>
173    <p>The list of the error types is available in
174    Whatpm Error Types &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
175    <p>
176    </p>
177    <hr />
178  <h1><a name="to_do">TO DO</a></h1>  <h1><a name="to_do">TO DO</a></h1>
179  <p>Tokenizer should emit a sequence of character tokens as one token  <p>Tokenizer should emit a sequence of character tokens as one token
180  to improve performance.</p>  to improve performance.</p>
181  <p>A method that accepts a byte stream as an input.</p>  <p>A method that accepts a byte stream as an input.</p>
182  <p>Charset detection algorithm.</p>  <p>Charset detection algorithm.</p>
183  <p>Setting inner_html.</p>  <p>Documentation for the setter of inner_html.</p>
184  <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>  <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>
185  <p>  <p>
186  </p>  </p>
187  <hr />  <hr />
188  <h1><a name="see_also">SEE ALSO</a></h1>  <h1><a name="see_also">SEE ALSO</a></h1>
189  <p>Web Applications 1.0 Working Draft (aka HTML5)  <p>Whatpm &lt;http://suika.fam.cx/www/markup/html/whatpm/readme&gt;.</p>
190  &lt;http://whatwg.org/html5&gt;.  (Revision 792, 1 May 2007)</p>  <p>Whatpm Error Types
191  <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a></p>  &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
192    <p>HTML5 &lt;http://whatwg.org/html5&gt;.</p>
193    <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a>.</p>
194    <p><a href="../Whatpm/ContentChecker/HTML.html">the Whatpm::ContentChecker::HTML manpage</a>.</p>
195  <p>  <p>
196  </p>  </p>
197  <hr />  <hr />

Legend:
Removed from v.1.2  
changed lines
  Added in v.1.6

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24