/[suikacvs]/markup/html/whatpm/Whatpm/HTML.html
Suika

Diff of /markup/html/whatpm/Whatpm/HTML.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.1 by wakaba, Tue May 1 10:36:06 2007 UTC revision 1.7 by wakaba, Sun Nov 11 04:59:35 2007 UTC
# Line 1  Line 1 
1  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2  <html xmlns="http://www.w3.org/1999/xhtml">  <html xmlns="http://www.w3.org/1999/xhtml">
3  <head>  <head>
4  <title>What::HTML - An HTML Parser</title>  <title>Whatpm::HTML - An HTML Parser and Serializer</title>
5  <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />  <link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" />
6  <link rev="made" href="mailto:admin@suika.fam.cx" />  <link rev="made" href="mailto:admin@suika.fam.cx" />
7  </head>  </head>
# Line 17  Line 17 
17          <li><a href="#synopsis">SYNOPSIS</a></li>          <li><a href="#synopsis">SYNOPSIS</a></li>
18          <li><a href="#description">DESCRIPTION</a></li>          <li><a href="#description">DESCRIPTION</a></li>
19          <li><a href="#methods">METHODS</a></li>          <li><a href="#methods">METHODS</a></li>
20            <li><a href="#lowlevel_interface">LOW-LEVEL INTERFACE</a></li>
21            <ul>
22    
23                    <li><a href="#application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></li>
24            </ul>
25    
26            <li><a href="#error_reports">ERROR REPORTS</a></li>
27          <li><a href="#to_do">TO DO</a></li>          <li><a href="#to_do">TO DO</a></li>
28          <li><a href="#see_also">SEE ALSO</a></li>          <li><a href="#see_also">SEE ALSO</a></li>
29          <li><a href="#author">AUTHOR</a></li>          <li><a href="#author">AUTHOR</a></li>
# Line 28  Line 35 
35  <p>  <p>
36  </p>  </p>
37  <h1><a name="name">NAME</a></h1>  <h1><a name="name">NAME</a></h1>
38  <p>What::HTML - An HTML Parser</p>  <p>Whatpm::HTML - An HTML Parser and Serializer</p>
39  <p>  <p>
40  </p>  </p>
41  <hr />  <hr />
42  <h1><a name="synopsis">SYNOPSIS</a></h1>  <h1><a name="synopsis">SYNOPSIS</a></h1>
43  <pre>  <pre>
44    use What::HTML;    use Whatpm::HTML;
45        
46    my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;    my $s = q&lt;&lt;!DOCTYPE html&gt;&lt;html&gt;...&lt;/html&gt;&gt;;
47    # $doc = an empty DOM |Document| object    # $doc = an empty DOM |Document| object
48    my $on_error = sub {    my $on_error = sub {
49      my $error_code = shift;      my %error = @_;
50      warn $error_code, &quot;\n&quot;;      warn $error{type}, &quot;\n&quot;;
51    };    };
52        
53    What::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);    Whatpm::HTML-&gt;parse_string ($s =&gt; $doc, $onerror);
54        
55    ## Then, |$doc| is the DOM representation of |$s|.</pre>    ## Now, |$doc| is the DOM representation of |$s|.</pre>
56  <p>  <p>
57  </p>  </p>
58  <hr />  <hr />
59  <h1><a name="description">DESCRIPTION</a></h1>  <h1><a name="description">DESCRIPTION</a></h1>
60  <p>The <code>What::HTML</code> module contains HTML parser and serializer.</p>  <p>The <code>Whatpm::HTML</code> module contains HTML parser and serializer.</p>
61  <p>The HTML parser can be used to construct the DOM tree representation  <p>The HTML parser can be used to construct the DOM tree representation
62  from an HTML document.  The parsing and tree construction are done  from an HTML document.  The parsing and tree construction are done
63  as described in the Web Application 1.0 specification.</p>  as described in the Web Application 1.0 specification.</p>
# Line 58  as described in the Web Application 1.0 Line 65  as described in the Web Application 1.0
65  of a DOM tree (or a tree fragment thereof).  The serialization  of a DOM tree (or a tree fragment thereof).  The serialization
66  is performed as described in the Web Applications 1.0 specification  is performed as described in the Web Applications 1.0 specification
67  for <code>innerHTML</code> DOM attribute.</p>  for <code>innerHTML</code> DOM attribute.</p>
68  <p>This module is part of WHAT.pm - Perl Modules for  <p>This module is part of Whatpm - Perl Modules for
69  Web Hypertext Application Technologies.</p>  Web Hypertext Application Technologies.</p>
70  <p>  <p>
71  </p>  </p>
72  <hr />  <hr />
73  <h1><a name="methods">METHODS</a></h1>  <h1><a name="methods">METHODS</a></h1>
74  <dl>  <dl>
75  <dt><strong><a name="item_parse_string">[<em>$doc</em> =] What::HTML-&gt;parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br />  <dt><strong><a name="item_parse_string">[<em>$doc</em> =] Whatpm::HTML-&gt;parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br />
76  </dt>  </dt>
77  <dd>  <dd>
78  Parse a string <em>$s</em> as an HTML document.  Parse a string <em>$s</em> as an HTML document.
# Line 92  This argument is optional; if missing, a Line 99  This argument is optional; if missing, a
99  parse error makes that string being <code>warn</code>ed.</p>  parse error makes that string being <code>warn</code>ed.</p>
100  </dd>  </dd>
101  <dd>  <dd>
102    <p><strong>NOTE</strong>: To be a conforming user agent, the code MUST either
103    abort the processing by throwing an exception at the first
104    invocation or MUST continue the processing until the parser
105    stops.</p>
106    </dd>
107    <dd>
108  <p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p>  <p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p>
109  </dd>  </dd>
110  <dd>  <dd>
111  <p>Note that the <code>What::NanoDOM</code> module provides a non-conforming  <p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming
112  implementation of DOM that only implements the subset that  implementation of DOM that only implements a subset that
113  is necessary for the purpose of <code>What::HTML</code>'s parsing and  is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and
114  serializing.  serializing.
115  With this module, creating a new HTML <code>Document</code> object  With this module, creating a new HTML <code>Document</code> object
116  from a string containing HTML document can be coded as:</p>  from a string containing HTML document might be coded as:</p>
117  </dd>  </dd>
118  <dd>  <dd>
119  <pre>  <pre>
120    use What::HTML;    use Whatpm::HTML;
121    use What::NanoDOM;    use Whatpm::NanoDOM;
122    my $doc = What::HTML-&gt;parse_string ($s =&gt; What::NanoDOM-&gt;new, $onerror);</pre>    my $doc = Whatpm::HTML-&gt;parse_string
123  </dd>        ($s =&gt; Whatpm::NanoDOM::Document-&gt;new, $onerror);</pre>
 <p></p>  
 <dt><strong><a name="item_get_inner_html"><em>$s</em> = What::HTML-&gt;get_inner_html (<em>$node</em>[, <em>$onerror</em>]);</a></strong><br />  
 </dt>  
 <dd>  
 Return the HTML serialization of a DOM node <em>$node</em>.  
 </dd>  
 <dd>  
 <p>The first argument, <em>$node</em>, MUST be a DOM <code>Document</code>,  
 <code>Node</code>, or <code>DocumentFragment</code> object.</p>  
 </dd>  
 <dd>  
 <p>The second argument, <em>$onerror</em>, MUST be a reference to the  
 error handling code.  This code will be invoked if a descendant  
 of <code>$node</code> is not of <code>Element</code>, <code>Text</code>, <code>CDATASection</code>,  
 <code>Comment</code>, <code>DocumentType</code>, or <code>EntityReference</code> so  
 that <code>INVALID_STATE_ERR</code> MUST be thrown.  
 The code will be invoked with an argument, which is the node  
 whose type is invalid.    
 This argument is optional; if missing, any such  
 node is simply ignored.</p>  
 </dd>  
 <dd>  
 <p>The method returns the <code>inner_html</code> attribute  
 value, i.e. the HTML serialization of the <code>$node</code>.</p>  
124  </dd>  </dd>
125  <p></p></dl>  <p></p></dl>
126  <p>  <p>
127  </p>  </p>
128  <hr />  <hr />
129    <h1><a name="lowlevel_interface">LOW-LEVEL INTERFACE</a></h1>
130    <p>@@ TBW</p>
131    <p>
132    </p>
133    <h2><a name="application_cache_selection_algorithm_hook">Application Cache Selection Algorithm Hook</a></h2>
134    <p>Once a parser <em>$p</em> is instantiated by method <code>new</code>,
135    a <code>CODE</code> reference can be set to <code>$p-&gt;{application_cache_selection}</code>.
136    That <code>CODE</code> will be called back when the application cache selection
137    algorithm MUST be run per HTML5.  By default,
138    <code>$p-&gt;{application_cache_selection}</code> is set to an empty subroutine.</p>
139    <p>The subroutine will be invoked with an argument <em>manifest_uri</em>,
140    which is set to the manifest URI when the algorithm MUST be invoked
141    with a manifest URI, or is set to <code>undef</code> when the algorithm MUST
142    be invoked without no manifest URI.</p>
143    <p>
144    </p>
145    <hr />
146    <h1><a name="error_reports">ERROR REPORTS</a></h1>
147    <p>@@ TBW</p>
148    <p>The list of the error types is available in
149    Whatpm Error Types &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
150    <p>
151    </p>
152    <hr />
153  <h1><a name="to_do">TO DO</a></h1>  <h1><a name="to_do">TO DO</a></h1>
154  <p>Tokenizer should emit a sequence of character tokens as one token  <p>Tokenizer should emit a sequence of character tokens as one token
155  to improve performance.</p>  to improve performance.</p>
156  <p>A method that accepts a byte stream as an input.</p>  <p>A method that accepts a byte stream as an input.</p>
157  <p>Charset detection algorithm.</p>  <p>Charset detection algorithm.</p>
158  <p>Setting inner_html.</p>  <p>Documentation for the setter of inner_html.</p>
159  <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>  <p>And there are many ``TODO''s and ``ISSUE''s in the source code.</p>
160  <p>  <p>
161  </p>  </p>
162  <hr />  <hr />
163  <h1><a name="see_also">SEE ALSO</a></h1>  <h1><a name="see_also">SEE ALSO</a></h1>
164  <p>Web Applications 1.0 Working Draft (aka HTML5)  <p>Whatpm &lt;http://suika.fam.cx/www/markup/html/whatpm/readme&gt;.</p>
165  &lt;http://whatwg.org/html5&gt;.  (Revision 792, 1 May 2007)</p>  <p>Whatpm Error Types
166  <p><a href="../What/NanoDOM.html">the What::NanoDOM manpage</a></p>  &lt;http://suika.fam.cx/gate/2005/sw/Whatpm%20Error%20Types&gt;.</p>
167    <p>HTML5 &lt;http://whatwg.org/html5&gt;.</p>
168    <p><a href="../Whatpm/HTML/Serializer.html">the Whatpm::HTML::Serializer manpage</a>.</p>
169    <p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a>.</p>
170    <p><a href="../Whatpm/ContentChecker/HTML.html">the Whatpm::ContentChecker::HTML manpage</a>.</p>
171  <p>  <p>
172  </p>  </p>
173  <hr />  <hr />

Legend:
Removed from v.1.1  
changed lines
  Added in v.1.7

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24