| 1 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| 2 |
<html xmlns="http://www.w3.org/1999/xhtml"> |
<html xmlns="http://www.w3.org/1999/xhtml"> |
| 3 |
<head> |
<head> |
| 4 |
<title>What::HTML - An HTML Parser</title> |
<title>Whatpm::HTML - An HTML Parser</title> |
| 5 |
<link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" /> |
<link rel="stylesheet" href="http://suika.fam.cx/www/style/html/pod.css" type="text/css" /> |
| 6 |
<link rev="made" href="mailto:admin@suika.fam.cx" /> |
<link rev="made" href="mailto:admin@suika.fam.cx" /> |
| 7 |
</head> |
</head> |
| 28 |
<p> |
<p> |
| 29 |
</p> |
</p> |
| 30 |
<h1><a name="name">NAME</a></h1> |
<h1><a name="name">NAME</a></h1> |
| 31 |
<p>What::HTML - An HTML Parser</p> |
<p>Whatpm::HTML - An HTML Parser</p> |
| 32 |
<p> |
<p> |
| 33 |
</p> |
</p> |
| 34 |
<hr /> |
<hr /> |
| 35 |
<h1><a name="synopsis">SYNOPSIS</a></h1> |
<h1><a name="synopsis">SYNOPSIS</a></h1> |
| 36 |
<pre> |
<pre> |
| 37 |
use What::HTML; |
use Whatpm::HTML; |
| 38 |
|
|
| 39 |
my $s = q<<!DOCTYPE html><html>...</html>>; |
my $s = q<<!DOCTYPE html><html>...</html>>; |
| 40 |
# $doc = an empty DOM |Document| object |
# $doc = an empty DOM |Document| object |
| 43 |
warn $error_code, "\n"; |
warn $error_code, "\n"; |
| 44 |
}; |
}; |
| 45 |
|
|
| 46 |
What::HTML->parse_string ($s => $doc, $onerror); |
Whatpm::HTML->parse_string ($s => $doc, $onerror); |
| 47 |
|
|
| 48 |
## Then, |$doc| is the DOM representation of |$s|.</pre> |
## Then, |$doc| is the DOM representation of |$s|.</pre> |
| 49 |
<p> |
<p> |
| 50 |
</p> |
</p> |
| 51 |
<hr /> |
<hr /> |
| 52 |
<h1><a name="description">DESCRIPTION</a></h1> |
<h1><a name="description">DESCRIPTION</a></h1> |
| 53 |
<p>The <code>What::HTML</code> module contains HTML parser and serializer.</p> |
<p>The <code>Whatpm::HTML</code> module contains HTML parser and serializer.</p> |
| 54 |
<p>The HTML parser can be used to construct the DOM tree representation |
<p>The HTML parser can be used to construct the DOM tree representation |
| 55 |
from an HTML document. The parsing and tree construction are done |
from an HTML document. The parsing and tree construction are done |
| 56 |
as described in the Web Application 1.0 specification.</p> |
as described in the Web Application 1.0 specification.</p> |
| 58 |
of a DOM tree (or a tree fragment thereof). The serialization |
of a DOM tree (or a tree fragment thereof). The serialization |
| 59 |
is performed as described in the Web Applications 1.0 specification |
is performed as described in the Web Applications 1.0 specification |
| 60 |
for <code>innerHTML</code> DOM attribute.</p> |
for <code>innerHTML</code> DOM attribute.</p> |
| 61 |
<p>This module is part of WHAT.pm - Perl Modules for |
<p>This module is part of Whatpm - Perl Modules for |
| 62 |
Web Hypertext Application Technologies.</p> |
Web Hypertext Application Technologies.</p> |
| 63 |
<p> |
<p> |
| 64 |
</p> |
</p> |
| 65 |
<hr /> |
<hr /> |
| 66 |
<h1><a name="methods">METHODS</a></h1> |
<h1><a name="methods">METHODS</a></h1> |
| 67 |
<dl> |
<dl> |
| 68 |
<dt><strong><a name="item_parse_string">[<em>$doc</em> =] What::HTML->parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br /> |
<dt><strong><a name="item_parse_string">[<em>$doc</em> =] Whatpm::HTML->parse_string (<em>$s</em>, <em>$doc</em>[, <em>$onerror</em>]);</a></strong><br /> |
| 69 |
</dt> |
</dt> |
| 70 |
<dd> |
<dd> |
| 71 |
Parse a string <em>$s</em> as an HTML document. |
Parse a string <em>$s</em> as an HTML document. |
| 95 |
<p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p> |
<p>The method returns the DOM <code>Document</code> object (i.e. the second argument).</p> |
| 96 |
</dd> |
</dd> |
| 97 |
<dd> |
<dd> |
| 98 |
<p>Note that the <code>What::NanoDOM</code> module provides a non-conforming |
<p>Note that the <code>Whatpm::NanoDOM</code> module provides a non-conforming |
| 99 |
implementation of DOM that only implements the subset that |
implementation of DOM that only implements the subset that |
| 100 |
is necessary for the purpose of <code>What::HTML</code>'s parsing and |
is necessary for the purpose of <code>Whatpm::HTML</code>'s parsing and |
| 101 |
serializing. |
serializing. |
| 102 |
With this module, creating a new HTML <code>Document</code> object |
With this module, creating a new HTML <code>Document</code> object |
| 103 |
from a string containing HTML document can be coded as:</p> |
from a string containing HTML document can be coded as:</p> |
| 104 |
</dd> |
</dd> |
| 105 |
<dd> |
<dd> |
| 106 |
<pre> |
<pre> |
| 107 |
use What::HTML; |
use Whatpm::HTML; |
| 108 |
use What::NanoDOM; |
use Whatpm::NanoDOM; |
| 109 |
my $doc = What::HTML->parse_string ($s => What::NanoDOM->new, $onerror);</pre> |
my $doc = Whatpm::HTML->parse_string |
| 110 |
|
($s => Whatpm::NanoDOM::Document->new, $onerror);</pre> |
| 111 |
</dd> |
</dd> |
| 112 |
<p></p> |
<p></p> |
| 113 |
<dt><strong><a name="item_get_inner_html"><em>$s</em> = What::HTML->get_inner_html (<em>$node</em>[, <em>$onerror</em>]);</a></strong><br /> |
<dt><strong><a name="item_get_inner_html"><em>$s</em> = Whatpm::HTML->get_inner_html (<em>$node</em>[, <em>$onerror</em>]);</a></strong><br /> |
| 114 |
</dt> |
</dt> |
| 115 |
<dd> |
<dd> |
| 116 |
Return the HTML serialization of a DOM node <em>$node</em>. |
Return the HTML serialization of a DOM node <em>$node</em>. |
| 131 |
node is simply ignored.</p> |
node is simply ignored.</p> |
| 132 |
</dd> |
</dd> |
| 133 |
<dd> |
<dd> |
| 134 |
<p>The method returns the <code>inner_html</code> attribute |
<p>The method returns a reference to the <code>inner_html</code> attribute |
| 135 |
value, i.e. the HTML serialization of the <code>$node</code>.</p> |
value, i.e. the HTML serialization of the <code>$node</code>.</p> |
| 136 |
</dd> |
</dd> |
| 137 |
<p></p></dl> |
<p></p></dl> |
| 151 |
<h1><a name="see_also">SEE ALSO</a></h1> |
<h1><a name="see_also">SEE ALSO</a></h1> |
| 152 |
<p>Web Applications 1.0 Working Draft (aka HTML5) |
<p>Web Applications 1.0 Working Draft (aka HTML5) |
| 153 |
<http://whatwg.org/html5>. (Revision 792, 1 May 2007)</p> |
<http://whatwg.org/html5>. (Revision 792, 1 May 2007)</p> |
| 154 |
<p><a href="../What/NanoDOM.html">the What::NanoDOM manpage</a></p> |
<p><a href="../Whatpm/NanoDOM.html">the Whatpm::NanoDOM manpage</a></p> |
| 155 |
<p> |
<p> |
| 156 |
</p> |
</p> |
| 157 |
<hr /> |
<hr /> |