manakai

Whatpm::XML::Parser

An XML parser

SYNOPSIS

  use Whatpm::XML::Parser;
  use Message::DOM::DOMImplementation;
  $parser = Whatpm::XML::Parser->new;
  $dom = Message::DOM::DOMImplementation->new;
  $doc = $dom->create_document;
  
  $parser->parse_char_string ($chars => $doc);

  ## Or, just use DOM attribute:
  $doc->inner_html ($chars);

DESCRIPTION

The Whatpm::XML::Parser module is an implementation of the XML parser. The parser is not Draconian - the parser does not holt on well-formedness errors. It implements a variant of XML5 proposal, which defines error handling for ill-formed XML documents.

METHODS

It is recommended to use standard DOM interface, such as inner_html method of the Document object, to parse an XML string, where possible. The Whatpm::XML::Parser module, which, in fact, is used to implement the inner_html method, offers more control on how parser behaves, which would not be useful unless you are writing a complex user agent such as browser or validator.

The Whatpm::XML::Parser module provides following methods:

$parser = Whatpm::XML::Parser->new

Create a new parser.

$parser->parse_char_string ($chars => $doc)

Parse a string of characters (i.e. a possibly utf8-flagged string) as XML and construct the DOM tree.

The first argument to the method must be a string to parse. It may or may not be a valid or well-formed XML document.

The second argument to the method must be a DOM Document object (Message::DOM::Document). Any child nodes of the document is first removed by the parser.

$code = $parser->onerror
$parser->onerror ($new_code)

Get or set the error handler for the parser. Any parse error, as well as warning and information, is reported to the handler. See Whatpm::Errors for more information.

Parsed document structure is reflected to the Document object specified as an argument to parse methods.

SEE ALSO

Message::DOM::Document, Message::DOM::Element.

Whatpm::ContentChecker.

Whatpm::HTML::Parser.

SPECIFICATIONS

[XML]

XML 1.0 <http://www.w3.org/TR/xml/>.

XML 1.1 <http://www.w3.org/TR/xml11/>.

Namespaces in XML 1.0 <http://www.w3.org/TR/xml-names/>.

Namespaces in XML 1.1 <http://www.w3.org/TR/xml-names11/>.

XML Information Set <http://www.w3.org/TR/xml-infoset/>.

DOM Level 3 Core - Infoset Mapping <http://www.w3.org/TR/DOM-Level-3-Core/infoset-mapping.html>.

XML5. See <https://suika.suikawiki.org/~wakaba/wiki/sw/n/XML5> for references.

[HTML]

HTML Living Standard -Parsing XHTML documents <http://www.whatwg.org/specs/web-apps/current-work/#parsing-xhtml-documents>.

[XMLCC]

manakai's XML Conformance Checking <https://suika.suikawiki.org/www/markup/xml/xmlcc/xmlcc-work>.

[DTDEF]

DOM Document Type Definition Module <https://suika.suikawiki.org/www/markup/xml/domdtdef/domdtdef-work>.

[MANAKAI]

manakai DOM Extensions <https://suika.suikawiki.org/~wakaba/wiki/sw/n/manakai%20DOM%20Extensions>.

AUTHOR

Wakaba <wakaba@suikawiki.org>.

LICENSE

Copyright 2007-2012 Wakaba <wakaba@suikawiki.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.