manakai

Whatpm::ContentChecker

DOM Conformance Checker

SYNOPSIS

  use Whatpm::ContentChecker;
  
  Whatpm::ContentChecker->check_document ($doc, sub {
    my %arg = @_;
    warn get_node_path ($arg{node}), ": ",
        ($arg{level} || "Error"), ": ",
        $arg{type}, "\n";
  });
  
  Whatpm::ContentChecker->check_element ($doc, sub {
    my %arg = @_;
    warn get_node_path ($arg{node}), ": ",
        ($arg{level} || "Error"), ": ",
        $arg{type}, "\n";
  });

DESCRIPTION

The Perl module Whatpm::ContentChecker contains methods for validation of entire or part of DOM tree with regard to relevant Web standards such as HTML.

METHODS

This module contains two class methods:

Whatpm::ContentChecker->check_document ($document, $onerror)

Checks a document, $document, and its descendant for their conformance. If there is an error or a warnign, then the $onerror CODE is invoked with named arguments same as ones for the method check_element.

Whatpm::ContentChecker->check_element ($element, $onerror)

Checks an element, $element, and its descendant for their conformance. If there is an error or a warning, then the $onerror CODE is invoked with named arguments:

level (Might be undef)

A string which describes the severity of the error or warning. For the list of the severities, see <https://suika.suikawiki.org/gate/2005/sw/Whatpm%20Error%20Types>.

node (Always specified)

The node with which the error is detected.

type (Always specified)

A string which describes the type of the error or warning. For the list of the errors and warnings, see <https://suika.suikawiki.org/gate/2005/sw/Whatpm%20Error%20Types>.

text (Sometimes specified)

An optional string argumenting the type of the error, e.g. an element name.

value (Sometimes specified)

An optional string in which the error occurs. For example, a URL extracted from a complex attribute value, in whcih a conformance error is detected.

SUPPORTED STANDARDS

Whatpm::ContentChecker - XML 1.0, XML 1.1, XML Namespaces 1.0, XML Namespaces 1.1, xml:base, xml:id.

Whatpm::ContentChecker::HTML - Web Applications 1.0 (including HTML Living Standard and HTML5), manakai's Conformance Checking Guideline for Obsolete HTML Elements and Attributes.

Whatpm::ContentChecker::Atom - Atom 1.0, Atom Threading Extension.

For more information, see <https://suika.suikawiki.org/gate/2007/html/standards>.

BUGS

This conformance checker is work in progress; it might not be able to detect all the errors in the DOM tree, and it might detect an error for a node that is conforming in fact.

NOTES ON IMPLEMENTATION DETAILS

This section is not complete.

This section describes various internal constructions used in Whatpm::ContentChecker and relevant modules. These data structures are not public interfaces -- they should not be accessed or modified by applications. They are documented here for the convenience of development only.

The $self->{flag} Structure

$self->{flag}->{has_label}

This flag is set to a true value if and only if there is a label element ancestor of the current node.

$self->{flag}->{has_labelable}

This flag is set to 1 if and only if a nearest ancestor label element has the for attribute and there is no labelable form-associated element that is a descendant of the label element and precedes the current node in tree order. This flag is set to 2 if and only if there is a labelable form-associated element that is a descendant of the nearest ancestor label element of the current node and precedes the current node in tree order. This flag is otherwise set to a false value. However, when there is no ancestor label element of the current node, i.e. when $self->{flag}->{has_label} is false, the value of the $self->{flag}->{has_labelable} flag is undefined.

The $element_state Structure

$element_state->{has_label_original}

Used to preserve the value of $self->{flag}->{has_label} at the time of invocation of the method element_start for the element being checked.

$element_state->{has_labelable_original}

Used to preserve the value of $self->{flag}->{has_labelable} at the time of invocation of the method element_start for the element being checked.

SEE ALSO

Whatpm::ContentChecker::Atom

Whatpm::ContentChecker::HTML

<https://suika.suikawiki.org/gate/2005/sw/Whatpm%20Error%20Types>

AUTHOR

Wakaba <wakaba@suikawiki.org>

LICENSE

Copyright 2007-2011 Wakaba <wakaba@suikawiki.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.