manakai

Whatpm::CSS::SelectorsParser

A Selectors Parser

SYNOPSIS

  use Whatpm::CSS::SelectorsParser;
  my $parser = Whatpm::CSS::SelectorsParser->new;
  $parsed_selectors = $parser->parse_string ($selectors);

DESCRIPTION

The Whatpm::CSS::SelectorsParser is a parser for Selectors, the element pattern language used in CSS. It parses a Selectors string into parsed data structure, if the input is valid, or reports a parse error, otherwise. In addition, it provides a method to compute the specificity of a parsed selector.

METHODS

$parser = Whatpm::CSS::SelectorsParser->new

Creates a new instance of the Selectors parser.

$parsed = $parser->parse_string ($selectors)

Parses a character string. If it is a valid group of selectors, the method returns the parsed group of selectors data structure. Otherwise, it returns undef.

$specificity = $parser->get_selector_specificity ($parsed_selector)

Returns the specificity of a parsed selector data structure. Note that the input has to be a selector, not a group of selectors.

The return value is an array reference with four values: The style attribute flag (always 0), a, b, and c.

PARAMETERS

Following parameters can be specified on a parser object:

$parser->{href} = URL

The URL in which the input selectors string is found, if available. This value is used to set the href parameter to the $parser->{onerror} handler.

$parser->{lookup_namespace_uri} = CODE

The CODE reference used to resolve namespce prefixes and to obtain the default namespace.

The code would be invoked during the parsing with an argument. The argument is undef, it must return the default namespace URL. Otherwise, it must return the namespace URL bound to the specified namespace prefix.

If the namespace URL is explicitly given for the prefix (or the default namespace), the URL must be returned. If the prefix (or the default namespace) is bound to the null namespace, the empty string must be returned. (Note that this is incompatible with the lookup_namespace_uri method on the Node object.) Otherwise, i.e. the namespace prefix (or the default namespace) is not bound to any namespace, undef must be returned.

$parser->{onerror} = CODE

The CODE reference to which any errors and warnings during the parsing is reported. The code would receive the following name-value pairs:

type (string, always specified)

A short string describing the kind of the error. Descriptions of error types are available at <https://suika.suikawiki.org/gate/2007/html/error-description#{type}>, where {type} is an error type string.

For the list of error types, see <https://suika.suikawiki.org/gate/2007/html/error-description#langtag-errors>.

level (string, always specified)

A character representing the level or severity of the error, which is one of the following characters: m (violation to a MUST-level requirement), s (violation to a SHOULD-level requirement), w (a warning), and i (an informational notification).

token (always specified)

A Whatpm::CSS::Tokenizer token where the error is detected.

uri (a reference to string, possibly missing)

The URL in which the input selectors string is found. The value is always same as $parser->{href} in this parser.

value (string, possibly missing)

A part of the input, in which an error is detected.

$parser->{pseudo_class} = {class_name => 1, class_name => 1, ...}

The list of pseudo-classes supported by the implementation, represented as a hash reference, where the hash key is the lowercased pseudo-class name and the hash key is a boolean representing whther the pseudo-class is supported or not. Any pseudo-class not supported by both the parser and the implementation (as declared by this parameter) are ignored and the entire group of selectors is considered invalid for the purpose of parsing.

$parser->{pseudo_element} = {element_name => 1, element_name => 1, ...}

The list of pseudo-elements supported by the implementation, represented as a hash reference, where the hash key is the lowercased pseudo-element name and the hash key is a boolean representing whther the pseudo-lelement is supported or not. Any pseudo-element not supported by both the parser and the implementation (as declared by this parameter) are ignored and the entire group of selectors is considered invalid for the purpose of parsing.

DATA STRUCTURES

A group of selectors

The parse_string method returns an array reference, which contains one or more selector data structures. They corresponds to selectors in the original group of selectors string, in order.

A selector

A selector is represented as an array reference, which contains pairs of a combinator constant and a sequence of simple selector data structure. They corresponds to sequences of simple selector and combinators appears in the original selector string, in order. Note that the first (index 0) item is always the descendant combinator constant.

The constants below represent the types of combinators.

DESCENDANT_COMBINATOR

A descendant combinator.

CHILD_COMBINATOR

A child combinator.

ADJACENT_SIBLING_COMBINATOR

An adjacent sibling combinator.

GENERAL_SIBLING_COMBINATOR

A general sibling combinator.

The exporter tag :combinator can be used to export all of these constants:

  use Whatpm::CSS::SelectorsParser qw(:combinator);

A sequence of simple selectors

A sequence of simple selector is represented as an array reference, which contains simple selector data strucutures. They corresponds to simple selectors in the original sequence of simple selectors string, in order.

A simple selector

A simple selector is represented as an array reference whose first (index 0) item is the type of simple selector and the following items are arguments to the simple selector.

The constants below represent the types of simple selectors (or parts of simple selectors).

NAMESPACE_SELECTOR

The namespace specification in a type of universal selector. The first argument (item of index 1) is the namespace URL (or undef for the null namespace).

LOCAL_NAME_SELECTOR

The local name specification in a type selector. The first argument (item of index 1) is the local name.

ID_SELECTOR

An ID selector. The first argument (item of index 1) is the ID.

CLASS_SELECTOR

A class selector. The first argument (item of index 1) is the class.

PSEUDO_CLASS_SELECTOR

A pseudo-class selector. The first argument (item of index 1) is the pseudo-class name in lowercase. If the pseudo-class takes a string or identifier argument (e.g. :lang() or :contains()), the second argument (item of index 2) is the argument (with no case folding). Otherwise, if the pseudo-class takes a an+b argument (e.g. :nth-child()), the second argument (item of index 2) represents the a value and the third argument (item of index 3) represents the b value (Even an incomplete argument is normalized to this form). If the pseudo-class takes a simple selector (e.g. :not()), any arguments (the zero or more items with index 2 or more) are simple selector data structures.

For example, the simple selector data structure for :NOT(a|b) would contain four items: constant PSEUDO_CLASS_SELECTOR, string not, the namespace selector for the namespace a, the local name selector with local name b.

PSEUDO_ELEMENT_SELECTOR

A pseudo-element specification. The first argument (item of index 1) is the pseudo-element name in lowercase.

ATTRIBUTE_SELECTOR

An attribute selector. The first argument (item of index 1) is the attribute name. The second argument (item of index 2) is the type of matching.

The constants below represent the types of matches used in attribute selectors.

EXISTS_MATCH

Match by the existence of an attribute.

EQUALS_MATCH

Exact match. The third argument (item of index 3) is the expected value.

INCLUDES_MATCH

Includes match (typically used for class attributes). The third argument (item of index 3) is the expected value.

DASH_MATCH

Dash match (typically used for language tag attributes). The third argument (item of index 3) is the expected value.

PREFIX_MATCH

Prefix match. The third argument (item of index 3) is the expected value.

SUFFIX_MATCH

Suffix match. The third argument (item of index 3) is the expected value.

SUBSTRING_MATCH

Substring match. The third argument (item of index 3) is the expected value.

The exporter tag :match can be used to export all of these constants:

  use Whatpm::CSS::SelectorsParser qw(:match);

The exporter tag :selector can be used to export all of these constants:

  use Whatpm::CSS::SelectorsParser qw(:selector);

SEE ALSO

Selectors <http://www.w3.org/TR/selectors/>.

The CSS syntax <http://www.w3.org/TR/CSS21/syndata.html>.

The style attribute specificity <http://www.w3.org/TR/CSS21/cascade.html#specificity>.

Supported standards - Selectors <https://suika.suikawiki.org/gate/2007/html/standards#selectors>.

AUTHOR

Wakaba <wakaba@suikawiki.org>.

LICENSE

Copyright 2007-2011 Wakaba <wakaba@suikawiki.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.