/[suikacvs]/messaging/manakai/lib/Message/DOM/XMLParser.dis
Suika

Contents of /messaging/manakai/lib/Message/DOM/XMLParser.dis

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (hide annotations) (download)
Wed Dec 28 03:50:17 2005 UTC (18 years, 11 months ago) by wakaba
Branch: MAIN
++ manakai/lib/Message/Util/ChangeLog	28 Dec 2005 03:36:58 -0000
2005-12-27  Wakaba  <wakaba@suika.fam.cx>

	* PerlCode.dis (PCReference): New interface.

++ manakai/lib/Message/Util/DIS/ChangeLog	28 Dec 2005 03:49:57 -0000
2005-12-28  Wakaba  <wakaba@suika.fam.cx>

	* DPG.dis (plCodeFragment): Pushes |pop @ch| rather
	than |$ch| if an acception clause follows to a code block
	generated from a child match statement so that tokenizer
	does do the correct thing.  Support for |builtin:nestedBlockAsText|
	special rule is removed.  |pg:ifTrueStatement| is
	supported.  |pg:returnStatement| is supported.

2005-12-27  Wakaba  <wakaba@suika.fam.cx>

	* DPG.dis (VALIDITY_ERR): New error code.
	($merge_adjacent_range): New subfunction.  Adjacent character
	ranges are now merged so that |if| conditions in
	scanner methods are simplified.
	($array_uniq): New subfunction.  Filters states so that
	state number don't explode.
	(plCodeFragment): Now a DPG parser can have more than
	one tokenizer.  The |pg:qMatch| rvalue is supported.
	(DPGMatchStatementElement): The match statement
	now accepts attributes.
	(DPGRuleRefStatementElement): The ruleref statement
	now accepts attributes.
	(AttributeSpecificationList): The attribute specification
	now accepts a |NAME| as an attribute value as well
	as |STRING|s.
	(CodeBlock): The |pg:qLexmodeStatement| is supported.
	(lexmode.default): |QKEYWORD| token type is added.
	(ErrDef): Errors |unique-lexmode-name-error|, |rule-not-defined-error|,
	|lexmode-not-defined-error| and properties added.

2005-12-25  Wakaba  <wakaba@suika.fam.cx>

	* DPG.dis (dpgGetAttributeList): New method.
	(plCodeFragment): |:terminator?| option for a match
	block is supported.  |pg:matchElseBlock| is implemented.

++ manakai/lib/Message/DOM/ChangeLog	28 Dec 2005 03:36:24 -0000
2005-12-28  Wakaba  <wakaba@suika.fam.cx>

	* XMLParser.dis: Some modifications made.

2005-12-27  Wakaba  <wakaba@suika.fam.cx>

	* DOMLS.dis (PARSE_ERR, SERIALIZE_ERR): They are now a
	global named resource.

	* Makefile: Rules to make |XMLParser.pm| is added.

	* XMLParser.dis: New file.

2005-12-24  Wakaba  <wakaba@suika.fam.cx>

	* DOMCore.dis (ManakaiDOMError._FORMATTER_PACKAGE_): Error
	message formatter can now vary by error types.
	(DOMLocator.utf32Offset): New (manakai extended) attribute.

1 wakaba 1.1 Module:
2     @QName: MDOM|XMLParser
3     @Namespace:
4     http://suika.fam.cx/~wakaba/archive/2004/dom/xml-parser#
5    
6     @FullName:
7     @@lang:en
8     @@@: XML Parser
9    
10     @DISCore:author: DISCore|Wakaba
11     @License: license|Perl+MPL
12     @Date:
13     $Date: 2005/11/24 15:05:48 $
14    
15     @DefaultFor: ManakaiDOM|ManakaiDOMLatest
16    
17     @Require:
18     @@Module:
19     @@@QName: MDOM|DOMLS
20     @@@WithFor: ManakaiDOM|ManakaiDOMLatest
21    
22     Namespace:
23     @dis:
24     http://suika.fam.cx/~wakaba/archive/2004/8/18/lang#dis--
25     @DOMCore:
26     http://suika.fam.cx/~wakaba/archive/2004/8/18/dom-core#
27     @DOMMain:
28     http://suika.fam.cx/~wakaba/archive/2004/dom/main#
29     @dx:
30     http://suika.fam.cx/~wakaba/archive/2005/manakai/Util/Error/DOMException#
31     @ecore:
32     http://suika.fam.cx/~wakaba/archive/2005/manakai/Util/Error/Core/
33     @f:
34     http://suika.fam.cx/~wakaba/archive/2004/dom/feature#
35     @idl:
36     http://suika.fam.cx/~wakaba/archive/2004/dis/IDL#
37     @infoset:
38     http://www.w3.org/2001/04/infoset#
39     @lang:
40     http://suika.fam.cx/~wakaba/archive/2004/8/18/lang#
41     @license:
42     http://suika.fam.cx/~wakaba/archive/2004/8/18/license#
43     @LSEV:
44     http://www.w3.org/2002/DOMLS
45     @ManakaiDOM:
46     http://suika.fam.cx/~wakaba/archive/2004/8/18/manakai-dom#
47     @ManakaiDOMLS:
48     http://suika.fam.cx/~wakaba/archive/2004/mdom-ls#
49     @MDOM:
50     http://suika.fam.cx/~wakaba/archive/2004/8/18/manakai-dom#ManakaiDOM.
51     @MDOMX:
52     http://suika.fam.cx/~wakaba/archive/2004/8/4/manakai-dom-exception#
53     @rdf:
54     http://www.w3.org/1999/02/22-rdf-syntax-ns#
55     @rdfs:
56     http://www.w3.org/2000/01/rdf-schema#
57     @t:
58     http://suika.fam.cx/~wakaba/archive/2004/dom/tree#
59     @xml:
60     http://www.w3.org/XML/1998/namespace
61     @xmlns:
62     http://www.w3.org/2000/xmlns/
63     @xp:
64     http://suika.fam.cx/~wakaba/archive/2004/dom/xml-parser#
65    
66     ## -- Features
67    
68     ElementTypeBinding:
69     @Name: FeatureDef
70     @ElementType:
71     dis:ResourceDef
72     @ShadowContent:
73     @@rdf:type: f|Feature
74     @@For: =ManakaiDOM|all
75    
76     ElementTypeBinding:
77     @Name: FeatureVerDef
78     @ElementType:
79     dis:ResourceDef
80     @ShadowContent:
81     @@rdf:type: f|Feature
82    
83     ElementTypeBinding:
84     @Name: featureQName
85     @ElementType:
86     f:name
87     @ShadowContent:
88     @@ContentType: DISCore|QName
89    
90     ResourceDef:
91     @QName: DOMString
92     @AliasFor: DOMMain|DOMString
93     @For: ManakaiDOM|DOM
94    
95     ResourceDef:
96     @QName: Node
97     @AliasFor: t|Node
98     @For: ManakaiDOM|DOM
99    
100     ResourceDef:
101     @QName: Element
102     @AliasFor: t|Element
103     @For: ManakaiDOM|DOM
104    
105     ResourceDef:
106     @QName: Document
107     @AliasFor: t|Document
108     @For: ManakaiDOM|DOM
109    
110     ElementTypeBinding:
111     @Name: ClsDef
112     @ElementType:
113     dis:ResourceDef
114     @ShadowContent:
115     @@rdf:type:
116     @@@@: dis|MultipleResource
117     @@@ForCheck: !ManakaiDOM|ForIF !ManakaiDOM|ForClass
118     @@resourceFor:
119     @@@@: ManakaiDOM|ForClass
120     @@@ForCheck: ManakaiDOM|ManakaiDOM !=ManakaiDOM|ManakaiDOM
121     @@For: ManakaiDOM|DOM3
122     @@For: =ManakaiDOM|ManakaiDOM
123    
124     @@rdf:type:
125     @@@@: DISLang|Class
126     @@@ForCheck: ManakaiDOM|ForClass
127    
128     ElementTypeBinding:
129     @Name: ClsQName
130     @ElementType:
131     dis:QName
132     @ShadowContent:
133     @@ForCheck: ManakaiDOM|ForClass
134    
135     ElementTypeBinding:
136     @Name: ClsISA
137     @ElementType:
138     dis:ISA
139     @ShadowContent:
140     @@ForCheck: ManakaiDOM|ForClass
141    
142     ElementTypeBinding:
143     @Name: nullCase
144     @ElementType:
145     dis:ResourceDef
146     @ShadowContent:
147     @@rdf:type: ManakaiDOM|InCase
148     @@Value:
149     @@@is-null:1
150    
151     ResourceDef:
152     @QName: LSParser
153     @AliasFor: DOMLS|LSParser
154     @For: ManakaiDOM|DOM3
155    
156     ClsDef:
157     @ClsQName: ManakaiXMLParser
158    
159     @Implement: DOMLS|LSParser
160    
161     @f:implements:
162     @@@: DOMLS|LSFeature30
163     @@For: ManakaiDOM|DOM3
164    
165     @DISLang:role: DOMLS|ParserRole
166    
167     @Attr:
168     @@Name: domConfig
169     @@enDesc:
170     The configuration of the parser.
171    
172     @@Get:
173     @@@Type: DOMCore|DOMConfiguration
174     @@@enDesc: The DOM configuration object.
175     @@@PerlDef:
176     __CODE{DOMCore|getConfigObject::
177     $target => $self,
178     $targetHash => $self,
179     $targetType => {<IFName::LSParser>},
180     $result => $r,
181     }__;
182    
183     @Method:
184     @@ManakaiDOM:isForInternal:1
185     @@ForCheck: ManakaiDOM|ForClass
186     @@Operator: DISPerl|NewMethod
187     @@enDesc:
188     Creates a new instance of the object.
189     @@Param:
190     @@@Name: impl
191     @@@Type: DOMLS|GLSImplementation
192     @@@enDesc:
193     The implementation from which the parser is created.
194     @@Param:
195     @@@Name: features
196     @@@Type: DOMString
197     @@@dis:actualType: f|FeaturesString
198     @@@enDesc:
199     The set of features requested for the parser.
200     @@Return:
201     @@@Type: DOMMain|DOMObject
202     @@@dis:actualType: LSParser
203     @@@enDesc:
204     The newly created parser.
205     @@@PerlDef:
206     $r = bless {
207     <H::DOMCore:implementation> => $impl,
208     }, $self;
209    
210     @Method:
211     @@Name: parseString
212     @@enImplNote:
213     Non-standard - to be removed
214    
215     @@Param:
216     @@@Name: sourceText
217     @@@Type: DOMString
218     @@Return:
219     @@@Type: Document
220     @@@PerlDef:
221    
222     $self->{char} = [];
223     $self->{token} = [];
224     $self->{source} = $sourceText;
225    
226     __DEEP{
227     $r = $self->_parse_DocumentEntity
228     ($self->{<H::DOMCore:implementation>});
229     }__;
230    
231     @Method:
232     @@Name: shiftChar
233     @@ManakaiDOM:isForInternal:1
234     @@ForCheck: ManakaiDOM|ForClass
235     @@enDesc:
236     Returns the next character.
237     @@Return:
238     @@@Type: idl|long||ManakaiDOM|all
239     @@@enDesc:
240     The code position number of the next character, if any,
241     or <CODE::-2>.
242     @@@PerlDef:
243     if (@{$self->{char}}) {
244     $r = shift @{$self->{char}};
245     } else {
246     my $char = substr ($self->{source}, pos ($self->{source}), 1);
247     pos ($self->{source})++;
248    
249     if (length $char) {
250     $r = ord $char;
251     } else {
252     $r = -1;
253     }
254     }
255    
256     @Method:
257     @@ManakaiDOM:isForInternal: 1
258     @@Operator: ManakaiDOM|MUErrorHandler
259     @@enDesc:
260     When a <IF::ecore|ErrorInterface||ManakaiDOM|Perl> is <Perl::report>ed,
261     then this method is invoked.
262    
263     The method calls the <cfg::DOMCore|error-handler> if the error is of
264     <IF::DOMCore|DOMError>. Otherwise, the error is re-thrown so that
265     corresponding <Perl::catch> clause, if any, can catch the error.
266     @@Param:
267     @@@Name: err
268     @@@Type: ecore|ErrorInterface||ManakaiDOM|Perl
269     @@@enDesc:
270     The reported error object.
271     @@Return:
272     @@@Type: DISPerl|Any
273     @@@enDesc:
274     If the <P::err> is a <IF::DOMCore|DOMError>, then the return value
275     of the error handler.
276    
277     {NOTE:: If the error is thrown, the method never returns.
278     }
279     @@@nullCase:
280     @@@@enDesc:
281     No error handler.
282     @@@PerlDef:
283     if ($err->isa (<IFName::DOMCore|DOMError||ManakaiDOM|ManakaiDOM>)) {
284     __DEEP{
285     A: {
286     my $cfg = $self-><AG::LSParser.domConfig>;
287     my $h = $cfg-><M::DOMCore|DOMConfiguration.getParameter>
288     ('error-handler');
289     $r = $h-><M::DOMCore|DOMErrorHandler.handleError> ($err);
290     } # A
291     }__;
292     } else {
293     $err-><M::ecore|ErrorInterface||ManakaiDOM|Perl.throw>;
294     }
295    
296     @DISPerl:dpgDef:
297    
298     /*
299     XML Document Entity
300    
301     document := prolog element *Misc
302     - *Char RestrictedChar *Char ;; [1]
303     */
304     rule DocumentEntity ($impl) : standalone {
305     my $doc : return;
306    
307     lang:Perl {
308     $doc = $impl-><M::DOMImpl.createDocument>;
309     $doc-><AS::Document.strictErrorChecking> (false);
310     }
311    
312     /*
313     prolog := XMLDecl? *Misc [doctypedecl *Misc] ;; [22]
314     */
315     ?lexmode 'DocumentStart';
316    
317     &XMLDeclarationOpt ($doc => $doc);
318    
319     ?lexmode 'DocumentProlog';
320    
321     // *Misc
322     ~* (COMO) {
323     &_CommentDeclaration_ ($doc => $doc, $parent => $doc);
324    
325     ~ (MDC) {
326     ?lexmode DocumentProlog;
327     } else {
328     ?lexmode DocumentProlog;
329     }
330     } (PIO) {
331     &_ProcessingInstruction_ ($doc => $doc, $parent => $doc);
332    
333     ~ (PIC) {
334     ?lexmode 'DocumentProlog';
335     } else {
336     ?lexmode DocumentProlog;
337     }
338     } (S) {
339     //
340     }
341    
342     // doctypedecl
343     ~? (MDO) {
344     &_DocumentTypeDeclaration_ ($doc => $doc);
345    
346     ~ (MDC) { }
347     }
348    
349     ?lexmode 'DocumentMisc';
350    
351     // *Misc
352     ~* (COMO) {
353     &_CommentDeclaration_ ($doc => $doc, $parent => $doc);
354    
355     ~ (MDC) {
356     ?lexmode DocumentMisc;
357     } else {
358     ?lexmode DocumentMisc;
359     }
360     } (PIO) {
361     &_ProcessingInstruction_ ($doc => $doc, $parent => $doc);
362    
363     ~ (PIC) {
364     ?lexmode 'DocumentMisc';
365     } else {
366     ?lexmode DocumentMisc;
367     }
368     } (S) {
369     //
370     }
371    
372     // Document element
373     ~ (STAGO) {
374     &Element_ ($doc => $doc, $parent => $doc)
375     : unshift-current-token;
376     ~ (TAGC) {
377     ?lexmode DocumentEnd;
378     } else {
379     ?lexmode DocumentEnd;
380     }
381     } else {
382     ?lexmode 'DocumentEnd';
383     }
384    
385     // *Misc
386     ~* (COMO) {
387     &_CommentDeclaration_ ($doc => $doc, $parent => $doc);
388    
389     ~ (MDC) {
390     ?lexmode DocumentEnd;
391     } else {
392     ?lexmode DocumentEnd;
393     }
394     } (PIO) {
395     &_ProcessingInstruction_ ($doc => $doc, $parent => $doc);
396     ~ (PIC) {
397     ?lexmode 'DocumentEnd';
398     } else {
399     ?lexmode DocumentEnd;
400     }
401     } (S) {
402     //
403     }
404    
405     ~ (#EOF) { }
406    
407     lang:Perl {
408     if ($self->{has_error}) {
409     __EXCEPTION{DOMLS|PARSE_ERR::
410     }__;
411     }
412    
413     $doc-><AS::Document.strictErrorChecking> (true);
414     }
415     } // DocumentEntity
416    
417     /*
418     XML Declaration
419    
420     XMLDecl := '<?xml' VersionInfo
421     [EncodingDecl]
422     [SDDecl]
423     [S] '?>' ;; [23]
424    
425     NOTE: XML declaration is optional in XML 1.0
426     while it is required in XML 1.1.
427     */
428    
429     rule XMLDeclarationOpt ($doc) {
430     ~? (XDO) {
431     &_XMLDeclaration ($doc => $doc);
432     }
433     } // XMLDeclarationOpt
434    
435     rule XMLDeclaration ($doc) {
436     ~ (XDO) {
437     &_XMLDeclaration ($doc => $doc);
438     }
439     } // XMLDeclaration
440    
441     rule _XMLDeclaration ($doc) {
442     ?lexmode 'TAG';
443    
444     // TODO: implement this
445    
446     } // _XMLDeclaration
447    
448     /*
449     Document Type Declaration
450     */
451     rule _DocumentTypeDeclaration_ ($doc) {
452     ?lexmode 'DocumentTypeDeclaration';
453    
454     ~ (Name == 'DOCTYPE') { }
455    
456     ~ (S) { }
457    
458     // Document type name
459     ~ (Name) {
460    
461     }
462    
463     // TODO: Implement this
464    
465     // ~ (MDC) { }
466     } // _DocumentTypeDeclaration_
467    
468     /*
469     Comment Declaration
470    
471     Comment := '<!--' *(Char - '-' / '-' (Char - '-'))
472     '-->' ;; [15]
473     */
474     rule _CommentDeclaration_ ($doc, $parent) {
475     ?lexmode 'CommentDeclaration';
476    
477     ~? (STRING) {
478     lang:Perl ($data => $token.value) {
479     my $com = $doc-><M::Document.createComment> ($data);
480     $parent-><M::Node.appendChild> ($com);
481     }
482     } else {
483     lang:Perl {
484     my $com = $doc-><M::Document.createComment> ('');
485     $parent-><M::Node.appendChild> ($com);
486     }
487     }
488    
489     ~ (COM) {
490     ?lexmode MarkupDeclaration;
491     } else {
492     ?lexmode MarkupDeclaration;
493     }
494    
495     // ~ (MDC) { }
496     } // _CommentDeclaration_
497    
498     /*
499     Processing Instruction
500    
501     PI := '<?' PITarget [S *Char - *Char '?>' *Char]
502     '?>' ;; [16]
503     */
504     rule _ProcessingInstruction_ ($doc, $parent) {
505     ?lexmode 'PIName';
506    
507     my $pi;
508    
509     ~ (Name) {
510     lang:Perl ($name => $token.value) {
511     if (lc $name eq 'xml') {
512     ## TODO: Well-formedness (syntax) error
513     }
514     ## TODO: Namespace well-formedness
515     $pi = $doc-><M::Document.createProcessingInstruction>
516     ($name);
517     }
518     }
519    
520     ~ (S) {
521     ?lexmode 'PIData';
522    
523     my $tdata;
524    
525     ~? (DATA) {
526     lang:Perl ($data => $token.value) {
527     $tdata = $data;
528     }
529     } else {
530     lang:Perl {
531     $tdata = '';
532     }
533     }
534    
535     ~? (QUESTION) {
536     lang:Perl ($data => $token.value) {
537     $tdata .= $data;
538     }
539     }
540    
541     lang:Perl {
542     $pi-><AS::Node.nodeValue> ($tdata);
543     }
544     }
545    
546     lang:Perl {
547     $parent-><M::Node.appendChild> ($pi);
548     ## TODO: PIs in document type declaration subsets
549     }
550    
551     // ~ (PIC) { }
552     } // _ProcessingInstruction_
553    
554     /*
555     Element content parsing mode
556    
557     element := EmptyElemTag /
558     STag content ETag ;; [39]
559     content := (CharData / element / Reference / CDSect /
560     PI / Comment) ;; [43]
561     */
562     rule Element_ ($doc, $parent) : standalone {
563     ?lexmode 'ElementContent';
564    
565     my $node; // Current "parent" node
566     my $nodes; // Node stack (w/o $current_node)
567     my $type; // Current "parent" element type QName
568     my $types; // Element type stack (w/o $current_type)
569     my $ns; // Current in-scope namespace bindings
570     my $nses; // Namespace binding stack (w/o $current_ns)
571    
572     lang:Perl {
573     $node = $parent;
574     $nodes = [];
575     $type = '';
576     $types = [];
577     $ns = {
578     xml => <Q::xml:>,
579     xmlns => <Q::xmlns:>,
580     };
581     $nses = [];
582     }
583    
584     ~* : name => CONTENT
585     (CharData) {
586     // Character data
587     lang:Perl ($data => $token.value) {
588     if (index ($data, ']]>') > -1) {
589     ## TODO: Well-formedness (syntax) error
590     }
591     $node-><M::Node.appendChild>
592     ($doc-><M::Document.createTextNode> ($data));
593     }
594     } (STAGO) {
595     // Start tag or empty element tag
596    
597     ?lexmode 'StartTag';
598    
599     ~ (Name) {
600     my $attrs;
601     lang:Perl ($name => $token.value) {
602     push @{$types}, $type;
603     $type = $name;
604     $attrs = {};
605     }
606    
607     ~? (S) {
608     &AttributeSpecificationList
609     ($doc => $doc, $attrs => $attrs);
610     }
611    
612     my $el;
613    
614     lang:Perl {
615     push @{$nses}, $ns;
616     $ns = {%$ns};
617    
618     my %gattr;
619     my %lattr;
620     for my $atqname (keys %$attrs) {
621     my ($pfx, $lname) = split /:/, $atqname;
622     if (defined $lname) { ## Global attribute
623     ## TODO: Namespace well-formedness (lname is NCName)
624     if ($pfx eq 'xmlns') {
625     my $nsuri = $attrs->{$atqname}->{value};
626     if ($lname eq 'xml' and
627     $nsuri ne <Q::xml:>) {
628     ## TODO: error
629     } elsif ($lname eq 'xmlns') {
630     ## TODO: error
631     }
632     if ($nsuri eq '') {
633     ## TODO: error in XML 1.0
634     } elsif ($nsuri eq <Q::xml:> and
635     $lname ne 'xml') {
636     ## TODO: error
637     } elsif ($nsuri eq <Q::xmlns:>) {
638     ## TODO: error
639     }
640     $ns->{$lname} = $attrs->{$atqname}->{value};
641     delete $ns->{$lname} unless length $ns->{$lname};
642     } elsif ($pfx eq '') {
643     ## TODO: pfx is not NCName error
644     } else {
645     if ($gattr{$pfx}->{$lname}) {
646     ## TODO: Namespace well-formedness error
647     }
648     }
649     $gattr{$pfx}->{$lname} = $attrs->{$atqname};
650     } else { ## Local attribute
651     if ($pfx eq 'xmlns') {
652     $ns->{''} = $attrs->{xmlns}->{value};
653     delete $ns->{''} unless length $ns->{''};
654     } else {
655     $lattr{$pfx} = $attrs->{$atqname};
656     }
657     }
658     }
659    
660     my ($pfx, $lname) = split /:/, $type;
661     my $nsuri;
662     ## TODO: lname is NCName?
663     if (defined $lname) { ## Prefixed namespace
664     if ($pfx eq '') {
665     ## TODO: pfx is not NCName error
666     }
667     if (defined $ns->{$pfx}) {
668     $nsuri = $ns->{$pfx};
669     } else {
670     ## TODO: namespace ill-formed
671     }
672     } else { ## Default namespace
673     $nsuri = $ns->{''};
674     }
675    
676     $el = $doc-><M::Document.createElementNS>
677     ($nsuri, $type);
678    
679     if ($attrs->{xmlns}) {
680     my $attr = $doc-><M::Document.createAttributeNS>
681     (<Q::xmlns:>, 'xmlns');
682     for (@{$attrs->{xmlns}->{nodes}}) {
683     $attr-><M::Node.appendChild> ($_);
684     }
685     $el-><M::Element.setAttributeNodeNS> ($attr);
686     }
687    
688     for my $lname (keys %lattr) {
689     my $attr = $doc-><M::Document.createAttributeNS>
690     (null, $lname);
691     for (@{$lattr{$lname}->{nodes}}) {
692     $attr-><M::Node.appendChild> ($_);
693     }
694     $el-><M::Element.setAttributeNodeNS> ($attr);
695     }
696    
697     for my $pfx (keys %gattr) {
698     for my $lname (keys %{$gattr{$pfx}}) {
699     my $attr = $doc-><M::Document.createAttributeNS>
700     ($ns->{$pfx}, $pfx.':'.$lname);
701     for (@{$gattr{$pfx}->{$lname}->{nodes}}) {
702     $attr-><M::Node.appendChild> ($_);
703     }
704     $el-><M::Element.setAttributeNodeNS> ($attr);
705     }
706     }
707    
708     $node-><M::Node.appendChild> ($el);
709     }
710    
711     ~ (TAGC) {
712     lang:Perl {
713     push @{$nodes}, $node;
714     $node = $el;
715     }
716     ?lexmode ElementContent;
717     } (MTAGC) {
718     lang:Perl {
719     $ns = pop @{$nses};
720     $type = pop @{$types};
721     }
722     ?lexmode ElementContent;
723     } else {
724     ?lexmode ElementContent;
725     }
726     } else {
727     ?lexmode ElementContent;
728     }
729    
730     } (ETAGO) {
731     // End tag
732    
733     ?lexmode 'EndTag';
734    
735     my $is_docel;
736    
737     ~ (Name) {
738     lang:Perl ($name => $token.value) {
739     if ($name eq $type) {
740     $type = pop @{$types};
741     if ($type eq '') {
742     $is_docel = true;
743     }
744     $node = pop @{$nodes};
745     $ns = pop @{$nses};
746     } else {
747     ## TODO: Element type match well-formedness error
748     }
749     }
750     }
751    
752     ~? (S) { }
753    
754     if-true ($is_docel) {
755     lang:Perl {
756     if (@{$types}) {
757     ## WF error
758     }
759     }
760     return;
761     }
762    
763     ~ (TAGC) {
764     ?lexmode ElementContent;
765     } else {
766     ?lexmode 'ElementContent';
767     }
768    
769     } (HCRO) {
770     &_HexadecimalCharacterReference_
771     ($doc => $doc, $parent => $node);
772    
773     ~ (REFC) {
774     ?lexmode 'ElementContent';
775     } else {
776     ?lexmode ElementContent;
777     }
778     } (CRO) {
779     &_NumericCharacterReference_
780     ($doc => $doc, $parent => $node);
781    
782     ~ (REFC) {
783     ?lexmode 'ElementContent';
784     } else {
785     ?lexmode ElementContent;
786     }
787     } (ERO) {
788     &_GeneralEntityReference_
789     ($doc => $doc, $parent => $node);
790    
791     ~ (REFC) {
792     ?lexmode 'ElementContent';
793     } else {
794     ?lexmode ElementContent;
795     }
796     } (CDO) {
797     &_CommentDeclaration_ ($doc => $doc, $parent => $node);
798    
799     ~ (MDC) {
800     ?lexmode ElementContent;
801     } else {
802     ?lexmode ElementContent;
803     }
804     } (CDSO) {
805     &_CDATASection_ ($doc => $doc, $parent => $node);
806    
807     ~ (MSE) {
808     ?lexmode 'ElementContent';
809     } else {
810     ?lexmode ElementContent;
811     }
812     } (PIO) {
813     &_ProcessingInstruction_ ($doc => $doc, $parent => $node);
814    
815     ~ (PIC) {
816     ?lexmode 'ElementContent';
817     } else {
818     ?lexmode ElementContent;
819     }
820     }
821     } // Element_
822    
823     rule AttributeSpecificationList ($doc, $attrs)
824     : standalone
825     {
826     ?lexmode 'StartTag';
827    
828     my $i;
829     lang:Perl {
830     $i = 0;
831     }
832    
833     ~* (Name) {
834     my $atqname;
835     lang:Perl ($name => $token.value) {
836     $atqname = $name;
837     }
838    
839     ~? (S) { }
840     ~ (VI) { }
841     ~? (S) { }
842    
843     my $vals;
844     lang:Perl {
845     if ($attrs->{$atqname}) {
846     ## TODO: Unique attr name well-formedness error
847     }
848    
849     $vals = $attrs->{$atqname} = {
850     nodes => [],
851     value => '',
852     index => $i++,
853     };
854     }
855    
856     ~ (LIT) {
857     &_AttributeValueSpecification_
858     ($doc => $doc, $vals => $vals);
859    
860     ~ (LIT) {
861     ?lexmode StartTag;
862     } else {
863     ?lexmode StartTag;
864     }
865     } (LITA) {
866     &_AttributeValueSpecificationA_
867     ($doc => $doc, $vals => $vals);
868    
869     ~ (LITA) {
870     ?lexmode StartTag;
871     } else {
872     ?lexmode StartTag;
873     }
874     }
875     } (S) : separator : terminator? { }
876     } // AttributeSpecificationList
877    
878     rule _AttributeValueSpecification_ ($doc, $vals) {
879     // ~ (LIT) { }
880     ?lexmode 'AttributeValueLiteral';
881    
882     ~* (STRING) {
883     lang:Perl ($value => $token.value) {
884     $value =~ s/[\x09\x0A\x0D]/ /g;
885     my $text = $doc-><M::Document.createTextNode> ($value);
886     push @{$vals->{nodes}}, $text;
887     $vals->{value} .= $value;
888     }
889     } (HCRO) {
890     &_HexadecimalCharacterReferenceV_
891     ($doc => $doc, $vals => $vals);
892    
893     ~ (REFC) {
894     ?lexmode AttributeValueLiteral;
895     } else {
896     ?lexmode AttributeValueLiteral;
897     }
898     } (CRO) {
899     &_NumericCharacterReferenceV_
900     ($doc => $doc, $vals => $vals);
901    
902     ~ (REFC) {
903     ?lexmode AttributeValueLiteral;
904     } else {
905     ?lexmode AttributeValueLiteral;
906     }
907     } (ERO) {
908     // TODO: Attribute value normalization
909     &_GeneralEntityReferenceV_
910     ($doc => $doc, $vals => $vals);
911    
912     ~ (REFC) {
913     ?lexmode AttributeValueLiteral;
914     } else {
915     ?lexmode AttributeValueLiteral;
916     }
917     }
918    
919     // ~ (LIT) { } (LITA) { }
920     } // _AttributeValueSpecification_
921    
922     rule _AttributeValueSpecificationA_ ($doc, $vals) {
923     // ~ (LITA) { }
924     ?lexmode 'AttributeValueLiteralA';
925    
926     ~* (STRING) {
927     lang:Perl ($value => $token.value) {
928     $value =~ s/[\x09\x0A\x0D]/ /g;
929     my $text = $doc-><M::Document.createTextNode> ($value);
930     push @{$vals->{nodes}}, $text;
931     $vals->{value} .= $value;
932     }
933     } (HCRO) {
934     &_HexadecimalCharacterReferenceV_
935     ($doc => $doc, $vals => $vals);
936    
937     ~ (REFC) {
938     ?lexmode AttributeValueLiteralA;
939     } else {
940     ?lexmode AttributeValueLiteralA;
941     }
942     } (CRO) {
943     &_NumericCharacterReferenceV_
944     ($doc => $doc, $vals => $vals);
945    
946     ~ (REFC) {
947     ?lexmode AttributeValueLiteralA;
948     } else {
949     ?lexmode AttributeValueLiteralA;
950     }
951     } (ERO) {
952     // TODO: Attribute value normalization
953     &_GeneralEntityReferenceV_
954     ($doc => $doc, $vals => $vals);
955    
956     ~ (REFC) {
957     ?lexmode AttributeValueLiteralA;
958     } else {
959     ?lexmode AttributeValueLiteralA;
960     }
961     }
962    
963     // ~ (LITA) { }
964     } // _AttributeValueSpecificationA_
965    
966     /*
967     CDATA Section Content Parsing Mode
968     */
969     rule _CDATASection_ ($doc, $parent) {
970     ?lexmode 'CDATASectionContent';
971    
972     my $cdata;
973    
974     ~? (CData1) {
975     lang:Perl ($data => $token.value) {
976     $cdata = $data;
977     }
978     } else {
979     lang:Perl {
980     $cdata = '';
981     }
982     }
983    
984     ~? (CData2) {
985     lang:Perl ($data => $token.value) {
986     $cdata .= $data;
987     }
988     }
989    
990     lang:Perl {
991     my $cdsect = $doc-><M::Document.createCDATASection>
992     ($cdata);
993     $parent-><M::Node.appendChild> ($cdsect);
994     }
995    
996     // ~ (MSE) { }
997     } // _CDATASection_
998    
999     rule _NumericCharacterReference_ ($doc, $parent) {
1000     ?lexmode 'NumericCharacterReference';
1001    
1002     ~ (NUMBER) {
1003     lang:Perl ($num => $token.value) {
1004     ## TODO: [WFC: Lecal Character]
1005     my $ncr = $doc-><M::Document.createTextNode>
1006     (chr (0+$num));
1007     $parent-><M::Node.appendChild> ($ncr);
1008     }
1009     }
1010    
1011     // ~ (REFC) { }
1012     } // _NumericCharacterReference_
1013    
1014     rule _NumericCharacterReferenceV_ ($doc, $vals) {
1015     ?lexmode 'NumericCharacterReference';
1016    
1017     ~ (NUMBER) {
1018     lang:Perl ($num => $token.value) {
1019     ## TODO: [WFC: Lecal Character]
1020     my $ncr = $doc-><M::Document.createTextNode>
1021     (my $char = chr (0+$num));
1022     push @{$vals->{nodes}}, $ncr;
1023     $vals->{value} .= $char;
1024     }
1025     }
1026    
1027     // ~ (REFC) { }
1028     } // _NumericCharacterReferenceV_
1029    
1030     rule _HexadecimalCharacterReference_ ($doc, $parent) {
1031     ?lexmode 'HexadecimalCharacterReference';
1032    
1033     ~ (Hex) {
1034     lang:Perl ($num => $token.value) {
1035     ## TODO: [WFC: Lecal Character]
1036     my $ncr = $doc-><M::Document.createTextNode>
1037     (chr hex $num);
1038     $parent-><M::Node.appendChild> ($ncr);
1039     }
1040     }
1041    
1042     // ~ (REFC) { }
1043     } // _HexadecimalCharacterReference_
1044    
1045     rule _HexadecimalCharacterReferenceV_ ($doc, $parent) {
1046     ?lexmode 'HexadecimalCharacterReference';
1047    
1048     ~ (Hex) {
1049     lang:Perl ($num => $token.value) {
1050     ## TODO: [WFC: Lecal Character]
1051     my $ncr = $doc-><M::Document.createTextNode>
1052     (my $char = chr hex $num);
1053     push @{$vals->{nodes}}, $ncr;
1054     $vals->{value} .= $char;
1055     }
1056     }
1057    
1058     // ~ (REFC) { }
1059     } // _HexadecimalCharacterReferenceV_
1060    
1061     rule _GeneralEntityReference_ ($doc, $parent) {
1062     // TODO: Expansion
1063     ?lexmode 'EntityReference';
1064    
1065     ~ (Name) {
1066     lang:Perl ($name => $token.value) {
1067     ## TODO: Namespace well-formedness
1068     ## TODO: Entity declared constraints
1069     my $er = $doc-><M::Document.createEntityReference>
1070     ($name);
1071     $parent-><M::Node.appendChild> ($er);
1072     }
1073     }
1074    
1075     // ~ (REFC) { }
1076     } // _GeneralEntityReference_
1077    
1078     rule _GeneralEntityReferenceV_ ($doc, $vals) {
1079     // TODO: Expansion
1080     ?lexmode 'EntityReference';
1081    
1082     ~ (Name) {
1083     lang:Perl ($name => $token.value) {
1084     ## TODO: Namespace well-formedness
1085     ## TODO: Entity declared constraints
1086     my $er = $doc-><M::Document.createEntityReference>
1087     ($name);
1088     push @{$vals->{nodes}}, $er;
1089     }
1090     }
1091    
1092     // ~ (REFC) { }
1093     } // _GeneralEntityReferenceV_
1094    
1095    
1096     /*
1097     XML Name
1098     */
1099     lexmode Name {
1100     $NameStartChar10 := [
1101     '_' ':'
1102     // Letter
1103     // BaseChar
1104     U+0041..U+005A U+0061..U+007A U+00C0..U+00D6
1105     U+00D8..U+00F6 U+00F8..U+00FF U+0100..U+0131
1106     U+0134..U+013E U+0141..U+0148 U+014A..U+017E
1107     U+0180..U+01C3 U+01CD..U+01F0 U+01F4..U+01F5
1108     U+01FA..U+0217 U+0250..U+02A8 U+02BB..U+02C1
1109     U+0386 U+0388..U+038A U+038C U+038E..U+03A1
1110     U+03A3..U+03CE U+03D0..U+03D6 U+03DA U+03DC
1111     U+03DE U+03E0 U+03E2..U+03F3 U+0401..U+040C
1112     U+040E..U+044F U+0451..U+045C U+045E..U+0481
1113     U+0490..U+04C4 U+04C7..U+04C8 U+04CB..U+04CC
1114     U+04D0..U+04EB U+04EE..U+04F5 U+04F8..U+04F9
1115     U+0531..U+0556 U+0559 U+0561..U+0586
1116     U+05D0..U+05EA U+05F0..U+05F2 U+0621..U+063A
1117     U+0641..U+064A U+0671..U+06B7 U+06BA..U+06BE
1118     U+06C0..U+06CE U+06D0..U+06D3 U+06D5
1119     U+06E5..U+06E6 U+0905..U+0939 U+093D
1120     U+0958..U+0961 U+0985..U+098C U+098F..U+0990
1121     U+0993..U+09A8 U+09AA..U+09B0 U+09B2
1122     U+09B6..U+09B9 U+09DC..U+09DD U+09DF..U+09E1
1123     U+09F0..U+09F1 U+0A05..U+0A0A U+0A0F..U+0A10
1124     U+0A13..U+0A28 U+0A2A..U+0A30 U+0A32..U+0A33
1125     U+0A35..U+0A36 U+0A38..U+0A39 U+0A59..U+0A5C
1126     U+0A5E U+0A72..U+0A74 U+0A85..U+0A8B U+0A8D
1127     U+0A8F..U+0A91 U+0A93..U+0AA8 U+0AAA..U+0AB0
1128     U+0AB2..U+0AB3 U+0AB5..U+0AB9 U+0ABD U+0AE0
1129     U+0B05..U+0B0C U+0B0F..U+0B10 U+0B13..U+0B28
1130     U+0B2A..U+0B30 U+0B32..U+0B33 U+0B36..U+0B39
1131     U+0B3D U+0B5C..U+0B5D U+0B5F..U+0B61
1132     U+0B85..U+0B8A U+0B8E..U+0B90 U+0B92..U+0B95
1133     U+0B99..U+0B9A U+0B9C U+0B9E..U+0B9F
1134     U+0BA3..U+0BA4 U+0BA8..U+0BAA U+0BAE..U+0BB5
1135     U+0BB7..U+0BB9 U+0C05..U+0C0C U+0C0E..U+0C10
1136     U+0C12..U+0C28 U+0C2A..U+0C33 U+0C35..U+0C39
1137     U+0C60..U+0C61 U+0C85..U+0C8C U+0C8E..U+0C90
1138     U+0C92..U+0CA8 U+0CAA..U+0CB3 U+0CB5..U+0CB9
1139     U+0CDE U+0CE0..U+0CE1 U+0D05..U+0D0C
1140     U+0D0E..U+0D10 U+0D12..U+0D28 U+0D2A..U+0D39
1141     U+0D60..U+0D61 U+0E01..U+0E2E U+0E30
1142     U+0E32..U+0E33 U+0E40..U+0E45 U+0E81..U+0E82
1143     U+0E84 U+0E87..U+0E88 U+0E8A U+0E8D
1144     U+0E94..U+0E97 U+0E99..U+0E9F U+0EA1..U+0EA3
1145     U+0EA5 U+0EA7 U+0EAA..U+0EAB U+0EAD..U+0EAE
1146     U+0EB0 U+0EB2..U+0EB3 U+0EBD U+0EC0..U+0EC4
1147     U+0F40..U+0F47 U+0F49..U+0F69 U+10A0..U+10C5
1148     U+10D0..U+10F6 U+1100 U+1102..U+1103
1149     U+1105..U+1107 U+1109 U+110B..U+110C
1150     U+110E..U+1112 U+113C U+113E U+1140 U+114C
1151     U+114E U+1150 U+1154..U+1155 U+1159
1152     U+115F..U+1161 U+1163 U+1165 U+1167 U+1169
1153     U+116D..U+116E U+1172..U+1173 U+1175 U+119E
1154     U+11A8 U+11AB U+11AE..U+11AF U+11B7..U+11B8
1155     U+11BA U+11BC..U+11C2 U+11EB U+11F0 U+11F9
1156     U+1E00..U+1E9B U+1EA0..U+1EF9 U+1F00..U+1F15
1157     U+1F18..U+1F1D U+1F20..U+1F45 U+1F48..U+1F4D
1158     U+1F50..U+1F57 U+1F59 U+1F5B U+1F5D
1159     U+1F5F..U+1F7D U+1F80..U+1FB4 U+1FB6..U+1FBC
1160     U+1FBE U+1FC2..U+1FC4 U+1FC6..U+1FCC
1161     U+1FD0..U+1FD3 U+1FD6..U+1FDB U+1FE0..U+1FEC
1162     U+1FF2..U+1FF4 U+1FF6..U+1FFC U+2126
1163     U+212A..U+212B U+212E U+2180..U+2182
1164     U+3041..U+3094 U+30A1..U+30FA U+3105..U+312C
1165     U+AC00..U+D7A3
1166     // Ideographic
1167     U+4E00..U+9FA5 U+3007 U+3021..U+3029
1168     ];
1169     $NameChar10 := [
1170     '.' '-' '_' ':'
1171     // Letter
1172     // BaseChar
1173     U+0041..U+005A U+0061..U+007A U+00C0..U+00D6
1174     U+00D8..U+00F6 U+00F8..U+00FF U+0100..U+0131
1175     U+0134..U+013E U+0141..U+0148 U+014A..U+017E
1176     U+0180..U+01C3 U+01CD..U+01F0 U+01F4..U+01F5
1177     U+01FA..U+0217 U+0250..U+02A8 U+02BB..U+02C1
1178     U+0386 U+0388..U+038A U+038C U+038E..U+03A1
1179     U+03A3..U+03CE U+03D0..U+03D6 U+03DA U+03DC
1180     U+03DE U+03E0 U+03E2..U+03F3 U+0401..U+040C
1181     U+040E..U+044F U+0451..U+045C U+045E..U+0481
1182     U+0490..U+04C4 U+04C7..U+04C8 U+04CB..U+04CC
1183     U+04D0..U+04EB U+04EE..U+04F5 U+04F8..U+04F9
1184     U+0531..U+0556 U+0559 U+0561..U+0586
1185     U+05D0..U+05EA U+05F0..U+05F2 U+0621..U+063A
1186     U+0641..U+064A U+0671..U+06B7 U+06BA..U+06BE
1187     U+06C0..U+06CE U+06D0..U+06D3 U+06D5
1188     U+06E5..U+06E6 U+0905..U+0939 U+093D
1189     U+0958..U+0961 U+0985..U+098C U+098F..U+0990
1190     U+0993..U+09A8 U+09AA..U+09B0 U+09B2
1191     U+09B6..U+09B9 U+09DC..U+09DD U+09DF..U+09E1
1192     U+09F0..U+09F1 U+0A05..U+0A0A U+0A0F..U+0A10
1193     U+0A13..U+0A28 U+0A2A..U+0A30 U+0A32..U+0A33
1194     U+0A35..U+0A36 U+0A38..U+0A39 U+0A59..U+0A5C
1195     U+0A5E U+0A72..U+0A74 U+0A85..U+0A8B U+0A8D
1196     U+0A8F..U+0A91 U+0A93..U+0AA8 U+0AAA..U+0AB0
1197     U+0AB2..U+0AB3 U+0AB5..U+0AB9 U+0ABD U+0AE0
1198     U+0B05..U+0B0C U+0B0F..U+0B10 U+0B13..U+0B28
1199     U+0B2A..U+0B30 U+0B32..U+0B33 U+0B36..U+0B39
1200     U+0B3D U+0B5C..U+0B5D U+0B5F..U+0B61
1201     U+0B85..U+0B8A U+0B8E..U+0B90 U+0B92..U+0B95
1202     U+0B99..U+0B9A U+0B9C U+0B9E..U+0B9F
1203     U+0BA3..U+0BA4 U+0BA8..U+0BAA U+0BAE..U+0BB5
1204     U+0BB7..U+0BB9 U+0C05..U+0C0C U+0C0E..U+0C10
1205     U+0C12..U+0C28 U+0C2A..U+0C33 U+0C35..U+0C39
1206     U+0C60..U+0C61 U+0C85..U+0C8C U+0C8E..U+0C90
1207     U+0C92..U+0CA8 U+0CAA..U+0CB3 U+0CB5..U+0CB9
1208     U+0CDE U+0CE0..U+0CE1 U+0D05..U+0D0C
1209     U+0D0E..U+0D10 U+0D12..U+0D28 U+0D2A..U+0D39
1210     U+0D60..U+0D61 U+0E01..U+0E2E U+0E30
1211     U+0E32..U+0E33 U+0E40..U+0E45 U+0E81..U+0E82
1212     U+0E84 U+0E87..U+0E88 U+0E8A U+0E8D
1213     U+0E94..U+0E97 U+0E99..U+0E9F U+0EA1..U+0EA3
1214     U+0EA5 U+0EA7 U+0EAA..U+0EAB U+0EAD..U+0EAE
1215     U+0EB0 U+0EB2..U+0EB3 U+0EBD U+0EC0..U+0EC4
1216     U+0F40..U+0F47 U+0F49..U+0F69 U+10A0..U+10C5
1217     U+10D0..U+10F6 U+1100 U+1102..U+1103
1218     U+1105..U+1107 U+1109 U+110B..U+110C
1219     U+110E..U+1112 U+113C U+113E U+1140 U+114C
1220     U+114E U+1150 U+1154..U+1155 U+1159
1221     U+115F..U+1161 U+1163 U+1165 U+1167 U+1169
1222     U+116D..U+116E U+1172..U+1173 U+1175 U+119E
1223     U+11A8 U+11AB U+11AE..U+11AF U+11B7..U+11B8
1224     U+11BA U+11BC..U+11C2 U+11EB U+11F0 U+11F9
1225     U+1E00..U+1E9B U+1EA0..U+1EF9 U+1F00..U+1F15
1226     U+1F18..U+1F1D U+1F20..U+1F45 U+1F48..U+1F4D
1227     U+1F50..U+1F57 U+1F59 U+1F5B U+1F5D
1228     U+1F5F..U+1F7D U+1F80..U+1FB4 U+1FB6..U+1FBC
1229     U+1FBE U+1FC2..U+1FC4 U+1FC6..U+1FCC
1230     U+1FD0..U+1FD3 U+1FD6..U+1FDB U+1FE0..U+1FEC
1231     U+1FF2..U+1FF4 U+1FF6..U+1FFC U+2126
1232     U+212A..U+212B U+212E U+2180..U+2182
1233     U+3041..U+3094 U+30A1..U+30FA U+3105..U+312C
1234     U+AC00..U+D7A3
1235     // Ideographic
1236     U+4E00..U+9FA5 U+3007 U+3021..U+3029
1237     // Digit
1238     U+0030..U+0039 U+0660..U+0669 U+06F0..U+06F9
1239     U+0966..U+096F U+09E6..U+09EF U+0A66..U+0A6F
1240     U+0AE6..U+0AEF U+0B66..U+0B6F U+0BE7..U+0BEF
1241     U+0C66..U+0C6F U+0CE6..U+0CEF U+0D66..U+0D6F
1242     U+0E50..U+0E59 U+0ED0..U+0ED9 U+0F20..U+0F29
1243     // CombiningChar
1244     U+0300..U+0345 U+0360..U+0361 U+0483..U+0486
1245     U+0591..U+05A1 U+05A3..U+05B9 U+05BB..U+05BD
1246     U+05BF U+05C1..U+05C2 U+05C4 U+064B..U+0652
1247     U+0670 U+06D6..U+06DC U+06DD..U+06DF
1248     U+06E0..U+06E4 U+06E7..U+06E8 U+06EA..U+06ED
1249     U+0901..U+0903 U+093C U+093E..U+094C U+094D
1250     U+0951..U+0954 U+0962..U+0963 U+0981..U+0983
1251     U+09BC U+09BE U+09BF U+09C0..U+09C4
1252     U+09C7..U+09C8 U+09CB..U+09CD U+09D7
1253     U+09E2..U+09E3 U+0A02 U+0A3C U+0A3E U+0A3F
1254     U+0A40..U+0A42 U+0A47..U+0A48 U+0A4B..U+0A4D
1255     U+0A70..U+0A71 U+0A81..U+0A83 U+0ABC
1256     U+0ABE..U+0AC5 U+0AC7..U+0AC9 U+0ACB..U+0ACD
1257     U+0B01..U+0B03 U+0B3C U+0B3E..U+0B43
1258     U+0B47..U+0B48 U+0B4B..U+0B4D U+0B56..U+0B57
1259     U+0B82..U+0B83 U+0BBE..U+0BC2 U+0BC6..U+0BC8
1260     U+0BCA..U+0BCD U+0BD7 U+0C01..U+0C03
1261     U+0C3E..U+0C44 U+0C46..U+0C48 U+0C4A..U+0C4D
1262     U+0C55..U+0C56 U+0C82..U+0C83 U+0CBE..U+0CC4
1263     U+0CC6..U+0CC8 U+0CCA..U+0CCD U+0CD5..U+0CD6
1264     U+0D02..U+0D03 U+0D3E..U+0D43 U+0D46..U+0D48
1265     U+0D4A..U+0D4D U+0D57 U+0E31 U+0E34..U+0E3A
1266     U+0E47..U+0E4E U+0EB1 U+0EB4..U+0EB9
1267     U+0EBB..U+0EBC U+0EC8..U+0ECD U+0F18..U+0F19
1268     U+0F35 U+0F37 U+0F39 U+0F3E U+0F3F
1269     U+0F71..U+0F84 U+0F86..U+0F8B U+0F90..U+0F95
1270     U+0F97 U+0F99..U+0FAD U+0FB1..U+0FB7 U+0FB9
1271     U+20D0..U+20DC U+20E1 U+302A..U+302F U+3099
1272     U+309A
1273     // Extender
1274     U+00B7 U+02D0 U+02D1 U+0387 U+0640 U+0E46
1275     U+0EC6 U+3005 U+3031..U+3035 U+309D..U+309E
1276     U+30FC..U+30FE
1277     ];
1278    
1279     $NameStartChar11 := [
1280     ':' '_'
1281     'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M'
1282     'N' 'O' 'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W' 'X' 'Y' 'Z'
1283     'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm'
1284     'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z'
1285     U+00C0..U+00D6 U+00D8..U+00F6 U+00F8..U+02FF
1286     U+0370..U+037D U+037F..U+1FFF U+200C..U+200D
1287     U+2070..U+218F U+2C00..U+2FEF U+3001..U+D7FF
1288     U+F900..U+FDCF U+FDF0..U+FFFD U+10000..U+EFFFF
1289     ];
1290     $NameChar11 := [
1291     '-' '.' '0' '1' '2' '3' '4' '5' '6' '7' '8' '9'
1292     U+00B7 U+0300..U+036F U+203F..U+2040
1293     // NameStartChar
1294     ':' '_'
1295     'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M'
1296     'N' 'O' 'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W' 'X' 'Y' 'Z'
1297     'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm'
1298     'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z'
1299     U+00C0..U+00D6 U+00D8..U+00F6 U+00F8..U+02FF
1300     U+0370..U+037D U+037F..U+1FFF U+200C..U+200D
1301     U+2070..U+218F U+2C00..U+2FEF U+3001..U+D7FF
1302     U+F900..U+FDCF U+FDF0..U+FFFD U+10000..U+EFFFF
1303     ];
1304     Name : value := $NameStartChar11 $NameChar11*;
1305     } // Name
1306    
1307     /*
1308     Space
1309     */
1310     lexmode S {
1311     S := [U+0009 U+000A U+000D U+0020]+;
1312     } // S
1313    
1314     /*
1315     Document end scanning mode
1316     */
1317     lexmode DocumentEnd
1318     : standalone
1319     : extends => 'S'
1320     {
1321     /*
1322     Processing instruction
1323     */
1324     PIO := ['<'] ['?'];
1325    
1326     /*
1327     Comment declaration
1328     */
1329     CDO := ['<'] ['!'] ['-'] ['-'];
1330     } // DocumentEnd
1331    
1332     /*
1333     Document misc scanning mode
1334    
1335     This mode scans |Misc| constructions as well
1336     as document element's start tag.
1337     */
1338     lexmode DocumentMisc
1339     : standalone
1340     : extends => 'DocumentEnd'
1341     {
1342     /*
1343     Document element start tag
1344     */
1345     STAGO := ['<'];
1346     } // DocumentMisc
1347    
1348     /*
1349     Document prolog scanning mode
1350     */
1351     lexmode DocumentProlog
1352     : standalone
1353     : extends => 'DocumentMisc'
1354     {
1355     /*
1356     |DOCTYPE| declaration
1357     */
1358     MDO := ['<'] ['!'];
1359     } // DocumentProlog
1360    
1361     /*
1362     Document start scanning mode
1363     */
1364     lexmode DocumentStart
1365     : initial
1366     : standalone
1367     : extends => 'DocumentProlog'
1368     {
1369     /*
1370     XML declaration
1371     */
1372     XDO := ['<'] ['?'] ['x'] ['m'] ['l'];
1373     } // DocumentStart
1374    
1375     /*
1376     Markup declaration scanning mode
1377    
1378     This mode is used to recognize |MDC| that terminates
1379     a comment declaration as well as the base |lexmode|
1380     for e.g. document type declaration scanning mode.
1381     */
1382     lexmode MarkupDeclaration
1383     : standalone
1384     : extends => 'Name'
1385     {
1386     /*
1387     Markup declaration close
1388     */
1389     MDC := ['>'];
1390    
1391     /*
1392     Literal open
1393     */
1394     LIT := ['"'];
1395    
1396     /*
1397     Alternative literal open
1398     */
1399     LITA := [U+0027];
1400     } // MarkupDeclaration
1401    
1402     lexmode DocumentTypeDeclaration
1403     : standalone
1404     : extends => 'MarkupDeclaration'
1405     {
1406     /*
1407     Declaration subset close
1408     */
1409     DSO := ['['];
1410    
1411     /*
1412     Declaration subset close
1413     */
1414     DSC := [']'];
1415     } // DocumentTypeDeclaration
1416    
1417     /*
1418     Comment declaration scanning mode
1419     */
1420     lexmode CommentDeclaration
1421     : standalone
1422     {
1423     /*
1424     Comment close
1425     */
1426     COM := ['-'] ['-'];
1427    
1428     /*
1429     Comment data
1430     */
1431     $string := ['-']? [^'-'];
1432     STRING : value := $string+;
1433     } // CommentDeclaration
1434    
1435     /*
1436     Processing instruction name and |S| scanning mode
1437     */
1438     lexmode PIName
1439     : standalone
1440     : extends => 'Name'
1441     : extends => 'S'
1442     {
1443     /*
1444     Processing instruction close
1445     */
1446     PIC := ['?'] ['>'];
1447     } // PIName
1448    
1449     /*
1450     Processing instruction data scanning mode
1451     */
1452     lexmode PIData
1453     : standalone
1454     {
1455     /*
1456     Processing instruction close
1457     */
1458     PIC := ['?'] ['>'];
1459    
1460     /*
1461     Processing instruction target data
1462     */
1463     $data0 := ['?']+ [^'>' '?'];
1464     $data1 := $data0? [^ '?'];
1465     DATA : value := $data1+;
1466    
1467     QUESTION : value := ['?']+;
1468     } // PIData
1469    
1470     /*
1471     Content of element scanning mode
1472     */
1473     lexmode ElementContent
1474     : standalone
1475     {
1476     /*
1477     Character data and/or |MSE|
1478     */
1479     CharData : value := [^'<' '&']+;
1480    
1481     /*
1482     Start tag open
1483     */
1484     STAGO := ['<'];
1485    
1486     /*
1487     End tag open
1488     */
1489     ETAGO := ['<'] ['/'];
1490    
1491     /*
1492     Hexadecimal character reference open
1493     */
1494     HCRO := ['&'] ['#'] ['x'];
1495    
1496     /*
1497     Numeric character reference open
1498     */
1499     CRO := ['&'] ['#'];
1500    
1501     /*
1502     General entity reference open
1503     */
1504     ERO := ['&'];
1505    
1506     /*
1507     Comment declaration open
1508     */
1509     CDO := ['<'] ['!'] ['-'] ['-'];
1510    
1511     /*
1512     CDATA section open
1513     */
1514     CDSO := ['<'] ['!'] ['[']
1515     ['C'] ['D'] ['A'] ['T'] ['A'] ['['];
1516    
1517     /*
1518     Processing instruction open
1519     */
1520     PIO := ['<'] ['?'];
1521     } // ElementContent
1522    
1523     /*
1524     CDATA section content scanning mode
1525     */
1526     lexmode CDATASectionContent
1527     : standalone
1528     {
1529     /*
1530     Markup section end
1531     */
1532     MSE := [']'] [']'] ['>'];
1533    
1534     /*
1535     Character data
1536    
1537     [^']']+
1538     [']'] [^']']+;
1539     [']'] [']']+ [^']' '>']+
1540     */
1541     $cd0 := [']'] [']']+ [^']' '>']+;
1542     $cd1 := $cd0* [']']? [^']']+;
1543     CData1 : value := $cd1+;
1544     CData2 : value := $cd0+;
1545     } // CDATASectionContent
1546    
1547     lexmode EntityReference
1548     : standalone
1549     : extends => 'Name'
1550     {
1551     /*
1552     Reference close
1553     */
1554     REFC := [';'];
1555     } // EntityReference
1556    
1557     lexmode NumericCharacterReference
1558     : standalone
1559     {
1560     /*
1561     Decimal number
1562     */
1563     $digit := ['0' '1' '2' '3' '4' '5' '6' '7' '8' '9'];
1564     NUMBER : value := $digit+;
1565    
1566     /*
1567     Reference close
1568     */
1569     REFC := [';'];
1570     } // NumericCharacterReference
1571    
1572     lexmode HexadecimalCharacterReference
1573     : standalone
1574     {
1575     /*
1576     Hexadecimal number
1577     */
1578     $hexdigit := ['0' '1' '2' '3' '4' '5' '6' '7' '8' '9'
1579     'A' 'B' 'C' 'D' 'E' 'F'
1580     'a' 'b' 'c' 'd' 'e' 'f'];
1581     Hex : value := $hexdigit+;
1582    
1583     /*
1584     Reference close
1585     */
1586     REFC := [';'];
1587     } // HexadecimalCharacterReference
1588    
1589     lexmode StartTag
1590     : standalone
1591     : extends => 'Name'
1592     : extends => 'S'
1593     {
1594    
1595     /*
1596     Value indicator
1597     */
1598     VI := ['='];
1599    
1600     /*
1601     Literal open
1602     */
1603     LIT := ['"'];
1604     LITA := [U+0027];
1605    
1606     /*
1607     Tag close
1608     */
1609     TAGC := ['>'];
1610    
1611     /*
1612     Empty element tag close
1613     */
1614     MTAGC := ['/'] ['>'];
1615     } // StartTag
1616    
1617     lexmode EndTag
1618     : standalone
1619     : extends => 'Name'
1620     : extends => 'S'
1621     {
1622     /*
1623     Tag close
1624     */
1625     TAGC := ['>'];
1626     } // EndTag
1627    
1628     lexmode AttributeValueLiteral_ {
1629     ERO := ['&'];
1630     CRO := ['&'] ['#'];
1631     HCRO := ['&'] ['#'] ['x'];
1632     } // AttributeValueLiteral_
1633    
1634     lexmode AttributeValueLiteral
1635     : standalone
1636     : extends => 'AttributeValueLiteral_'
1637     {
1638     LIT := ['"'];
1639     STRING : value := [^'"' '&' '<'];
1640     } // AttributeValueLiteral
1641    
1642     lexmode AttributeValueLiteralA
1643     : standalone
1644     : extends => 'AttributeValueLiteral_'
1645     {
1646     LIT := [U+0027];
1647     STRING : value := [^U+0027 '&' '<'];
1648     } // AttributeValueLiteralA
1649    
1650     token-error default : default {
1651     lang:Perl {
1652     my $location = {
1653     utf32_offset => pos ($self->{source}),
1654     };
1655     my $continue = __DOMCore:ERROR{xp|bad-token-error::
1656     xp|error-token => {$token},
1657     DOMCore|location => {$location},
1658     xp|source-text => {\($self->{source})},
1659     }__;
1660     unless ($continue) {
1661     __EXCEPTION{DOMLS|PARSE_ERR::
1662     }__;
1663     }
1664     $self->{has_error} = true;
1665     }
1666     } // default
1667     ##ManakaiXMLParser
1668    
1669    
1670     ElementTypeBinding:
1671     @Name: RuleDef
1672     @ElementType:
1673     dis:ResourceDef
1674     @ShadowContent:
1675     @@ForCheck: ManakaiDOM|ForClass
1676     @@rdf:type: Muf2003|RuleDefClass
1677    
1678     ElementTypeBinding:
1679     @Name: RuleParam
1680     @ElementType:
1681     dis:ResourceDef
1682     @ShadowContent:
1683     @@rdf:type: Muf2003|RuleParameter
1684    
1685     ElementTypeBinding:
1686     @Name: enImplNote
1687     @ElementType:
1688     dis:ImplNote
1689     @ShadowContent:
1690     @@lang:en
1691    
1692     ElementTypeBinding:
1693     @Name: ErrDef
1694     @ElementType:
1695     dis:ResourceDef
1696     @ShadowContent:
1697     @@rdf:type: DOMCore|DOMErrorType
1698     @@For: ManakaiDOM|DOM3
1699     @@ecore:textFormatter: ManakaiXMLParserExceptionFormatter
1700    
1701     ErrDef:
1702     @QName: xp|bad-token-error
1703     @enDesc:
1704     The parser is encountered to a token whose type is not
1705     allowed there.
1706     @DOMCore:severity: DOMCore|SEVERITY_ERROR
1707     @enMufDef:
1708     Token |%xp-error-token-type;|%xp-error-token-value
1709     (prefix => { (|}, suffix => {|)}); is not
1710     allowed %xp-error-lines (prefix => {(|}, suffix => {|)});
1711    
1712     PropDef:
1713     @QName: xp|error-token
1714     @enDesc:
1715     The token where the parser found an error.
1716    
1717     PropDef:
1718     @QName: xp|source-text
1719     @enDesc:
1720     A reference to the original source text, if available.
1721    
1722     ElementTypeBinding:
1723     @Name:enMufDef
1724     @ElementType:
1725     ecore:defaultMessage
1726     @ShadowContent:
1727     @@lang:en
1728     @@ContentType:
1729     lang:muf
1730    
1731     ResourceDef:
1732     @QName: DOMImpl
1733     @AliasFor: DOMCore|DOMImplementation
1734     @For: ManakaiDOM|DOM
1735    
1736     ElementTypeBinding:
1737     @Name: Attr
1738     @ElementType:
1739     dis:ResourceDef
1740     @ShadowContent:
1741     @@rdf:type: DISLang|Attribute
1742     @@ForCheck: !=ManakaiDOM|ManakaiDOM
1743    
1744     ElementTypeBinding:
1745     @Name: Get
1746     @ElementType:
1747     dis:ResourceDef
1748     @ShadowContent:
1749     @@rdf:type: DISLang|AttributeGet
1750    
1751     ElementTypeBinding:
1752     @Name: Set
1753     @ElementType:
1754     dis:ResourceDef
1755     @ShadowContent:
1756     @@rdf:type: DISLang|AttributeSet
1757    
1758     ElementTypeBinding:
1759     @Name: enDesc
1760     @ElementType:
1761     dis:Description
1762     @ShadowContent:
1763     @@lang:en
1764    
1765     ElementTypeBinding:
1766     @Name: Method
1767     @ElementType:
1768     dis:ResourceDef
1769     @ShadowContent:
1770     @@rdf:type: DISLang|Method
1771     @@For: !=ManakaiDOM|ManakaiDOM
1772    
1773     ElementTypeBinding:
1774     @Name: Return
1775     @ElementType:
1776     dis:ResourceDef
1777     @ShadowContent:
1778     @@rdf:type: DISLang|MethodReturn
1779    
1780     ElementTypeBinding:
1781     @Name: Param
1782     @ElementType:
1783     dis:ResourceDef
1784     @ShadowContent:
1785     @@rdf:type: DISLang|MethodParameter
1786    
1787     ElementTypeBinding:
1788     @Name: PerlDef
1789     @ElementType:
1790     dis:Def
1791     @ShadowContent:
1792     @@ContentType: lang|Perl
1793    
1794     ElementTypeBinding:
1795     @Name: PropDef
1796     @ElementType:
1797     dis:ResourceDef
1798     @ShadowContent:
1799     @@rdf:type: rdf|Property
1800    
1801     ClsDef:
1802     @ClsQName: ManakaiXMLParserExceptionFormatter
1803    
1804     @ClsISA: ecore|MUErrorFormatter||ManakaiDOM|Perl
1805    
1806     @RuleDef:
1807     @@Name: xp-error-token-type
1808     @@enDesc:
1809     The type of the token the parser is encountered.
1810    
1811     @@Method:
1812     @@@Name: after
1813     @@@Param:
1814     @@@@Name: name
1815     @@@@Type: DOMString
1816     @@@@enDesc: The name of the method.
1817     @@@Param:
1818     @@@@Name: p
1819     @@@@Type: DISPerl|HASH
1820     @@@@enDesc: The set of the parameters to the method.
1821     @@@Param:
1822     @@@@Name: o
1823     @@@@Type: DISPerl|HASH
1824     @@@@enDesc: The option value.
1825     @@@Return:
1826     @@@@PerlDef:
1827     $p->{-result} = $o->{<H::xp|error-token>}->{type}
1828     if defined $o->{<H::xp|error-token>}->{type};
1829    
1830     @RuleDef:
1831     @@Name: xp-error-token-value
1832     @@enDesc:
1833     The value of the token the parser is encountered, if any.
1834    
1835     @@Method:
1836     @@@Name: after
1837     @@@Param:
1838     @@@@Name: name
1839     @@@@Type: DOMString
1840     @@@@enDesc: The name of the method.
1841     @@@Param:
1842     @@@@Name: p
1843     @@@@Type: DISPerl|HASH
1844     @@@@enDesc: The set of the parameters to the method.
1845     @@@Param:
1846     @@@@Name: o
1847     @@@@Type: DISPerl|HASH
1848     @@@@enDesc: The option value.
1849     @@@Return:
1850     @@@@PerlDef:
1851     $p->{-result} = $o->{<H::xp|error-token>}->{value}
1852     if defined $o->{<H::xp|error-token>}->{value};
1853    
1854     @RuleDef:
1855     @@Name: xp-error-lines
1856     @@enDesc:
1857     A copy of fragment of the source text that contains the line
1858     where the error occurred, if available.
1859    
1860     @@Method:
1861     @@@Name: after
1862     @@@Param:
1863     @@@@Name: name
1864     @@@@Type: DOMString
1865     @@@@enDesc: The name of the method.
1866     @@@Param:
1867     @@@@Name: p
1868     @@@@Type: DISPerl|HASH
1869     @@@@enDesc: The set of the parameters to the method.
1870     @@@Param:
1871     @@@@Name: o
1872     @@@@Type: DISPerl|HASH
1873     @@@@enDesc: The option value.
1874     @@@Return:
1875     @@@@PerlDef:
1876     my $pos = $o-><AG::DOMCore|DOMError.location>
1877     -><AG::DOMCore|DOMLocator.utf32Offset>;
1878     if ($pos > -1) {
1879     my $src = $o->{<H::xp|source-text>};
1880     my $start = $pos;
1881     $start = rindex ($$src, "\x0A", $start - 1) for 0..2;
1882     $start++;
1883     my $end = $pos;
1884     $end = index ($$src, "\x0A", $end + 1) for 0..2;
1885     $end = length $$src if $end < 0;
1886     $p->{-result} = substr $$src, $start, $end - $start;
1887     }
1888     ##XMLParserExceptionFormatter

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24