/[suikacvs]/messaging/manakai/lib/Message/DOM/XMLParser.dis
Suika

Contents of /messaging/manakai/lib/Message/DOM/XMLParser.dis

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (show annotations) (download)
Wed Dec 28 11:10:56 2005 UTC (18 years, 11 months ago) by wakaba
Branch: MAIN
Changes since 1.1: +21 -36 lines
++ manakai/lib/Message/Util/DIS/ChangeLog	28 Dec 2005 11:09:37 -0000
	* DPG.dis: |?default-token| statement is supported.

2005-12-28  Wakaba  <wakaba@suika.fam.cx>

++ manakai/lib/Message/DOM/ChangeLog	28 Dec 2005 11:08:59 -0000
	* XMLParser.dis (DocumentEntity): Typos fixed.
	(|lexmode|s): New |?default-token| statements are used
	so that tokenizer description has been simplified
	and CDATA sections now can be parsed.

2005-12-28  Wakaba  <wakaba@suika.fam.cx>

1 Module:
2 @QName: MDOM|XMLParser
3 @Namespace:
4 http://suika.fam.cx/~wakaba/archive/2004/dom/xml-parser#
5
6 @FullName:
7 @@lang:en
8 @@@: XML Parser
9
10 @DISCore:author: DISCore|Wakaba
11 @License: license|Perl+MPL
12 @Date:
13 $Date: 2005/12/28 03:50:17 $
14
15 @DefaultFor: ManakaiDOM|ManakaiDOMLatest
16
17 @Require:
18 @@Module:
19 @@@QName: MDOM|DOMLS
20 @@@WithFor: ManakaiDOM|ManakaiDOMLatest
21
22 Namespace:
23 @dis:
24 http://suika.fam.cx/~wakaba/archive/2004/8/18/lang#dis--
25 @DOMCore:
26 http://suika.fam.cx/~wakaba/archive/2004/8/18/dom-core#
27 @DOMMain:
28 http://suika.fam.cx/~wakaba/archive/2004/dom/main#
29 @dx:
30 http://suika.fam.cx/~wakaba/archive/2005/manakai/Util/Error/DOMException#
31 @ecore:
32 http://suika.fam.cx/~wakaba/archive/2005/manakai/Util/Error/Core/
33 @f:
34 http://suika.fam.cx/~wakaba/archive/2004/dom/feature#
35 @idl:
36 http://suika.fam.cx/~wakaba/archive/2004/dis/IDL#
37 @infoset:
38 http://www.w3.org/2001/04/infoset#
39 @lang:
40 http://suika.fam.cx/~wakaba/archive/2004/8/18/lang#
41 @license:
42 http://suika.fam.cx/~wakaba/archive/2004/8/18/license#
43 @LSEV:
44 http://www.w3.org/2002/DOMLS
45 @ManakaiDOM:
46 http://suika.fam.cx/~wakaba/archive/2004/8/18/manakai-dom#
47 @ManakaiDOMLS:
48 http://suika.fam.cx/~wakaba/archive/2004/mdom-ls#
49 @MDOM:
50 http://suika.fam.cx/~wakaba/archive/2004/8/18/manakai-dom#ManakaiDOM.
51 @MDOMX:
52 http://suika.fam.cx/~wakaba/archive/2004/8/4/manakai-dom-exception#
53 @rdf:
54 http://www.w3.org/1999/02/22-rdf-syntax-ns#
55 @rdfs:
56 http://www.w3.org/2000/01/rdf-schema#
57 @t:
58 http://suika.fam.cx/~wakaba/archive/2004/dom/tree#
59 @xml:
60 http://www.w3.org/XML/1998/namespace
61 @xmlns:
62 http://www.w3.org/2000/xmlns/
63 @xp:
64 http://suika.fam.cx/~wakaba/archive/2004/dom/xml-parser#
65
66 ## -- Features
67
68 ElementTypeBinding:
69 @Name: FeatureDef
70 @ElementType:
71 dis:ResourceDef
72 @ShadowContent:
73 @@rdf:type: f|Feature
74 @@For: =ManakaiDOM|all
75
76 ElementTypeBinding:
77 @Name: FeatureVerDef
78 @ElementType:
79 dis:ResourceDef
80 @ShadowContent:
81 @@rdf:type: f|Feature
82
83 ElementTypeBinding:
84 @Name: featureQName
85 @ElementType:
86 f:name
87 @ShadowContent:
88 @@ContentType: DISCore|QName
89
90 ResourceDef:
91 @QName: DOMString
92 @AliasFor: DOMMain|DOMString
93 @For: ManakaiDOM|DOM
94
95 ResourceDef:
96 @QName: Node
97 @AliasFor: t|Node
98 @For: ManakaiDOM|DOM
99
100 ResourceDef:
101 @QName: Element
102 @AliasFor: t|Element
103 @For: ManakaiDOM|DOM
104
105 ResourceDef:
106 @QName: Document
107 @AliasFor: t|Document
108 @For: ManakaiDOM|DOM
109
110 ElementTypeBinding:
111 @Name: ClsDef
112 @ElementType:
113 dis:ResourceDef
114 @ShadowContent:
115 @@rdf:type:
116 @@@@: dis|MultipleResource
117 @@@ForCheck: !ManakaiDOM|ForIF !ManakaiDOM|ForClass
118 @@resourceFor:
119 @@@@: ManakaiDOM|ForClass
120 @@@ForCheck: ManakaiDOM|ManakaiDOM !=ManakaiDOM|ManakaiDOM
121 @@For: ManakaiDOM|DOM3
122 @@For: =ManakaiDOM|ManakaiDOM
123
124 @@rdf:type:
125 @@@@: DISLang|Class
126 @@@ForCheck: ManakaiDOM|ForClass
127
128 ElementTypeBinding:
129 @Name: ClsQName
130 @ElementType:
131 dis:QName
132 @ShadowContent:
133 @@ForCheck: ManakaiDOM|ForClass
134
135 ElementTypeBinding:
136 @Name: ClsISA
137 @ElementType:
138 dis:ISA
139 @ShadowContent:
140 @@ForCheck: ManakaiDOM|ForClass
141
142 ElementTypeBinding:
143 @Name: nullCase
144 @ElementType:
145 dis:ResourceDef
146 @ShadowContent:
147 @@rdf:type: ManakaiDOM|InCase
148 @@Value:
149 @@@is-null:1
150
151 ResourceDef:
152 @QName: LSParser
153 @AliasFor: DOMLS|LSParser
154 @For: ManakaiDOM|DOM3
155
156 ClsDef:
157 @ClsQName: ManakaiXMLParser
158
159 @Implement: DOMLS|LSParser
160
161 @f:implements:
162 @@@: DOMLS|LSFeature30
163 @@For: ManakaiDOM|DOM3
164
165 @DISLang:role: DOMLS|ParserRole
166
167 @Attr:
168 @@Name: domConfig
169 @@enDesc:
170 The configuration of the parser.
171
172 @@Get:
173 @@@Type: DOMCore|DOMConfiguration
174 @@@enDesc: The DOM configuration object.
175 @@@PerlDef:
176 __CODE{DOMCore|getConfigObject::
177 $target => $self,
178 $targetHash => $self,
179 $targetType => {<IFName::LSParser>},
180 $result => $r,
181 }__;
182
183 @Method:
184 @@ManakaiDOM:isForInternal:1
185 @@ForCheck: ManakaiDOM|ForClass
186 @@Operator: DISPerl|NewMethod
187 @@enDesc:
188 Creates a new instance of the object.
189 @@Param:
190 @@@Name: impl
191 @@@Type: DOMLS|GLSImplementation
192 @@@enDesc:
193 The implementation from which the parser is created.
194 @@Param:
195 @@@Name: features
196 @@@Type: DOMString
197 @@@dis:actualType: f|FeaturesString
198 @@@enDesc:
199 The set of features requested for the parser.
200 @@Return:
201 @@@Type: DOMMain|DOMObject
202 @@@dis:actualType: LSParser
203 @@@enDesc:
204 The newly created parser.
205 @@@PerlDef:
206 $r = bless {
207 <H::DOMCore:implementation> => $impl,
208 }, $self;
209
210 @Method:
211 @@Name: parseString
212 @@enImplNote:
213 Non-standard - to be removed
214
215 @@Param:
216 @@@Name: sourceText
217 @@@Type: DOMString
218 @@Return:
219 @@@Type: Document
220 @@@PerlDef:
221
222 $self->{char} = [];
223 $self->{token} = [];
224 $self->{source} = $sourceText;
225
226 __DEEP{
227 $r = $self->_parse_DocumentEntity
228 ($self->{<H::DOMCore:implementation>});
229 }__;
230
231 @Method:
232 @@Name: shiftChar
233 @@ManakaiDOM:isForInternal:1
234 @@ForCheck: ManakaiDOM|ForClass
235 @@enDesc:
236 Returns the next character.
237 @@Return:
238 @@@Type: idl|long||ManakaiDOM|all
239 @@@enDesc:
240 The code position number of the next character, if any,
241 or <CODE::-2>.
242 @@@PerlDef:
243 if (@{$self->{char}}) {
244 $r = shift @{$self->{char}};
245 } else {
246 my $char = substr ($self->{source}, pos ($self->{source}), 1);
247 pos ($self->{source})++;
248
249 if (length $char) {
250 $r = ord $char;
251 } else {
252 $r = -1;
253 }
254 }
255
256 @Method:
257 @@ManakaiDOM:isForInternal: 1
258 @@Operator: ManakaiDOM|MUErrorHandler
259 @@enDesc:
260 When a <IF::ecore|ErrorInterface||ManakaiDOM|Perl> is <Perl::report>ed,
261 then this method is invoked.
262
263 The method calls the <cfg::DOMCore|error-handler> if the error is of
264 <IF::DOMCore|DOMError>. Otherwise, the error is re-thrown so that
265 corresponding <Perl::catch> clause, if any, can catch the error.
266 @@Param:
267 @@@Name: err
268 @@@Type: ecore|ErrorInterface||ManakaiDOM|Perl
269 @@@enDesc:
270 The reported error object.
271 @@Return:
272 @@@Type: DISPerl|Any
273 @@@enDesc:
274 If the <P::err> is a <IF::DOMCore|DOMError>, then the return value
275 of the error handler.
276
277 {NOTE:: If the error is thrown, the method never returns.
278 }
279 @@@nullCase:
280 @@@@enDesc:
281 No error handler.
282 @@@PerlDef:
283 if ($err->isa (<IFName::DOMCore|DOMError||ManakaiDOM|ManakaiDOM>)) {
284 __DEEP{
285 A: {
286 my $cfg = $self-><AG::LSParser.domConfig>;
287 my $h = $cfg-><M::DOMCore|DOMConfiguration.getParameter>
288 ('error-handler');
289 $r = $h-><M::DOMCore|DOMErrorHandler.handleError> ($err);
290 } # A
291 }__;
292 } else {
293 $err-><M::ecore|ErrorInterface||ManakaiDOM|Perl.throw>;
294 }
295
296 @DISPerl:dpgDef:
297
298 /*
299 XML Document Entity
300
301 document := prolog element *Misc
302 - *Char RestrictedChar *Char ;; [1]
303 */
304 rule DocumentEntity ($impl) : standalone {
305 my $doc : return;
306
307 lang:Perl {
308 $doc = $impl-><M::DOMImpl.createDocument>;
309 $doc-><AS::Document.strictErrorChecking> (false);
310 }
311
312 /*
313 prolog := XMLDecl? *Misc [doctypedecl *Misc] ;; [22]
314 */
315 ?lexmode 'DocumentStart';
316
317 &XMLDeclarationOpt ($doc => $doc);
318
319 ?lexmode 'DocumentProlog';
320
321 // *Misc
322 ~* (CDO) {
323 &_CommentDeclaration_ ($doc => $doc, $parent => $doc);
324
325 ~ (MDC) {
326 ?lexmode DocumentProlog;
327 } else {
328 ?lexmode DocumentProlog;
329 }
330 } (PIO) {
331 &_ProcessingInstruction_ ($doc => $doc, $parent => $doc);
332
333 ~ (PIC) {
334 ?lexmode 'DocumentProlog';
335 } else {
336 ?lexmode DocumentProlog;
337 }
338 } (S) {
339 //
340 }
341
342 // doctypedecl
343 ~? (MDO) {
344 &_DocumentTypeDeclaration_ ($doc => $doc);
345
346 ~ (MDC) { }
347 }
348
349 ?lexmode 'DocumentMisc';
350
351 // *Misc
352 ~* (CDO) {
353 &_CommentDeclaration_ ($doc => $doc, $parent => $doc);
354
355 ~ (MDC) {
356 ?lexmode DocumentMisc;
357 } else {
358 ?lexmode DocumentMisc;
359 }
360 } (PIO) {
361 &_ProcessingInstruction_ ($doc => $doc, $parent => $doc);
362
363 ~ (PIC) {
364 ?lexmode 'DocumentMisc';
365 } else {
366 ?lexmode DocumentMisc;
367 }
368 } (S) {
369 //
370 }
371
372 // Document element
373 ~ (STAGO) {
374 &Element_ ($doc => $doc, $parent => $doc)
375 : unshift-current-token;
376 ~ (TAGC) {
377 ?lexmode DocumentEnd;
378 } else {
379 ?lexmode DocumentEnd;
380 }
381 } else {
382 ?lexmode 'DocumentEnd';
383 }
384
385 // *Misc
386 ~* (CDO) {
387 &_CommentDeclaration_ ($doc => $doc, $parent => $doc);
388
389 ~ (MDC) {
390 ?lexmode DocumentEnd;
391 } else {
392 ?lexmode DocumentEnd;
393 }
394 } (PIO) {
395 &_ProcessingInstruction_ ($doc => $doc, $parent => $doc);
396 ~ (PIC) {
397 ?lexmode 'DocumentEnd';
398 } else {
399 ?lexmode DocumentEnd;
400 }
401 } (S) {
402 //
403 }
404
405 ~ (#EOF) { }
406
407 lang:Perl {
408 if ($self->{has_error}) {
409 __EXCEPTION{DOMLS|PARSE_ERR::
410 }__;
411 }
412
413 $doc-><AS::Document.strictErrorChecking> (true);
414 }
415 } // DocumentEntity
416
417 /*
418 XML Declaration
419
420 XMLDecl := '<?xml' VersionInfo
421 [EncodingDecl]
422 [SDDecl]
423 [S] '?>' ;; [23]
424
425 NOTE: XML declaration is optional in XML 1.0
426 while it is required in XML 1.1.
427 */
428
429 rule XMLDeclarationOpt ($doc) {
430 ~? (XDO) {
431 &_XMLDeclaration ($doc => $doc);
432 }
433 } // XMLDeclarationOpt
434
435 rule XMLDeclaration ($doc) {
436 ~ (XDO) {
437 &_XMLDeclaration ($doc => $doc);
438 }
439 } // XMLDeclaration
440
441 rule _XMLDeclaration ($doc) {
442 ?lexmode 'TAG';
443
444 // TODO: implement this
445
446 } // _XMLDeclaration
447
448 /*
449 Document Type Declaration
450 */
451 rule _DocumentTypeDeclaration_ ($doc) {
452 ?lexmode 'DocumentTypeDeclaration';
453
454 ~ (Name == 'DOCTYPE') { }
455
456 ~ (S) { }
457
458 // Document type name
459 ~ (Name) {
460
461 }
462
463 // TODO: Implement this
464
465 // ~ (MDC) { }
466 } // _DocumentTypeDeclaration_
467
468 /*
469 Comment Declaration
470
471 Comment := '<!--' *(Char - '-' / '-' (Char - '-'))
472 '-->' ;; [15]
473 */
474 rule _CommentDeclaration_ ($doc, $parent) {
475 ?lexmode 'CommentDeclaration';
476
477 ~? (STRING) {
478 lang:Perl ($data => $token.value) {
479 my $com = $doc-><M::Document.createComment> ($data);
480 $parent-><M::Node.appendChild> ($com);
481 }
482 } else {
483 lang:Perl {
484 my $com = $doc-><M::Document.createComment> ('');
485 $parent-><M::Node.appendChild> ($com);
486 }
487 }
488
489 ~ (COM) {
490 ?lexmode MarkupDeclaration;
491 } else {
492 ?lexmode MarkupDeclaration;
493 }
494
495 // ~ (MDC) { }
496 } // _CommentDeclaration_
497
498 /*
499 Processing Instruction
500
501 PI := '<?' PITarget [S *Char - *Char '?>' *Char]
502 '?>' ;; [16]
503 */
504 rule _ProcessingInstruction_ ($doc, $parent) {
505 ?lexmode 'PIName';
506
507 my $pi;
508
509 ~ (Name) {
510 lang:Perl ($name => $token.value) {
511 if (lc $name eq 'xml') {
512 ## TODO: Well-formedness (syntax) error
513 }
514 ## TODO: Namespace well-formedness
515 $pi = $doc-><M::Document.createProcessingInstruction>
516 ($name);
517 }
518 }
519
520 ~ (S) {
521 ?lexmode 'PIData';
522
523 my $tdata;
524
525 ~? (DATA) {
526 lang:Perl ($data => $token.value) {
527 $tdata = $data;
528 }
529 } else {
530 lang:Perl {
531 $tdata = '';
532 }
533 }
534
535 lang:Perl {
536 $pi-><AS::Node.nodeValue> ($tdata);
537 }
538 }
539
540 lang:Perl {
541 $parent-><M::Node.appendChild> ($pi);
542 ## TODO: PIs in document type declaration subsets
543 }
544
545 // ~ (PIC) { }
546 } // _ProcessingInstruction_
547
548 /*
549 Element content parsing mode
550
551 element := EmptyElemTag /
552 STag content ETag ;; [39]
553 content := (CharData / element / Reference / CDSect /
554 PI / Comment) ;; [43]
555 */
556 rule Element_ ($doc, $parent) : standalone {
557 ?lexmode 'ElementContent';
558
559 my $node; // Current "parent" node
560 my $nodes; // Node stack (w/o $current_node)
561 my $type; // Current "parent" element type QName
562 my $types; // Element type stack (w/o $current_type)
563 my $ns; // Current in-scope namespace bindings
564 my $nses; // Namespace binding stack (w/o $current_ns)
565
566 lang:Perl {
567 $node = $parent;
568 $nodes = [];
569 $type = '';
570 $types = [];
571 $ns = {
572 xml => <Q::xml:>,
573 xmlns => <Q::xmlns:>,
574 };
575 $nses = [];
576 }
577
578 ~* : name => CONTENT
579 (CharData) {
580 // Character data
581 lang:Perl ($data => $token.value) {
582 if (index ($data, ']]>') > -1) {
583 ## TODO: Well-formedness (syntax) error
584 }
585 $node-><M::Node.appendChild>
586 ($doc-><M::Document.createTextNode> ($data));
587 }
588 } (STAGO) {
589 // Start tag or empty element tag
590
591 ?lexmode 'StartTag';
592
593 ~ (Name) {
594 my $attrs;
595 lang:Perl ($name => $token.value) {
596 push @{$types}, $type;
597 $type = $name;
598 $attrs = {};
599 }
600
601 ~? (S) {
602 &AttributeSpecificationList
603 ($doc => $doc, $attrs => $attrs);
604 }
605
606 my $el;
607
608 lang:Perl {
609 push @{$nses}, $ns;
610 $ns = {%$ns};
611
612 my %gattr;
613 my %lattr;
614 for my $atqname (keys %$attrs) {
615 my ($pfx, $lname) = split /:/, $atqname;
616 if (defined $lname) { ## Global attribute
617 ## TODO: Namespace well-formedness (lname is NCName)
618 if ($pfx eq 'xmlns') {
619 my $nsuri = $attrs->{$atqname}->{value};
620 if ($lname eq 'xml' and
621 $nsuri ne <Q::xml:>) {
622 ## TODO: error
623 } elsif ($lname eq 'xmlns') {
624 ## TODO: error
625 }
626 if ($nsuri eq '') {
627 ## TODO: error in XML 1.0
628 } elsif ($nsuri eq <Q::xml:> and
629 $lname ne 'xml') {
630 ## TODO: error
631 } elsif ($nsuri eq <Q::xmlns:>) {
632 ## TODO: error
633 }
634 $ns->{$lname} = $attrs->{$atqname}->{value};
635 delete $ns->{$lname} unless length $ns->{$lname};
636 } elsif ($pfx eq '') {
637 ## TODO: pfx is not NCName error
638 } else {
639 if ($gattr{$pfx}->{$lname}) {
640 ## TODO: Namespace well-formedness error
641 }
642 }
643 $gattr{$pfx}->{$lname} = $attrs->{$atqname};
644 } else { ## Local attribute
645 if ($pfx eq 'xmlns') {
646 $ns->{''} = $attrs->{xmlns}->{value};
647 delete $ns->{''} unless length $ns->{''};
648 } else {
649 $lattr{$pfx} = $attrs->{$atqname};
650 }
651 }
652 }
653
654 my ($pfx, $lname) = split /:/, $type;
655 my $nsuri;
656 ## TODO: lname is NCName?
657 if (defined $lname) { ## Prefixed namespace
658 if ($pfx eq '') {
659 ## TODO: pfx is not NCName error
660 }
661 if (defined $ns->{$pfx}) {
662 $nsuri = $ns->{$pfx};
663 } else {
664 ## TODO: namespace ill-formed
665 }
666 } else { ## Default namespace
667 $nsuri = $ns->{''};
668 }
669
670 $el = $doc-><M::Document.createElementNS>
671 ($nsuri, $type);
672
673 if ($attrs->{xmlns}) {
674 my $attr = $doc-><M::Document.createAttributeNS>
675 (<Q::xmlns:>, 'xmlns');
676 for (@{$attrs->{xmlns}->{nodes}}) {
677 $attr-><M::Node.appendChild> ($_);
678 }
679 $el-><M::Element.setAttributeNodeNS> ($attr);
680 }
681
682 for my $lname (keys %lattr) {
683 my $attr = $doc-><M::Document.createAttributeNS>
684 (null, $lname);
685 for (@{$lattr{$lname}->{nodes}}) {
686 $attr-><M::Node.appendChild> ($_);
687 }
688 $el-><M::Element.setAttributeNodeNS> ($attr);
689 }
690
691 for my $pfx (keys %gattr) {
692 for my $lname (keys %{$gattr{$pfx}}) {
693 my $attr = $doc-><M::Document.createAttributeNS>
694 ($ns->{$pfx}, $pfx.':'.$lname);
695 for (@{$gattr{$pfx}->{$lname}->{nodes}}) {
696 $attr-><M::Node.appendChild> ($_);
697 }
698 $el-><M::Element.setAttributeNodeNS> ($attr);
699 }
700 }
701
702 $node-><M::Node.appendChild> ($el);
703 }
704
705 ~ (TAGC) {
706 lang:Perl {
707 push @{$nodes}, $node;
708 $node = $el;
709 }
710 ?lexmode ElementContent;
711 } (MTAGC) {
712 lang:Perl {
713 $ns = pop @{$nses};
714 $type = pop @{$types};
715 }
716 ?lexmode ElementContent;
717 } else {
718 ?lexmode ElementContent;
719 }
720 } else {
721 ?lexmode ElementContent;
722 }
723
724 } (ETAGO) {
725 // End tag
726
727 ?lexmode 'EndTag';
728
729 my $is_docel;
730
731 ~ (Name) {
732 lang:Perl ($name => $token.value) {
733 if ($name eq $type) {
734 $type = pop @{$types};
735 if ($type eq '') {
736 $is_docel = true;
737 }
738 $node = pop @{$nodes};
739 $ns = pop @{$nses};
740 } else {
741 ## TODO: Element type match well-formedness error
742 }
743 }
744 }
745
746 ~? (S) { }
747
748 if-true ($is_docel) {
749 lang:Perl {
750 if (@{$types}) {
751 ## WF error
752 }
753 }
754 return;
755 }
756
757 ~ (TAGC) {
758 ?lexmode ElementContent;
759 } else {
760 ?lexmode 'ElementContent';
761 }
762
763 } (HCRO) {
764 &_HexadecimalCharacterReference_
765 ($doc => $doc, $parent => $node);
766
767 ~ (REFC) {
768 ?lexmode 'ElementContent';
769 } else {
770 ?lexmode ElementContent;
771 }
772 } (CRO) {
773 &_NumericCharacterReference_
774 ($doc => $doc, $parent => $node);
775
776 ~ (REFC) {
777 ?lexmode 'ElementContent';
778 } else {
779 ?lexmode ElementContent;
780 }
781 } (ERO) {
782 &_GeneralEntityReference_
783 ($doc => $doc, $parent => $node);
784
785 ~ (REFC) {
786 ?lexmode 'ElementContent';
787 } else {
788 ?lexmode ElementContent;
789 }
790 } (CDO) {
791 &_CommentDeclaration_ ($doc => $doc, $parent => $node);
792
793 ~ (MDC) {
794 ?lexmode ElementContent;
795 } else {
796 ?lexmode ElementContent;
797 }
798 } (CDSO) {
799 &_CDATASection_ ($doc => $doc, $parent => $node);
800
801 ~ (MSE) {
802 ?lexmode 'ElementContent';
803 } else {
804 ?lexmode ElementContent;
805 }
806 } (PIO) {
807 &_ProcessingInstruction_ ($doc => $doc, $parent => $node);
808
809 ~ (PIC) {
810 ?lexmode 'ElementContent';
811 } else {
812 ?lexmode ElementContent;
813 }
814 }
815 } // Element_
816
817 rule AttributeSpecificationList ($doc, $attrs)
818 : standalone
819 {
820 ?lexmode 'StartTag';
821
822 my $i;
823 lang:Perl {
824 $i = 0;
825 }
826
827 ~* (Name) {
828 my $atqname;
829 lang:Perl ($name => $token.value) {
830 $atqname = $name;
831 }
832
833 ~? (S) { }
834 ~ (VI) { }
835 ~? (S) { }
836
837 my $vals;
838 lang:Perl {
839 if ($attrs->{$atqname}) {
840 ## TODO: Unique attr name well-formedness error
841 }
842
843 $vals = $attrs->{$atqname} = {
844 nodes => [],
845 value => '',
846 index => $i++,
847 };
848 }
849
850 ~ (LIT) {
851 &_AttributeValueSpecification_
852 ($doc => $doc, $vals => $vals);
853
854 ~ (LIT) {
855 ?lexmode StartTag;
856 } else {
857 ?lexmode StartTag;
858 }
859 } (LITA) {
860 &_AttributeValueSpecificationA_
861 ($doc => $doc, $vals => $vals);
862
863 ~ (LITA) {
864 ?lexmode StartTag;
865 } else {
866 ?lexmode StartTag;
867 }
868 }
869 } (S) : separator : terminator? { }
870 } // AttributeSpecificationList
871
872 rule _AttributeValueSpecification_ ($doc, $vals) {
873 // ~ (LIT) { }
874 ?lexmode 'AttributeValueLiteral';
875
876 ~* (STRING) {
877 lang:Perl ($value => $token.value) {
878 $value =~ s/[\x09\x0A\x0D]/ /g;
879 my $text = $doc-><M::Document.createTextNode> ($value);
880 push @{$vals->{nodes}}, $text;
881 $vals->{value} .= $value;
882 }
883 } (HCRO) {
884 &_HexadecimalCharacterReferenceV_
885 ($doc => $doc, $vals => $vals);
886
887 ~ (REFC) {
888 ?lexmode AttributeValueLiteral;
889 } else {
890 ?lexmode AttributeValueLiteral;
891 }
892 } (CRO) {
893 &_NumericCharacterReferenceV_
894 ($doc => $doc, $vals => $vals);
895
896 ~ (REFC) {
897 ?lexmode AttributeValueLiteral;
898 } else {
899 ?lexmode AttributeValueLiteral;
900 }
901 } (ERO) {
902 // TODO: Attribute value normalization
903 &_GeneralEntityReferenceV_
904 ($doc => $doc, $vals => $vals);
905
906 ~ (REFC) {
907 ?lexmode AttributeValueLiteral;
908 } else {
909 ?lexmode AttributeValueLiteral;
910 }
911 }
912
913 // ~ (LIT) { } (LITA) { }
914 } // _AttributeValueSpecification_
915
916 rule _AttributeValueSpecificationA_ ($doc, $vals) {
917 // ~ (LITA) { }
918 ?lexmode 'AttributeValueLiteralA';
919
920 ~* (STRING) {
921 lang:Perl ($value => $token.value) {
922 $value =~ s/[\x09\x0A\x0D]/ /g;
923 my $text = $doc-><M::Document.createTextNode> ($value);
924 push @{$vals->{nodes}}, $text;
925 $vals->{value} .= $value;
926 }
927 } (HCRO) {
928 &_HexadecimalCharacterReferenceV_
929 ($doc => $doc, $vals => $vals);
930
931 ~ (REFC) {
932 ?lexmode AttributeValueLiteralA;
933 } else {
934 ?lexmode AttributeValueLiteralA;
935 }
936 } (CRO) {
937 &_NumericCharacterReferenceV_
938 ($doc => $doc, $vals => $vals);
939
940 ~ (REFC) {
941 ?lexmode AttributeValueLiteralA;
942 } else {
943 ?lexmode AttributeValueLiteralA;
944 }
945 } (ERO) {
946 // TODO: Attribute value normalization
947 &_GeneralEntityReferenceV_
948 ($doc => $doc, $vals => $vals);
949
950 ~ (REFC) {
951 ?lexmode AttributeValueLiteralA;
952 } else {
953 ?lexmode AttributeValueLiteralA;
954 }
955 }
956
957 // ~ (LITA) { }
958 } // _AttributeValueSpecificationA_
959
960 /*
961 CDATA Section Content Parsing Mode
962 */
963 rule _CDATASection_ ($doc, $parent) {
964 ?lexmode 'CDATASectionContent';
965
966 my $cdata;
967
968 ~? (CData) {
969 lang:Perl ($data => $token.value) {
970 $cdata = $data;
971 }
972 } else {
973 lang:Perl {
974 $cdata = '';
975 }
976 }
977
978 lang:Perl {
979 my $cdsect = $doc-><M::Document.createCDATASection>
980 ($cdata);
981 $parent-><M::Node.appendChild> ($cdsect);
982 }
983
984 // ~ (MSE) { }
985 } // _CDATASection_
986
987 rule _NumericCharacterReference_ ($doc, $parent) {
988 ?lexmode 'NumericCharacterReference';
989
990 ~ (NUMBER) {
991 lang:Perl ($num => $token.value) {
992 ## TODO: [WFC: Lecal Character]
993 my $ncr = $doc-><M::Document.createTextNode>
994 (chr (0+$num));
995 $parent-><M::Node.appendChild> ($ncr);
996 }
997 }
998
999 // ~ (REFC) { }
1000 } // _NumericCharacterReference_
1001
1002 rule _NumericCharacterReferenceV_ ($doc, $vals) {
1003 ?lexmode 'NumericCharacterReference';
1004
1005 ~ (NUMBER) {
1006 lang:Perl ($num => $token.value) {
1007 ## TODO: [WFC: Lecal Character]
1008 my $ncr = $doc-><M::Document.createTextNode>
1009 (my $char = chr (0+$num));
1010 push @{$vals->{nodes}}, $ncr;
1011 $vals->{value} .= $char;
1012 }
1013 }
1014
1015 // ~ (REFC) { }
1016 } // _NumericCharacterReferenceV_
1017
1018 rule _HexadecimalCharacterReference_ ($doc, $parent) {
1019 ?lexmode 'HexadecimalCharacterReference';
1020
1021 ~ (Hex) {
1022 lang:Perl ($num => $token.value) {
1023 ## TODO: [WFC: Lecal Character]
1024 my $ncr = $doc-><M::Document.createTextNode>
1025 (chr hex $num);
1026 $parent-><M::Node.appendChild> ($ncr);
1027 }
1028 }
1029
1030 // ~ (REFC) { }
1031 } // _HexadecimalCharacterReference_
1032
1033 rule _HexadecimalCharacterReferenceV_ ($doc, $parent) {
1034 ?lexmode 'HexadecimalCharacterReference';
1035
1036 ~ (Hex) {
1037 lang:Perl ($num => $token.value) {
1038 ## TODO: [WFC: Lecal Character]
1039 my $ncr = $doc-><M::Document.createTextNode>
1040 (my $char = chr hex $num);
1041 push @{$vals->{nodes}}, $ncr;
1042 $vals->{value} .= $char;
1043 }
1044 }
1045
1046 // ~ (REFC) { }
1047 } // _HexadecimalCharacterReferenceV_
1048
1049 rule _GeneralEntityReference_ ($doc, $parent) {
1050 // TODO: Expansion
1051 ?lexmode 'EntityReference';
1052
1053 ~ (Name) {
1054 lang:Perl ($name => $token.value) {
1055 ## TODO: Namespace well-formedness
1056 ## TODO: Entity declared constraints
1057 my $er = $doc-><M::Document.createEntityReference>
1058 ($name);
1059 $parent-><M::Node.appendChild> ($er);
1060 }
1061 }
1062
1063 // ~ (REFC) { }
1064 } // _GeneralEntityReference_
1065
1066 rule _GeneralEntityReferenceV_ ($doc, $vals) {
1067 // TODO: Expansion
1068 ?lexmode 'EntityReference';
1069
1070 ~ (Name) {
1071 lang:Perl ($name => $token.value) {
1072 ## TODO: Namespace well-formedness
1073 ## TODO: Entity declared constraints
1074 my $er = $doc-><M::Document.createEntityReference>
1075 ($name);
1076 push @{$vals->{nodes}}, $er;
1077 }
1078 }
1079
1080 // ~ (REFC) { }
1081 } // _GeneralEntityReferenceV_
1082
1083
1084 /*
1085 XML Name
1086 */
1087 lexmode Name {
1088 $NameStartChar10 := [
1089 '_' ':'
1090 // Letter
1091 // BaseChar
1092 U+0041..U+005A U+0061..U+007A U+00C0..U+00D6
1093 U+00D8..U+00F6 U+00F8..U+00FF U+0100..U+0131
1094 U+0134..U+013E U+0141..U+0148 U+014A..U+017E
1095 U+0180..U+01C3 U+01CD..U+01F0 U+01F4..U+01F5
1096 U+01FA..U+0217 U+0250..U+02A8 U+02BB..U+02C1
1097 U+0386 U+0388..U+038A U+038C U+038E..U+03A1
1098 U+03A3..U+03CE U+03D0..U+03D6 U+03DA U+03DC
1099 U+03DE U+03E0 U+03E2..U+03F3 U+0401..U+040C
1100 U+040E..U+044F U+0451..U+045C U+045E..U+0481
1101 U+0490..U+04C4 U+04C7..U+04C8 U+04CB..U+04CC
1102 U+04D0..U+04EB U+04EE..U+04F5 U+04F8..U+04F9
1103 U+0531..U+0556 U+0559 U+0561..U+0586
1104 U+05D0..U+05EA U+05F0..U+05F2 U+0621..U+063A
1105 U+0641..U+064A U+0671..U+06B7 U+06BA..U+06BE
1106 U+06C0..U+06CE U+06D0..U+06D3 U+06D5
1107 U+06E5..U+06E6 U+0905..U+0939 U+093D
1108 U+0958..U+0961 U+0985..U+098C U+098F..U+0990
1109 U+0993..U+09A8 U+09AA..U+09B0 U+09B2
1110 U+09B6..U+09B9 U+09DC..U+09DD U+09DF..U+09E1
1111 U+09F0..U+09F1 U+0A05..U+0A0A U+0A0F..U+0A10
1112 U+0A13..U+0A28 U+0A2A..U+0A30 U+0A32..U+0A33
1113 U+0A35..U+0A36 U+0A38..U+0A39 U+0A59..U+0A5C
1114 U+0A5E U+0A72..U+0A74 U+0A85..U+0A8B U+0A8D
1115 U+0A8F..U+0A91 U+0A93..U+0AA8 U+0AAA..U+0AB0
1116 U+0AB2..U+0AB3 U+0AB5..U+0AB9 U+0ABD U+0AE0
1117 U+0B05..U+0B0C U+0B0F..U+0B10 U+0B13..U+0B28
1118 U+0B2A..U+0B30 U+0B32..U+0B33 U+0B36..U+0B39
1119 U+0B3D U+0B5C..U+0B5D U+0B5F..U+0B61
1120 U+0B85..U+0B8A U+0B8E..U+0B90 U+0B92..U+0B95
1121 U+0B99..U+0B9A U+0B9C U+0B9E..U+0B9F
1122 U+0BA3..U+0BA4 U+0BA8..U+0BAA U+0BAE..U+0BB5
1123 U+0BB7..U+0BB9 U+0C05..U+0C0C U+0C0E..U+0C10
1124 U+0C12..U+0C28 U+0C2A..U+0C33 U+0C35..U+0C39
1125 U+0C60..U+0C61 U+0C85..U+0C8C U+0C8E..U+0C90
1126 U+0C92..U+0CA8 U+0CAA..U+0CB3 U+0CB5..U+0CB9
1127 U+0CDE U+0CE0..U+0CE1 U+0D05..U+0D0C
1128 U+0D0E..U+0D10 U+0D12..U+0D28 U+0D2A..U+0D39
1129 U+0D60..U+0D61 U+0E01..U+0E2E U+0E30
1130 U+0E32..U+0E33 U+0E40..U+0E45 U+0E81..U+0E82
1131 U+0E84 U+0E87..U+0E88 U+0E8A U+0E8D
1132 U+0E94..U+0E97 U+0E99..U+0E9F U+0EA1..U+0EA3
1133 U+0EA5 U+0EA7 U+0EAA..U+0EAB U+0EAD..U+0EAE
1134 U+0EB0 U+0EB2..U+0EB3 U+0EBD U+0EC0..U+0EC4
1135 U+0F40..U+0F47 U+0F49..U+0F69 U+10A0..U+10C5
1136 U+10D0..U+10F6 U+1100 U+1102..U+1103
1137 U+1105..U+1107 U+1109 U+110B..U+110C
1138 U+110E..U+1112 U+113C U+113E U+1140 U+114C
1139 U+114E U+1150 U+1154..U+1155 U+1159
1140 U+115F..U+1161 U+1163 U+1165 U+1167 U+1169
1141 U+116D..U+116E U+1172..U+1173 U+1175 U+119E
1142 U+11A8 U+11AB U+11AE..U+11AF U+11B7..U+11B8
1143 U+11BA U+11BC..U+11C2 U+11EB U+11F0 U+11F9
1144 U+1E00..U+1E9B U+1EA0..U+1EF9 U+1F00..U+1F15
1145 U+1F18..U+1F1D U+1F20..U+1F45 U+1F48..U+1F4D
1146 U+1F50..U+1F57 U+1F59 U+1F5B U+1F5D
1147 U+1F5F..U+1F7D U+1F80..U+1FB4 U+1FB6..U+1FBC
1148 U+1FBE U+1FC2..U+1FC4 U+1FC6..U+1FCC
1149 U+1FD0..U+1FD3 U+1FD6..U+1FDB U+1FE0..U+1FEC
1150 U+1FF2..U+1FF4 U+1FF6..U+1FFC U+2126
1151 U+212A..U+212B U+212E U+2180..U+2182
1152 U+3041..U+3094 U+30A1..U+30FA U+3105..U+312C
1153 U+AC00..U+D7A3
1154 // Ideographic
1155 U+4E00..U+9FA5 U+3007 U+3021..U+3029
1156 ];
1157 $NameChar10 := [
1158 '.' '-' '_' ':'
1159 // Letter
1160 // BaseChar
1161 U+0041..U+005A U+0061..U+007A U+00C0..U+00D6
1162 U+00D8..U+00F6 U+00F8..U+00FF U+0100..U+0131
1163 U+0134..U+013E U+0141..U+0148 U+014A..U+017E
1164 U+0180..U+01C3 U+01CD..U+01F0 U+01F4..U+01F5
1165 U+01FA..U+0217 U+0250..U+02A8 U+02BB..U+02C1
1166 U+0386 U+0388..U+038A U+038C U+038E..U+03A1
1167 U+03A3..U+03CE U+03D0..U+03D6 U+03DA U+03DC
1168 U+03DE U+03E0 U+03E2..U+03F3 U+0401..U+040C
1169 U+040E..U+044F U+0451..U+045C U+045E..U+0481
1170 U+0490..U+04C4 U+04C7..U+04C8 U+04CB..U+04CC
1171 U+04D0..U+04EB U+04EE..U+04F5 U+04F8..U+04F9
1172 U+0531..U+0556 U+0559 U+0561..U+0586
1173 U+05D0..U+05EA U+05F0..U+05F2 U+0621..U+063A
1174 U+0641..U+064A U+0671..U+06B7 U+06BA..U+06BE
1175 U+06C0..U+06CE U+06D0..U+06D3 U+06D5
1176 U+06E5..U+06E6 U+0905..U+0939 U+093D
1177 U+0958..U+0961 U+0985..U+098C U+098F..U+0990
1178 U+0993..U+09A8 U+09AA..U+09B0 U+09B2
1179 U+09B6..U+09B9 U+09DC..U+09DD U+09DF..U+09E1
1180 U+09F0..U+09F1 U+0A05..U+0A0A U+0A0F..U+0A10
1181 U+0A13..U+0A28 U+0A2A..U+0A30 U+0A32..U+0A33
1182 U+0A35..U+0A36 U+0A38..U+0A39 U+0A59..U+0A5C
1183 U+0A5E U+0A72..U+0A74 U+0A85..U+0A8B U+0A8D
1184 U+0A8F..U+0A91 U+0A93..U+0AA8 U+0AAA..U+0AB0
1185 U+0AB2..U+0AB3 U+0AB5..U+0AB9 U+0ABD U+0AE0
1186 U+0B05..U+0B0C U+0B0F..U+0B10 U+0B13..U+0B28
1187 U+0B2A..U+0B30 U+0B32..U+0B33 U+0B36..U+0B39
1188 U+0B3D U+0B5C..U+0B5D U+0B5F..U+0B61
1189 U+0B85..U+0B8A U+0B8E..U+0B90 U+0B92..U+0B95
1190 U+0B99..U+0B9A U+0B9C U+0B9E..U+0B9F
1191 U+0BA3..U+0BA4 U+0BA8..U+0BAA U+0BAE..U+0BB5
1192 U+0BB7..U+0BB9 U+0C05..U+0C0C U+0C0E..U+0C10
1193 U+0C12..U+0C28 U+0C2A..U+0C33 U+0C35..U+0C39
1194 U+0C60..U+0C61 U+0C85..U+0C8C U+0C8E..U+0C90
1195 U+0C92..U+0CA8 U+0CAA..U+0CB3 U+0CB5..U+0CB9
1196 U+0CDE U+0CE0..U+0CE1 U+0D05..U+0D0C
1197 U+0D0E..U+0D10 U+0D12..U+0D28 U+0D2A..U+0D39
1198 U+0D60..U+0D61 U+0E01..U+0E2E U+0E30
1199 U+0E32..U+0E33 U+0E40..U+0E45 U+0E81..U+0E82
1200 U+0E84 U+0E87..U+0E88 U+0E8A U+0E8D
1201 U+0E94..U+0E97 U+0E99..U+0E9F U+0EA1..U+0EA3
1202 U+0EA5 U+0EA7 U+0EAA..U+0EAB U+0EAD..U+0EAE
1203 U+0EB0 U+0EB2..U+0EB3 U+0EBD U+0EC0..U+0EC4
1204 U+0F40..U+0F47 U+0F49..U+0F69 U+10A0..U+10C5
1205 U+10D0..U+10F6 U+1100 U+1102..U+1103
1206 U+1105..U+1107 U+1109 U+110B..U+110C
1207 U+110E..U+1112 U+113C U+113E U+1140 U+114C
1208 U+114E U+1150 U+1154..U+1155 U+1159
1209 U+115F..U+1161 U+1163 U+1165 U+1167 U+1169
1210 U+116D..U+116E U+1172..U+1173 U+1175 U+119E
1211 U+11A8 U+11AB U+11AE..U+11AF U+11B7..U+11B8
1212 U+11BA U+11BC..U+11C2 U+11EB U+11F0 U+11F9
1213 U+1E00..U+1E9B U+1EA0..U+1EF9 U+1F00..U+1F15
1214 U+1F18..U+1F1D U+1F20..U+1F45 U+1F48..U+1F4D
1215 U+1F50..U+1F57 U+1F59 U+1F5B U+1F5D
1216 U+1F5F..U+1F7D U+1F80..U+1FB4 U+1FB6..U+1FBC
1217 U+1FBE U+1FC2..U+1FC4 U+1FC6..U+1FCC
1218 U+1FD0..U+1FD3 U+1FD6..U+1FDB U+1FE0..U+1FEC
1219 U+1FF2..U+1FF4 U+1FF6..U+1FFC U+2126
1220 U+212A..U+212B U+212E U+2180..U+2182
1221 U+3041..U+3094 U+30A1..U+30FA U+3105..U+312C
1222 U+AC00..U+D7A3
1223 // Ideographic
1224 U+4E00..U+9FA5 U+3007 U+3021..U+3029
1225 // Digit
1226 U+0030..U+0039 U+0660..U+0669 U+06F0..U+06F9
1227 U+0966..U+096F U+09E6..U+09EF U+0A66..U+0A6F
1228 U+0AE6..U+0AEF U+0B66..U+0B6F U+0BE7..U+0BEF
1229 U+0C66..U+0C6F U+0CE6..U+0CEF U+0D66..U+0D6F
1230 U+0E50..U+0E59 U+0ED0..U+0ED9 U+0F20..U+0F29
1231 // CombiningChar
1232 U+0300..U+0345 U+0360..U+0361 U+0483..U+0486
1233 U+0591..U+05A1 U+05A3..U+05B9 U+05BB..U+05BD
1234 U+05BF U+05C1..U+05C2 U+05C4 U+064B..U+0652
1235 U+0670 U+06D6..U+06DC U+06DD..U+06DF
1236 U+06E0..U+06E4 U+06E7..U+06E8 U+06EA..U+06ED
1237 U+0901..U+0903 U+093C U+093E..U+094C U+094D
1238 U+0951..U+0954 U+0962..U+0963 U+0981..U+0983
1239 U+09BC U+09BE U+09BF U+09C0..U+09C4
1240 U+09C7..U+09C8 U+09CB..U+09CD U+09D7
1241 U+09E2..U+09E3 U+0A02 U+0A3C U+0A3E U+0A3F
1242 U+0A40..U+0A42 U+0A47..U+0A48 U+0A4B..U+0A4D
1243 U+0A70..U+0A71 U+0A81..U+0A83 U+0ABC
1244 U+0ABE..U+0AC5 U+0AC7..U+0AC9 U+0ACB..U+0ACD
1245 U+0B01..U+0B03 U+0B3C U+0B3E..U+0B43
1246 U+0B47..U+0B48 U+0B4B..U+0B4D U+0B56..U+0B57
1247 U+0B82..U+0B83 U+0BBE..U+0BC2 U+0BC6..U+0BC8
1248 U+0BCA..U+0BCD U+0BD7 U+0C01..U+0C03
1249 U+0C3E..U+0C44 U+0C46..U+0C48 U+0C4A..U+0C4D
1250 U+0C55..U+0C56 U+0C82..U+0C83 U+0CBE..U+0CC4
1251 U+0CC6..U+0CC8 U+0CCA..U+0CCD U+0CD5..U+0CD6
1252 U+0D02..U+0D03 U+0D3E..U+0D43 U+0D46..U+0D48
1253 U+0D4A..U+0D4D U+0D57 U+0E31 U+0E34..U+0E3A
1254 U+0E47..U+0E4E U+0EB1 U+0EB4..U+0EB9
1255 U+0EBB..U+0EBC U+0EC8..U+0ECD U+0F18..U+0F19
1256 U+0F35 U+0F37 U+0F39 U+0F3E U+0F3F
1257 U+0F71..U+0F84 U+0F86..U+0F8B U+0F90..U+0F95
1258 U+0F97 U+0F99..U+0FAD U+0FB1..U+0FB7 U+0FB9
1259 U+20D0..U+20DC U+20E1 U+302A..U+302F U+3099
1260 U+309A
1261 // Extender
1262 U+00B7 U+02D0 U+02D1 U+0387 U+0640 U+0E46
1263 U+0EC6 U+3005 U+3031..U+3035 U+309D..U+309E
1264 U+30FC..U+30FE
1265 ];
1266
1267 $NameStartChar11 := [
1268 ':' '_'
1269 'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M'
1270 'N' 'O' 'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W' 'X' 'Y' 'Z'
1271 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm'
1272 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z'
1273 U+00C0..U+00D6 U+00D8..U+00F6 U+00F8..U+02FF
1274 U+0370..U+037D U+037F..U+1FFF U+200C..U+200D
1275 U+2070..U+218F U+2C00..U+2FEF U+3001..U+D7FF
1276 U+F900..U+FDCF U+FDF0..U+FFFD U+10000..U+EFFFF
1277 ];
1278 $NameChar11 := [
1279 '-' '.' '0' '1' '2' '3' '4' '5' '6' '7' '8' '9'
1280 U+00B7 U+0300..U+036F U+203F..U+2040
1281 // NameStartChar
1282 ':' '_'
1283 'A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M'
1284 'N' 'O' 'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W' 'X' 'Y' 'Z'
1285 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm'
1286 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z'
1287 U+00C0..U+00D6 U+00D8..U+00F6 U+00F8..U+02FF
1288 U+0370..U+037D U+037F..U+1FFF U+200C..U+200D
1289 U+2070..U+218F U+2C00..U+2FEF U+3001..U+D7FF
1290 U+F900..U+FDCF U+FDF0..U+FFFD U+10000..U+EFFFF
1291 ];
1292 Name : value := $NameStartChar11 $NameChar11*;
1293 } // Name
1294
1295 /*
1296 Space
1297 */
1298 lexmode S {
1299 S := [U+0009 U+000A U+000D U+0020]+;
1300 } // S
1301
1302 /*
1303 Document end scanning mode
1304 */
1305 lexmode DocumentEnd
1306 : standalone
1307 : extends => 'S'
1308 {
1309 /*
1310 Processing instruction
1311 */
1312 PIO := ['<'] ['?'];
1313
1314 /*
1315 Comment declaration
1316 */
1317 CDO := ['<'] ['!'] ['-'] ['-'];
1318 } // DocumentEnd
1319
1320 /*
1321 Document misc scanning mode
1322
1323 This mode scans |Misc| constructions as well
1324 as document element's start tag.
1325 */
1326 lexmode DocumentMisc
1327 : standalone
1328 : extends => 'DocumentEnd'
1329 {
1330 /*
1331 Document element start tag
1332 */
1333 STAGO := ['<'];
1334 } // DocumentMisc
1335
1336 /*
1337 Document prolog scanning mode
1338 */
1339 lexmode DocumentProlog
1340 : standalone
1341 : extends => 'DocumentMisc'
1342 {
1343 /*
1344 |DOCTYPE| declaration
1345 */
1346 MDO := ['<'] ['!'];
1347 } // DocumentProlog
1348
1349 /*
1350 Document start scanning mode
1351 */
1352 lexmode DocumentStart
1353 : initial
1354 : standalone
1355 : extends => 'DocumentProlog'
1356 {
1357 /*
1358 XML declaration
1359 */
1360 XDO := ['<'] ['?'] ['x'] ['m'] ['l'];
1361 } // DocumentStart
1362
1363 /*
1364 Markup declaration scanning mode
1365
1366 This mode is used to recognize |MDC| that terminates
1367 a comment declaration as well as the base |lexmode|
1368 for e.g. document type declaration scanning mode.
1369 */
1370 lexmode MarkupDeclaration
1371 : standalone
1372 : extends => 'Name'
1373 {
1374 /*
1375 Markup declaration close
1376 */
1377 MDC := ['>'];
1378
1379 /*
1380 Literal open
1381 */
1382 LIT := ['"'];
1383
1384 /*
1385 Alternative literal open
1386 */
1387 LITA := [U+0027];
1388 } // MarkupDeclaration
1389
1390 lexmode DocumentTypeDeclaration
1391 : standalone
1392 : extends => 'MarkupDeclaration'
1393 {
1394 /*
1395 Declaration subset close
1396 */
1397 DSO := ['['];
1398
1399 /*
1400 Declaration subset close
1401 */
1402 DSC := [']'];
1403 } // DocumentTypeDeclaration
1404
1405 /*
1406 Comment declaration scanning mode
1407 */
1408 lexmode CommentDeclaration
1409 : standalone
1410 {
1411 /*
1412 Comment close
1413 */
1414 COM := ['-'] ['-'];
1415
1416 /*
1417 Comment data
1418 */
1419 $string := ['-']? [^'-'];
1420 STRING : value := $string+;
1421 } // CommentDeclaration
1422
1423 /*
1424 Processing instruction name and |S| scanning mode
1425 */
1426 lexmode PIName
1427 : standalone
1428 : extends => 'Name'
1429 : extends => 'S'
1430 {
1431 /*
1432 Processing instruction close
1433 */
1434 PIC := ['?'] ['>'];
1435 } // PIName
1436
1437 /*
1438 Processing instruction data scanning mode
1439 */
1440 lexmode PIData
1441 : standalone
1442 {
1443 /*
1444 Processing instruction close
1445 */
1446 PIC := ['?'] ['>'];
1447
1448 /*
1449 Processing instruction target data
1450 */
1451 ?default-token DATA : value;
1452 } // PIData
1453
1454 /*
1455 Content of element scanning mode
1456 */
1457 lexmode ElementContent
1458 : standalone
1459 {
1460 /*
1461 Start tag open
1462 */
1463 STAGO := ['<'];
1464
1465 /*
1466 End tag open
1467 */
1468 ETAGO := ['<'] ['/'];
1469
1470 /*
1471 Hexadecimal character reference open
1472 */
1473 HCRO := ['&'] ['#'] ['x'];
1474
1475 /*
1476 Numeric character reference open
1477 */
1478 CRO := ['&'] ['#'];
1479
1480 /*
1481 General entity reference open
1482 */
1483 ERO := ['&'];
1484
1485 /*
1486 Comment declaration open
1487 */
1488 CDO := ['<'] ['!'] ['-'] ['-'];
1489
1490 /*
1491 CDATA section open
1492 */
1493 CDSO := ['<'] ['!'] ['[']
1494 ['C'] ['D'] ['A'] ['T'] ['A'] ['['];
1495
1496 /*
1497 Processing instruction open
1498 */
1499 PIO := ['<'] ['?'];
1500
1501 /*
1502 Markup section end
1503 */
1504 MSE := [']'] [']'] ['>'];
1505
1506 /*
1507 Character data
1508 */
1509 /*
1510 Character data and/or |MSE|
1511 */
1512 ?default-token CharData : value;
1513 } // ElementContent
1514
1515 /*
1516 CDATA section content scanning mode
1517 */
1518 lexmode CDATASectionContent
1519 : standalone
1520 {
1521 /*
1522 Markup section end
1523 */
1524 MSE := [']'] [']'] ['>'];
1525
1526 /*
1527 Character data
1528 */
1529 ?default-token CData : value;
1530 } // CDATASectionContent
1531
1532 lexmode EntityReference
1533 : standalone
1534 : extends => 'Name'
1535 {
1536 /*
1537 Reference close
1538 */
1539 REFC := [';'];
1540 } // EntityReference
1541
1542 lexmode NumericCharacterReference
1543 : standalone
1544 {
1545 /*
1546 Decimal number
1547 */
1548 $digit := ['0' '1' '2' '3' '4' '5' '6' '7' '8' '9'];
1549 NUMBER : value := $digit+;
1550
1551 /*
1552 Reference close
1553 */
1554 REFC := [';'];
1555 } // NumericCharacterReference
1556
1557 lexmode HexadecimalCharacterReference
1558 : standalone
1559 {
1560 /*
1561 Hexadecimal number
1562 */
1563 $hexdigit := ['0' '1' '2' '3' '4' '5' '6' '7' '8' '9'
1564 'A' 'B' 'C' 'D' 'E' 'F'
1565 'a' 'b' 'c' 'd' 'e' 'f'];
1566 Hex : value := $hexdigit+;
1567
1568 /*
1569 Reference close
1570 */
1571 REFC := [';'];
1572 } // HexadecimalCharacterReference
1573
1574 lexmode StartTag
1575 : standalone
1576 : extends => 'Name'
1577 : extends => 'S'
1578 {
1579
1580 /*
1581 Value indicator
1582 */
1583 VI := ['='];
1584
1585 /*
1586 Literal open
1587 */
1588 LIT := ['"'];
1589 LITA := [U+0027];
1590
1591 /*
1592 Tag close
1593 */
1594 TAGC := ['>'];
1595
1596 /*
1597 Empty element tag close
1598 */
1599 MTAGC := ['/'] ['>'];
1600 } // StartTag
1601
1602 lexmode EndTag
1603 : standalone
1604 : extends => 'Name'
1605 : extends => 'S'
1606 {
1607 /*
1608 Tag close
1609 */
1610 TAGC := ['>'];
1611 } // EndTag
1612
1613 lexmode AttributeValueLiteral_ {
1614 ERO := ['&'];
1615 CRO := ['&'] ['#'];
1616 HCRO := ['&'] ['#'] ['x'];
1617 } // AttributeValueLiteral_
1618
1619 lexmode AttributeValueLiteral
1620 : standalone
1621 : extends => 'AttributeValueLiteral_'
1622 {
1623 LIT := ['"'];
1624 STRING : value := [^'"' '&' '<'];
1625 } // AttributeValueLiteral
1626
1627 lexmode AttributeValueLiteralA
1628 : standalone
1629 : extends => 'AttributeValueLiteral_'
1630 {
1631 LIT := [U+0027];
1632 STRING : value := [^U+0027 '&' '<'];
1633 } // AttributeValueLiteralA
1634
1635 token-error default : default {
1636 lang:Perl {
1637 my $location = {
1638 utf32_offset => pos ($self->{source}),
1639 };
1640 my $continue = __DOMCore:ERROR{xp|bad-token-error::
1641 xp|error-token => {$token},
1642 DOMCore|location => {$location},
1643 xp|source-text => {\($self->{source})},
1644 }__;
1645 unless ($continue) {
1646 __EXCEPTION{DOMLS|PARSE_ERR::
1647 }__;
1648 }
1649 $self->{has_error} = true;
1650 }
1651 } // default
1652 ##ManakaiXMLParser
1653
1654
1655 ElementTypeBinding:
1656 @Name: RuleDef
1657 @ElementType:
1658 dis:ResourceDef
1659 @ShadowContent:
1660 @@ForCheck: ManakaiDOM|ForClass
1661 @@rdf:type: Muf2003|RuleDefClass
1662
1663 ElementTypeBinding:
1664 @Name: RuleParam
1665 @ElementType:
1666 dis:ResourceDef
1667 @ShadowContent:
1668 @@rdf:type: Muf2003|RuleParameter
1669
1670 ElementTypeBinding:
1671 @Name: enImplNote
1672 @ElementType:
1673 dis:ImplNote
1674 @ShadowContent:
1675 @@lang:en
1676
1677 ElementTypeBinding:
1678 @Name: ErrDef
1679 @ElementType:
1680 dis:ResourceDef
1681 @ShadowContent:
1682 @@rdf:type: DOMCore|DOMErrorType
1683 @@For: ManakaiDOM|DOM3
1684 @@ecore:textFormatter: ManakaiXMLParserExceptionFormatter
1685
1686 ErrDef:
1687 @QName: xp|bad-token-error
1688 @enDesc:
1689 The parser is encountered to a token whose type is not
1690 allowed there.
1691 @DOMCore:severity: DOMCore|SEVERITY_ERROR
1692 @enMufDef:
1693 Token |%xp-error-token-type;|%xp-error-token-value
1694 (prefix => { (|}, suffix => {|)}); is not
1695 allowed %xp-error-lines (prefix => {(|}, suffix => {|)});
1696
1697 PropDef:
1698 @QName: xp|error-token
1699 @enDesc:
1700 The token where the parser found an error.
1701
1702 PropDef:
1703 @QName: xp|source-text
1704 @enDesc:
1705 A reference to the original source text, if available.
1706
1707 ElementTypeBinding:
1708 @Name:enMufDef
1709 @ElementType:
1710 ecore:defaultMessage
1711 @ShadowContent:
1712 @@lang:en
1713 @@ContentType:
1714 lang:muf
1715
1716 ResourceDef:
1717 @QName: DOMImpl
1718 @AliasFor: DOMCore|DOMImplementation
1719 @For: ManakaiDOM|DOM
1720
1721 ElementTypeBinding:
1722 @Name: Attr
1723 @ElementType:
1724 dis:ResourceDef
1725 @ShadowContent:
1726 @@rdf:type: DISLang|Attribute
1727 @@ForCheck: !=ManakaiDOM|ManakaiDOM
1728
1729 ElementTypeBinding:
1730 @Name: Get
1731 @ElementType:
1732 dis:ResourceDef
1733 @ShadowContent:
1734 @@rdf:type: DISLang|AttributeGet
1735
1736 ElementTypeBinding:
1737 @Name: Set
1738 @ElementType:
1739 dis:ResourceDef
1740 @ShadowContent:
1741 @@rdf:type: DISLang|AttributeSet
1742
1743 ElementTypeBinding:
1744 @Name: enDesc
1745 @ElementType:
1746 dis:Description
1747 @ShadowContent:
1748 @@lang:en
1749
1750 ElementTypeBinding:
1751 @Name: Method
1752 @ElementType:
1753 dis:ResourceDef
1754 @ShadowContent:
1755 @@rdf:type: DISLang|Method
1756 @@For: !=ManakaiDOM|ManakaiDOM
1757
1758 ElementTypeBinding:
1759 @Name: Return
1760 @ElementType:
1761 dis:ResourceDef
1762 @ShadowContent:
1763 @@rdf:type: DISLang|MethodReturn
1764
1765 ElementTypeBinding:
1766 @Name: Param
1767 @ElementType:
1768 dis:ResourceDef
1769 @ShadowContent:
1770 @@rdf:type: DISLang|MethodParameter
1771
1772 ElementTypeBinding:
1773 @Name: PerlDef
1774 @ElementType:
1775 dis:Def
1776 @ShadowContent:
1777 @@ContentType: lang|Perl
1778
1779 ElementTypeBinding:
1780 @Name: PropDef
1781 @ElementType:
1782 dis:ResourceDef
1783 @ShadowContent:
1784 @@rdf:type: rdf|Property
1785
1786 ClsDef:
1787 @ClsQName: ManakaiXMLParserExceptionFormatter
1788
1789 @ClsISA: ecore|MUErrorFormatter||ManakaiDOM|Perl
1790
1791 @RuleDef:
1792 @@Name: xp-error-token-type
1793 @@enDesc:
1794 The type of the token the parser is encountered.
1795
1796 @@Method:
1797 @@@Name: after
1798 @@@Param:
1799 @@@@Name: name
1800 @@@@Type: DOMString
1801 @@@@enDesc: The name of the method.
1802 @@@Param:
1803 @@@@Name: p
1804 @@@@Type: DISPerl|HASH
1805 @@@@enDesc: The set of the parameters to the method.
1806 @@@Param:
1807 @@@@Name: o
1808 @@@@Type: DISPerl|HASH
1809 @@@@enDesc: The option value.
1810 @@@Return:
1811 @@@@PerlDef:
1812 $p->{-result} = $o->{<H::xp|error-token>}->{type}
1813 if defined $o->{<H::xp|error-token>}->{type};
1814
1815 @RuleDef:
1816 @@Name: xp-error-token-value
1817 @@enDesc:
1818 The value of the token the parser is encountered, if any.
1819
1820 @@Method:
1821 @@@Name: after
1822 @@@Param:
1823 @@@@Name: name
1824 @@@@Type: DOMString
1825 @@@@enDesc: The name of the method.
1826 @@@Param:
1827 @@@@Name: p
1828 @@@@Type: DISPerl|HASH
1829 @@@@enDesc: The set of the parameters to the method.
1830 @@@Param:
1831 @@@@Name: o
1832 @@@@Type: DISPerl|HASH
1833 @@@@enDesc: The option value.
1834 @@@Return:
1835 @@@@PerlDef:
1836 $p->{-result} = $o->{<H::xp|error-token>}->{value}
1837 if defined $o->{<H::xp|error-token>}->{value};
1838
1839 @RuleDef:
1840 @@Name: xp-error-lines
1841 @@enDesc:
1842 A copy of fragment of the source text that contains the line
1843 where the error occurred, if available.
1844
1845 @@Method:
1846 @@@Name: after
1847 @@@Param:
1848 @@@@Name: name
1849 @@@@Type: DOMString
1850 @@@@enDesc: The name of the method.
1851 @@@Param:
1852 @@@@Name: p
1853 @@@@Type: DISPerl|HASH
1854 @@@@enDesc: The set of the parameters to the method.
1855 @@@Param:
1856 @@@@Name: o
1857 @@@@Type: DISPerl|HASH
1858 @@@@enDesc: The option value.
1859 @@@Return:
1860 @@@@PerlDef:
1861 my $pos = $o-><AG::DOMCore|DOMError.location>
1862 -><AG::DOMCore|DOMLocator.utf32Offset>;
1863 if ($pos > -1) {
1864 my $src = $o->{<H::xp|source-text>};
1865 my $start = $pos;
1866 $start = rindex ($$src, "\x0A", $start - 1) for 0..2;
1867 $start++;
1868 my $end = $pos;
1869 $end = index ($$src, "\x0A", $end + 1) for 0..2;
1870 $end = length $$src if $end < 0;
1871 $p->{-result} = substr $$src, $start, $end - $start;
1872 }
1873 ##XMLParserExceptionFormatter

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24