/[suikacvs]/markup/xml/xmlcc/xmlcc-work.en.html
Suika

Diff of /markup/xml/xmlcc/xmlcc-work.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.27 by wakaba, Sat Mar 29 10:12:46 2008 UTC revision 1.29 by wakaba, Fri Oct 17 05:57:26 2008 UTC
# Line 18  Line 18 
18    
19  <div class="header">  <div class="header">
20  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>
21  <h2>Working Draft <time datetime=2008-03-29>29 March 2008</time></h2>  <h2>Working Draft <time datetime=2008-10-17>17 October 2008</time></h2>
22    
23  <dl class="versions-uri">  <dl class="versions-uri">
24  <dt>This Version</dt>  <dt>This Version</dt>
# Line 79  project.  It might be updated, replaced, Line 79  project.  It might be updated, replaced,
79  other documents at any time.  It is inappropriate to  other documents at any time.  It is inappropriate to
80  cite this document as other than <q>work in progress</q>.</p>  cite this document as other than <q>work in progress</q>.</p>
81    
82    <p>The scope of this specification is explicitly limited to the <a
83    href="http://suika.fam.cx/www/markup/html/whatpm/readme">Whatpm</a>
84    implementation.  It is <em>not</em> the purpose of this specification
85    to define a general guideline to parse or to check XML documents.
86    This specification does <em>not</em> try to define a new version of
87    XML at all.
88    
89    <p>This version of the specification supports the fourth edition of
90    XML 1.0 and the second edition of XML 1.1.  The fifth edition of XML
91    1.0 might be supported in a later version.  XML namespaces
92    specifications is expected to be supported in a later version of this
93    specification.
94    
95  <p>Comments on this document are welcome and  <p>Comments on this document are welcome and
96  may be sent to the <a href="#author">author</a>.</p>  may be sent to the <a href="#author">author</a>.</p>
97    
# Line 92  normative version.</p> Line 105  normative version.</p>
105    
106  <p class=section-info>This section is <em>non$B!>(Bnormative</em>.</p>  <p class=section-info>This section is <em>non$B!>(Bnormative</em>.</p>
107    
108    <p>This specification defines how the parsing and the conformance
109    checking of XML documents should be implemented in the <a
110    href="http://suika.fam.cx/www/markup/html/whatpm/readme">Whatpm</a>
111    XML parser and conformance checker.
112    
113    <p>It is <em>not</em> the purpose of this specification to define,
114    e.g., how to parse XML documents in general; its scope is explicitly
115    limited to the <a
116    href="http://suika.fam.cx/www/markup/html/whatpm/readme">Whatpm</a>
117    implementation.
118    
119  <div class="issue ed">...  <div class="issue ed">...
120    
121  <p>Much of invalid (well-formed or not) XML document parsing and XML document  <p>Much of invalid (well-formed or not) XML document parsing and XML document
# Line 130  document, however, corresponding parts o Line 154  document, however, corresponding parts o
154  specification should be consulted instead.</p>  specification should be consulted instead.</p>
155  </div>  </div>
156    
157    <div class=section id=processing-model>
158    <h2>Processing Model</h2>
159    
160    <p>Conceptually, validation of an XML document is split into two
161    stages for the purpose of this specification: the <dfn
162    id=xml-document-parsing>XML document parsing</dfn> stage and the <dfn
163    id=dom-xml-conformance-checking>DOM XML conformance checking</dfn>
164    stage.
165    
166    <p>The input to the XML document parsing stage is a byte sequence
167    representing the parsed XML document (and any additional metadata),
168    and the output are a DOM tree representing the XML document and zero
169    or more <a href="#error">errors</a>.  The processor that implements
170    this stage is called <dfn id=parser>parser</dfn>.  Requirements for a
171    parser are defined in the section of <a href="#parsing-xml">Parsing an
172    XML Document</a>.
173    
174    <p>The input to the DOM XML conformance chcking stage is a DOM tree,
175    and the output are zero or more <a href="#error">errors</a>.  The
176    processor that implements this stage is called <dfn
177    id=conformance-checker>conformance checker</dfn>.  Requirements for a
178    conformance checker are defined in the section of <a
179    href="#checking-dom">Checking an XML DOM Tree</a>.
180    
181    
182  <div class=section id=error-categories>  <div class=section id=error-categories>
183  <h2>Error Classification</h2>  <h3>Error Classification</h3>
184    
185    <p class=ed>An <dfn id=error>error</dfn> is ...
186    
187  <p class=ed>If a <code>Document</code> node has no  <p class=ed>If a <code>Document</code> node has no
188  xml-well-formedness-error, entity-error, and unknown-error,  xml-well-formedness-error, entity-error, and unknown-error,
# Line 232  can be easily serialized into a valid XM Line 282  can be easily serialized into a valid XM
282  when MUST or SHOULD).  when MUST or SHOULD).
283  </div>  </div>
284    
285  <p>The parser <em class=rfc2119>MAY</em> continue the parsing of the document  </div>
286  even after a fatal error (as defined by the relavant specification) is  
287  encountered.  How the parsing ought to be continued is not defined by this  </div>
 specification.  
288    
289    <div class=section id=parsing-xml>
290    <h2>Parsing an <abbr>XML</abbr> Document</h2>
291    
292    <p>When a byte stream that represents an XML document is given to a
293    parser, it <em class=rfc2119>MUST</em> create a DOM tree according to
294    relevant specifications <span class=ed>[XML10, XML11, XMLNAMES10,
295    XMLNAMES11, DOM3CORE, WEBDOMCORE, DOMDTDEF, MANAKAIDOMEXT]</span>.
296    
297    <p>The parser <em class=rfc2119>MAY</em> continue the parsing of the
298    document even after a fatal error (as defined by the relavant
299    specifications) is encountered.  How the parsing ought to be continued
300    is not defined by this specification.
301    
302    <!--
303    <div class="note memo informative">    <div class="note memo informative">
304    <p>It is expected that the XML5 specification <span class=ed>@@ ref</span>    <p>It is expected that the XML5 specification <span class=ed>@@ ref</span>
305    will define how the parser has to convert any string into DOM tree    will define how the parser has to convert any string into DOM tree
306    completely.    completely.
307    </div>    </div>
308    -->
309    
310  </div>  <p class=ed>A future version of this specification might define the
311    entire parser in terms of input stream preprocessor, tokenizer, and
312    tree constructor.
313    
314  <div class=section id=parsing-xml>  <p>In addition, the following requirements are applied to the parser:
 <h2>Parsing <abbr>XML</abbr> Document</h2>  
315    
 <p>When an <abbr>XML</abbr> document is parsed, the following clauses  
 are applied:</p>  
316  <dl class=switch>  <dl class=switch>
317  <dt>For each external entity (including the document entity and the external  <dt>For each external entity (including the document entity and the external
318  subset entity, if any)  subset entity, if any)
# Line 494  parser Line 557  parser
557    ignored. -->    ignored. -->
558    
559  <dt>For each parameter entity reference  <dt>For each parameter entity reference
560    <dd>If the declaration for the entity is not read (i.e. no declaration    <dd><p>Process as follows:
561    for the entity is processed or the external entity referenced by the      <ol>
562    declaration cannot be retrieved), then:      <li>If the declaration for the entity is <em>not</em> processed,
563        then:
564          <dl class=switch>
565          <dt>If the document contains no external entity or if the document
566          contains the <code>standalone</code> pseudo-attribute set to
567          <code>yes</code><!-- or the document contains no DTD -->
568            <dd>The parser <em class=rfc2119>MUST</em> raise an
569            <a href="#xml-well-formedness-error" id=wf-entdeclared-pe><code>xml-well-formedness-error</code></a>.
570          <dt>Otherwise
571            <dd>The parser <em class=rfc2119>MUST</em> raise an
572            <a href="#xml-validity-error" id=vc-entdeclared-pe><code>xml-validity-error</code></a>.
573          </dl>
574        <li>If the declaration for the entity <em>is</em> processed but the
575        referenced entity cannot be retrieved, then the parser
576        <em class=rfc2119>MUST</em> raise an
577        <span class=ed>@@ ??-error</span>.
578        </ol>
579    
580        <p>In any of two cases above, process as follows:
581      <ul>      <ul>
582      <li>If the parameter entity is contained in a declaration, then the      <li>If the parameter entity reference is contained in a declaration, then
583      declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that      the declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that
584      any error before the parameter entity <em class=rfc2119>MUST</em> be      any error before the parameter entity <em class=rfc2119>MUST</em> be
585      raised as usual.      raised as usual.
586      <li>If the parameter entity is contained in the status portion of a      <li>If the parameter entity reference is contained in the status portion of
587      conditional section, then the conditional section      a conditional section, then the conditional section
588      <em class=rfc2119>MUST</em> be processed as if it were an      <em class=rfc2119>MUST</em> be processed as if it were an
589      <code>IGNORE</code>d section.      <code>IGNORE</code>d section.
590      <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or      <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or
# Line 523  parser Line 604  parser
604      </ul>      </ul>
605  <dt>For each general entity reference in an attribute value or in the content  <dt>For each general entity reference in an attribute value or in the content
606  of an element  of an element
607    <dd>If the declaration for the entity is not read (i.e. no declaration for    <dd><p>Process as follows:
608    the entity is processed or the external entity referenced by the declaration      <ol>
609    cannot be retrieved), then:      <li>If the <code>Name</code> of the entity reference is either
610        <code>amp</code>, <code>lt</code>, <code>gt</code>, <code>quot</code>,
611        or <code>apos</code>, then abort these steps.
612        <li>If the declaration for the entity is <em>not</em> processed,
613        then:
614          <dl class=switch>
615          <dt>If the document contains no external entity or if the document
616          contains the <code>standalone</code> pseudo-attribute set to
617          <code>yes</code><!-- or the document contains no DTD -->
618            <dd>The parser <em class=rfc2119>MUST</em> raise an
619            <a href="#xml-well-formedness-error" id=wf-entdeclared-ge><code>xml-well-formedness-error</code></a>.
620          <dt>Otherwise
621            <dd>The parser <em class=rfc2119>MUST</em> raise an
622            <a href="#xml-validity-error" id=vc-entdeclared-ge><code>xml-validity-error</code></a>.
623          </dl>
624        <li>If the declaration for the entity <em>is</em> processed but the
625        referenced entity cannot be retrieved, then the parser
626        <em class=rfc2119>MUST</em> raise an
627        <span class=ed>@@ ??-error</span>.
628        </ol>
629    
630        <p>In any of two cases above, process as follows:
631      <ul>      <ul>
632      <li>If the general entity reference is the first reference to an entity      <li>If the general entity reference is the first reference to an entity
633      that is not read, then the parser <em class=rfc2119>MUST</em> raise an      that is not read, then the parser <em class=rfc2119>MUST</em> raise an
# Line 540  of an element Line 642  of an element
642    <!-- In XML 1.0/1.1 spec, this is optional. -->    <!-- In XML 1.0/1.1 spec, this is optional. -->
643  </dl>  </dl>
644    
645  <p class=ed>@@ MUST try to read external entity  <p>The parser <em class=rfc2119>MUST</em> try to read any entity
646    referenced by general or parameter entity references<!--,--> and the
647  <p>In addition, the parser has to check whether the  external subset entity, if any<!--, and any general entity declared-->
648  following constraints are met.  in the document type definition.
649    
650    <p><strong>Well-formedness constraints</strong>.  When the parser
651    detects a voilation to one of certain well-formedness constraints, it
652    <em class=rfc2119>MUST</em> raise an <a
653    href="#xml-well-formedness-error"><code>xml-well-formedness-error</code></a>.
654    The list of such well-formed constraints is as follows:
655    
 <p><strong>Well-formedness constraints</strong>.  For each violation to  
 one of constraints below, an  
 <a href="#xml-well-formedness-error"><code>xml-well-formedness-error</code></a>  
 <em class=rfc2119>MUST</em> be raised.  The list of well-formedness  
 constraints is below:  
656  <ul>  <ul>
657  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wfc-PEinInternalSubset">Well-formedness constraint: PEs in Internal Subset</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wfc-PEinInternalSubset">Well-formedness constraint: PEs in Internal Subset</a>
658  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#GIMatch">Well-formedness constraint: Element Type Match</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#GIMatch">Well-formedness constraint: Element Type Match</a>
# Line 562  constraints is below: Line 665  constraints is below:
665  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#indtd">Well-formedness constraint: In DTD</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#indtd">Well-formedness constraint: In DTD</a>
666  </ul>  </ul>
667    
668  <p><strong>Validity constraints</strong>.  For each violation to  <p><strong>Validity constraints</strong>.  When the parser detects a
669  one of constraints below, an  violation to one of certain validity contraints, it <em
670  <a href="#xml-validity-error"><code>xml-validity-error</code></a>.  class=rfc2119>MUST</em> raise an <a
671  <em class=rfc2119>MUST</em> be raised.  The list of validity  href="#xml-validity-error"><code>xml-validity-error</code></a>.  The
672  constraints is below:  list of such validity constraints is as follows:
673    
674  <ul>  <ul>
675  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>
676  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>
677  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>
678    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-check-rmd">Validity constraint: Standalone Document Declaration</a>
679  </ul>  </ul>
680    
681  <p><strong>Other creteria</strong>.  For each violation to  <p><strong>Other creteria</strong>.  If the parser detects a violation
682  one of constraints below, an  to one of certain additional constraints, it <em
683  <a href="#xml-misc-recommendation"><code>xml-misc-recommendation</code></a>  class=rfc2119>MUST</em> raise an <a
684  <em class=rfc2119>MUST</em> be raised.  The list of constraints is below:  href="#xml-misc-recommendation"><code>xml-misc-recommendation</code></a>.
685    The list of such constraints is as follows:
686    
687  <ul>  <ul>
688  <li><q>For interoperability, if a parameter-entity reference appears in a  <li><q>For interoperability, if a parameter-entity reference appears in a
689  <code>choice</code>, <code>seq</code>, or <code>Mixed</code> construct, its  <code>choice</code>, <code>seq</code>, or <code>Mixed</code> construct, its
# Line 588  or <code>,</code>).</q> Line 695  or <code>,</code>).</q>
695  text declaration.</q>  text declaration.</q>
696  </ul>  </ul>
697    
698    <p>The parser <em class=rfc2119>MUST</em> act as if it is a <a
699  <!--  href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-validating">validating
700    XML processor</a> for the purpose of informing of white space
701  @@ Need detailed review, but maybe should be in parsing phase  characters appearing in <a
702    href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-elemcontent">element
703  #vc-check-rmd Validity constraint: Standalone Document Declaration  content</a> (See <a
704    href="http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space">Section
 @@ Need dtailed review  
   
 #wf-entdeclared Well-formedness constraint: Entity Declared  
 #vc-entdeclared Validity constraint: Entity Declared  
   
 -->  
   
 <p>The parser <em class=rfc2119>MUST</em> act as if it is a  
 <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-validating">validating  
 XML processor</a> for the informing of white space appearing in  
 <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-elemcontent">element  
 content</a> (See  
 <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space">Section  
705  2.10</a> of the XML specification).  2.10</a> of the XML specification).
706    
707    <div class="note memo informative">    <div class="note memo informative">
# Line 618  content</a> (See Line 712  content</a> (See
712    is not processed.    is not processed.
713    </div>    </div>
714    
715  <p>The parser <em class=rfc2119>MUST</em> raise an  <p>The parser <em class=rfc2119>MUST</em> raise at least one <a
716  <a href="#xml-well-formedness-error" id=wfe-syntax><code>xml-well-formedness-error</code></a>  href="#xml-well-formedness-error"
717  for any failure to match to a production rule in the XML specification.  id=wfe-syntax><code>xml-well-formedness-error</code></a> if the entity
718    it parses does not match to the appropriate production rule in the XML
719    specification.  As an exception to this requirement, it <em
720    class=rfc2119>MAY</em> choose not to raise such an error if the error
721    will be raised by the conformance checker when the conformance checker
722    <a href="#algorithm-to-check-a-node" title="check a node">checks</a>
723    the <code>Document</code> object produced by the parser.
724    
725  <!--  <!--
726    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>
727    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>
# Line 635  for any failure to match to a production Line 736  for any failure to match to a production
736  </div>  </div>
737    
738  <div class="section" id=checking-dom>  <div class="section" id=checking-dom>
739  <h2>Checking <abbr>DOM</abbr></h2>  <h2>Checking an <abbr>XML</abbr> <abbr>DOM</abbr> Tree</h2>
740    
741  <p>The following algorithms and definitions are applied to  <p>The following algorithms and definitions are applied to
742  <abbr>XML</abbr> documents; especially, they are not applied  <abbr>XML</abbr> documents; especially, they are not applied

Legend:
Removed from v.1.27  
changed lines
  Added in v.1.29

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24