/[suikacvs]/markup/xml/xmlcc/xmlcc-work.en.html
Suika

Diff of /markup/xml/xmlcc/xmlcc-work.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.26 by wakaba, Sat Mar 29 06:35:16 2008 UTC revision 1.29 by wakaba, Fri Oct 17 05:57:26 2008 UTC
# Line 18  Line 18 
18    
19  <div class="header">  <div class="header">
20  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>
21  <h2>Working Draft <time datetime=2008-03-29>29 March 2008</time></h2>  <h2>Working Draft <time datetime=2008-10-17>17 October 2008</time></h2>
22    
23  <dl class="versions-uri">  <dl class="versions-uri">
24  <dt>This Version</dt>  <dt>This Version</dt>
# Line 79  project.  It might be updated, replaced, Line 79  project.  It might be updated, replaced,
79  other documents at any time.  It is inappropriate to  other documents at any time.  It is inappropriate to
80  cite this document as other than <q>work in progress</q>.</p>  cite this document as other than <q>work in progress</q>.</p>
81    
82    <p>The scope of this specification is explicitly limited to the <a
83    href="http://suika.fam.cx/www/markup/html/whatpm/readme">Whatpm</a>
84    implementation.  It is <em>not</em> the purpose of this specification
85    to define a general guideline to parse or to check XML documents.
86    This specification does <em>not</em> try to define a new version of
87    XML at all.
88    
89    <p>This version of the specification supports the fourth edition of
90    XML 1.0 and the second edition of XML 1.1.  The fifth edition of XML
91    1.0 might be supported in a later version.  XML namespaces
92    specifications is expected to be supported in a later version of this
93    specification.
94    
95  <p>Comments on this document are welcome and  <p>Comments on this document are welcome and
96  may be sent to the <a href="#author">author</a>.</p>  may be sent to the <a href="#author">author</a>.</p>
97    
# Line 92  normative version.</p> Line 105  normative version.</p>
105    
106  <p class=section-info>This section is <em>non$B!>(Bnormative</em>.</p>  <p class=section-info>This section is <em>non$B!>(Bnormative</em>.</p>
107    
108  <div class="issue ed">...</div>  <p>This specification defines how the parsing and the conformance
109    checking of XML documents should be implemented in the <a
110    href="http://suika.fam.cx/www/markup/html/whatpm/readme">Whatpm</a>
111    XML parser and conformance checker.
112    
113    <p>It is <em>not</em> the purpose of this specification to define,
114    e.g., how to parse XML documents in general; its scope is explicitly
115    limited to the <a
116    href="http://suika.fam.cx/www/markup/html/whatpm/readme">Whatpm</a>
117    implementation.
118    
119    <div class="issue ed">...
120    
121    <p>Much of invalid (well-formed or not) XML document parsing and XML document
122    / XML DOM conformance is left undefined so that this document provides a
123    guideline for conformance checkers.
124    </div>
125    
126    
127  </div>  </div>
# Line 125  document, however, corresponding parts o Line 154  document, however, corresponding parts o
154  specification should be consulted instead.</p>  specification should be consulted instead.</p>
155  </div>  </div>
156    
157    <div class=section id=processing-model>
158    <h2>Processing Model</h2>
159    
160    <p>Conceptually, validation of an XML document is split into two
161    stages for the purpose of this specification: the <dfn
162    id=xml-document-parsing>XML document parsing</dfn> stage and the <dfn
163    id=dom-xml-conformance-checking>DOM XML conformance checking</dfn>
164    stage.
165    
166    <p>The input to the XML document parsing stage is a byte sequence
167    representing the parsed XML document (and any additional metadata),
168    and the output are a DOM tree representing the XML document and zero
169    or more <a href="#error">errors</a>.  The processor that implements
170    this stage is called <dfn id=parser>parser</dfn>.  Requirements for a
171    parser are defined in the section of <a href="#parsing-xml">Parsing an
172    XML Document</a>.
173    
174    <p>The input to the DOM XML conformance chcking stage is a DOM tree,
175    and the output are zero or more <a href="#error">errors</a>.  The
176    processor that implements this stage is called <dfn
177    id=conformance-checker>conformance checker</dfn>.  Requirements for a
178    conformance checker are defined in the section of <a
179    href="#checking-dom">Checking an XML DOM Tree</a>.
180    
181    
182  <div class=section id=error-categories>  <div class=section id=error-categories>
183  <h2>Error Classification</h2>  <h3>Error Classification</h3>
184    
185    <p class=ed>An <dfn id=error>error</dfn> is ...
186    
187  <p class=ed>If a <code>Document</code> node has no  <p class=ed>If a <code>Document</code> node has no
188  xml-well-formedness-error, entity-error, and unknown-error,  xml-well-formedness-error, entity-error, and unknown-error,
# Line 222  can be easily serialized into a valid XM Line 277  can be easily serialized into a valid XM
277  #dt-interop for interoperability</p>  #dt-interop for interoperability</p>
278    
279  <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id  <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id
280    
281    <p>TODO: XML "error"/"fatal error" is not always non-conforming (only
282    when MUST or SHOULD).
283    </div>
284    
285  </div>  </div>
286    
287  </div>  </div>
288    
289  <div class=section id=parsing-xml>  <div class=section id=parsing-xml>
290  <h2>Parsing <abbr>XML</abbr> Document</h2>  <h2>Parsing an <abbr>XML</abbr> Document</h2>
291    
292    <p>When a byte stream that represents an XML document is given to a
293    parser, it <em class=rfc2119>MUST</em> create a DOM tree according to
294    relevant specifications <span class=ed>[XML10, XML11, XMLNAMES10,
295    XMLNAMES11, DOM3CORE, WEBDOMCORE, DOMDTDEF, MANAKAIDOMEXT]</span>.
296    
297    <p>The parser <em class=rfc2119>MAY</em> continue the parsing of the
298    document even after a fatal error (as defined by the relavant
299    specifications) is encountered.  How the parsing ought to be continued
300    is not defined by this specification.
301    
302    <!--
303      <div class="note memo informative">
304      <p>It is expected that the XML5 specification <span class=ed>@@ ref</span>
305      will define how the parser has to convert any string into DOM tree
306      completely.
307      </div>
308    -->
309    
310    <p class=ed>A future version of this specification might define the
311    entire parser in terms of input stream preprocessor, tokenizer, and
312    tree constructor.
313    
314    <p>In addition, the following requirements are applied to the parser:
315    
 <p>When an <abbr>XML</abbr> document is parsed, the following clauses  
 are applied:</p>  
316  <dl class=switch>  <dl class=switch>
317    <dt>For each external entity (including the document entity and the external
318    subset entity, if any)
319      <dd>If there is a byte sequence that are not legal in the encoding in use,
320      then the parser <em class=rfc2119>MUST</em> raise an
321      <a href="#xml-misc-error" id=xme-illegal-bytes><code>xml-misc-error</code></a>.
322      <!--
323         <q>It is a fatal error when an XML processor encounters an entity with an
324         encoding that it is unable to process. It is a fatal error if an XML
325         entity is determined (via default, encoding declaration, or higher-level
326         protocol) to be in a certain encoding but contains byte sequences that are
327         not legal in that encoding.</q>
328      -->
329    
330      <dd>If it is the document entity or a general entity, then:
331        <ul>
332        <li>If the input byte sequence for the entity begins with the
333        <abbr title="BYTE ORDER MARK" class=charname>BOM</abbr>, then the parser
334        <em class=rfc2119>MUST</em> set the <span class=ed>BOM flag</span> of
335        the node corresponding to the entity (the <code>Document</code> node
336        for the document entity or an <code>Entity</code> node for a general
337        entity) to <code>true</code>.
338        <!--
339          <q>Entities encoded in UTF-16 MUST and entities encoded in UTF-8 MAY
340          begin with the Byte Order Mark</q>
341        -->
342        <span class=ed>@@ flag must be checked later</span>
343        <!-- <?xml encoding=""?> must be reflected to xmlEncoding; this should be
344        enforced by DOM Core spec. -->
345        </ul>
346      <dd>If it is a parameter entity or the external subset entity, then:
347        <ul>
348        <li>If the character encoding of the entity is <code>UTF-16</code> but
349        the input byte stream for the entity does not begin with the
350        <abbr title="BYTE ORDER MARK" class=charname>BOM</abbr>, then the parser
351        <em class=rfc2119>MUST</em> raise an
352        <a href="#xml-misc-error" id=xme-pe-bom><code>xml-misc-error</code></a>.
353        <li class=ed>@@ encoding="" preferred name?
354    <!--
355    "In an encoding declaration, the values "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4" SHOULD be used for the various encodings and transformations of Unicode / ISO/IEC 10646, the values "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-n" (where n is the part number) SHOULD be used for the parts of ISO 8859, and the values "ISO-2022-JP", "Shift_JIS", and "EUC-JP" SHOULD be used for the various encoded forms of JIS X-0208-1997. It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using their registered names; other encodings SHOULD use names starting with an "x-" prefix."
356    
357    In addition, this should be checked later for Document and Entity nodes.
358    -->
359        </ul>
360    
361  <dt>For the document  <dt>For the document
362    <dd>If the <abbr>XML</abbr> document does not begin with an    <dd>If the <abbr>XML</abbr> document does not begin with an
363    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>
364    raise an    raise an
365    <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.    <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.
366      <dd>If the document does not contain the document type declaration, or
367      if it does but the document type definition does not contain entity
368      declaration for any of <code>amp</code>, <code>lt</code>, <code>gt</code>,
369      <code>apos</code>, or <code>quot</code>, then the parser
370      <em class=rfc2119>MUST</em> raise
371      <a href="#xml-misc-recommentation" id=xmr-predefined-decl><code>xml-misc-recommendation</code></a>(s).
372      <!--
373        <q>For interoperability, valid documents SHOULD declare the entities
374        amp, lt, gt, apos, quot, in the form specified in 4.6 Predefined
375        Entities.</q>
376      -->
377  <dt>For the document type declaration  <dt>For the document type declaration
378    <dd class=ed>@@ read external entity    <dd class=ed>@@ read external entity
379    <dd>The <code>entities</code> attribute of the <code>DocumentType</code>    <dd>The <code>entities</code> attribute of the <code>DocumentType</code>
# Line 401  parser Line 538  parser
538    type declaration as <code>EMPTY</code> content, then the parser    type declaration as <code>EMPTY</code> content, then the parser
539    <em class=rfc2119>MUST</em> raise an    <em class=rfc2119>MUST</em> raise an
540    <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.    <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.
541    <dt>For each attribute
542      <dd><p>The parser <em class=rfc2119>MUST</em> set the normalized value of
543      the attribute to the <code>value</code> attribute of the <code>Attr</code>
544      node created for the attribute.
545    
546        <div class="note memo informative">
547        <p>That is, any entity reference has to be expanded.  Unexpanded entity
548        references in attribute values are discarded.
549        </div>
550    <dt>For each <code>xml:space</code> attribute
551      <dd>The parser <em class=rfc2119>MUST</em> set the normalized value of
552      the <code>xml:space</code> attribute to the <code>value</code> attribute
553      of the <code>Attr</code> node created for the attribute even if the
554      normalized value is different from <code>default</code> or
555      <code>preserve</code>.
556      <!-- In XML 1.0/1.1 specification, the attribute specification MAY be
557      ignored. -->
558    
559  <dt>For each parameter entity reference  <dt>For each parameter entity reference
560    <dd>If the declaration for the entity is not read (i.e. no declaration    <dd><p>Process as follows:
561    for the entity is processed or the external entity referenced by the      <ol>
562    declaration cannot be retrieved), then:      <li>If the declaration for the entity is <em>not</em> processed,
563        then:
564          <dl class=switch>
565          <dt>If the document contains no external entity or if the document
566          contains the <code>standalone</code> pseudo-attribute set to
567          <code>yes</code><!-- or the document contains no DTD -->
568            <dd>The parser <em class=rfc2119>MUST</em> raise an
569            <a href="#xml-well-formedness-error" id=wf-entdeclared-pe><code>xml-well-formedness-error</code></a>.
570          <dt>Otherwise
571            <dd>The parser <em class=rfc2119>MUST</em> raise an
572            <a href="#xml-validity-error" id=vc-entdeclared-pe><code>xml-validity-error</code></a>.
573          </dl>
574        <li>If the declaration for the entity <em>is</em> processed but the
575        referenced entity cannot be retrieved, then the parser
576        <em class=rfc2119>MUST</em> raise an
577        <span class=ed>@@ ??-error</span>.
578        </ol>
579    
580        <p>In any of two cases above, process as follows:
581      <ul>      <ul>
582      <li>If the parameter entity is contained in a declaration, then the      <li>If the parameter entity reference is contained in a declaration, then
583      declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that      the declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that
584      any error before the parameter entity <em class=rfc2119>MUST</em> be      any error before the parameter entity <em class=rfc2119>MUST</em> be
585      raised as usual.      raised as usual.
586      <li>If the parameter entity is contained in the status portion of a      <li>If the parameter entity reference is contained in the status portion of
587      conditional section, then the conditional section      a conditional section, then the conditional section
588      <em class=rfc2119>MUST</em> be processed as if it were an      <em class=rfc2119>MUST</em> be processed as if it were an
589      <code>IGNORE</code>d section.      <code>IGNORE</code>d section.
590      <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or      <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or
# Line 432  parser Line 604  parser
604      </ul>      </ul>
605  <dt>For each general entity reference in an attribute value or in the content  <dt>For each general entity reference in an attribute value or in the content
606  of an element  of an element
607    <dd>If the declaration for the entity is not read (i.e. no declaration for    <dd><p>Process as follows:
608    the entity is processed or the external entity referenced by the declaration      <ol>
609    cannot be retrieved), then:      <li>If the <code>Name</code> of the entity reference is either
610        <code>amp</code>, <code>lt</code>, <code>gt</code>, <code>quot</code>,
611        or <code>apos</code>, then abort these steps.
612        <li>If the declaration for the entity is <em>not</em> processed,
613        then:
614          <dl class=switch>
615          <dt>If the document contains no external entity or if the document
616          contains the <code>standalone</code> pseudo-attribute set to
617          <code>yes</code><!-- or the document contains no DTD -->
618            <dd>The parser <em class=rfc2119>MUST</em> raise an
619            <a href="#xml-well-formedness-error" id=wf-entdeclared-ge><code>xml-well-formedness-error</code></a>.
620          <dt>Otherwise
621            <dd>The parser <em class=rfc2119>MUST</em> raise an
622            <a href="#xml-validity-error" id=vc-entdeclared-ge><code>xml-validity-error</code></a>.
623          </dl>
624        <li>If the declaration for the entity <em>is</em> processed but the
625        referenced entity cannot be retrieved, then the parser
626        <em class=rfc2119>MUST</em> raise an
627        <span class=ed>@@ ??-error</span>.
628        </ol>
629    
630        <p>In any of two cases above, process as follows:
631      <ul>      <ul>
632      <li>If the general entity reference is the first reference to an entity      <li>If the general entity reference is the first reference to an entity
633      that is not read, then the parser <em class=rfc2119>MUST</em> raise an      that is not read, then the parser <em class=rfc2119>MUST</em> raise an
# Line 442  of an element Line 635  of an element
635      <span class=ed>@@ entity declared WFC?</span>      <span class=ed>@@ entity declared WFC?</span>
636      <li class=ed>An unexpended entity reference node <em class=rfc2119>MUST</em> be inserted to the current node.      <li class=ed>An unexpended entity reference node <em class=rfc2119>MUST</em> be inserted to the current node.
637      </ul>      </ul>
 </dl>  
638    
639  <p class=ed>@@ MUST try to read external entity  <dt>For each comment <em>outside</em> of document type declaration
640      <dd>A <code>Comment</code> node <em class=rfc2119>MUST</em> be created
641      and inserted appropriately.
642      <!-- In XML 1.0/1.1 spec, this is optional. -->
643    </dl>
644    
645  <p>In addition, the parser has to check whether the  <p>The parser <em class=rfc2119>MUST</em> try to read any entity
646  following constraints are met.  referenced by general or parameter entity references<!--,--> and the
647    external subset entity, if any<!--, and any general entity declared-->
648    in the document type definition.
649    
650    <p><strong>Well-formedness constraints</strong>.  When the parser
651    detects a voilation to one of certain well-formedness constraints, it
652    <em class=rfc2119>MUST</em> raise an <a
653    href="#xml-well-formedness-error"><code>xml-well-formedness-error</code></a>.
654    The list of such well-formed constraints is as follows:
655    
 <p><strong>Well-formedness constraints</strong>.  For each violation to  
 one of constraints below, an  
 <a href="#xml-well-formedness-error"><code>xml-well-formedness-error</code></a>  
 <em class=rfc2119>MUST</em> be raised.  The list of well-formedness  
 constraints is below:  
656  <ul>  <ul>
657  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wfc-PEinInternalSubset">Well-formedness constraint: PEs in Internal Subset</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wfc-PEinInternalSubset">Well-formedness constraint: PEs in Internal Subset</a>
658  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#GIMatch">Well-formedness constraint: Element Type Match</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#GIMatch">Well-formedness constraint: Element Type Match</a>
# Line 466  constraints is below: Line 665  constraints is below:
665  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#indtd">Well-formedness constraint: In DTD</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#indtd">Well-formedness constraint: In DTD</a>
666  </ul>  </ul>
667    
668  <p><strong>Validity constraints</strong>.  For each violation to  <p><strong>Validity constraints</strong>.  When the parser detects a
669  one of constraints below, an  violation to one of certain validity contraints, it <em
670  <a href="#xml-validity-error"><code>xml-validity-error</code></a>.  class=rfc2119>MUST</em> raise an <a
671  <em class=rfc2119>MUST</em> be raised.  The list of validity  href="#xml-validity-error"><code>xml-validity-error</code></a>.  The
672  constraints is below:  list of such validity constraints is as follows:
673    
674  <ul>  <ul>
675  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>
676  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>
677  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>
678    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-check-rmd">Validity constraint: Standalone Document Declaration</a>
679  </ul>  </ul>
680    
681  <p><strong>Other creteria</strong>.  For each violation to  <p><strong>Other creteria</strong>.  If the parser detects a violation
682  one of constraints below, an  to one of certain additional constraints, it <em
683  <a href="#xml-misc-recommendation"><code>xml-misc-recommendation</code></a>  class=rfc2119>MUST</em> raise an <a
684  <em class=rfc2119>MUST</em> be raised.  The list of constraints is below:  href="#xml-misc-recommendation"><code>xml-misc-recommendation</code></a>.
685    The list of such constraints is as follows:
686    
687  <ul>  <ul>
688  <li><q>For interoperability, if a parameter-entity reference appears in a  <li><q>For interoperability, if a parameter-entity reference appears in a
689  <code>choice</code>, <code>seq</code>, or <code>Mixed</code> construct, its  <code>choice</code>, <code>seq</code>, or <code>Mixed</code> construct, its
# Line 492  or <code>,</code>).</q> Line 695  or <code>,</code>).</q>
695  text declaration.</q>  text declaration.</q>
696  </ul>  </ul>
697    
698    <p>The parser <em class=rfc2119>MUST</em> act as if it is a <a
699    href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-validating">validating
700    XML processor</a> for the purpose of informing of white space
701    characters appearing in <a
702    href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-elemcontent">element
703    content</a> (See <a
704    href="http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space">Section
705    2.10</a> of the XML specification).
706    
707      <div class="note memo informative">
708      <p>In other word, the <code>isElementContentWhitespace</code> attribute
709      of <code>Text</code> nodes has to be set appropriately.  Note that the
710      value of the attribute will be set to <code>false</code> for any
711      <code>Text</code> node in the content of an element whose declaration
712      is not processed.
713      </div>
714    
715    <p>The parser <em class=rfc2119>MUST</em> raise at least one <a
716    href="#xml-well-formedness-error"
717    id=wfe-syntax><code>xml-well-formedness-error</code></a> if the entity
718    it parses does not match to the appropriate production rule in the XML
719    specification.  As an exception to this requirement, it <em
720    class=rfc2119>MAY</em> choose not to raise such an error if the error
721    will be raised by the conformance checker when the conformance checker
722    <a href="#algorithm-to-check-a-node" title="check a node">checks</a>
723    the <code>Document</code> object produced by the parser.
724    
725  <!--  <!--
726      <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>
727  @@ Need detailed review, but maybe should be in parsing phase    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>
   
 #vc-check-rmd Validity constraint: Standalone Document Declaration  
   
 @@ Need dtailed review  
   
 #wf-entdeclared Well-formedness constraint: Entity Declared  
 #vc-entdeclared Validity constraint: Entity Declared  
 "For interoperability, valid documents SHOULD declare the entities amp, lt, gt, apos, quot, in the form specified in 4.6 Predefined Entities."  
   
 @@ flaged and then reported in DOM check phase  
   
 "Entities encoded in UTF-16 MUST and entities encoded in UTF-8 MAY begin with the Byte Order Mark"  
 "In the absence of external character encoding information (such as MIME headers), parsed entities which are stored in an encoding other than UTF-8 or UTF-16 MUST begin with a text declaration"  
 "In an encoding declaration, the values "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4" SHOULD be used for the various encodings and transformations of Unicode / ISO/IEC 10646, the values "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-n" (where n is the part number) SHOULD be used for the parts of ISO 8859, and the values "ISO-2022-JP", "Shift_JIS", and "EUC-JP" SHOULD be used for the various encoded forms of JIS X-0208-1997. It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using their registered names; other encodings SHOULD use names starting with an "x-" prefix."  
   
 @@ in parsing phase  
   
 "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."  
   
 @@ We should phrase out what the parser should do where the XML specification  
 left undefined.  For example: Comment must be converted to a Comment node,  
 illegal xml:space value must be preserved, so on.  
   
 Warn <!ENTITY % xml... ...>  
   
728  -->  -->
729    
 <p>The parser <em class=rfc2119>MUST</em> raise an  
 <a href="#xml-well-formedness-error" id=wfe-syntax><code>xml-well-formedness-error</code></a>  
 for any failure to match to a production rule in the XML specification.  
730  <!--  <!--
731    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>    Inpossible to test:
732    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>      "In the absence of external character encoding information (such as MIME
733        headers), parsed entities which are stored in an encoding other than UTF-8
734        or UTF-16 MUST begin with a text declaration"
735  -->  -->
736  </div>  </div>
737    
738  <div class="section" id=checking-dom>  <div class="section" id=checking-dom>
739  <h2>Checking <abbr>DOM</abbr></h2>  <h2>Checking an <abbr>XML</abbr> <abbr>DOM</abbr> Tree</h2>
740    
741  <p>The following algorithms and definitions are applied to  <p>The following algorithms and definitions are applied to
742  <abbr>XML</abbr> documents; especially, they are not applied  <abbr>XML</abbr> documents; especially, they are not applied
# Line 1151  parsed entity)</dt> Line 1354  parsed entity)</dt>
1354      <li>If the <code>childNodes</code> list of <var>n</var> contains      <li>If the <code>childNodes</code> list of <var>n</var> contains
1355      any nodes, then raise an      any nodes, then raise an
1356      <a href="#xml-well-formedness-error" id=wfe-pi-child><code>xml-well-formedness-error</code></a>.</li>      <a href="#xml-well-formedness-error" id=wfe-pi-child><code>xml-well-formedness-error</code></a>.</li>
1357        <li class=ed>@@ Warn if not declared
1358      </ol>      </ol>
1359    </dd>    </dd>
1360  <dt>If <var>n</var> is a <code>Text</code> node</dt>  <dt>If <var>n</var> is a <code>Text</code> node</dt>

Legend:
Removed from v.1.26  
changed lines
  Added in v.1.29

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24