/[suikacvs]/markup/xml/xmlcc/xmlcc-work.en.html
Suika

Diff of /markup/xml/xmlcc/xmlcc-work.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.20 by wakaba, Sat Dec 1 15:08:18 2007 UTC revision 1.25 by wakaba, Sat Mar 29 02:22:57 2008 UTC
# Line 18  Line 18 
18    
19  <div class="header">  <div class="header">
20  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>
21  <h2>Working Draft <time datetime=2007-12-01>1 December 2007</time></h2>  <h2>Working Draft <time datetime=2008-03-29>29 March 2008</time></h2>
22    
23  <dl class="versions-uri">  <dl class="versions-uri">
24  <dt>This Version</dt>  <dt>This Version</dt>
# Line 42  Line 42 
42        >w@suika.fam.cx</a>&gt;</code></dd>        >w@suika.fam.cx</a>&gt;</code></dd>
43  </dl>  </dl>
44    
45  <p class="copyright" lang="en">&#xA9; <time>2007</time> <a  <p class="copyright" lang="en">&#xA9; <time>2007</time>$B!>(B<time>2008</time> <a
46      href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.      href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.
47  Permission is granted to copy, distribute and/or modify this document  Permission is granted to copy, distribute and/or modify this document
48  under the terms of the <a rel="license"  under the terms of the <a rel="license"
# Line 118  else in this specification is normative. Line 118  else in this specification is normative.
118  <p><span class=ed>Algorithm is normative but non-normative</span>.  <p><span class=ed>Algorithm is normative but non-normative</span>.
119  In addition, the order in which <a href="#errors">errors</a> are  In addition, the order in which <a href="#errors">errors</a> are
120  raised is undefined.</p>  raised is undefined.</p>
121    
122    <p>This document sometimes cites parts of <abbr>XML</abbr> 1.0 specification
123    by hyperlinks.  When the document being processed is an <abbr>XML</abbr> 1.1
124    document, however, corresponding parts of the <abbr>XML</abbr> 1.1
125    specification should be consulted instead.</p>
126  </div>  </div>
127    
128    
# Line 209  can be easily serialized into a valid XM Line 214  can be easily serialized into a valid XM
214    
215  <div class=ed><p>@@ TODO: #dt-atuseroption at user option  <div class=ed><p>@@ TODO: #dt-atuseroption at user option
216  (MAY or MUST), #dt-compat for compatibility,  (MAY or MUST), #dt-compat for compatibility,
217  #dt-interop for interoperability</p></div>  #dt-interop for interoperability</p>
218    
219    <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id
220    </div>
221    
222  </div>  </div>
223    
224  <div class=section id=parsing-xml>  <div class=section id=parsing-xml>
225  <h2>Parsing <abbr>XML</abbr> Document</h2>  <h2>Parsing <abbr>XML</abbr> Document</h2>
226    
227  <ul>  <p>When an <abbr>XML</abbr> document is parsed, the following clauses
228  <li>If the <abbr>XML</abbr> document does not begin with an  are applied:</p>
229  <abbr>XML</abbr> declaration, then raise an  <dl class=switch>
230  <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.</li>  <dt>For the document
231  <li>If the replacement text of an entity declaration is    <dd>If the <abbr>XML</abbr> document does not begin with an
232  <code>&lt;</code>, then raise an    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>
233  <a href="#xml-misc-warning" id=xmw-entity-value-lt><code>xml-misc-warning</code></a>.<!--    raise an
234      <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.
235    <dt>For the document type declaration
236      <dd class=ed>@@ read external entity
237    <dt>For each internal general entity declaration being processed by the parser
238      <dd>If the
239      <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EntityValue"><code>EntityValue</code></a>
240      part of the general entity declaration contains a bare <code>U+003C</code>
241      <code>LESS-THAN SIGN</code> (<code>&lt;</code>) character, then the parser
242      <em class=rfc2119>MUST</em> raise an
243      <a href="#xml-misc-warning" id=xmw-entity-value-lt><code>xml-misc-warning</code></a>.<!--
244  "strongly advised to avoid" in a Note in Section 2.3 of [XML10], [XML11].  "strongly advised to avoid" in a Note in Section 2.3 of [XML10], [XML11].
245  --></li>  -->
246  <li>If there is an element type declaration whose <code>Name</code>  <dt>For each element type declaration being processed by the parser
247  value is already declared, then raise an    <dd>If there is another processed element type declaration whose
248  <a href="#xml-validity-error" id=vc-edunique><code>xml-validity-error</code></a>.</li>    <code>Name</code> is equal to the <code>Name</code> of the element type
249  <li>If attribute definition whose <code>Name</code> is    declaration, then the parser <em class=rfc2119>MUST</em> raise an
250  <code>xml:space</code> has <span class=ed>declared type different from    <a href="#xml-validity-error" id=vc-edunique><code>xml-validity-error</code></a>.
251  (default|preserve), (default), or (preserve)</span>, then raise an  <dt>For each attribute definition list declaration being processed by the
252  <a href="#xml-misc-error" id=xme-ad-xml-space><code>xml-misc-error</code></a>.  parser
253  <span class=ed>@@ duplication with    <dd>If there is another processed attribute defintion list declaration whose
254  <a href="#xml-at-xml-space">#xml-at-xml-space</a>.<!--    <code>Name</code> is equal to the <code>Name</code> of the attribute
255  <!ATTLIST e xml:space CDATA #IMPLIED xml:space CDATA #IMPLIED> --></span></li>    definition list declaration, then the parser <em class=rfc2119>MUST</em>
256  <li>If an empty-element tag is used for an element which is <em>not</em>    raise an
257  declared <code>EMPTY</code>, then raise an    <a href="#xml-misc-warning" id=xme-attlist-unique><code>xml-misc-warning</code></a>.
258  <a href="#xml-misc-recommentation" id=xmr-emptyelemtag-not-empty><code>xml-misc-recommendation</code></a>.</li>    <dd>For each attribute definition in the attribute definition list
259  <li>If an empty-element tag is <em>not</em> used for an element which is    declaration, if there is another processed attribute definition whose
260  declared <code>EMPTY</code>, then raise an    <code>Name</code> is equal to the <code>Name</code> of the attribute
261  <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.</li>    definition (whether or not in the same attribute definition list
262      declaration), then the parser <em class=rfc2119>MUST</em> raise an
263      <a href="#xml-misc-warning" id=xme-attrdef-unique><code>xml-misc-warning</code></a>.
264      <!--
265        <q>For interoperability, an XML processor <em class=rfc2119>MAY</em> at
266        user option issue a warning when more than one attribute-list declaration
267        is provided for a given element type, or more than one attribute definition
268        is provided for a given attribute, but this is not an error.</q>
269      -->
270    <!--
271      NOTE: <!ATTLIST a xml:space (default) #IMPLIED xml:space CDATA #IMPLIED>
272      will not be warned.
273    -->
274    
275    <dt>For each entity declaration being processed by the parser
276      <dd>If the entity declaration declares a parameter entity and the
277      <code>Name</code> of the entity begins with the string <code>xml</code>
278      (in any combination of upper- and lowercase letters), then the parser
279      <em class=rfc2119>MUST</em> raise an
280      <a href="#xml-misc-warning" id=xmw-reserved-pe-name><code>xml-misc-warning</code></a>.
281      <dd>If the entity declaration contains the <code>EntityValue</code>, then
282      for each occurence of any references to unparsed entities in the
283      <code>EntityValue</code>, the parser <em class=rfc2119>MUST</em> raise an
284      <a href="#xml-misc-error" id=xme-unparsed-in-ev><code>xml-misc-error</code></a>.
285      <!--
286        <q>It is an error for a reference to an unparsed entity to appear in the
287        EntityValue in an entity declaration.</q>
288      -->
289    
290    <dt>For each notation declaration being processed by the parser
291      <dd>If there is another processed notation declaration whose
292      <code>Name</code> is equal to the <code>Name</code> of the notation
293      declaration, then the parser <em class=rfc2119>MUST</em> raise an
294      <a href="#xml-validity-error" id=vc-uniquenotationname><code>xml-validity-error</code></a>.
295      <!-- <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#UniqueNotationName">Validity constraint: Unique Notation Name</a> -->
296    
297    <dt>For each empty-element tag
298      <dd>If the <code>Name</code> of the tag is not declared by a processed
299      element type declaration as <code>EMPTY</code> content, then the parser
300      <em class=rfc2119>MUST</em> raise an
301      <a href="#xml-misc-recommentation" id=xmr-emptyelemtag-not-empty><code>xml-misc-recommendation</code></a>.
302    <dt>For each start-tag
303      <dd>If the <code>Name</code> of the tag is declared by a processed element
304      type declaration as <code>EMPTY</code> content, then the parser
305      <em class=rfc2119>MUST</em> raise an
306      <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.
307    
308    <dt>For each parameter entity reference
309      <dd>If the declaration for the entity is not read (i.e. no declaration
310      for the entity is processed or the external entity referenced by the
311      declaration cannot be retrieved), then:
312        <ul>
313        <li>If the parameter entity is contained in a declaration, then the
314        declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that
315        any error before the parameter entity <em class=rfc2119>MUST</em> be
316        raised as usual.
317        <li>If the parameter entity is contained in the status portion of a
318        conditional section, then the conditional section
319        <em class=rfc2119>MUST</em> be processed as if it were an
320        <code>IGNORE</code>d section.
321        <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or
322        attribute-list declaration after the parameter entity reference in the DTD
323        <em>except</em> when the <code>standalone</code> pseudo-attribute of the
324        XML declaration (if any) is set to <code>yes</code>.
325        <!-- This requirement is enforced for internal DTD subset case in
326        XML 1.0/1.1 specification (section 5.1) but not for any other cases. -->
327        <!-- According to this definition, element type declarations, notation
328        declarations, and PIs ARE processed. -->
329        <li>If the parameter entity reference is the first reference to an entity
330        that is not read, then the parser <em class=rfc2119>MUST</em> raise an
331        <a href="#entity-error" id=ee-unread-pe><code>entity-error</code></a>.
332        <li>The <code>allDeclarationsProcessed</code> <span class=ed>@@ ref</span>
333        attribute of the <code>Document</code> node <em class=rfc2119>MUST</em> be
334        set to <code>false</code>.
335        </ul>
336    <dt>For each general entity reference in an attribute value or in the content
337    of an element
338      <dd>If the declaration for the entity is not read (i.e. no declaration for
339      the entity is processed or the external entity referenced by the declaration
340      cannot be retrieved), then:
341        <ul>
342        <li>If the general entity reference is the first reference to an entity
343        that is not read, then the parser <em class=rfc2119>MUST</em> raise an
344        <a href="#entity-error" id=ee-unread-ge><code>entity-error</code></a>.
345        <span class=ed>@@ entity declared WFC?</span>
346        <li class=ed>An unexpended entity reference node <em class=rfc2119>MUST</em> be inserted to the current node.
347        </ul>
348    </dl>
349    
350    <p class=ed>@@ MUST try to read external entity
351    
352    <p>In addition, the parser has to check whether the
353    following constraints are met.
354    
355    <p><strong>Well-formedness constraints</strong>.  For each violation to
356    one of constraints below, an
357    <a href="#xml-well-formedness-error"><code>xml-well-formedness-error</code></a>
358    <em class=rfc2119>MUST</em> be raised.  The list of well-formedness
359    constraints is below:
360    <ul>
361    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wfc-PEinInternalSubset">Well-formedness constraint: PEs in Internal Subset</a>
362    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#GIMatch">Well-formedness constraint: Element Type Match</a>
363    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#uniqattspec">Well-formedness constraint: Unique Att Spec</a>
364    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NoExternalRefs">Well-formedness constraint: No External Entity References</a>
365    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#CleanAttrVals">Well-formedness constraint: No &lt; in Attribute Values</a>
366    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wf-Legalchar">Well-formedness constraint: Legal Character</a>
367    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#textent">Well-formedness constraint: Parsed Entity</a>
368    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#norecursion">Well-formedness constraint: No Recursion</a>
369    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#indtd">Well-formedness constraint: In DTD</a>
370    </ul>
371    
372    <p><strong>Validity constraints</strong>.  For each violation to
373    one of constraints below, an
374    <a href="#xml-validity-error"><code>xml-validity-error</code></a>.
375    <em class=rfc2119>MUST</em> be raised.  The list of validity
376    constraints is below:
377    <ul>
378    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>
379    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>
380    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>
381    </ul>
382    
383    <p><strong>Other creteria</strong>.  For each violation to
384    one of constraints below, an
385    <a href="#xml-misc-recommendation"><code>xml-misc-recommendation</code></a>
386    <em class=rfc2119>MUST</em> be raised.  The list of constraints is below:
387    <ul>
388    <li><q>For interoperability, if a parameter-entity reference appears in a
389    <code>choice</code>, <code>seq</code>, or <code>Mixed</code> construct, its
390    replacement text <em class=rfc2119>SHOULD</em> contain at least one non-blank
391    character, and neither the first nor last non-blank character of the
392    replacement text <em class=rfc2119>SHOULD</em> be a connector (<code>|</code>
393    or <code>,</code>).</q>
394    <li><q>External parsed entities <em class=rfc2119>SHOULD</em> each begin with a
395    text declaration.</q>
396    </ul>
397    
398    
399  <!--  <!--
400    
 #vc-PEinMarkupDecl Validity constraint: Proper Declaration/PE Nesting  
 #wfc-PEinInternalSubset Well-formedness constraint: PEs in Internal Subset  
 #ExtSubset Well-formedness constraint: External Subset  
 #PE-between-Decls Well-formedness constraint: PE Between Declarations  
 #GIMatch Well-formedness constraint: Element Type Match  
 #uniqattspec Well-formedness constraint: Unique Att Spec  
 #NoExternalRefs Well-formedness constraint: No External Entity References  
 #CleanAttrVals Well-formedness constraint: No < in Attribute Values  
 #vc-PEinGroup Validity constraint: Proper Group/PE Nesting  
 "For interoperability, if a parameter-entity reference appears in a choice, seq, or Mixed construct, its replacement text SHOULD contain at least one non-blank character, and neither the first nor last non-blank character of the replacement text SHOULD be a connector (| or ,)."  
 "For interoperability, an XML processor MAY at user option issue a warning when more than one attribute-list declaration is provided for a given element type, or more than one attribute definition is provided for a given attribute, but this is not an error."  
 #condsec-nesting Validity constraint: Proper Conditional Section/PE Nesting  
 #wf-Legalchar Well-formedness constraint: Legal Character  
 #textent Well-formedness constraint: Parsed Entity  
 #norecursion Well-formedness constraint: No Recursion  
 #indtd Well-formedness constraint: In DTD  
 "External parsed entities SHOULD each begin with a text declaration."  
 "It is an error for a reference to an unparsed entity to appear in the EntityValue in an entity declaration."  
 #UniqueNotationName Validity constraint: Unique Notation Name  
401    
402  @@ Need detailed review, but maybe should be in parsing phase  @@ Need detailed review, but maybe should be in parsing phase
403    
# Line 284  declared <code>EMPTY</code>, then raise Line 420  declared <code>EMPTY</code>, then raise
420    
421  "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."  "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."
422    
423    @@ We should phrase out what the parser should do where the XML specification
424    left undefined.  For example: Comment must be converted to a Comment node,
425    illegal xml:space value must be preserved, so on.
426    
427    Warn <!ENTITY % xml... ...>
428    
429    -->
430    
431    <p>The parser <em class=rfc2119>MUST</em> raise an
432    <a href="#xml-well-formedness-error" id=wfe-syntax><code>xml-well-formedness-error</code></a>
433    for any failure to match to a production rule in the XML specification.
434    <!--
435      <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>
436      <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>
437  -->  -->
 </ul>  
438  </div>  </div>
439    
440  <div class="section" id=checking-dom>  <div class="section" id=checking-dom>
# Line 367  character that is <em>not</em> in the ch Line 516  character that is <em>not</em> in the ch
516  case combination), then raise an  case combination), then raise an
517  <a href="#xml-misc-warning" id=xmw-reserved-name><code>xml-misc-warning</code></a>.  <a href="#xml-misc-warning" id=xmw-reserved-name><code>xml-misc-warning</code></a>.
518  <span class=ed>@@ except for attribute names <code>xml:lang</code>,  <span class=ed>@@ except for attribute names <code>xml:lang</code>,
519  <code>xml:space</code>, <code>xml:base</code>, <code>xml:id</code>,  <code>xml:space</code><!--, <code>xml:base</code>, <code>xml:id</code>,
520  <code>xmlns</code>, <code>xmlns:<var>*</var></code>,  <code>xmlns</code>, <code>xmlns:<var>*</var></code>,
521  pi name <code>xml-stylesheet</code>.</span><!--  pi name <code>xml-stylesheet</code>-->.</span><!--
522  "names beginning with a match to (('X'|'x')('M'|'m')('L'|'l')) are reserved for standardization in this or future versions of this specification.":  "names beginning with a match to (('X'|'x')('M'|'m')('L'|'l')) are reserved for standardization in this or future versions of this specification.":
523  xmlns, xml-stylesheet, xml:base and xml:id specifications violate to this sentense!  xmlns, xml-stylesheet, xml:base and xml:id specifications violate to this sentense!
524  --></li>  --></li>
# Line 559  following:</p> Line 708  following:</p>
708      <li>If <code>nodeName</code> attribute of <var>n</var> is      <li>If <code>nodeName</code> attribute of <var>n</var> is
709      <code>xml:space</code> <span class=ed>@@ or {xml namespace}:space ?</span>      <code>xml:space</code> <span class=ed>@@ or {xml namespace}:space ?</span>
710      and <span class=ed>its declared type is different from (default|preserve),      and <span class=ed>its declared type is different from (default|preserve),
711      (default), or (preserve)</span>, then raise an      (preserve|default), (default), or (preserve)</span>, then raise an
712      <a href="#xml-misc-error" id=xme-at-xml-space><code>xml-misc-error</code></a>.</li>      <a href="#xml-misc-error" id=xme-at-xml-space><code>xml-misc-error</code></a>.</li>
713      <li>For each node <dfn id=var-ad-nc><var>n<sub><var>c</var></sub></var></dfn> in the      <li>For each node <dfn id=var-ad-nc><var>n<sub><var>c</var></sub></var></dfn> in the
714      <code>childNodes</code> list of <var>n</var>,      <code>childNodes</code> list of <var>n</var>,
# Line 1052  PARENTHESIS</code> (<code class=char>)</ Line 1201  PARENTHESIS</code> (<code class=char>)</
1201  (<code class=char>.</code>)</li>  (<code class=char>.</code>)</li>
1202  <li><code class=char>U+002F</code> <code class=charname>SOLIDUS</code>  <li><code class=char>U+002F</code> <code class=charname>SOLIDUS</code>
1203  (<code class=char>/</code>)</li>  (<code class=char>/</code>)</li>
1204    <li><code class=char>U+0030</code> <code class=charname>DIGIT ZERO</code>
1205    (<code class=char>0</code>) .. <code class=char>U+0039</code>
1206    <code class=charname>DIGIT NINE</code> (<code class=char>9</code>)</li>
1207  <li><code class=char>U+003A</code> <code class=charname>COLON</code>  <li><code class=char>U+003A</code> <code class=charname>COLON</code>
1208  (<code class=char>:</code>)</li>  (<code class=char>:</code>)</li>
1209  <li><code class=char>U+003B</code> <code class=charname>SEMICOLON</code>  <li><code class=char>U+003B</code> <code class=charname>SEMICOLON</code>
# Line 1062  PARENTHESIS</code> (<code class=char>)</ Line 1214  PARENTHESIS</code> (<code class=char>)</
1214  (<code class=char>?</code>)</li>  (<code class=char>?</code>)</li>
1215  <li><code class=char>U+0040</code> <code class=charname>COMMERCIAL AT</code>  <li><code class=char>U+0040</code> <code class=charname>COMMERCIAL AT</code>
1216  (<code class=char>@</code>)</li>  (<code class=char>@</code>)</li>
1217    <li><code class=char>U+0041</code> <code class=charname>LATIN CAPITAL LETTER
1218    A</code> (<code class=char>A</code>) .. <code class=char>U+005A</code>
1219    <code class=charname>LATIN CAPITAL LETTER Z</code>
1220    (<code class=char>Z</code>)</li>
1221  <li><code class=char>U+005F</code> <code class=charname>LOW LINE</code>  <li><code class=char>U+005F</code> <code class=charname>LOW LINE</code>
1222  (<code class=char>_</code>)</li>  (<code class=char>_</code>)</li>
1223    <li><code class=char>U+0061</code> <code class=charname>LATIN CAPITAL LETTER
1224    A</code> (<code class=char>A</code>) .. <code class=char>U+007A</code>
1225    <code class=charname>LATIN CAPITAL LETTER Z</code>
1226    (<code class=char>Z</code>)</li>
1227  </ul>  </ul>
1228  <div class="note memo">  <div class="note memo">
1229  <p>This character class contains all characters allowed in the production rule  <p>This character class contains all characters allowed in the production rule

Legend:
Removed from v.1.20  
changed lines
  Added in v.1.25

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24