/[suikacvs]/markup/xml/xmlcc/xmlcc-work.en.html
Suika

Diff of /markup/xml/xmlcc/xmlcc-work.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.19 by wakaba, Sat Dec 1 14:51:45 2007 UTC revision 1.26 by wakaba, Sat Mar 29 06:35:16 2008 UTC
# Line 18  Line 18 
18    
19  <div class="header">  <div class="header">
20  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>
21  <h2>Working Draft <time datetime=2007-12-01>1 December 2007</time></h2>  <h2>Working Draft <time datetime=2008-03-29>29 March 2008</time></h2>
22    
23  <dl class="versions-uri">  <dl class="versions-uri">
24  <dt>This Version</dt>  <dt>This Version</dt>
# Line 42  Line 42 
42        >w@suika.fam.cx</a>&gt;</code></dd>        >w@suika.fam.cx</a>&gt;</code></dd>
43  </dl>  </dl>
44    
45  <p class="copyright" lang="en">&#xA9; <time>2007</time> <a  <p class="copyright" lang="en">&#xA9; <time>2007</time>$B!>(B<time>2008</time> <a
46      href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.      href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.
47  Permission is granted to copy, distribute and/or modify this document  Permission is granted to copy, distribute and/or modify this document
48  under the terms of the <a rel="license"  under the terms of the <a rel="license"
# Line 118  else in this specification is normative. Line 118  else in this specification is normative.
118  <p><span class=ed>Algorithm is normative but non-normative</span>.  <p><span class=ed>Algorithm is normative but non-normative</span>.
119  In addition, the order in which <a href="#errors">errors</a> are  In addition, the order in which <a href="#errors">errors</a> are
120  raised is undefined.</p>  raised is undefined.</p>
121    
122    <p>This document sometimes cites parts of <abbr>XML</abbr> 1.0 specification
123    by hyperlinks.  When the document being processed is an <abbr>XML</abbr> 1.1
124    document, however, corresponding parts of the <abbr>XML</abbr> 1.1
125    specification should be consulted instead.</p>
126  </div>  </div>
127    
128    
# Line 205  can be easily serialized into a valid XM Line 210  can be easily serialized into a valid XM
210    violate to any well$B!>(Bformedness constraint in XML    violate to any well$B!>(Bformedness constraint in XML
211    specification <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>,    specification <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>,
212    <a href="#ref-XML11">XML11</a>]</cite>.</p></dd>    <a href="#ref-XML11">XML11</a>]</cite>.</p></dd>
213    <dt><dfn id=misc-info><code>misc-info</code></dfn>
214      <dd><p>A <code>misc-info</code> is raised when some status information
215      on parsing or checking process that are considered useful for debugging
216      and so on is available.  It by no means implies the non-conformance of
217      the document.
218  </dl>  </dl>
219    
220  <div class=ed><p>@@ TODO: #dt-atuseroption at user option  <div class=ed><p>@@ TODO: #dt-atuseroption at user option
221  (MAY or MUST), #dt-compat for compatibility,  (MAY or MUST), #dt-compat for compatibility,
222  #dt-interop for interoperability</p></div>  #dt-interop for interoperability</p>
223    
224    <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id
225    </div>
226    
227  </div>  </div>
228    
229  <div class=section id=parsing-xml>  <div class=section id=parsing-xml>
230  <h2>Parsing <abbr>XML</abbr> Document</h2>  <h2>Parsing <abbr>XML</abbr> Document</h2>
231    
232  <ul>  <p>When an <abbr>XML</abbr> document is parsed, the following clauses
233  <li>If the <abbr>XML</abbr> document does not begin with an  are applied:</p>
234  <abbr>XML</abbr> declaration, then raise an  <dl class=switch>
235  <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.</li>  <dt>For the document
236  <li>If the replacement text of an entity declaration is    <dd>If the <abbr>XML</abbr> document does not begin with an
237  <code>&lt;</code>, then raise an    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>
238  <a href="#xml-misc-warning" id=xmw-entity-value-lt><code>xml-misc-warning</code></a>.<!--    raise an
239      <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.
240    <dt>For the document type declaration
241      <dd class=ed>@@ read external entity
242      <dd>The <code>entities</code> attribute of the <code>DocumentType</code>
243      node <em class=rfc2119>MUST</em> contain a <code>NamedNodeMap</code> object
244      whose first five items are as follows:
245        <ol start=0>
246        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
247        is <code>amp</code>.  It contains a <code>Text</code> node whose
248        <code>data</code> attribute is set to <code>&amp;</code>.
249        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
250        is <code>lt</code>.  It contains a <code>Text</code> node whose
251        <code>data</code> attribute is set to <code>&lt;</code>.
252        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
253        is <code>gt</code>.  It contains a <code>Text</code> node whose
254        <code>data</code> attribute is set to <code>></code>.
255        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
256        is <code>quot</code>.  It contains a <code>Text</code> node whose
257        <code>data</code> attribute is set to <code>"</code>.
258        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
259        is <code>apos</code>.  It contains a <code>Text</code> node whose
260        <code>data</code> attribute is set to <code>'</code>.
261        </ol>
262    <dt>For each internal general entity declaration being processed by the parser
263      <dd>If the
264      <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EntityValue"><code>EntityValue</code></a>
265      part of the general entity declaration contains a bare <code>U+003C</code>
266      <code>LESS-THAN SIGN</code> (<code>&lt;</code>) character, then the parser
267      <em class=rfc2119>MUST</em> raise an
268      <a href="#xml-misc-warning" id=xmw-entity-value-lt><code>xml-misc-warning</code></a>.<!--
269  "strongly advised to avoid" in a Note in Section 2.3 of [XML10], [XML11].  "strongly advised to avoid" in a Note in Section 2.3 of [XML10], [XML11].
270  --></li>  -->
271  <li>If there is an element type declaration whose <code>Name</code>  <dt>For each element type declaration being processed by the parser
272  value is already declared, then raise an    <dd>If there is another processed element type declaration whose
273  <a href="#xml-validity-error" id=vc-edunique><code>xml-validity-error</code></a>.</li>    <code>Name</code> is equal to the <code>Name</code> of the element type
274  <li>If attribute definition whose <code>Name</code> is    declaration, then the parser <em class=rfc2119>MUST</em> raise an
275  <code>xml:space</code> has <span class=ed>declared type different from    <a href="#xml-validity-error" id=vc-edunique><code>xml-validity-error</code></a>.
276  (default|preserve), (default), or (preserve)</span>, then raise an  <dt>For each attribute definition list declaration being processed by the
277  <a href="#xml-misc-error" id=xme-ad-xml-space><code>xml-misc-error</code></a>.  parser
278  <span class=ed>@@ duplication with    <dd>If there is another processed attribute defintion list declaration whose
279  <a href="#xml-at-xml-space">#xml-at-xml-space</a>.<!--    <code>Name</code> is equal to the <code>Name</code> of the attribute
280  <!ATTLIST e xml:space CDATA #IMPLIED xml:space CDATA #IMPLIED> --></span></li>    definition list declaration, then the parser <em class=rfc2119>MUST</em>
281  <li>If an empty-element tag is used for an element which is <em>not</em>    raise an
282  declared <code>EMPTY</code>, then raise an    <a href="#xml-misc-warning" id=xme-attlist-unique><code>xml-misc-warning</code></a>.
283  <a href="#xml-misc-recommentation" id=xmr-emptyelemtag-not-empty><code>xml-misc-recommendation</code></a>.</li>    <dd>For each attribute definition in the attribute definition list
284  <li>If an empty-element tag is <em>not</em> used for an element which is    declaration, if there is another processed attribute definition whose
285  declared <code>EMPTY</code>, then raise an    <code>Name</code> is equal to the <code>Name</code> of the attribute
286  <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.</li>    definition (whether or not in the same attribute definition list
287      declaration), then the parser <em class=rfc2119>MUST</em> raise an
288      <a href="#xml-misc-warning" id=xme-attrdef-unique><code>xml-misc-warning</code></a>.
289      <!--
290        <q>For interoperability, an XML processor <em class=rfc2119>MAY</em> at
291        user option issue a warning when more than one attribute-list declaration
292        is provided for a given element type, or more than one attribute definition
293        is provided for a given attribute, but this is not an error.</q>
294      -->
295  <!--  <!--
296      NOTE: <!ATTLIST a xml:space (default) #IMPLIED xml:space CDATA #IMPLIED>
297      will not be warned.
298    -->
299    
300    <dt>For each entity declaration being processed by the parser
301      <dd>Handle as follows:
302        <ol>
303        <li><p>If the entity declaration declares a general entity, the following
304        is applied:
305          <dl>
306          <dt>If the <code>Name</code> is <code>lt</code> or <code>amp</code>
307            <dd><p>If the entity declaration does not declare an internal entity,
308            or if the replacement text of the entity is not the escaped form of
309            <code>&lt;</code> (if <code>lt</code>) or <code>&amp;</code> (if
310            <code>amp</code>), then the parser <em class=rfc2119>MUST</em> raise an
311            <a href="#xml-misc-error" id=xme-double-escape><code>xml-misc-error</code></a>.
312    
313              <div class="note memo informative">
314              <p>In other word, the character in the <code>EntityValue</code>
315              has to be double-escaped.
316              </div>
317          <dt>If the <code>Name</code> is <code>gt</code>, <code>quot</code>, or
318          <code>apos</code>
319            <dd><p>If the entity declaration does not declare an internal entity,
320            or if the replacement text of the entity is not equal to or not the
321            escaped form of <code>></code> (if <code>gt</code>), <code>"</code> (if
322            <code>quot</code>), or <code>'</code> (if <code>apos</code>), then the
323            parser <em class=rfc2119>MUST</em> raise an
324            <a href="#xml-misc-error" id=xme-single-escape><code>xml-misc-error</code></a>.
325    
326              <div class="note memo informative">
327              <p>In other word, the character in the <code>EntityValue</code>
328              has to be single- or double-escaped.
329              </div>
330          </dl>
331          <!--
332            <q>If the entities lt or amp are declared, they MUST be declared as internal entities whose replacement text is a character reference to the respective character (less-than sign or ampersand) being escaped; the double escaping is REQUIRED for these entities so that references to them produce a well-formed result. If the entities gt, apos, or quot are declared, they MUST be declared as internal entities whose replacement text is the single character being escaped (or a character reference to that character; the double escaping here is OPTIONAL but harmless).</q>
333          -->
334    
335        <li><p>If the entity declaration has to be ignored since there has already
336        been declared an entity with the same <code>Name</code> as the declaration,
337        then the parser <em class=rfc2119>MUST</em> raise a
338        <a href="#misc-info" id=mi-ent-unique><code>misc-info</code></a>
339        and abort these steps.
340    
341        <div class="informative note memo">
342        <p>Five predefined entities, i.e. <code>amp</code>, <code>lt</code>,
343        <code>gt</code>, <code>quot</code>, and <code>apos</code>, are always
344        declared implicitly and therefore any declaration for such an entity
345        always raises an
346        <a href="#misc-info" id=mi-ent-unique><code>misc-info</code></a>.
347        </div>  
348    
349        <li><p>If the entity declaration declares a parameter entity and the
350        <code>Name</code> of the entity begins with the string <code>xml</code>
351        (in any combination of upper- and lowercase letters), then the parser
352        <em class=rfc2119>MUST</em> raise an
353        <a href="#xml-misc-warning" id=xmw-reserved-pe-name><code>xml-misc-warning</code></a>.
354    
355        <li><p>If the entity declaration contains the <code>EntityValue</code>,
356        then for each occurence of any references to unparsed entities in the
357        <code>EntityValue</code>, the parser <em class=rfc2119>MUST</em> raise an
358        <a href="#xml-misc-error" id=xme-unparsed-in-ev><code>xml-misc-error</code></a>.
359        <!--
360          <q>It is an error for a reference to an unparsed entity to appear in the
361          EntityValue in an entity declaration.</q>
362        -->
363        <li><p>If the entity declaration declares a general entity, then an
364        <code>Entity</code> node <em class=rfc2119>MUST</em> be created and
365        appended to the <code>NamedNodeMap</code> object in the
366        <code>entities</code> attribute of the <code>DocumentType</code> node.
367        
368        <p class=ed>Read the external entity
369    
370        <p>If the replacement text of the entity is read, then parse the
371        replacement text as if it were referenced from the content of an
372        element (with no namespace bindings).  If no <span class=ed>@@ parse error</span>
373        is raised by the parsing process, then the nodes generated by the
374        parsing <em class=rfc2119>MUST</em> be appended to the <code>Entity</code>
375        node.  The parse error <em class=rfc2119>MUST NOT</em> be propagated to
376        the entire parsing process.  Other kinds of errors
377        <em class=rfc2119>MUST</em> be propagated.  The first parse error
378        <em class=rfc2119>MUST</em> abort the internal parsing process.
379        <span class=ed>@@ better wording</span>
380    
381        <p class=ed>@@ prop
382        
383        <p>Then, the <code>Entity</code> node and its descendant
384        <em class=rfc2119>MUST</em> be marked as read-only.
385      </ol>
386    
387    <dt>For each notation declaration being processed by the parser
388      <dd>If there is another processed notation declaration whose
389      <code>Name</code> is equal to the <code>Name</code> of the notation
390      declaration, then the parser <em class=rfc2119>MUST</em> raise an
391      <a href="#xml-validity-error" id=vc-uniquenotationname><code>xml-validity-error</code></a>.
392      <!-- <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#UniqueNotationName">Validity constraint: Unique Notation Name</a> -->
393    
394    <dt>For each empty-element tag
395      <dd>If the <code>Name</code> of the tag is not declared by a processed
396      element type declaration as <code>EMPTY</code> content, then the parser
397      <em class=rfc2119>MUST</em> raise an
398      <a href="#xml-misc-recommentation" id=xmr-emptyelemtag-not-empty><code>xml-misc-recommendation</code></a>.
399    <dt>For each start-tag
400      <dd>If the <code>Name</code> of the tag is declared by a processed element
401      type declaration as <code>EMPTY</code> content, then the parser
402      <em class=rfc2119>MUST</em> raise an
403      <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.
404    
405    <dt>For each parameter entity reference
406      <dd>If the declaration for the entity is not read (i.e. no declaration
407      for the entity is processed or the external entity referenced by the
408      declaration cannot be retrieved), then:
409        <ul>
410        <li>If the parameter entity is contained in a declaration, then the
411        declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that
412        any error before the parameter entity <em class=rfc2119>MUST</em> be
413        raised as usual.
414        <li>If the parameter entity is contained in the status portion of a
415        conditional section, then the conditional section
416        <em class=rfc2119>MUST</em> be processed as if it were an
417        <code>IGNORE</code>d section.
418        <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or
419        attribute-list declaration after the parameter entity reference in the DTD
420        <em>except</em> when the <code>standalone</code> pseudo-attribute of the
421        XML declaration (if any) is set to <code>yes</code>.
422        <!-- This requirement is enforced for internal DTD subset case in
423        XML 1.0/1.1 specification (section 5.1) but not for any other cases. -->
424        <!-- According to this definition, element type declarations, notation
425        declarations, and PIs ARE processed. -->
426        <li>If the parameter entity reference is the first reference to an entity
427        that is not read, then the parser <em class=rfc2119>MUST</em> raise an
428        <a href="#entity-error" id=ee-unread-pe><code>entity-error</code></a>.
429        <li>The <code>allDeclarationsProcessed</code> <span class=ed>@@ ref</span>
430        attribute of the <code>Document</code> node <em class=rfc2119>MUST</em> be
431        set to <code>false</code>.
432        </ul>
433    <dt>For each general entity reference in an attribute value or in the content
434    of an element
435      <dd>If the declaration for the entity is not read (i.e. no declaration for
436      the entity is processed or the external entity referenced by the declaration
437      cannot be retrieved), then:
438        <ul>
439        <li>If the general entity reference is the first reference to an entity
440        that is not read, then the parser <em class=rfc2119>MUST</em> raise an
441        <a href="#entity-error" id=ee-unread-ge><code>entity-error</code></a>.
442        <span class=ed>@@ entity declared WFC?</span>
443        <li class=ed>An unexpended entity reference node <em class=rfc2119>MUST</em> be inserted to the current node.
444        </ul>
445    </dl>
446    
447    <p class=ed>@@ MUST try to read external entity
448    
449  #vc-PEinMarkupDecl Validity constraint: Proper Declaration/PE Nesting  <p>In addition, the parser has to check whether the
450  #wfc-PEinInternalSubset Well-formedness constraint: PEs in Internal Subset  following constraints are met.
451  #ExtSubset Well-formedness constraint: External Subset  
452  #PE-between-Decls Well-formedness constraint: PE Between Declarations  <p><strong>Well-formedness constraints</strong>.  For each violation to
453  #GIMatch Well-formedness constraint: Element Type Match  one of constraints below, an
454  #uniqattspec Well-formedness constraint: Unique Att Spec  <a href="#xml-well-formedness-error"><code>xml-well-formedness-error</code></a>
455  #NoExternalRefs Well-formedness constraint: No External Entity References  <em class=rfc2119>MUST</em> be raised.  The list of well-formedness
456  #CleanAttrVals Well-formedness constraint: No < in Attribute Values  constraints is below:
457  #vc-PEinGroup Validity constraint: Proper Group/PE Nesting  <ul>
458  "For interoperability, if a parameter-entity reference appears in a choice, seq, or Mixed construct, its replacement text SHOULD contain at least one non-blank character, and neither the first nor last non-blank character of the replacement text SHOULD be a connector (| or ,)."  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wfc-PEinInternalSubset">Well-formedness constraint: PEs in Internal Subset</a>
459  "For interoperability, an XML processor MAY at user option issue a warning when more than one attribute-list declaration is provided for a given element type, or more than one attribute definition is provided for a given attribute, but this is not an error."  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#GIMatch">Well-formedness constraint: Element Type Match</a>
460  #condsec-nesting Validity constraint: Proper Conditional Section/PE Nesting  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#uniqattspec">Well-formedness constraint: Unique Att Spec</a>
461  #wf-Legalchar Well-formedness constraint: Legal Character  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NoExternalRefs">Well-formedness constraint: No External Entity References</a>
462  #textent Well-formedness constraint: Parsed Entity  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#CleanAttrVals">Well-formedness constraint: No &lt; in Attribute Values</a>
463  #norecursion Well-formedness constraint: No Recursion  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#wf-Legalchar">Well-formedness constraint: Legal Character</a>
464  #indtd Well-formedness constraint: In DTD  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#textent">Well-formedness constraint: Parsed Entity</a>
465  "External parsed entities SHOULD each begin with a text declaration."  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#norecursion">Well-formedness constraint: No Recursion</a>
466  "It is an error for a reference to an unparsed entity to appear in the EntityValue in an entity declaration."  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#indtd">Well-formedness constraint: In DTD</a>
467  #UniqueNotationName Validity constraint: Unique Notation Name  </ul>
468    
469    <p><strong>Validity constraints</strong>.  For each violation to
470    one of constraints below, an
471    <a href="#xml-validity-error"><code>xml-validity-error</code></a>.
472    <em class=rfc2119>MUST</em> be raised.  The list of validity
473    constraints is below:
474    <ul>
475    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>
476    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>
477    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>
478    </ul>
479    
480    <p><strong>Other creteria</strong>.  For each violation to
481    one of constraints below, an
482    <a href="#xml-misc-recommendation"><code>xml-misc-recommendation</code></a>
483    <em class=rfc2119>MUST</em> be raised.  The list of constraints is below:
484    <ul>
485    <li><q>For interoperability, if a parameter-entity reference appears in a
486    <code>choice</code>, <code>seq</code>, or <code>Mixed</code> construct, its
487    replacement text <em class=rfc2119>SHOULD</em> contain at least one non-blank
488    character, and neither the first nor last non-blank character of the
489    replacement text <em class=rfc2119>SHOULD</em> be a connector (<code>|</code>
490    or <code>,</code>).</q>
491    <li><q>External parsed entities <em class=rfc2119>SHOULD</em> each begin with a
492    text declaration.</q>
493    </ul>
494    
495    
496    <!--
497    
498  @@ Need detailed review, but maybe should be in parsing phase  @@ Need detailed review, but maybe should be in parsing phase
499    
# Line 272  declared <code>EMPTY</code>, then raise Line 504  declared <code>EMPTY</code>, then raise
504  #wf-entdeclared Well-formedness constraint: Entity Declared  #wf-entdeclared Well-formedness constraint: Entity Declared
505  #vc-entdeclared Validity constraint: Entity Declared  #vc-entdeclared Validity constraint: Entity Declared
506  "For interoperability, valid documents SHOULD declare the entities amp, lt, gt, apos, quot, in the form specified in 4.6 Predefined Entities."  "For interoperability, valid documents SHOULD declare the entities amp, lt, gt, apos, quot, in the form specified in 4.6 Predefined Entities."
 "If the entities lt or amp are declared, they MUST be declared as internal entities whose replacement text is a character reference to the respective character (less-than sign or ampersand) being escaped; the double escaping is REQUIRED for these entities so that references to them produce a well-formed result. If the entities gt, apos, or quot are declared, they MUST be declared as internal entities whose replacement text is the single character being escaped (or a character reference to that character; the double escaping here is OPTIONAL but harmless)."  
507    
508  @@ flaged and then reported in DOM check phase  @@ flaged and then reported in DOM check phase
509    
# Line 284  declared <code>EMPTY</code>, then raise Line 515  declared <code>EMPTY</code>, then raise
515    
516  "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."  "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."
517    
518    @@ We should phrase out what the parser should do where the XML specification
519    left undefined.  For example: Comment must be converted to a Comment node,
520    illegal xml:space value must be preserved, so on.
521    
522    Warn <!ENTITY % xml... ...>
523    
524    -->
525    
526    <p>The parser <em class=rfc2119>MUST</em> raise an
527    <a href="#xml-well-formedness-error" id=wfe-syntax><code>xml-well-formedness-error</code></a>
528    for any failure to match to a production rule in the XML specification.
529    <!--
530      <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>
531      <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>
532  -->  -->
 </ul>  
533  </div>  </div>
534    
535  <div class="section" id=checking-dom>  <div class="section" id=checking-dom>
# Line 367  character that is <em>not</em> in the ch Line 611  character that is <em>not</em> in the ch
611  case combination), then raise an  case combination), then raise an
612  <a href="#xml-misc-warning" id=xmw-reserved-name><code>xml-misc-warning</code></a>.  <a href="#xml-misc-warning" id=xmw-reserved-name><code>xml-misc-warning</code></a>.
613  <span class=ed>@@ except for attribute names <code>xml:lang</code>,  <span class=ed>@@ except for attribute names <code>xml:lang</code>,
614  <code>xml:space</code>, <code>xml:base</code>, <code>xml:id</code>,  <code>xml:space</code><!--, <code>xml:base</code>, <code>xml:id</code>,
615  <code>xmlns</code>, <code>xmlns:<var>*</var></code>,  <code>xmlns</code>, <code>xmlns:<var>*</var></code>,
616  pi name <code>xml-stylesheet</code>.</span><!--  pi name <code>xml-stylesheet</code>-->.</span><!--
617  "names beginning with a match to (('X'|'x')('M'|'m')('L'|'l')) are reserved for standardization in this or future versions of this specification.":  "names beginning with a match to (('X'|'x')('M'|'m')('L'|'l')) are reserved for standardization in this or future versions of this specification.":
618  xmlns, xml-stylesheet, xml:base and xml:id specifications violate to this sentense!  xmlns, xml-stylesheet, xml:base and xml:id specifications violate to this sentense!
619  --></li>  --></li>
# Line 393  following algorithm <em class=rfc2119>MU Line 637  following algorithm <em class=rfc2119>MU
637  a public identifier (<dfn id=var-pid><var>pid</var></dfn>)</dfn>, the  a public identifier (<dfn id=var-pid><var>pid</var></dfn>)</dfn>, the
638  following algorithm <em class=rfc2119>MUST</em> be used:</p>  following algorithm <em class=rfc2119>MUST</em> be used:</p>
639  <ol>  <ol>
640    <li>If <var>pid</var> is <code>null</code>, abort these steps.</li>  <li>If <var>pid</var> is <code>null</code>, abort these steps.</li>
641    <li>If <var>pid</var> contains any character  <li>If <var>pid</var> contains a character that is <em>not</em> in the
642    that is outside of the range of <code>#x20 | #xD | #xA |  character class <a href="#class-PubidChar"><code>PubidChar</code></a>, then
643    [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%]</code><!-- @@ TODO: formal def -->,  raise an
644    then it is an  <a href="#xml-well-formedness-error" id=wfe-pubid-char><code>xml-well-formedness-error</code></a>.</li>
   <a href="#xml-well-formedness-error" id=wfe-pubid-char><code>xml-well-formedness-error</code></a>.</li>  
645    <li>If <var>pid</var> contains one of <code class=char>U+0009</code>    <li>If <var>pid</var> contains one of <code class=char>U+0009</code>
646    <code class=charname>CHARACTER TABULATION</code>,    <code class=charname>CHARACTER TABULATION</code>,
647    <code class=char>U+000A</code> <code class=charname>CARRIAGE RETURN</code>,    <code class=char>U+000A</code> <code class=charname>CARRIAGE RETURN</code>,
# Line 414  following algorithm <em class=rfc2119>MU Line 657  following algorithm <em class=rfc2119>MU
657    <span class=ed>Is this really a roundtripness problem?  XML spec    <span class=ed>Is this really a roundtripness problem?  XML spec
658    does only define the way to match public identifiers in fact, no    does only define the way to match public identifiers in fact, no
659    canonical form.</span></li>    canonical form.</span></li>
   <li class=ed>@@ Should we check formal-public-identifierness?</li>  
660  </ol>  </ol>
661    
662  <p>To  <p>To
# Line 561  following:</p> Line 803  following:</p>
803      <li>If <code>nodeName</code> attribute of <var>n</var> is      <li>If <code>nodeName</code> attribute of <var>n</var> is
804      <code>xml:space</code> <span class=ed>@@ or {xml namespace}:space ?</span>      <code>xml:space</code> <span class=ed>@@ or {xml namespace}:space ?</span>
805      and <span class=ed>its declared type is different from (default|preserve),      and <span class=ed>its declared type is different from (default|preserve),
806      (default), or (preserve)</span>, then raise an      (preserve|default), (default), or (preserve)</span>, then raise an
807      <a href="#xml-misc-error" id=xme-at-xml-space><code>xml-misc-error</code></a>.</li>      <a href="#xml-misc-error" id=xme-at-xml-space><code>xml-misc-error</code></a>.</li>
808      <li>For each node <dfn id=var-ad-nc><var>n<sub><var>c</var></sub></var></dfn> in the      <li>For each node <dfn id=var-ad-nc><var>n<sub><var>c</var></sub></var></dfn> in the
809      <code>childNodes</code> list of <var>n</var>,      <code>childNodes</code> list of <var>n</var>,
# Line 716  following:</p> Line 958  following:</p>
958      <a href="#algorithm-to-check-a-node" title="check a node">check the      <a href="#algorithm-to-check-a-node" title="check a node">check the
959      node</a> recursively.</li>      node</a> recursively.</li>
960      <li class=ed>@@ externally declared?</li>      <li class=ed>@@ externally declared?</li>
961        <li>If the <code>NamedNodeMap</code> object in the <code>entities</code>
962        attribute of <var>n</var> does not contain <code>Entity</code> nodes
963        whose <code>nodeName</code> attribute are <code>amp</code>,
964        <code>lt</code>, <code>gt</code>, <code>apos</code>, and <code>quot</code>
965        then raise
966        <a href="#xml-misc-recommentation" id=xmr-predefined><code>xml-misc-recommendation</code></a>(s).
967      </ol>      </ol>
968    </dd>    </dd>
969  <dt>If <var>n</var> is an <code>Element</code> node</dt>  <dt>If <var>n</var> is an <code>Element</code> node</dt>
# Line 996  as amended by Line 1244  as amended by
1244  contains the following characters:</p>  contains the following characters:</p>
1245  <ul class=ed>  <ul class=ed>
1246  </ul>  </ul>
1247    <div class="note memo">
1248    <p>This character class contains all characters allowed as the first character
1249    of a string matching to the production rule
1250    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-Name"><code>Name</code></a>
1251    of <abbr>XML</abbr> 1.0
1252    <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>]</cite>.</p>
1253    </div>
1254    
1255  <p>The character class <dfn id=class-NameChar10><code>NameChar10</code></dfn>  <p>The character class <dfn id=class-NameChar10><code>NameChar10</code></dfn>
1256  contains the following characters:</p>  contains the following characters:</p>
# Line 1004  contains the following characters:</p> Line 1259  contains the following characters:</p>
1259  <a href="#class-NameStartChar10">NameStartChar10</a>.</li>  <a href="#class-NameStartChar10">NameStartChar10</a>.</li>
1260  <li class=ed></li>  <li class=ed></li>
1261  </ul>  </ul>
1262    <div class="note memo">
1263    <p>This character class contains all characters allowed as the second
1264    character of a string matching to the production rule
1265    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-Name"><code>Name</code></a>
1266    of <abbr>XML</abbr> 1.0
1267    <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>]</cite>.</p>
1268    </div>
1269    
1270    <p>The character class <dfn id=class-PubidChar><code>PubidChar</code></dfn>
1271    contains the following characters:</p>
1272    <ul>
1273    <li><code class=char>U+0009</code> <code class=charname>CHARACTER
1274    TABULATION</code></li>
1275    <li><code class=char>U+000A</code> <code class=charname>LINE FEED</code></li>
1276    <li><code class=char>U+000D</code> <code class=charname>CARRIAGE
1277    RETURN</code></li>
1278    <li><code class=char>U+0020</code> <code class=charname>SPACE</code></li>
1279    <li><code class=char>U+0021</code> <code class=charname>EXCLAMATION MARK</code>
1280    (<code class=char>!</code>)</li>
1281    <li><code class=char>U+0023</code> <code class=charname>DOLLAR SIGN</code>
1282    (<code class=char>$</code>)</li>
1283    <li><code class=char>U+0024</code> <code class=charname>NUMBER SIGN</code>
1284    (<code class=char>#</code>)</li>
1285    <li><code class=char>U+0025</code> <code class=charname>PERCENT SIGN</code>
1286    (<code class=char>%</code>)</li>
1287    <li><code class=char>U+0027</code> <code class=charname>APOSTROPHE</code>
1288    (<code class=char>'</code>)</li>
1289    <li><code class=char>U+0028</code> <code class=charname>LEFT PARENTHESIS</code>
1290    (<code class=char>(</code>)</li>
1291    <li><code class=char>U+0029</code> <code class=charname>RIGHT
1292    PARENTHESIS</code> (<code class=char>)</code>)</li>
1293    <li><code class=char>U+002A</code> <code class=charname>ASTERISK</code>
1294    (<code class=char>*</code>)</li>
1295    <li><code class=char>U+002B</code> <code class=charname>PLUS SIGN</code>
1296    (<code class=char>+</code>)</li>
1297    <li><code class=char>U+002C</code> <code class=charname>COMMA</code>
1298    (<code class=char>,</code>)</li>
1299    <li><code class=char>U+002D</code> <code class=charname>HYPHEN-MINUS</code>
1300    (<code class=char>-</code>)</li>
1301    <li><code class=char>U+002E</code> <code class=charname>FULL STOP</code>
1302    (<code class=char>.</code>)</li>
1303    <li><code class=char>U+002F</code> <code class=charname>SOLIDUS</code>
1304    (<code class=char>/</code>)</li>
1305    <li><code class=char>U+0030</code> <code class=charname>DIGIT ZERO</code>
1306    (<code class=char>0</code>) .. <code class=char>U+0039</code>
1307    <code class=charname>DIGIT NINE</code> (<code class=char>9</code>)</li>
1308    <li><code class=char>U+003A</code> <code class=charname>COLON</code>
1309    (<code class=char>:</code>)</li>
1310    <li><code class=char>U+003B</code> <code class=charname>SEMICOLON</code>
1311    (<code class=char>;</code>)</li>
1312    <li><code class=char>U+003D</code> <code class=charname>EQUAL SIGN</code>
1313    (<code class=char>=</code>)</li>
1314    <li><code class=char>U+003F</code> <code class=charname>QUESTION MARK</code>
1315    (<code class=char>?</code>)</li>
1316    <li><code class=char>U+0040</code> <code class=charname>COMMERCIAL AT</code>
1317    (<code class=char>@</code>)</li>
1318    <li><code class=char>U+0041</code> <code class=charname>LATIN CAPITAL LETTER
1319    A</code> (<code class=char>A</code>) .. <code class=char>U+005A</code>
1320    <code class=charname>LATIN CAPITAL LETTER Z</code>
1321    (<code class=char>Z</code>)</li>
1322    <li><code class=char>U+005F</code> <code class=charname>LOW LINE</code>
1323    (<code class=char>_</code>)</li>
1324    <li><code class=char>U+0061</code> <code class=charname>LATIN CAPITAL LETTER
1325    A</code> (<code class=char>A</code>) .. <code class=char>U+007A</code>
1326    <code class=charname>LATIN CAPITAL LETTER Z</code>
1327    (<code class=char>Z</code>)</li>
1328    </ul>
1329    <div class="note memo">
1330    <p>This character class contains all characters allowed in the production rule
1331    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-PubidChar"><code>PubidChar</code></a>
1332    of <abbr>XML</abbr> 1.0
1333    <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>]</cite>.</p>
1334    </div>
1335    
1336  </div>  </div>
1337    

Legend:
Removed from v.1.19  
changed lines
  Added in v.1.26

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24