/[suikacvs]/markup/xml/xmlcc/xmlcc-work.en.html
Suika

Diff of /markup/xml/xmlcc/xmlcc-work.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.25 by wakaba, Sat Mar 29 02:22:57 2008 UTC revision 1.28 by wakaba, Sat Mar 29 11:46:02 2008 UTC
# Line 92  normative version.</p> Line 92  normative version.</p>
92    
93  <p class=section-info>This section is <em>non$B!>(Bnormative</em>.</p>  <p class=section-info>This section is <em>non$B!>(Bnormative</em>.</p>
94    
95  <div class="issue ed">...</div>  <div class="issue ed">...
96    
97    <p>Much of invalid (well-formed or not) XML document parsing and XML document
98    / XML DOM conformance is left undefined so that this document provides a
99    guideline for conformance checkers.
100    </div>
101    
102    
103  </div>  </div>
# Line 210  can be easily serialized into a valid XM Line 215  can be easily serialized into a valid XM
215    violate to any well$B!>(Bformedness constraint in XML    violate to any well$B!>(Bformedness constraint in XML
216    specification <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>,    specification <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>,
217    <a href="#ref-XML11">XML11</a>]</cite>.</p></dd>    <a href="#ref-XML11">XML11</a>]</cite>.</p></dd>
218    <dt><dfn id=misc-info><code>misc-info</code></dfn>
219      <dd><p>A <code>misc-info</code> is raised when some status information
220      on parsing or checking process that are considered useful for debugging
221      and so on is available.  It by no means implies the non-conformance of
222      the document.
223  </dl>  </dl>
224    
225  <div class=ed><p>@@ TODO: #dt-atuseroption at user option  <div class=ed><p>@@ TODO: #dt-atuseroption at user option
# Line 217  can be easily serialized into a valid XM Line 227  can be easily serialized into a valid XM
227  #dt-interop for interoperability</p>  #dt-interop for interoperability</p>
228    
229  <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id  <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id
230    
231    <p>TODO: XML "error"/"fatal error" is not always non-conforming (only
232    when MUST or SHOULD).
233  </div>  </div>
234    
235    <p>The parser <em class=rfc2119>MAY</em> continue the parsing of the document
236    even after a fatal error (as defined by the relavant specification) is
237    encountered.  How the parsing ought to be continued is not defined by this
238    specification.
239    
240      <div class="note memo informative">
241      <p>It is expected that the XML5 specification <span class=ed>@@ ref</span>
242      will define how the parser has to convert any string into DOM tree
243      completely.
244      </div>
245    
246  </div>  </div>
247    
248  <div class=section id=parsing-xml>  <div class=section id=parsing-xml>
# Line 227  can be easily serialized into a valid XM Line 251  can be easily serialized into a valid XM
251  <p>When an <abbr>XML</abbr> document is parsed, the following clauses  <p>When an <abbr>XML</abbr> document is parsed, the following clauses
252  are applied:</p>  are applied:</p>
253  <dl class=switch>  <dl class=switch>
254    <dt>For each external entity (including the document entity and the external
255    subset entity, if any)
256      <dd>If there is a byte sequence that are not legal in the encoding in use,
257      then the parser <em class=rfc2119>MUST</em> raise an
258      <a href="#xml-misc-error" id=xme-illegal-bytes><code>xml-misc-error</code></a>.
259      <!--
260         <q>It is a fatal error when an XML processor encounters an entity with an
261         encoding that it is unable to process. It is a fatal error if an XML
262         entity is determined (via default, encoding declaration, or higher-level
263         protocol) to be in a certain encoding but contains byte sequences that are
264         not legal in that encoding.</q>
265      -->
266    
267      <dd>If it is the document entity or a general entity, then:
268        <ul>
269        <li>If the input byte sequence for the entity begins with the
270        <abbr title="BYTE ORDER MARK" class=charname>BOM</abbr>, then the parser
271        <em class=rfc2119>MUST</em> set the <span class=ed>BOM flag</span> of
272        the node corresponding to the entity (the <code>Document</code> node
273        for the document entity or an <code>Entity</code> node for a general
274        entity) to <code>true</code>.
275        <!--
276          <q>Entities encoded in UTF-16 MUST and entities encoded in UTF-8 MAY
277          begin with the Byte Order Mark</q>
278        -->
279        <span class=ed>@@ flag must be checked later</span>
280        <!-- <?xml encoding=""?> must be reflected to xmlEncoding; this should be
281        enforced by DOM Core spec. -->
282        </ul>
283      <dd>If it is a parameter entity or the external subset entity, then:
284        <ul>
285        <li>If the character encoding of the entity is <code>UTF-16</code> but
286        the input byte stream for the entity does not begin with the
287        <abbr title="BYTE ORDER MARK" class=charname>BOM</abbr>, then the parser
288        <em class=rfc2119>MUST</em> raise an
289        <a href="#xml-misc-error" id=xme-pe-bom><code>xml-misc-error</code></a>.
290        <li class=ed>@@ encoding="" preferred name?
291    <!--
292    "In an encoding declaration, the values "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4" SHOULD be used for the various encodings and transformations of Unicode / ISO/IEC 10646, the values "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-n" (where n is the part number) SHOULD be used for the parts of ISO 8859, and the values "ISO-2022-JP", "Shift_JIS", and "EUC-JP" SHOULD be used for the various encoded forms of JIS X-0208-1997. It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using their registered names; other encodings SHOULD use names starting with an "x-" prefix."
293    
294    In addition, this should be checked later for Document and Entity nodes.
295    -->
296        </ul>
297    
298  <dt>For the document  <dt>For the document
299    <dd>If the <abbr>XML</abbr> document does not begin with an    <dd>If the <abbr>XML</abbr> document does not begin with an
300    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>
301    raise an    raise an
302    <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.    <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.
303      <dd>If the document does not contain the document type declaration, or
304      if it does but the document type definition does not contain entity
305      declaration for any of <code>amp</code>, <code>lt</code>, <code>gt</code>,
306      <code>apos</code>, or <code>quot</code>, then the parser
307      <em class=rfc2119>MUST</em> raise
308      <a href="#xml-misc-recommentation" id=xmr-predefined-decl><code>xml-misc-recommendation</code></a>(s).
309      <!--
310        <q>For interoperability, valid documents SHOULD declare the entities
311        amp, lt, gt, apos, quot, in the form specified in 4.6 Predefined
312        Entities.</q>
313      -->
314  <dt>For the document type declaration  <dt>For the document type declaration
315    <dd class=ed>@@ read external entity    <dd class=ed>@@ read external entity
316      <dd>The <code>entities</code> attribute of the <code>DocumentType</code>
317      node <em class=rfc2119>MUST</em> contain a <code>NamedNodeMap</code> object
318      whose first five items are as follows:
319        <ol start=0>
320        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
321        is <code>amp</code>.  It contains a <code>Text</code> node whose
322        <code>data</code> attribute is set to <code>&amp;</code>.
323        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
324        is <code>lt</code>.  It contains a <code>Text</code> node whose
325        <code>data</code> attribute is set to <code>&lt;</code>.
326        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
327        is <code>gt</code>.  It contains a <code>Text</code> node whose
328        <code>data</code> attribute is set to <code>></code>.
329        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
330        is <code>quot</code>.  It contains a <code>Text</code> node whose
331        <code>data</code> attribute is set to <code>"</code>.
332        <li>An <code>Entity</code> node whose <code>nodeName</code> attribute
333        is <code>apos</code>.  It contains a <code>Text</code> node whose
334        <code>data</code> attribute is set to <code>'</code>.
335        </ol>
336  <dt>For each internal general entity declaration being processed by the parser  <dt>For each internal general entity declaration being processed by the parser
337    <dd>If the    <dd>If the
338    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EntityValue"><code>EntityValue</code></a>    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EntityValue"><code>EntityValue</code></a>
# Line 273  parser Line 372  parser
372  -->  -->
373    
374  <dt>For each entity declaration being processed by the parser  <dt>For each entity declaration being processed by the parser
375    <dd>If the entity declaration declares a parameter entity and the    <dd>Handle as follows:
376    <code>Name</code> of the entity begins with the string <code>xml</code>      <ol>
377    (in any combination of upper- and lowercase letters), then the parser      <li><p>If the entity declaration declares a general entity, the following
378    <em class=rfc2119>MUST</em> raise an      is applied:
379    <a href="#xml-misc-warning" id=xmw-reserved-pe-name><code>xml-misc-warning</code></a>.        <dl>
380    <dd>If the entity declaration contains the <code>EntityValue</code>, then        <dt>If the <code>Name</code> is <code>lt</code> or <code>amp</code>
381    for each occurence of any references to unparsed entities in the          <dd><p>If the entity declaration does not declare an internal entity,
382    <code>EntityValue</code>, the parser <em class=rfc2119>MUST</em> raise an          or if the replacement text of the entity is not the escaped form of
383    <a href="#xml-misc-error" id=xme-unparsed-in-ev><code>xml-misc-error</code></a>.          <code>&lt;</code> (if <code>lt</code>) or <code>&amp;</code> (if
384    <!--          <code>amp</code>), then the parser <em class=rfc2119>MUST</em> raise an
385      <q>It is an error for a reference to an unparsed entity to appear in the          <a href="#xml-misc-error" id=xme-double-escape><code>xml-misc-error</code></a>.
386      EntityValue in an entity declaration.</q>  
387    -->            <div class="note memo informative">
388              <p>In other word, the character in the <code>EntityValue</code>
389              has to be double-escaped.
390              </div>
391          <dt>If the <code>Name</code> is <code>gt</code>, <code>quot</code>, or
392          <code>apos</code>
393            <dd><p>If the entity declaration does not declare an internal entity,
394            or if the replacement text of the entity is not equal to or not the
395            escaped form of <code>></code> (if <code>gt</code>), <code>"</code> (if
396            <code>quot</code>), or <code>'</code> (if <code>apos</code>), then the
397            parser <em class=rfc2119>MUST</em> raise an
398            <a href="#xml-misc-error" id=xme-single-escape><code>xml-misc-error</code></a>.
399    
400              <div class="note memo informative">
401              <p>In other word, the character in the <code>EntityValue</code>
402              has to be single- or double-escaped.
403              </div>
404          </dl>
405          <!--
406            <q>If the entities lt or amp are declared, they MUST be declared as internal entities whose replacement text is a character reference to the respective character (less-than sign or ampersand) being escaped; the double escaping is REQUIRED for these entities so that references to them produce a well-formed result. If the entities gt, apos, or quot are declared, they MUST be declared as internal entities whose replacement text is the single character being escaped (or a character reference to that character; the double escaping here is OPTIONAL but harmless).</q>
407          -->
408    
409        <li><p>If the entity declaration has to be ignored since there has already
410        been declared an entity with the same <code>Name</code> as the declaration,
411        then the parser <em class=rfc2119>MUST</em> raise a
412        <a href="#misc-info" id=mi-ent-unique><code>misc-info</code></a>
413        and abort these steps.
414    
415        <div class="informative note memo">
416        <p>Five predefined entities, i.e. <code>amp</code>, <code>lt</code>,
417        <code>gt</code>, <code>quot</code>, and <code>apos</code>, are always
418        declared implicitly and therefore any declaration for such an entity
419        always raises an
420        <a href="#misc-info" id=mi-ent-unique><code>misc-info</code></a>.
421        </div>  
422    
423        <li><p>If the entity declaration declares a parameter entity and the
424        <code>Name</code> of the entity begins with the string <code>xml</code>
425        (in any combination of upper- and lowercase letters), then the parser
426        <em class=rfc2119>MUST</em> raise an
427        <a href="#xml-misc-warning" id=xmw-reserved-pe-name><code>xml-misc-warning</code></a>.
428    
429        <li><p>If the entity declaration contains the <code>EntityValue</code>,
430        then for each occurence of any references to unparsed entities in the
431        <code>EntityValue</code>, the parser <em class=rfc2119>MUST</em> raise an
432        <a href="#xml-misc-error" id=xme-unparsed-in-ev><code>xml-misc-error</code></a>.
433        <!--
434          <q>It is an error for a reference to an unparsed entity to appear in the
435          EntityValue in an entity declaration.</q>
436        -->
437        <li><p>If the entity declaration declares a general entity, then an
438        <code>Entity</code> node <em class=rfc2119>MUST</em> be created and
439        appended to the <code>NamedNodeMap</code> object in the
440        <code>entities</code> attribute of the <code>DocumentType</code> node.
441        
442        <p class=ed>Read the external entity
443    
444        <p>If the replacement text of the entity is read, then parse the
445        replacement text as if it were referenced from the content of an
446        element (with no namespace bindings).  If no <span class=ed>@@ parse error</span>
447        is raised by the parsing process, then the nodes generated by the
448        parsing <em class=rfc2119>MUST</em> be appended to the <code>Entity</code>
449        node.  The parse error <em class=rfc2119>MUST NOT</em> be propagated to
450        the entire parsing process.  Other kinds of errors
451        <em class=rfc2119>MUST</em> be propagated.  The first parse error
452        <em class=rfc2119>MUST</em> abort the internal parsing process.
453        <span class=ed>@@ better wording</span>
454    
455        <p class=ed>@@ prop
456        
457        <p>Then, the <code>Entity</code> node and its descendant
458        <em class=rfc2119>MUST</em> be marked as read-only.
459      </ol>
460    
461  <dt>For each notation declaration being processed by the parser  <dt>For each notation declaration being processed by the parser
462    <dd>If there is another processed notation declaration whose    <dd>If there is another processed notation declaration whose
# Line 304  parser Line 475  parser
475    type declaration as <code>EMPTY</code> content, then the parser    type declaration as <code>EMPTY</code> content, then the parser
476    <em class=rfc2119>MUST</em> raise an    <em class=rfc2119>MUST</em> raise an
477    <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.    <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.
478    <dt>For each attribute
479      <dd><p>The parser <em class=rfc2119>MUST</em> set the normalized value of
480      the attribute to the <code>value</code> attribute of the <code>Attr</code>
481      node created for the attribute.
482    
483        <div class="note memo informative">
484        <p>That is, any entity reference has to be expanded.  Unexpanded entity
485        references in attribute values are discarded.
486        </div>
487    <dt>For each <code>xml:space</code> attribute
488      <dd>The parser <em class=rfc2119>MUST</em> set the normalized value of
489      the <code>xml:space</code> attribute to the <code>value</code> attribute
490      of the <code>Attr</code> node created for the attribute even if the
491      normalized value is different from <code>default</code> or
492      <code>preserve</code>.
493      <!-- In XML 1.0/1.1 specification, the attribute specification MAY be
494      ignored. -->
495    
496  <dt>For each parameter entity reference  <dt>For each parameter entity reference
497    <dd>If the declaration for the entity is not read (i.e. no declaration    <dd><p>Process as follows:
498    for the entity is processed or the external entity referenced by the      <ol>
499    declaration cannot be retrieved), then:      <li>If the declaration for the entity is <em>not</em> processed,
500        then:
501          <dl class=switch>
502          <dt>If the document contains no external entity or if the document
503          contains the <code>standalone</code> pseudo-attribute set to
504          <code>yes</code><!-- or the document contains no DTD -->
505            <dd>The parser <em class=rfc2119>MUST</em> raise an
506            <a href="#xml-well-formedness-error" id=wf-entdeclared-pe><code>xml-well-formedness-error</code></a>.
507          <dt>Otherwise
508            <dd>The parser <em class=rfc2119>MUST</em> raise an
509            <a href="#xml-validity-error" id=vc-entdeclared-pe><code>xml-validity-error</code></a>.
510          </dl>
511        <li>If the declaration for the entity <em>is</em> processed but the
512        referenced entity cannot be retrieved, then the parser
513        <em class=rfc2119>MUST</em> raise an
514        <span class=ed>@@ ??-error</span>.
515        </ol>
516    
517        <p>In any of two cases above, process as follows:
518      <ul>      <ul>
519      <li>If the parameter entity is contained in a declaration, then the      <li>If the parameter entity reference is contained in a declaration, then
520      declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that      the declaration <em class=rfc2119>MUST</em> be ignored <em>except</em> that
521      any error before the parameter entity <em class=rfc2119>MUST</em> be      any error before the parameter entity <em class=rfc2119>MUST</em> be
522      raised as usual.      raised as usual.
523      <li>If the parameter entity is contained in the status portion of a      <li>If the parameter entity reference is contained in the status portion of
524      conditional section, then the conditional section      a conditional section, then the conditional section
525      <em class=rfc2119>MUST</em> be processed as if it were an      <em class=rfc2119>MUST</em> be processed as if it were an
526      <code>IGNORE</code>d section.      <code>IGNORE</code>d section.
527      <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or      <li>The parser <em class=rfc2119>MUST NOT</em> process any entity or
# Line 335  parser Line 541  parser
541      </ul>      </ul>
542  <dt>For each general entity reference in an attribute value or in the content  <dt>For each general entity reference in an attribute value or in the content
543  of an element  of an element
544    <dd>If the declaration for the entity is not read (i.e. no declaration for    <dd><p>Process as follows:
545    the entity is processed or the external entity referenced by the declaration      <ol>
546    cannot be retrieved), then:      <li>If the <code>Name</code> of the entity reference is either
547        <code>amp</code>, <code>lt</code>, <code>gt</code>, <code>quot</code>,
548        or <code>apos</code>, then abort these steps.
549        <li>If the declaration for the entity is <em>not</em> processed,
550        then:
551          <dl class=switch>
552          <dt>If the document contains no external entity or if the document
553          contains the <code>standalone</code> pseudo-attribute set to
554          <code>yes</code><!-- or the document contains no DTD -->
555            <dd>The parser <em class=rfc2119>MUST</em> raise an
556            <a href="#xml-well-formedness-error" id=wf-entdeclared-ge><code>xml-well-formedness-error</code></a>.
557          <dt>Otherwise
558            <dd>The parser <em class=rfc2119>MUST</em> raise an
559            <a href="#xml-validity-error" id=vc-entdeclared-ge><code>xml-validity-error</code></a>.
560          </dl>
561        <li>If the declaration for the entity <em>is</em> processed but the
562        referenced entity cannot be retrieved, then the parser
563        <em class=rfc2119>MUST</em> raise an
564        <span class=ed>@@ ??-error</span>.
565        </ol>
566    
567        <p>In any of two cases above, process as follows:
568      <ul>      <ul>
569      <li>If the general entity reference is the first reference to an entity      <li>If the general entity reference is the first reference to an entity
570      that is not read, then the parser <em class=rfc2119>MUST</em> raise an      that is not read, then the parser <em class=rfc2119>MUST</em> raise an
# Line 345  of an element Line 572  of an element
572      <span class=ed>@@ entity declared WFC?</span>      <span class=ed>@@ entity declared WFC?</span>
573      <li class=ed>An unexpended entity reference node <em class=rfc2119>MUST</em> be inserted to the current node.      <li class=ed>An unexpended entity reference node <em class=rfc2119>MUST</em> be inserted to the current node.
574      </ul>      </ul>
575    
576    <dt>For each comment <em>outside</em> of document type declaration
577      <dd>A <code>Comment</code> node <em class=rfc2119>MUST</em> be created
578      and inserted appropriately.
579      <!-- In XML 1.0/1.1 spec, this is optional. -->
580  </dl>  </dl>
581    
582  <p class=ed>@@ MUST try to read external entity  <p>The parser <em class=rfc2119>MUST</em> try to read any entity referenced
583    by general or parameter entity references, the external subset entity, if any,
584    and any general entity declared in the document type definition.
585    
586  <p>In addition, the parser has to check whether the  <p>In addition, the parser has to check whether the
587  following constraints are met.  following constraints are met.
# Line 378  constraints is below: Line 612  constraints is below:
612  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinMarkupDecl">Validity constraint: Proper Declaration/PE Nesting</a>
613  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-PEinGroup">Validity constraint: Proper Group/PE Nesting</a>
614  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>  <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#condsec-nesting">Validity constraint: Proper Conditional Section/PE Nesting</a>
615    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#vc-check-rmd">Validity constraint: Standalone Document Declaration</a>
616  </ul>  </ul>
617    
618  <p><strong>Other creteria</strong>.  For each violation to  <p><strong>Other creteria</strong>.  For each violation to
# Line 395  or <code>,</code>).</q> Line 630  or <code>,</code>).</q>
630  text declaration.</q>  text declaration.</q>
631  </ul>  </ul>
632    
633    <p>The parser <em class=rfc2119>MUST</em> act as if it is a
634  <!--  <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-validating">validating
635    XML processor</a> for the informing of white space appearing in
636    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#dt-elemcontent">element
637  @@ Need detailed review, but maybe should be in parsing phase  content</a> (See
638    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space">Section
639  #vc-check-rmd Validity constraint: Standalone Document Declaration  2.10</a> of the XML specification).
640    
641  @@ Need dtailed review    <div class="note memo informative">
642      <p>In other word, the <code>isElementContentWhitespace</code> attribute
643  #wf-entdeclared Well-formedness constraint: Entity Declared    of <code>Text</code> nodes has to be set appropriately.  Note that the
644  #vc-entdeclared Validity constraint: Entity Declared    value of the attribute will be set to <code>false</code> for any
645  "For interoperability, valid documents SHOULD declare the entities amp, lt, gt, apos, quot, in the form specified in 4.6 Predefined Entities."    <code>Text</code> node in the content of an element whose declaration
646  "If the entities lt or amp are declared, they MUST be declared as internal entities whose replacement text is a character reference to the respective character (less-than sign or ampersand) being escaped; the double escaping is REQUIRED for these entities so that references to them produce a well-formed result. If the entities gt, apos, or quot are declared, they MUST be declared as internal entities whose replacement text is the single character being escaped (or a character reference to that character; the double escaping here is OPTIONAL but harmless)."    is not processed.
647      </div>
 @@ flaged and then reported in DOM check phase  
   
 "Entities encoded in UTF-16 MUST and entities encoded in UTF-8 MAY begin with the Byte Order Mark"  
 "In the absence of external character encoding information (such as MIME headers), parsed entities which are stored in an encoding other than UTF-8 or UTF-16 MUST begin with a text declaration"  
 "In an encoding declaration, the values "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4" SHOULD be used for the various encodings and transformations of Unicode / ISO/IEC 10646, the values "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-n" (where n is the part number) SHOULD be used for the parts of ISO 8859, and the values "ISO-2022-JP", "Shift_JIS", and "EUC-JP" SHOULD be used for the various encoded forms of JIS X-0208-1997. It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using their registered names; other encodings SHOULD use names starting with an "x-" prefix."  
   
 @@ in parsing phase  
   
 "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."  
   
 @@ We should phrase out what the parser should do where the XML specification  
 left undefined.  For example: Comment must be converted to a Comment node,  
 illegal xml:space value must be preserved, so on.  
   
 Warn <!ENTITY % xml... ...>  
   
 -->  
648    
649  <p>The parser <em class=rfc2119>MUST</em> raise an  <p>The parser <em class=rfc2119>MUST</em> raise an
650  <a href="#xml-well-formedness-error" id=wfe-syntax><code>xml-well-formedness-error</code></a>  <a href="#xml-well-formedness-error" id=wfe-syntax><code>xml-well-formedness-error</code></a>
651  for any failure to match to a production rule in the XML specification.  for any failure for matching to a production rule in the XML specification.
652  <!--  <!--
653    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#ExtSubset">Well-formedness constraint: External Subset</a>
654    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>    <li><a href="http://www.w3.org/TR/2006/REC-xml-20060816/#PE-between-Decls">Well-formedness constraint: PE Between Declarations</a>
655  -->  -->
656    
657    <!--
658      Inpossible to test:
659        "In the absence of external character encoding information (such as MIME
660        headers), parsed entities which are stored in an encoding other than UTF-8
661        or UTF-16 MUST begin with a text declaration"
662    -->
663  </div>  </div>
664    
665  <div class="section" id=checking-dom>  <div class="section" id=checking-dom>
# Line 863  following:</p> Line 1088  following:</p>
1088      <a href="#algorithm-to-check-a-node" title="check a node">check the      <a href="#algorithm-to-check-a-node" title="check a node">check the
1089      node</a> recursively.</li>      node</a> recursively.</li>
1090      <li class=ed>@@ externally declared?</li>      <li class=ed>@@ externally declared?</li>
1091        <li>If the <code>NamedNodeMap</code> object in the <code>entities</code>
1092        attribute of <var>n</var> does not contain <code>Entity</code> nodes
1093        whose <code>nodeName</code> attribute are <code>amp</code>,
1094        <code>lt</code>, <code>gt</code>, <code>apos</code>, and <code>quot</code>
1095        then raise
1096        <a href="#xml-misc-recommentation" id=xmr-predefined><code>xml-misc-recommendation</code></a>(s).
1097      </ol>      </ol>
1098    </dd>    </dd>
1099  <dt>If <var>n</var> is an <code>Element</code> node</dt>  <dt>If <var>n</var> is an <code>Element</code> node</dt>
# Line 1050  parsed entity)</dt> Line 1281  parsed entity)</dt>
1281      <li>If the <code>childNodes</code> list of <var>n</var> contains      <li>If the <code>childNodes</code> list of <var>n</var> contains
1282      any nodes, then raise an      any nodes, then raise an
1283      <a href="#xml-well-formedness-error" id=wfe-pi-child><code>xml-well-formedness-error</code></a>.</li>      <a href="#xml-well-formedness-error" id=wfe-pi-child><code>xml-well-formedness-error</code></a>.</li>
1284        <li class=ed>@@ Warn if not declared
1285      </ol>      </ol>
1286    </dd>    </dd>
1287  <dt>If <var>n</var> is a <code>Text</code> node</dt>  <dt>If <var>n</var> is a <code>Text</code> node</dt>

Legend:
Removed from v.1.25  
changed lines
  Added in v.1.28

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24