/[suikacvs]/markup/xml/xmlcc/xmlcc-work.en.html
Suika

Diff of /markup/xml/xmlcc/xmlcc-work.en.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.19 by wakaba, Sat Dec 1 14:51:45 2007 UTC revision 1.23 by wakaba, Sun Mar 16 12:37:17 2008 UTC
# Line 18  Line 18 
18    
19  <div class="header">  <div class="header">
20  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>  <h1>manakai's <abbr>XML</abbr> Conformance Checking</h1>
21  <h2>Working Draft <time datetime=2007-12-01>1 December 2007</time></h2>  <h2>Working Draft <time datetime=2008-03-16>16 March 2008</time></h2>
22    
23  <dl class="versions-uri">  <dl class="versions-uri">
24  <dt>This Version</dt>  <dt>This Version</dt>
# Line 42  Line 42 
42        >w@suika.fam.cx</a>&gt;</code></dd>        >w@suika.fam.cx</a>&gt;</code></dd>
43  </dl>  </dl>
44    
45  <p class="copyright" lang="en">&#xA9; <time>2007</time> <a  <p class="copyright" lang="en">&#xA9; <time>2007</time>$B!>(B<time>2008</time> <a
46      href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.      href="http://suika.fam.cx/~wakaba/who?" rel="author">Wakaba</a>.
47  Permission is granted to copy, distribute and/or modify this document  Permission is granted to copy, distribute and/or modify this document
48  under the terms of the <a rel="license"  under the terms of the <a rel="license"
# Line 118  else in this specification is normative. Line 118  else in this specification is normative.
118  <p><span class=ed>Algorithm is normative but non-normative</span>.  <p><span class=ed>Algorithm is normative but non-normative</span>.
119  In addition, the order in which <a href="#errors">errors</a> are  In addition, the order in which <a href="#errors">errors</a> are
120  raised is undefined.</p>  raised is undefined.</p>
121    
122    <p>This document sometimes cites parts of <abbr>XML</abbr> 1.0 specification
123    by hyperlinks.  When the document being processed is an <abbr>XML</abbr> 1.1
124    document, however, corresponding parts of the <abbr>XML</abbr> 1.1
125    specification should be consulted instead.</p>
126  </div>  </div>
127    
128    
# Line 209  can be easily serialized into a valid XM Line 214  can be easily serialized into a valid XM
214    
215  <div class=ed><p>@@ TODO: #dt-atuseroption at user option  <div class=ed><p>@@ TODO: #dt-atuseroption at user option
216  (MAY or MUST), #dt-compat for compatibility,  (MAY or MUST), #dt-compat for compatibility,
217  #dt-interop for interoperability</p></div>  #dt-interop for interoperability</p>
218    
219    <p>TODO: XML 1.1, XML Namespace 1.0/1.1, xml:base, xml:id
220    </div>
221    
222  </div>  </div>
223    
224  <div class=section id=parsing-xml>  <div class=section id=parsing-xml>
225  <h2>Parsing <abbr>XML</abbr> Document</h2>  <h2>Parsing <abbr>XML</abbr> Document</h2>
226    
227  <ul>  <p>When an <abbr>XML</abbr> document is parsed, the following clauses
228  <li>If the <abbr>XML</abbr> document does not begin with an  are applied:</p>
229  <abbr>XML</abbr> declaration, then raise an  <dl>
230  <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.</li>  <dt>For each document
231  <li>If the replacement text of an entity declaration is    <dd>If the <abbr>XML</abbr> document does not begin with an
232  <code>&lt;</code>, then raise an    <abbr>XML</abbr> declaration, then the parser <em class=rfc2119>MUST</em>
233  <a href="#xml-misc-warning" id=xmw-entity-value-lt><code>xml-misc-warning</code></a>.<!--    raise an
234      <a href="#xml-misc-recommentation" id=xmr-xml-decl><code>xml-misc-recommendation</code></a>.
235    <dt>For each internal general entity declaration processed by the parser
236      <dd>If the
237      <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-EntityValue"><code>EntityValue</code></a>
238      part of the general entity declaration contains a bare <code>U+003C</code>
239      <code>LESS-THAN SIGN</code> (<code>&lt;</code>) character, then the parser
240      <em class=rfc2119>MUST</em> raise an
241      <a href="#xml-misc-warning" id=xmw-entity-value-lt><code>xml-misc-warning</code></a>.<!--
242  "strongly advised to avoid" in a Note in Section 2.3 of [XML10], [XML11].  "strongly advised to avoid" in a Note in Section 2.3 of [XML10], [XML11].
243  --></li>  -->
244  <li>If there is an element type declaration whose <code>Name</code>  <dt>For each element type declaration processed by the parser
245  value is already declared, then raise an    <dd>If there is another element type declaration whose <code>Name</code>
246  <a href="#xml-validity-error" id=vc-edunique><code>xml-validity-error</code></a>.</li>    is equal to the <code>Name</code> of the element type declaration, then
247  <li>If attribute definition whose <code>Name</code> is    the parser <em class=rfc2119>MUST</em> raise an
248  <code>xml:space</code> has <span class=ed>declared type different from    <a href="#xml-validity-error" id=vc-edunique><code>xml-validity-error</code></a>.
249  (default|preserve), (default), or (preserve)</span>, then raise an  <!--
250  <a href="#xml-misc-error" id=xme-ad-xml-space><code>xml-misc-error</code></a>.    NOTE: <!ATTLIST a xml:space (default) #IMPLIED xml:space CDATA #IMPLIED>
251  <span class=ed>@@ duplication with    will not be warned.
252  <a href="#xml-at-xml-space">#xml-at-xml-space</a>.<!--  -->
253  <!ATTLIST e xml:space CDATA #IMPLIED xml:space CDATA #IMPLIED> --></span></li>  <dt>For each empty-element tag
254  <li>If an empty-element tag is used for an element which is <em>not</em>    <dd>If the <code>Name</code> of the tag is not declared by a processed
255  declared <code>EMPTY</code>, then raise an    element type declaration as <code>EMPTY</code> content, then the parser
256  <a href="#xml-misc-recommentation" id=xmr-emptyelemtag-not-empty><code>xml-misc-recommendation</code></a>.</li>    <em class=rfc2119>MUST</em> raise an
257  <li>If an empty-element tag is <em>not</em> used for an element which is    <a href="#xml-misc-recommentation" id=xmr-emptyelemtag-not-empty><code>xml-misc-recommendation</code></a>.
258  declared <code>EMPTY</code>, then raise an  <dt>For each start-tag
259  <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.</li>    <dd>If the <code>Name</code> of the tag is declared by a processed element
260      type declaration as <code>EMPTY</code> content, then the parser
261      <em class=rfc2119>MUST</em> raise an
262      <a href="#xml-misc-recommentation" id=xmr-empty-not-emptyelemtag><code>xml-misc-recommendation</code></a>.
263  <!--  <!--
264    
265  #vc-PEinMarkupDecl Validity constraint: Proper Declaration/PE Nesting  #vc-PEinMarkupDecl Validity constraint: Proper Declaration/PE Nesting
# Line 284  declared <code>EMPTY</code>, then raise Line 303  declared <code>EMPTY</code>, then raise
303    
304  "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."  "It is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process. It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains byte sequences that are not legal in that encoding."
305    
306    @@ We should phrase out what the parser should do where the XML specification
307    left undefined.  For example: Comment must be converted to a Comment node,
308    illegal xml:space value must be preserved, so on.
309    
310  -->  -->
311  </ul>  </dl>
312  </div>  </div>
313    
314  <div class="section" id=checking-dom>  <div class="section" id=checking-dom>
# Line 393  following algorithm <em class=rfc2119>MU Line 416  following algorithm <em class=rfc2119>MU
416  a public identifier (<dfn id=var-pid><var>pid</var></dfn>)</dfn>, the  a public identifier (<dfn id=var-pid><var>pid</var></dfn>)</dfn>, the
417  following algorithm <em class=rfc2119>MUST</em> be used:</p>  following algorithm <em class=rfc2119>MUST</em> be used:</p>
418  <ol>  <ol>
419    <li>If <var>pid</var> is <code>null</code>, abort these steps.</li>  <li>If <var>pid</var> is <code>null</code>, abort these steps.</li>
420    <li>If <var>pid</var> contains any character  <li>If <var>pid</var> contains a character that is <em>not</em> in the
421    that is outside of the range of <code>#x20 | #xD | #xA |  character class <a href="#class-PubidChar"><code>PubidChar</code></a>, then
422    [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%]</code><!-- @@ TODO: formal def -->,  raise an
423    then it is an  <a href="#xml-well-formedness-error" id=wfe-pubid-char><code>xml-well-formedness-error</code></a>.</li>
   <a href="#xml-well-formedness-error" id=wfe-pubid-char><code>xml-well-formedness-error</code></a>.</li>  
424    <li>If <var>pid</var> contains one of <code class=char>U+0009</code>    <li>If <var>pid</var> contains one of <code class=char>U+0009</code>
425    <code class=charname>CHARACTER TABULATION</code>,    <code class=charname>CHARACTER TABULATION</code>,
426    <code class=char>U+000A</code> <code class=charname>CARRIAGE RETURN</code>,    <code class=char>U+000A</code> <code class=charname>CARRIAGE RETURN</code>,
# Line 414  following algorithm <em class=rfc2119>MU Line 436  following algorithm <em class=rfc2119>MU
436    <span class=ed>Is this really a roundtripness problem?  XML spec    <span class=ed>Is this really a roundtripness problem?  XML spec
437    does only define the way to match public identifiers in fact, no    does only define the way to match public identifiers in fact, no
438    canonical form.</span></li>    canonical form.</span></li>
   <li class=ed>@@ Should we check formal-public-identifierness?</li>  
439  </ol>  </ol>
440    
441  <p>To  <p>To
# Line 561  following:</p> Line 582  following:</p>
582      <li>If <code>nodeName</code> attribute of <var>n</var> is      <li>If <code>nodeName</code> attribute of <var>n</var> is
583      <code>xml:space</code> <span class=ed>@@ or {xml namespace}:space ?</span>      <code>xml:space</code> <span class=ed>@@ or {xml namespace}:space ?</span>
584      and <span class=ed>its declared type is different from (default|preserve),      and <span class=ed>its declared type is different from (default|preserve),
585      (default), or (preserve)</span>, then raise an      (preserve|default), (default), or (preserve)</span>, then raise an
586      <a href="#xml-misc-error" id=xme-at-xml-space><code>xml-misc-error</code></a>.</li>      <a href="#xml-misc-error" id=xme-at-xml-space><code>xml-misc-error</code></a>.</li>
587      <li>For each node <dfn id=var-ad-nc><var>n<sub><var>c</var></sub></var></dfn> in the      <li>For each node <dfn id=var-ad-nc><var>n<sub><var>c</var></sub></var></dfn> in the
588      <code>childNodes</code> list of <var>n</var>,      <code>childNodes</code> list of <var>n</var>,
# Line 996  as amended by Line 1017  as amended by
1017  contains the following characters:</p>  contains the following characters:</p>
1018  <ul class=ed>  <ul class=ed>
1019  </ul>  </ul>
1020    <div class="note memo">
1021    <p>This character class contains all characters allowed as the first character
1022    of a string matching to the production rule
1023    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-Name"><code>Name</code></a>
1024    of <abbr>XML</abbr> 1.0
1025    <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>]</cite>.</p>
1026    </div>
1027    
1028  <p>The character class <dfn id=class-NameChar10><code>NameChar10</code></dfn>  <p>The character class <dfn id=class-NameChar10><code>NameChar10</code></dfn>
1029  contains the following characters:</p>  contains the following characters:</p>
# Line 1004  contains the following characters:</p> Line 1032  contains the following characters:</p>
1032  <a href="#class-NameStartChar10">NameStartChar10</a>.</li>  <a href="#class-NameStartChar10">NameStartChar10</a>.</li>
1033  <li class=ed></li>  <li class=ed></li>
1034  </ul>  </ul>
1035    <div class="note memo">
1036    <p>This character class contains all characters allowed as the second
1037    character of a string matching to the production rule
1038    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-Name"><code>Name</code></a>
1039    of <abbr>XML</abbr> 1.0
1040    <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>]</cite>.</p>
1041    </div>
1042    
1043    <p>The character class <dfn id=class-PubidChar><code>PubidChar</code></dfn>
1044    contains the following characters:</p>
1045    <ul>
1046    <li><code class=char>U+0009</code> <code class=charname>CHARACTER
1047    TABULATION</code></li>
1048    <li><code class=char>U+000A</code> <code class=charname>LINE FEED</code></li>
1049    <li><code class=char>U+000D</code> <code class=charname>CARRIAGE
1050    RETURN</code></li>
1051    <li><code class=char>U+0020</code> <code class=charname>SPACE</code></li>
1052    <li><code class=char>U+0021</code> <code class=charname>EXCLAMATION MARK</code>
1053    (<code class=char>!</code>)</li>
1054    <li><code class=char>U+0023</code> <code class=charname>DOLLAR SIGN</code>
1055    (<code class=char>$</code>)</li>
1056    <li><code class=char>U+0024</code> <code class=charname>NUMBER SIGN</code>
1057    (<code class=char>#</code>)</li>
1058    <li><code class=char>U+0025</code> <code class=charname>PERCENT SIGN</code>
1059    (<code class=char>%</code>)</li>
1060    <li><code class=char>U+0027</code> <code class=charname>APOSTROPHE</code>
1061    (<code class=char>'</code>)</li>
1062    <li><code class=char>U+0028</code> <code class=charname>LEFT PARENTHESIS</code>
1063    (<code class=char>(</code>)</li>
1064    <li><code class=char>U+0029</code> <code class=charname>RIGHT
1065    PARENTHESIS</code> (<code class=char>)</code>)</li>
1066    <li><code class=char>U+002A</code> <code class=charname>ASTERISK</code>
1067    (<code class=char>*</code>)</li>
1068    <li><code class=char>U+002B</code> <code class=charname>PLUS SIGN</code>
1069    (<code class=char>+</code>)</li>
1070    <li><code class=char>U+002C</code> <code class=charname>COMMA</code>
1071    (<code class=char>,</code>)</li>
1072    <li><code class=char>U+002D</code> <code class=charname>HYPHEN-MINUS</code>
1073    (<code class=char>-</code>)</li>
1074    <li><code class=char>U+002E</code> <code class=charname>FULL STOP</code>
1075    (<code class=char>.</code>)</li>
1076    <li><code class=char>U+002F</code> <code class=charname>SOLIDUS</code>
1077    (<code class=char>/</code>)</li>
1078    <li><code class=char>U+0030</code> <code class=charname>DIGIT ZERO</code>
1079    (<code class=char>0</code>) .. <code class=char>U+0039</code>
1080    <code class=charname>DIGIT NINE</code> (<code class=char>9</code>)</li>
1081    <li><code class=char>U+003A</code> <code class=charname>COLON</code>
1082    (<code class=char>:</code>)</li>
1083    <li><code class=char>U+003B</code> <code class=charname>SEMICOLON</code>
1084    (<code class=char>;</code>)</li>
1085    <li><code class=char>U+003D</code> <code class=charname>EQUAL SIGN</code>
1086    (<code class=char>=</code>)</li>
1087    <li><code class=char>U+003F</code> <code class=charname>QUESTION MARK</code>
1088    (<code class=char>?</code>)</li>
1089    <li><code class=char>U+0040</code> <code class=charname>COMMERCIAL AT</code>
1090    (<code class=char>@</code>)</li>
1091    <li><code class=char>U+0041</code> <code class=charname>LATIN CAPITAL LETTER
1092    A</code> (<code class=char>A</code>) .. <code class=char>U+005A</code>
1093    <code class=charname>LATIN CAPITAL LETTER Z</code>
1094    (<code class=char>Z</code>)</li>
1095    <li><code class=char>U+005F</code> <code class=charname>LOW LINE</code>
1096    (<code class=char>_</code>)</li>
1097    <li><code class=char>U+0061</code> <code class=charname>LATIN CAPITAL LETTER
1098    A</code> (<code class=char>A</code>) .. <code class=char>U+007A</code>
1099    <code class=charname>LATIN CAPITAL LETTER Z</code>
1100    (<code class=char>Z</code>)</li>
1101    </ul>
1102    <div class="note memo">
1103    <p>This character class contains all characters allowed in the production rule
1104    <a href="http://www.w3.org/TR/2006/REC-xml-20060816/#NT-PubidChar"><code>PubidChar</code></a>
1105    of <abbr>XML</abbr> 1.0
1106    <cite class="bibref normative">[<a href="#ref-XML10">XML10</a>]</cite>.</p>
1107    </div>
1108    
1109  </div>  </div>
1110    

Legend:
Removed from v.1.19  
changed lines
  Added in v.1.23

admin@suikawiki.org
ViewVC Help
Powered by ViewVC 1.1.24