| 1 |
|
| 2 |
|
| 3 |
|
| 4 |
INTERNET DRAFT to be NEWS sec. - |
| 5 |
|
| 6 |
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
News Article Format and Transmission |
| 11 |
|
| 12 |
Henry Spencer |
| 13 |
|
| 14 |
|
| 15 |
|
| 16 |
Status of this Memo |
| 17 |
|
| 18 |
This document is intended to become an Internet Draft. |
| 19 |
Internet Drafts are working documents of the Internet Engi- |
| 20 |
neering Task Force (IETF), its Areas, and its Working |
| 21 |
Groups. Note that other groups may also distribute working |
| 22 |
documents as Internet Drafts. |
| 23 |
|
| 24 |
Internet Drafts are draft documents valid for a maximum of |
| 25 |
six months. Internet Drafts may be updated, replaced, or |
| 26 |
obsoleted by other documents at any time. It is not appro- |
| 27 |
priate to use Internet Drafts as reference material or to |
| 28 |
cite them other than as a "working draft" or "work in |
| 29 |
progress". |
| 30 |
|
| 31 |
Please check the I-D abstract listing contained in each |
| 32 |
Internet Draft directory to learn the current status of this |
| 33 |
or any other Internet Draft. (Actually, this draft is at |
| 34 |
too early a stage to even be listed there yet.) |
| 35 |
|
| 36 |
It is hoped that a later version of this Draft will obsolete |
| 37 |
RFC 1036 and will become an Internet standard. |
| 38 |
|
| 39 |
References to the "successor to this Draft" refer not to |
| 40 |
later versions of this draft, but to a hypothetical future |
| 41 |
rewrite of this Draft (in the same way that this Draft is a |
| 42 |
rewrite of RFC 1036). |
| 43 |
|
| 44 |
Distribution of this memo is unlimited. |
| 45 |
|
| 46 |
|
| 47 |
Abstract |
| 48 |
|
| 49 |
This Draft defines the format and procedures for interchange |
| 50 |
of network news articles. It is hoped that a later version |
| 51 |
of this Draft will obsolete RFC 1036, reflecting more recent |
| 52 |
experience and accommodating future directions. |
| 53 |
|
| 54 |
Network news articles resemble mail messages but are broad- |
| 55 |
cast to potentially-large audiences, using a flooding algo- |
| 56 |
rithm that propagates one copy to each interested host (or |
| 57 |
group thereof), typically stores only one copy per host, and |
| 58 |
does not require any central administration or systematic |
| 59 |
registration of interested users. Network news originated |
| 60 |
as the medium of communication for Usenet, circa 1980. |
| 61 |
|
| 62 |
|
| 63 |
|
| 64 |
2 June 1994 - 1 - expires 15 July 1994 |
| 65 |
|
| 66 |
|
| 67 |
|
| 68 |
|
| 69 |
|
| 70 |
INTERNET DRAFT to be NEWS sec. - |
| 71 |
|
| 72 |
|
| 73 |
Since then Usenet has grown explosively, and many Internet |
| 74 |
sites participate in it. In addition, the news technology |
| 75 |
is now in widespread use for other purposes, on the Internet |
| 76 |
and elsewhere. |
| 77 |
|
| 78 |
This Draft primarily codifies and organizes existing prac- |
| 79 |
tice. A few small extensions have been added in an attempt |
| 80 |
to solve problems that are considered serious. Major exten- |
| 81 |
sions (e.g. cryptographic authentication) that need signifi- |
| 82 |
cant development effort are left to be undertaken as inde- |
| 83 |
pendent efforts. |
| 84 |
|
| 85 |
|
| 86 |
Table of Contents |
| 87 |
|
| 88 |
TBW |
| 89 |
|
| 90 |
|
| 91 |
1. Introduction |
| 92 |
|
| 93 |
Network news articles resemble mail messages but are broad- |
| 94 |
cast to potentially-large audiences, using a flooding algo- |
| 95 |
rithm that propagates one copy to each interested host (or |
| 96 |
groups thereof), typically stores only one copy per host, |
| 97 |
and does not require any central administration or system- |
| 98 |
atic registration of interested users. Network news origi- |
| 99 |
nated as the medium of communication for Usenet, circa 1980. |
| 100 |
Since then Usenet has grown explosively, and many Internet |
| 101 |
sites participate in it. In addition, the news technology |
| 102 |
is now in widespread use for other purposes, on the Internet |
| 103 |
and elsewhere. |
| 104 |
|
| 105 |
The earliest news interchange used the so-called "A News" |
| 106 |
article format. Shortly thereafter, an article format |
| 107 |
vaguely resembling Internet mail was devised and used |
| 108 |
briefly. Both of those formats are completely obsolete; |
| 109 |
they are documented in appendix A for historical reasons |
| 110 |
only. With publication of RFC 850 [rrr] in 1983, news arti- |
| 111 |
cles came to closely resemble Internet mail messages, with |
| 112 |
some restrictions and some additional headers. RFC 1036 |
| 113 |
[rrr] in 1987 updated RFC 850 without making major changes. |
| 114 |
|
| 115 |
In the intervening five years, the RFC 1036 article format |
| 116 |
has proven quite satisfactory, although minor extensions |
| 117 |
appear desirable to match recent developments in areas such |
| 118 |
as multi-media mail. RFC 1036 itself has not proven quite |
| 119 |
so satisfactory. It is often rather vague and does not |
| 120 |
address some issues at all; this has caused significant |
| 121 |
interoperability problems at times, and implementations have |
| 122 |
diverged somewhat. Worse, although it was intended primar- |
| 123 |
ily to document existing practice, it did not precisely |
| 124 |
match existing practice even at the time it was published, |
| 125 |
and the deviations have grown since. |
| 126 |
|
| 127 |
|
| 128 |
|
| 129 |
|
| 130 |
2 June 1994 - 2 - expires 15 July 1994 |
| 131 |
|
| 132 |
|
| 133 |
|
| 134 |
|
| 135 |
|
| 136 |
INTERNET DRAFT to be NEWS sec. 1 |
| 137 |
|
| 138 |
|
| 139 |
This Draft attempts to specify the format of articles, and |
| 140 |
the procedures used to exchange them and process them, in |
| 141 |
sufficient detail to allow full interoperability. In addi- |
| 142 |
tion, some tentative suggestions are made about directions |
| 143 |
for future development, in an attempt to avert unnecessary |
| 144 |
divergence and consequent loss of interoperability. Major |
| 145 |
extensions (e.g. cryptographic authentication) that need |
| 146 |
significant development effort are left to be undertaken as |
| 147 |
independent efforts. |
| 148 |
|
| 149 |
NOTE: One question this all may raise is: why is |
| 150 |
there no News-Version header, analogous to MIME- |
| 151 |
Version, specifying a version number corresponding |
| 152 |
to this specification? The answer is: it doesn't |
| 153 |
appear to be useful, given news's backward- |
| 154 |
compatibility constraints. The major use of a |
| 155 |
version number is indicating which of several |
| 156 |
INCOMPATIBLE interpretations is relevant. The |
| 157 |
impossibility of orchestrating any sort of simul- |
| 158 |
taneous change over news's installed base makes it |
| 159 |
necessary to avoid such incompatible changes (as |
| 160 |
opposed to extensions) entirely. MIME has a ver- |
| 161 |
sion number mostly because it introduced incompat- |
| 162 |
ible changes to the interpretation of several |
| 163 |
"Content-" headers. This Draft attempts no |
| 164 |
changes in interpretation and it appears doubtful |
| 165 |
that future Drafts will find it feasible to intro- |
| 166 |
duce any. |
| 167 |
|
| 168 |
UNRESOLVED ISSUE: Should this be reconsidered? |
| 169 |
Only if the header has SPECIFIC IDENTIFIABLE uses |
| 170 |
today. Otherwise it's just useless added bulk. |
| 171 |
|
| 172 |
As in this Draft's predecessors, the exact means used to |
| 173 |
transmit articles from one host to another is not specified. |
| 174 |
NNTP [rrr] is probably the most common transmission method |
| 175 |
on the Internet, but a number of others are known to be in |
| 176 |
use, including the UUCP protocol [rrr] extensively used in |
| 177 |
the early days of Usenet and still much used on its fringes |
| 178 |
today. |
| 179 |
|
| 180 |
Several of the mechanisms described in this Draft may seem |
| 181 |
somewhat strange or even bizarre at first reading. As with |
| 182 |
Internet mail, there is no reasonable possibility of updat- |
| 183 |
ing the entire installed base of news software promptly, so |
| 184 |
interoperability with old software is crucial and will |
| 185 |
remain so. Compatibility with existing practice and robust- |
| 186 |
ness in an imperfect world necessarily take priority over |
| 187 |
|
| 188 |
|
| 189 |
|
| 190 |
|
| 191 |
|
| 192 |
|
| 193 |
|
| 194 |
|
| 195 |
|
| 196 |
2 June 1994 - 3 - expires 15 July 1994 |
| 197 |
|
| 198 |
|
| 199 |
|
| 200 |
|
| 201 |
|
| 202 |
INTERNET DRAFT to be NEWS sec. 1 |
| 203 |
|
| 204 |
|
| 205 |
elegance. |
| 206 |
|
| 207 |
|
| 208 |
2. Definitions, Notations, and Conventions |
| 209 |
|
| 210 |
|
| 211 |
2.1. Textual Notations |
| 212 |
|
| 213 |
Throughout this Draft, "MAIL" is short for "RFC 822 [rrr] as |
| 214 |
amended by RFC 1123 [rrr]". (RFC 1123's amendments are |
| 215 |
mostly relatively small, but they are not insignificant.) |
| 216 |
See also the discussion in section 3 about this Draft's |
| 217 |
relationship to MAIL. "MIME" is short for "RFCs 1341 and |
| 218 |
1342" (or their updated replacements). |
| 219 |
|
| 220 |
UNRESOLVED ISSUE: Update these numbers. |
| 221 |
|
| 222 |
"ASCII" is short for "the ANSI X3.4 character set" [rrr]. |
| 223 |
While "ASCII" is often misused to refer to various character |
| 224 |
sets somewhat similar to X3.4, in this Draft, "ASCII" means |
| 225 |
X3.4 and only X3.4. |
| 226 |
|
| 227 |
NOTE: The name is traditional (to the point where |
| 228 |
the ANSI standard sanctions it) even though it is |
| 229 |
no longer an acronym for the name of the standard. |
| 230 |
|
| 231 |
NOTE: ASCII, X3.4, contains 128 characters, not |
| 232 |
all of them printable. Character sets with more |
| 233 |
characters are not ASCII, although they may |
| 234 |
include it as a subset. |
| 235 |
|
| 236 |
Certain words used to define the significance of individual |
| 237 |
requirements are capitalized. "MUST" means that the item is |
| 238 |
an absolute requirement of the specification. "SHOULD" |
| 239 |
means that the item is a strong recommendation: there may be |
| 240 |
valid reasons to ignore it in unusual circumstances, but |
| 241 |
this should be done only after careful study of the full |
| 242 |
implications and a firm conclusion that it is necessary, |
| 243 |
because there are serious disadvantages to doing so. "MAY" |
| 244 |
means that the item is truly optional, and implementors and |
| 245 |
users are warned that conformance is possible but not to be |
| 246 |
relied on. |
| 247 |
|
| 248 |
The term "compliant", applied to implementations etc., indi- |
| 249 |
cates satisfaction of all relevant "MUST" and "SHOULD" |
| 250 |
requirements. The term "conditionally compliant" indicates |
| 251 |
satisfaction of all relevant "MUST" requirements but viola- |
| 252 |
tion of at least one relevant "SHOULD" requirement. |
| 253 |
|
| 254 |
This Draft contains explanatory notes using the following |
| 255 |
format. These may be skipped by persons interested solely |
| 256 |
in the content of the specification. The purpose of the |
| 257 |
notes is to explain why choices were made, to place them in |
| 258 |
context, or to suggest possible implementation techniques. |
| 259 |
|
| 260 |
|
| 261 |
|
| 262 |
2 June 1994 - 4 - expires 15 July 1994 |
| 263 |
|
| 264 |
|
| 265 |
|
| 266 |
|
| 267 |
|
| 268 |
INTERNET DRAFT to be NEWS sec. 2.1 |
| 269 |
|
| 270 |
|
| 271 |
NOTE: While such explanatory notes may seem super- |
| 272 |
fluous in principle, they often help the less- |
| 273 |
than-omniscient reader grasp the purpose of the |
| 274 |
specification and the constraints involved. Given |
| 275 |
the limitations of natural language for descrip- |
| 276 |
tive purposes, this improves the probability that |
| 277 |
implementors and users will understand the true |
| 278 |
intent of the specification in cases where the |
| 279 |
wording is not entirely clear. |
| 280 |
|
| 281 |
All numeric values are given in decimal unless otherwise |
| 282 |
indicated. Octets are assumed to be unsigned values for |
| 283 |
this purpose. Large numbers are written using the North |
| 284 |
American convention, in which "," separates groups of three |
| 285 |
digits but otherwise has no significance. |
| 286 |
|
| 287 |
|
| 288 |
2.2. Syntax Notation |
| 289 |
|
| 290 |
Although the mechanisms specified in this Draft are all |
| 291 |
described in prose, most are also described formally in the |
| 292 |
modified BNF notation of RFC 822. Implementors will need to |
| 293 |
be familiar with this notation to fully understand this |
| 294 |
specification, and are referred to RFC 822 for a complete |
| 295 |
explanation of the modified BNF notation. Here is a brief |
| 296 |
illustrative example: |
| 297 |
|
| 298 |
sentence = clause *( punct clause ) "." |
| 299 |
punct = ":" / ";" |
| 300 |
clause = 1*word [ "(" clause ")" / "," 1*word ] |
| 301 |
word = <any English word> |
| 302 |
|
| 303 |
This defines a sentence as some clauses separated by puncts |
| 304 |
and ended by a period, a punct as a colon or semicolon, a |
| 305 |
clause as at least one <word> optionally followed by either |
| 306 |
a parenthesized clause or a comma and at least one more |
| 307 |
<word>, and a <word> as (informally) any English word. <> |
| 308 |
are used to enclose names when (and only when) distinguish- |
| 309 |
ing them from surrounding text is useful. The full form of |
| 310 |
the repetition notation is <m>"*"<n><thing>, denoting <m> |
| 311 |
through <n> repetitions of <thing>; <m> defaults to zero, |
| 312 |
<n> to infinity, and the "*" and <n> can be omitted if <m> |
| 313 |
and <n> are equal, so 1*word is one or more words, 1*5word |
| 314 |
is one through five words, and 2word is exactly two words. |
| 315 |
|
| 316 |
The character "\" is not special in any way in this nota- |
| 317 |
tion. |
| 318 |
|
| 319 |
This Draft is intended to be self-contained; all syntax |
| 320 |
rules used in it are defined within it, and a rule with the |
| 321 |
same name as one found in MAIL does not necessarily have the |
| 322 |
same definition. The lexical layer of MAIL is NOT, repeat |
| 323 |
NOT, used in this Draft, and its presence must not be |
| 324 |
assumed; notably, this Draft spells out all places where |
| 325 |
|
| 326 |
|
| 327 |
|
| 328 |
2 June 1994 - 5 - expires 15 July 1994 |
| 329 |
|
| 330 |
|
| 331 |
|
| 332 |
|
| 333 |
|
| 334 |
INTERNET DRAFT to be NEWS sec. 2.2 |
| 335 |
|
| 336 |
|
| 337 |
white space is permitted/required and all places where con- |
| 338 |
structs resembling MAIL comments can occur. |
| 339 |
|
| 340 |
NOTE: News parsers historically have been much |
| 341 |
less permissive than MAIL parsers. |
| 342 |
|
| 343 |
|
| 344 |
2.3. Definitions |
| 345 |
|
| 346 |
The term "character set", wherever it is used in this Draft, |
| 347 |
refers to a coded character set, in the sense of ISO charac- |
| 348 |
ter set standardization work, and must not be misinterpreted |
| 349 |
as meaning merely "a set of characters". |
| 350 |
|
| 351 |
In this Draft, ASCII character 32 is referred to as "blank"; |
| 352 |
the word "space" has a more generic meaning. |
| 353 |
|
| 354 |
An "article" is the unit of news, analogous to a MAIL "mes- |
| 355 |
sage". |
| 356 |
|
| 357 |
A "poster" is a human being (or software equivalent) submit- |
| 358 |
ting a possibly-compliant article to be "posted": made |
| 359 |
available for reading on all relevant hosts. A "posting |
| 360 |
agent" is software that assists posters to prepare articles, |
| 361 |
including determining whether the final article is compli- |
| 362 |
ant, passing it on to a relayer for posting if so, and |
| 363 |
returning it to the poster with an explanation if not. A |
| 364 |
"relayer" is software which receives allegedly-compliant |
| 365 |
articles from posting agents and/or other relayers, files |
| 366 |
copies in a "news database", and possibly passes copies on |
| 367 |
to other relayers. |
| 368 |
|
| 369 |
NOTE: While the same software may well function |
| 370 |
both as a relayer and as part of a posting agent, |
| 371 |
the two functions are distinct and should not be |
| 372 |
confused. The posting agent's purpose is (in |
| 373 |
part) to validate an article, supply header infor- |
| 374 |
mation that can or should be supplied automati- |
| 375 |
cally, and generally take reasonable actions in an |
| 376 |
attempt to transform the poster's submission into |
| 377 |
a compliant article. The relayer's purpose is to |
| 378 |
move already-compliant articles around efficiently |
| 379 |
without damaging them. |
| 380 |
|
| 381 |
A "reader" is a human being reading news articles. A "read- |
| 382 |
ing agent" is software which presents articles to a reader. |
| 383 |
|
| 384 |
NOTE: Informal usage often uses "reader" for both |
| 385 |
these meanings, but this introduces considerable |
| 386 |
potential for confusion and misunderstanding, so |
| 387 |
this Draft takes care to make the distinction. |
| 388 |
|
| 389 |
A "newsgroup" is a single news forum, a logical bulletin |
| 390 |
board, having a name and nominally intended for articles on |
| 391 |
|
| 392 |
|
| 393 |
|
| 394 |
2 June 1994 - 6 - expires 15 July 1994 |
| 395 |
|
| 396 |
|
| 397 |
|
| 398 |
|
| 399 |
|
| 400 |
INTERNET DRAFT to be NEWS sec. 2.3 |
| 401 |
|
| 402 |
|
| 403 |
a specific topic. An article is "posted to" a single news- |
| 404 |
group or several newsgroups. When an article is posted to |
| 405 |
more than one newsgroup, it is said to be "cross-posted"; |
| 406 |
note that this differs from posting the same text as part of |
| 407 |
each of several articles, one per newsgroup. A "hierarchy" |
| 408 |
is the set of all newsgroups whose names share a first com- |
| 409 |
ponent (see the name syntax in section 5.5). |
| 410 |
|
| 411 |
A newsgroup may be "moderated", in which case submissions |
| 412 |
are not posted directly, but mailed to a "moderator" for |
| 413 |
consideration and possible posting. Moderators are typi- |
| 414 |
cally human but may be implemented partially or entirely in |
| 415 |
software. |
| 416 |
|
| 417 |
A "followup" is an article containing a response to the con- |
| 418 |
tents of an earlier article (the followup's "precursor"). A |
| 419 |
"followup agent" is a combination of reading agent and post- |
| 420 |
ing agent that aids in the preparation and posting of a fol- |
| 421 |
lowup. |
| 422 |
|
| 423 |
Text comparisons are "case-sensitive" if they consider |
| 424 |
uppercase letters (e.g. "A") different from lowercase let- |
| 425 |
ters (e.g. "a"), and "case-insensitive" if letters differing |
| 426 |
only in case (e.g. "A" and "a") are considered identical. |
| 427 |
Categories of text are said to be case-(in)sensitive if com- |
| 428 |
parisons of such texts to others are case-(in)sensitive. |
| 429 |
|
| 430 |
A "cooperating subnet" is a set of news-exchanging hosts |
| 431 |
which is sufficiently well-coordinated (typically via a cen- |
| 432 |
tral administration of some sort) that stronger assumptions |
| 433 |
can be made about hosts in the set than about news hosts in |
| 434 |
general. This is typically used to relax restrictions which |
| 435 |
are otherwise required for worst-case interoperability; mem- |
| 436 |
bers of a cooperating subnet MAY interchange articles that |
| 437 |
do not conform to this Draft's specifications, provided all |
| 438 |
members have agreed to this and provided the articles are |
| 439 |
not permitted to leak out of the subnet. The word "subnet" |
| 440 |
is used to emphasize that a cooperating subnet is typically |
| 441 |
not an isolated universe; care must be taken that traffic |
| 442 |
leaving the subnet complies with the restrictions of the |
| 443 |
larger net, not just those of the cooperating subnet. |
| 444 |
|
| 445 |
A "message ID" is a unique identifier for an article, usu- |
| 446 |
ally supplied by the posting agent which posted it. It dis- |
| 447 |
tinguishes the article from every other article ever posted |
| 448 |
anywhere (in theory). Articles with the same message ID are |
| 449 |
treated as identical copies of the same article even if they |
| 450 |
are not in fact identical. |
| 451 |
|
| 452 |
A "gateway" is software which receives news articles and |
| 453 |
converts them to messages of some other kind (e.g. mail to a |
| 454 |
mailing list), or vice-versa; in essence it is a translating |
| 455 |
relayer that straddles boundaries between different methods |
| 456 |
of message exchange. The most common type of gateway |
| 457 |
|
| 458 |
|
| 459 |
|
| 460 |
2 June 1994 - 7 - expires 15 July 1994 |
| 461 |
|
| 462 |
|
| 463 |
|
| 464 |
|
| 465 |
|
| 466 |
INTERNET DRAFT to be NEWS sec. 2.3 |
| 467 |
|
| 468 |
|
| 469 |
connects newsgroup(s) to mailing list(s), either unidirec- |
| 470 |
tionally or bidirectionally, but there are also gateways |
| 471 |
between news networks using this Draft's news format and |
| 472 |
those using other formats. |
| 473 |
|
| 474 |
A "control message" is an article which is marked as con- |
| 475 |
taining control information; a relayer receiving such an |
| 476 |
article will (subject to permissions etc.) take actions |
| 477 |
beyond just filing and passing on the article. |
| 478 |
|
| 479 |
NOTE: "Control article" would be more consistent |
| 480 |
terminology, but "control message" is already well |
| 481 |
established. |
| 482 |
|
| 483 |
An article's "reply address" is the address to which mailed |
| 484 |
replies should be sent. This is the address specified in |
| 485 |
the article's From header (see section 5.2), unless it also |
| 486 |
has a Reply-To header (see section 6.3). |
| 487 |
|
| 488 |
The notation (e.g.) "(ASCII 17)" following a name means |
| 489 |
"this name refers to the ASCII character having value 17". |
| 490 |
An "ASCII printable character" is an ASCII character in the |
| 491 |
range 33-126. An "ASCII control character" is an ASCII |
| 492 |
character in the range 0-31, or the character DEL (ASCII |
| 493 |
127). A "non-ASCII character" is a character having a value |
| 494 |
exceeding 127. |
| 495 |
|
| 496 |
NOTE: Blank is neither an "ASCII printable charac- |
| 497 |
ter" nor an "ASCII control character". |
| 498 |
|
| 499 |
|
| 500 |
2.4. End Of Line |
| 501 |
|
| 502 |
How the end of a text line is represented depends on the |
| 503 |
context and the implementation. For Internet transmission |
| 504 |
via protocols such as SMTP [rrr], an end-of-line is a CR |
| 505 |
(ASCII 13) followed by an LF (ASCII 10). ISO C [rrr] and |
| 506 |
many modern operating systems indicate end-of-line with a |
| 507 |
single character, typically ASCII LF (aka "newline"), and |
| 508 |
this is the normal convention when news is transmitted via |
| 509 |
UUCP. A variety of other methods are in use, including out- |
| 510 |
of-band methods in which there is no specific character that |
| 511 |
means end-of-line. |
| 512 |
|
| 513 |
This Draft does not constrain how end-of-line is represented |
| 514 |
in news, except that characters other than CR and LF MUST |
| 515 |
not be usurped for use in end-of-line representations. |
| 516 |
Also, obviously, all software dealing with a particular copy |
| 517 |
of an article must agree on the convention to be used. |
| 518 |
"EOL" is used to mean "whatever end-of-line representation |
| 519 |
is appropriate"; it is not necessarily a character or |
| 520 |
sequence of characters. |
| 521 |
|
| 522 |
|
| 523 |
|
| 524 |
|
| 525 |
|
| 526 |
2 June 1994 - 8 - expires 15 July 1994 |
| 527 |
|
| 528 |
|
| 529 |
|
| 530 |
|
| 531 |
|
| 532 |
INTERNET DRAFT to be NEWS sec. 2.4 |
| 533 |
|
| 534 |
|
| 535 |
NOTE: If faced with picking an EOL representation |
| 536 |
in the absence of other constraints, use of a sin- |
| 537 |
gle character simplifies processing, and the ASCII |
| 538 |
standard [rrr] specifies that if one character is |
| 539 |
to be used for this purpose, it should be LF |
| 540 |
(ASCII 10). |
| 541 |
|
| 542 |
NOTE: Inside MIME encodings, use of the Internet |
| 543 |
canonical EOL representation (CR followed by LF) |
| 544 |
is mandatory. See [rrr]. |
| 545 |
|
| 546 |
|
| 547 |
2.5. Case-Sensitivity |
| 548 |
|
| 549 |
Text in newsgroup names, header parameters, etc. is case- |
| 550 |
sensitive unless stated otherwise. |
| 551 |
|
| 552 |
NOTE: This is at variance with MAIL, which is |
| 553 |
case-insensitive unless stated otherwise, but is |
| 554 |
consistent with news historical practice and |
| 555 |
existing news software. See the comments on back- |
| 556 |
ward compatibility in section 1. |
| 557 |
|
| 558 |
|
| 559 |
2.6. Language |
| 560 |
|
| 561 |
Various constant strings in this Draft, such as header names |
| 562 |
and month names, are derived from English words. Despite |
| 563 |
their derivation, these words do NOT change when the poster |
| 564 |
or reader employing them is interacting in a language other |
| 565 |
than English. Posting and reading agents SHOULD translate |
| 566 |
as appropriate in their interaction with the poster or |
| 567 |
reader, but the forms that actually appear in articles are |
| 568 |
always the English-derived ones defined in this Draft. |
| 569 |
|
| 570 |
|
| 571 |
3. Relation To MAIL (RFC 822 etc.) |
| 572 |
|
| 573 |
The primary intent of this Draft is to completely describe |
| 574 |
the news article format as a subset of MAIL's message format |
| 575 |
augmented by some new headers. Unless explicitly noted oth- |
| 576 |
erwise, the intent throughout is that an article MUST also |
| 577 |
be a valid MAIL message. |
| 578 |
|
| 579 |
NOTE: Despite obvious similarities between news |
| 580 |
and mail, opinions vary on whether it is possible |
| 581 |
or desirable to unify them into a single service. |
| 582 |
However, it is unquestionably both possible and |
| 583 |
useful to employ some of the same tools for manip- |
| 584 |
ulating both mail messages and news articles, so |
| 585 |
there is specific advantage to be had in defining |
| 586 |
them compatibly. Furthermore, there is no appar- |
| 587 |
ent need to re-invent the wheel when slight exten- |
| 588 |
sions to an existing definition will suffice. |
| 589 |
|
| 590 |
|
| 591 |
|
| 592 |
2 June 1994 - 9 - expires 15 July 1994 |
| 593 |
|
| 594 |
|
| 595 |
|
| 596 |
|
| 597 |
|
| 598 |
INTERNET DRAFT to be NEWS sec. 3 |
| 599 |
|
| 600 |
|
| 601 |
Given that this Draft attempts to be self-contained, it |
| 602 |
inevitably contains considerable repetition of information |
| 603 |
found in MAIL. This raises the possibility of unintentional |
| 604 |
conflicts. Unless specifically noted otherwise, any wording |
| 605 |
in this Draft which permits behavior that is not MAIL- |
| 606 |
compliant is erroneous and should be followed only to the |
| 607 |
extent that the result remains compliant with MAIL. |
| 608 |
|
| 609 |
NOTE: RFC 1036 said "where this standard conflicts |
| 610 |
with [RFC 822], RFC-822 should be considered cor- |
| 611 |
rect and this standard in error". Taken liter- |
| 612 |
ally, this was obviously incorrect, since RFC 1036 |
| 613 |
imposed a number of restrictions not found in RFC |
| 614 |
822. The intent, however, was reasonable: to |
| 615 |
indicate that UNINTENTIONAL differences were |
| 616 |
errors in RFC 1036. |
| 617 |
|
| 618 |
Implementors and users should note that MAIL is deliberately |
| 619 |
an extensible standard, and most extensions devised for mail |
| 620 |
are also relevant to (and compatible with) news. Note par- |
| 621 |
ticularly MIME [rrr], summarized briefly in appendix B, |
| 622 |
which extends MAIL in a number of useful ways that are defi- |
| 623 |
nitely relevant to news. Also of note is the work in |
| 624 |
progress on reconciling PEM (Privacy Enhanced Mail, which |
| 625 |
defines extensions for authentication and security) with |
| 626 |
MIME, after which this may also be relevant to news. |
| 627 |
|
| 628 |
UNRESOLVED ISSUE: Update the MIME/PEM information. |
| 629 |
|
| 630 |
Similarly, descriptions here of MIME facilities should be |
| 631 |
considered correct only to the extent that they do not |
| 632 |
require or legitimize practices that would violate those |
| 633 |
RFCs. (Note that this Draft does extend the application of |
| 634 |
some MIME facilities, but this is an extension rather than |
| 635 |
an alteration.) |
| 636 |
|
| 637 |
|
| 638 |
4. Basic Format |
| 639 |
|
| 640 |
|
| 641 |
4.1. Overall Syntax |
| 642 |
|
| 643 |
The overall syntax of a news article is: |
| 644 |
|
| 645 |
|
| 646 |
|
| 647 |
|
| 648 |
|
| 649 |
|
| 650 |
|
| 651 |
|
| 652 |
|
| 653 |
|
| 654 |
|
| 655 |
|
| 656 |
|
| 657 |
|
| 658 |
2 June 1994 - 10 - expires 15 July 1994 |
| 659 |
|
| 660 |
|
| 661 |
|
| 662 |
|
| 663 |
|
| 664 |
INTERNET DRAFT to be NEWS sec. 4.1 |
| 665 |
|
| 666 |
|
| 667 |
article = 1*header separator body |
| 668 |
header = start-line *continuation |
| 669 |
start-line = header-name ":" space [ nonblank-text ] eol |
| 670 |
continuation = space nonblank-text eol |
| 671 |
header-name = 1*name-character *( "-" 1*name-character ) |
| 672 |
name-character = letter / digit |
| 673 |
letter = <ASCII letter A-Z or a-z> |
| 674 |
digit = <ASCII digit 0-9> |
| 675 |
separator = eol |
| 676 |
body = *( [ nonblank-text / space ] eol ) |
| 677 |
eol = <EOL> |
| 678 |
nonblank-text = [ space ] text-character *( space-or-text ) |
| 679 |
text-character = <any ASCII character except NUL (ASCII 0), |
| 680 |
HT (ASCII 9), LF (ASCII 10), CR (ASCII 13), |
| 681 |
or blank (ASCII 32)> |
| 682 |
space = 1*( <HT (ASCII 9)> / <blank (ASCII 32)> ) |
| 683 |
space-or-text = space / text-character |
| 684 |
|
| 685 |
An article consists of some headers followed by a body. An |
| 686 |
empty line separates the two. The headers contain struc- |
| 687 |
tured information about the article and its transmission. A |
| 688 |
header begins with a header name identifying it, and can be |
| 689 |
continued onto subsequent lines by beginning the continua- |
| 690 |
tion line(s) with white space. (Note that section 4.2.3 |
| 691 |
adds some restrictions to the header syntax indicated here.) |
| 692 |
The body is largely-unstructured text significant only to |
| 693 |
the poster and the readers. |
| 694 |
|
| 695 |
NOTE: Terminology here follows the current custom |
| 696 |
in the news community, rather than the MAIL con- |
| 697 |
vention of (sometimes) referring to what is here |
| 698 |
called a "header" as a "header field" or "field". |
| 699 |
|
| 700 |
Note that the separator line must be truly empty, not just a |
| 701 |
line containing white space. Further empty lines following |
| 702 |
it are part of the body, as are empty lines at the end of |
| 703 |
the article. |
| 704 |
|
| 705 |
NOTE: Some systems make no distinction between |
| 706 |
empty lines and lines consisting entirely of white |
| 707 |
space; indeed, some systems cannot represent |
| 708 |
entirely empty lines. The grammar's requirement |
| 709 |
that header continuation lines contain some print- |
| 710 |
able text is meant to ensure that the empty/space |
| 711 |
distinction cannot confuse identification of the |
| 712 |
separator line. |
| 713 |
|
| 714 |
NOTE: It is tempting to authorize posting agents |
| 715 |
to strip empty lines at the beginning and end of |
| 716 |
the body, but such empty lines could possibly be |
| 717 |
part of a preformatted document. |
| 718 |
|
| 719 |
Implementors are warned that trailing white space, whether |
| 720 |
alone on the line or not, MAY be significant in the body, |
| 721 |
|
| 722 |
|
| 723 |
|
| 724 |
2 June 1994 - 11 - expires 15 July 1994 |
| 725 |
|
| 726 |
|
| 727 |
|
| 728 |
|
| 729 |
|
| 730 |
INTERNET DRAFT to be NEWS sec. 4.1 |
| 731 |
|
| 732 |
|
| 733 |
notably in early versions of the "uuencode" encoding for |
| 734 |
binary data. Trailing white space MUST be preserved unless |
| 735 |
the article is known to have originated within a cooperating |
| 736 |
subnet that avoids using significant trailing white space, |
| 737 |
and SHOULD be preserved regardless. Posters SHOULD avoid |
| 738 |
using conventions or encodings which make trailing white |
| 739 |
space significant; for encoding of binary data, MIME's |
| 740 |
"base64" encoding is recommended. Implementors are warned |
| 741 |
that ISO C implementations are not required to preserve |
| 742 |
trailing white space, and special precautions may be neces- |
| 743 |
sary in implementations which do not. |
| 744 |
|
| 745 |
NOTE: Unfortunately, the signature-delimiter con- |
| 746 |
vention (described in section 4.3.2) does use sig- |
| 747 |
nificant trailing white space. It's too late to |
| 748 |
fix this; there is work underway on defining an |
| 749 |
organized signature convention as part of MIME, |
| 750 |
which is a preferable solution in the long run. |
| 751 |
|
| 752 |
Posters are warned that some very old relayer software mis- |
| 753 |
behaves when the first non-empty line of an article body |
| 754 |
begins with white space. |
| 755 |
|
| 756 |
|
| 757 |
4.2. Headers |
| 758 |
|
| 759 |
|
| 760 |
4.2.1. Names and Contents |
| 761 |
|
| 762 |
Despite the restrictions on header-name syntax imposed by |
| 763 |
the grammar, relayers and reading agents SHOULD tolerate |
| 764 |
header names containing any ASCII printable character other |
| 765 |
than colon (":", ASCII 58). |
| 766 |
|
| 767 |
NOTE: MAIL header names can contain any ASCII |
| 768 |
printable character (other than colon) in theory, |
| 769 |
but in practice, arbitrary header names are known |
| 770 |
to cause trouble for some news software. Section |
| 771 |
4.1's restriction to alphanumeric sequences sepa- |
| 772 |
rated by hyphens is believed to permit all widely- |
| 773 |
used header names without causing problems for any |
| 774 |
widely-used software. Software is nevertheless |
| 775 |
encouraged to cope correctly with the full range |
| 776 |
of possibilities, since aberrations are known to |
| 777 |
occur. |
| 778 |
|
| 779 |
Relayers MUST disregard headers not described in this Draft |
| 780 |
(that is, with header names not mentioned in this Draft), |
| 781 |
and pass them on unaltered. |
| 782 |
|
| 783 |
Posters wishing to convey non-standard information in head- |
| 784 |
ers SHOULD use header names beginning with "X-". No stan- |
| 785 |
dard header name will ever be of this form. Reading agents |
| 786 |
SHOULD ignore "X-" headers, or at least treat them with |
| 787 |
|
| 788 |
|
| 789 |
|
| 790 |
2 June 1994 - 12 - expires 15 July 1994 |
| 791 |
|
| 792 |
|
| 793 |
|
| 794 |
|
| 795 |
|
| 796 |
INTERNET DRAFT to be NEWS sec. 4.2.1 |
| 797 |
|
| 798 |
|
| 799 |
great care. |
| 800 |
|
| 801 |
The order of headers in an article is not significant. How- |
| 802 |
ever, posting agents are encouraged to put mandatory headers |
| 803 |
(see section 5) first, followed by optional headers (see |
| 804 |
section 6), followed by headers not defined in this Draft. |
| 805 |
|
| 806 |
NOTE: While relayers and reading agents must be |
| 807 |
prepared to handle any order, having the signifi- |
| 808 |
cant headers (the precise definition of "signifi- |
| 809 |
cant" depends on context) first can noticeably |
| 810 |
improve efficiency, especially in memory-limited |
| 811 |
environments where it is difficult to buffer up an |
| 812 |
arbitrary quantity of headers while searching for |
| 813 |
the few that matter. |
| 814 |
|
| 815 |
Header names are case-insensitive. There is a preferred |
| 816 |
case convention, which posters and posting agents SHOULD |
| 817 |
use: each hyphen-separated "word" has its initial letter (if |
| 818 |
any) in uppercase and the rest in lowercase, except that |
| 819 |
some abbreviations have all letters uppercase (e.g. "Mes- |
| 820 |
sage-ID" and "MIME-Version"). The forms used in this Draft |
| 821 |
are the preferred forms for the headers described herein. |
| 822 |
Relayers and reading agents are warned that articles might |
| 823 |
not obey this convention. |
| 824 |
|
| 825 |
NOTE: Although software must be prepared for the |
| 826 |
possibility of random use of case in header names |
| 827 |
(and other case-independent text), establishing a |
| 828 |
preferred convention reduces pointless diversity, |
| 829 |
and may permit optimized software that looks for |
| 830 |
the preferred forms before resorting to less- |
| 831 |
efficient case-insensitive searches. |
| 832 |
|
| 833 |
In general, a header can consist of several lines, with each |
| 834 |
continuation line beginning with white space. The EOLs pre- |
| 835 |
ceding continuation lines are ignored when processing such a |
| 836 |
header, effectively combining the start-line and the contin- |
| 837 |
uations into a single logical line. The logical line, less |
| 838 |
the header name, colon, and any white space following the |
| 839 |
colon, is the "header content". |
| 840 |
|
| 841 |
|
| 842 |
4.2.2. Undesirable Headers |
| 843 |
|
| 844 |
A header whose content is empty is said to be an empty |
| 845 |
header. Relayers and reading agents SHOULD not consider |
| 846 |
presence or absence of an empty header to alter the seman- |
| 847 |
tics of an article (although syntactic rules, such as |
| 848 |
requirements that certain header names appear at most once |
| 849 |
in an article, MUST still be satisfied). Posting agents |
| 850 |
SHOULD delete empty headers from articles before posting |
| 851 |
them. |
| 852 |
|
| 853 |
|
| 854 |
|
| 855 |
|
| 856 |
2 June 1994 - 13 - expires 15 July 1994 |
| 857 |
|
| 858 |
|
| 859 |
|
| 860 |
|
| 861 |
|
| 862 |
INTERNET DRAFT to be NEWS sec. 4.2.2 |
| 863 |
|
| 864 |
|
| 865 |
Headers that merely state defaults explicitly (e.g., a Fol- |
| 866 |
lowup-To header with the same content as the Newsgroups |
| 867 |
header, or a MIME Content-Type header with contents |
| 868 |
"text/plain; charset=us-ascii") or state information that |
| 869 |
reading agents can typically determine easily themselves |
| 870 |
(e.g. the length of the body in octets) are redundant, con- |
| 871 |
veying no information whatsoever. Headers that state infor- |
| 872 |
mation which cannot possibly be of use to a significant num- |
| 873 |
ber of relayers, reading agents, or readers (e.g., the name |
| 874 |
of the software package used as the posting agent) are use- |
| 875 |
less and pointless. Posters and posting agents SHOULD avoid |
| 876 |
including redundant or useless headers in articles. |
| 877 |
|
| 878 |
NOTE: Information that someone, somewhere, might |
| 879 |
someday find useful is best omitted from headers. |
| 880 |
(There's quite enough of it in article bodies.) |
| 881 |
Headers should contain information of known util- |
| 882 |
ity only. This is not meant to preclude inclusion |
| 883 |
of information primarily meant for news-software |
| 884 |
debugging, but such information should be included |
| 885 |
only if there is real reason, preferably based on |
| 886 |
experience, to suspect that it may be genuinely |
| 887 |
useful. Articles passing through gateways are the |
| 888 |
only obvious case where inclusion of debugging |
| 889 |
information appears clearly legitimate. (See sec- |
| 890 |
tion 10.1.) |
| 891 |
|
| 892 |
NOTE: A useful rule of thumb for software imple- |
| 893 |
mentors is: "if I had to pay a dollar a day for |
| 894 |
the transmission of this header, would I still |
| 895 |
think it worthwhile?". |
| 896 |
|
| 897 |
|
| 898 |
4.2.3. White Space and Continuations |
| 899 |
|
| 900 |
The colon following the header name on the start-line MUST |
| 901 |
be followed by white space, even if the header is empty. If |
| 902 |
the header is not empty, at least some of the content MUST |
| 903 |
appear on the start-line. Posting agents MUST enforce these |
| 904 |
restrictions, but relayers (etc.) SHOULD accept even arti- |
| 905 |
cles that violate them. |
| 906 |
|
| 907 |
NOTE: MAIL does not require white space after the |
| 908 |
colon, but it is usual. RFC 1036 required the |
| 909 |
white space, even in empty headers, and some |
| 910 |
existing software demands it. In MAIL, and |
| 911 |
arguably in RFC 1036 (although the wording is |
| 912 |
vague), it is technically legitimate for the white |
| 913 |
space to be part of a continuation line rather |
| 914 |
than the start-line, but not all existing software |
| 915 |
will accept this. Deleting empty headers and |
| 916 |
placing some content on the start-line avoids this |
| 917 |
issue... which is desirable because trailing |
| 918 |
blanks, easily deleted by accident, are best not |
| 919 |
|
| 920 |
|
| 921 |
|
| 922 |
2 June 1994 - 14 - expires 15 July 1994 |
| 923 |
|
| 924 |
|
| 925 |
|
| 926 |
|
| 927 |
|
| 928 |
INTERNET DRAFT to be NEWS sec. 4.2.3 |
| 929 |
|
| 930 |
|
| 931 |
made significant in headers. |
| 932 |
|
| 933 |
In general, posters and posting agents SHOULD use blank |
| 934 |
(ASCII 32), not tab (ASCII 9), where white space is desired |
| 935 |
in headers. Existing software does not consistently accept |
| 936 |
tab as synonymous with blank in all contexts. In particu- |
| 937 |
lar, RFC 1036 appeared to specify that the character immedi- |
| 938 |
ately following the colon after a header name was required |
| 939 |
to be a blank, and some news software insists on that, so |
| 940 |
this character MUST be a blank. Again, posting agents MUST |
| 941 |
enforce these restrictions but relayers SHOULD be more tol- |
| 942 |
erant. |
| 943 |
|
| 944 |
Since the white space beginning a continuation line remains |
| 945 |
a part of the logical line, headers can be "broken" into |
| 946 |
multiple lines only at white space. Posting agents SHOULD |
| 947 |
not break headers unnecessarily. Relayers SHOULD preserve |
| 948 |
existing header breaks, and SHOULD not introduce new breaks. |
| 949 |
Breaking headers SHOULD be a last resort; relayers and read- |
| 950 |
ing agents SHOULD handle long header lines gracefully. (See |
| 951 |
the discussion of size limits in section 4.6.) |
| 952 |
|
| 953 |
|
| 954 |
4.3. Body |
| 955 |
|
| 956 |
Although the article body is unstructured for most of the |
| 957 |
purposes of this Draft, structure MAY be imposed on it by |
| 958 |
other means, notably MIME headers (see appendix B). |
| 959 |
|
| 960 |
|
| 961 |
4.3.1. Body Format Issues |
| 962 |
|
| 963 |
The body of an article MAY be empty, although posting agents |
| 964 |
SHOULD consider this an error condition (meriting returning |
| 965 |
the article to the poster for revision). A posting agent |
| 966 |
which does not reject such an article SHOULD issue a warning |
| 967 |
message to the poster and supply a non-empty body. Note |
| 968 |
that the separator line MUST be present even if the body is |
| 969 |
empty. |
| 970 |
|
| 971 |
NOTE: An empty body is probably a poster error |
| 972 |
except, arguably, for some control messages... and |
| 973 |
even they really ought to have a body explaining |
| 974 |
the reason for the control message. Some old |
| 975 |
reading agents are known to generate empty bodies |
| 976 |
for "cancel" control messages, so posting agents |
| 977 |
might opt not to reject body-less articles in such |
| 978 |
cases (although it would be better to fix the |
| 979 |
reading agents to request a body). However, some |
| 980 |
existing news software is known to react badly to |
| 981 |
body-less articles, hence the request for posting |
| 982 |
agents to insert a body in such cases. |
| 983 |
|
| 984 |
|
| 985 |
|
| 986 |
|
| 987 |
|
| 988 |
2 June 1994 - 15 - expires 15 July 1994 |
| 989 |
|
| 990 |
|
| 991 |
|
| 992 |
|
| 993 |
|
| 994 |
INTERNET DRAFT to be NEWS sec. 4.3.1 |
| 995 |
|
| 996 |
|
| 997 |
NOTE: A possible posting-agent-supplied body text |
| 998 |
(already used by one widespread posting agent) is |
| 999 |
"This article was probably generated by a buggy |
| 1000 |
news reader.". (The use of "reader" to refer to |
| 1001 |
the reading agent is traditional, although this |
| 1002 |
Draft uses more precise terminology.) |
| 1003 |
|
| 1004 |
NOTE: The requirement for the separator line even |
| 1005 |
in a bodyless article is inherited from MAIL, and |
| 1006 |
also distinguishes legitimately-bodyless articles |
| 1007 |
from articles accidentally truncated in the middle |
| 1008 |
of the headers. |
| 1009 |
|
| 1010 |
Note that an article body is a sequence of lines terminated |
| 1011 |
by EOLs, not arbitrary binary data, and in particular it |
| 1012 |
MUST end with an EOL. However, relayers SHOULD treat the |
| 1013 |
body of an article as an uninterpreted sequence of octets |
| 1014 |
(except as mandated by changes of EOL representation and by |
| 1015 |
control-message processing) and SHOULD avoid imposing con- |
| 1016 |
straints on it. See also section 4.6. |
| 1017 |
|
| 1018 |
|
| 1019 |
4.3.2. Body Conventions |
| 1020 |
|
| 1021 |
Although body lines can in principle be very long (see sec- |
| 1022 |
tion 4.6 for some discussion of length limits), posters |
| 1023 |
SHOULD restrict body line lengths to circa 70-75 characters. |
| 1024 |
On systems where text is conventionally stored with EOLs |
| 1025 |
only at paragraph breaks and other "hard return" points, |
| 1026 |
with software breaking lines as appropriate for display or |
| 1027 |
manipulation, posting agents SHOULD insert EOLs as necessary |
| 1028 |
so that posted articles comply with this restriction. |
| 1029 |
|
| 1030 |
NOTE: News originated in environments where line |
| 1031 |
breaks in plain text files were supplied by the |
| 1032 |
user, not the software. Be this good or bad, much |
| 1033 |
reading-agent and posting-agent software assumes |
| 1034 |
that news articles follow this convention, so it |
| 1035 |
is often inconvenient to read or respond to arti- |
| 1036 |
cles which violate it. The "70-75" number comes |
| 1037 |
from the widespread use of display devices which |
| 1038 |
are 80 columns wide, and the desire to leave a bit |
| 1039 |
of margin for quoting etc. (see below). |
| 1040 |
|
| 1041 |
Reading agents confronted with body lines much longer than |
| 1042 |
the available output-device width SHOULD break lines as |
| 1043 |
appropriate. Posters are warned that such breaks may not |
| 1044 |
occur exactly where the poster intends. |
| 1045 |
|
| 1046 |
NOTE: "As appropriate" would typically include |
| 1047 |
breaking lines when supplying the text of an arti- |
| 1048 |
cle to be quoted in a reply or followup, something |
| 1049 |
that line-breaking reading agents often neglect to |
| 1050 |
do now. |
| 1051 |
|
| 1052 |
|
| 1053 |
|
| 1054 |
2 June 1994 - 16 - expires 15 July 1994 |
| 1055 |
|
| 1056 |
|
| 1057 |
|
| 1058 |
|
| 1059 |
|
| 1060 |
INTERNET DRAFT to be NEWS sec. 4.3.2 |
| 1061 |
|
| 1062 |
|
| 1063 |
Although styles vary widely, for plain text it is usual to |
| 1064 |
use no left margin, leave the right edge ragged, use a sin- |
| 1065 |
gle empty line to separate paragraphs, and employ normal |
| 1066 |
natural-language usage on matters such as upper/lowercase. |
| 1067 |
(In particular, articles SHOULD not be written entirely in |
| 1068 |
uppercase. In environments where posters have access only |
| 1069 |
to uppercase, posting agents SHOULD translate it to lower- |
| 1070 |
case.) |
| 1071 |
|
| 1072 |
NOTE: Most people find substantial bodies of text |
| 1073 |
entirely in uppercase relatively hard to read, |
| 1074 |
while all-lowercase text merely looks slightly |
| 1075 |
odd. The common association of uppercase with |
| 1076 |
strong emphasis adds to this. |
| 1077 |
|
| 1078 |
Tone of voice does not carry well in written text, and mis- |
| 1079 |
understandings are common when sarcasm, parody, or exaggera- |
| 1080 |
tion for humorous effect is attempted without explicit warn- |
| 1081 |
ing. It has become conventional to use the sequence ":-)", |
| 1082 |
which (on most output devices) resembles a rotated "smiley |
| 1083 |
face" symbol, as a marker for text not meant to be taken |
| 1084 |
literally, especially when humor is intended. This practice |
| 1085 |
aids communication and averts unintended ill-will; posters |
| 1086 |
are urged to use it. A variety of analogous sequences are |
| 1087 |
used with less-standardized meanings [Sanderson]. |
| 1088 |
|
| 1089 |
The order of arrival of news articles at a particular host |
| 1090 |
depends somewhat on transmission paths, and occasionally |
| 1091 |
articles are lost for various reasons. When responding to a |
| 1092 |
previous article, posters SHOULD not assume that all readers |
| 1093 |
understand the exact context. It is common to quote some of |
| 1094 |
the previous article to establish context. This SHOULD be |
| 1095 |
done by prefacing each quoted line (even if it is empty) |
| 1096 |
with the character ">". This will result in multiple levels |
| 1097 |
of ">" when quoted context itself contains quoted context. |
| 1098 |
|
| 1099 |
NOTE: It may seem superfluous to put a prefix on |
| 1100 |
empty lines, but it simplifies implementation of |
| 1101 |
functions such as "skip all quoted text" in read- |
| 1102 |
ing agents. |
| 1103 |
|
| 1104 |
Readability is enhanced if quoted text and new text are sep- |
| 1105 |
arated by an empty line. |
| 1106 |
|
| 1107 |
Posters SHOULD edit quoted context to trim it down to the |
| 1108 |
minimum necessary. However, posting agents SHOULD not |
| 1109 |
attempt to enforce this by imposing overly-simplistic rules |
| 1110 |
like "no more than 50% of the lines should be quotes". |
| 1111 |
|
| 1112 |
NOTE: While encouraging trimming is desirable, the |
| 1113 |
50% rule imposed by some old posting agents is |
| 1114 |
both inadequate and counterproductive. Posters do |
| 1115 |
not respond to it by being more selective about |
| 1116 |
quoting; they respond by padding short responses, |
| 1117 |
|
| 1118 |
|
| 1119 |
|
| 1120 |
2 June 1994 - 17 - expires 15 July 1994 |
| 1121 |
|
| 1122 |
|
| 1123 |
|
| 1124 |
|
| 1125 |
|
| 1126 |
INTERNET DRAFT to be NEWS sec. 4.3.2 |
| 1127 |
|
| 1128 |
|
| 1129 |
or by using different quoting styles to defeat |
| 1130 |
automatic analysis. The former adds unnecessary |
| 1131 |
noise and volume, while the latter also defeats |
| 1132 |
more useful forms of automatic analysis that read- |
| 1133 |
ing agents might wish to do. |
| 1134 |
|
| 1135 |
NOTE: At the very least, if a minimum-unquoted |
| 1136 |
quota is being set, article bodies shorter than |
| 1137 |
(say) 20 lines, or perhaps articles which exceed |
| 1138 |
the quota by only a few lines, should be exempt. |
| 1139 |
This avoids the ridiculous situation of complain- |
| 1140 |
ing about a 5-line response to a 6-line quote. |
| 1141 |
|
| 1142 |
NOTE: A more subtle posting-agent rule, suggested |
| 1143 |
for experimental use, is to reject articles that |
| 1144 |
appear to contain quoted signatures (see below). |
| 1145 |
This is almost certainly the result of a careless |
| 1146 |
poster not bothering to trim down quoted context. |
| 1147 |
Also, if a posting agent or followup agent pre- |
| 1148 |
sents an article template to the poster for edit- |
| 1149 |
ing, it really should take note of whether the |
| 1150 |
poster actually made any changes, and refrain from |
| 1151 |
posting an unmodified template. |
| 1152 |
|
| 1153 |
Some followup agents supply "attribution" lines for quoted |
| 1154 |
context, indicating where it first appeared and under whose |
| 1155 |
name. When multiple levels of quoting are present and |
| 1156 |
quoted context is edited for brevity, "inner" attribution |
| 1157 |
lines are not always retained. The editing process is also |
| 1158 |
somewhat error-prone. Reading agents (and readers) are |
| 1159 |
warned not to assume that attributions are accurate. |
| 1160 |
|
| 1161 |
UNRESOLVED ISSUE: Should a standard format for |
| 1162 |
attribution lines be defined? There is already |
| 1163 |
considerable diversity... but automatic news anal- |
| 1164 |
ysis would be substantially aided by a standard |
| 1165 |
convention. |
| 1166 |
|
| 1167 |
Early difficulties in inferring return addresses from arti- |
| 1168 |
cle headers led to "signatures": short closing texts, auto- |
| 1169 |
matically added to the end of articles by posting agents, |
| 1170 |
identifying the poster and giving his network addresses etc. |
| 1171 |
If a poster or posting agent does append a signature to an |
| 1172 |
article, the signature SHOULD be preceded with a delimiter |
| 1173 |
line containing (only) two hyphens (ASCII 45) followed by |
| 1174 |
one blank (ASCII 32). Posting agents SHOULD limit the |
| 1175 |
length of signatures, since verbose excess bordering on |
| 1176 |
abuse is common if no restraint is imposed; 4 lines is a |
| 1177 |
common limit. |
| 1178 |
|
| 1179 |
NOTE: While signatures are arguably a blemish, |
| 1180 |
they are a well-understood convention, and convey- |
| 1181 |
ing the same information in headers exposes it to |
| 1182 |
mangling and makes it rather less conspicuous. A |
| 1183 |
|
| 1184 |
|
| 1185 |
|
| 1186 |
2 June 1994 - 18 - expires 15 July 1994 |
| 1187 |
|
| 1188 |
|
| 1189 |
|
| 1190 |
|
| 1191 |
|
| 1192 |
INTERNET DRAFT to be NEWS sec. 4.3.2 |
| 1193 |
|
| 1194 |
|
| 1195 |
standard delimiter line makes it possible for |
| 1196 |
reading agents to handle signatures specially if |
| 1197 |
desired. (This is unfortunately hampered by |
| 1198 |
extensive misunderstanding of, and misuse of, the |
| 1199 |
delimiter.) |
| 1200 |
|
| 1201 |
NOTE: The choice of delimiter is somewhat unfortu- |
| 1202 |
nate, since it relies on preservation of trailing |
| 1203 |
white space, but it is too well-established to |
| 1204 |
change. There is work underway to define a more |
| 1205 |
sophisticated signature scheme as part of MIME, |
| 1206 |
and this will presumably supersede the current |
| 1207 |
convention in due time. |
| 1208 |
|
| 1209 |
NOTE: Four 75-column lines of signature text is |
| 1210 |
300 characters, which is ample to convey name and |
| 1211 |
mail-address information in all but the most |
| 1212 |
bizarre situations. |
| 1213 |
|
| 1214 |
|
| 1215 |
4.4. Characters And Character Sets |
| 1216 |
|
| 1217 |
Header and body lines MAY contain any ASCII characters other |
| 1218 |
than CR (ASCII 13), LF (ASCII 10), and NUL (ASCII 0). |
| 1219 |
|
| 1220 |
NOTE: CR and LF are excluded because they clash |
| 1221 |
with common EOL conventions. NUL is excluded |
| 1222 |
because it clashes with the C end-of-string con- |
| 1223 |
vention, which is significant to most existing |
| 1224 |
news software. These three characters are |
| 1225 |
unlikely to be transmitted successfully. |
| 1226 |
|
| 1227 |
However, posters SHOULD avoid using ASCII control characters |
| 1228 |
except for tab (ASCII 9), formfeed (ASCII 12), and backspace |
| 1229 |
(ASCII 8). Tab signifies sufficient horizontal white space |
| 1230 |
to reach the next of a set of fixed positions; posters are |
| 1231 |
warned that there is no standard set of positions, so tabs |
| 1232 |
should be avoided if precise spacing is essential. Formfeed |
| 1233 |
signifies a point at which a reading agent SHOULD pause and |
| 1234 |
await reader interaction before displaying further text. |
| 1235 |
Backspace SHOULD be used only for underlining, done by a |
| 1236 |
sequence of underscores (ASCII 95) followed by an equal num- |
| 1237 |
ber of backspaces, signifying that the same number of text |
| 1238 |
characters following are to be underlined. Posters are |
| 1239 |
warned that underlining is not available on all output |
| 1240 |
devices and is best not relied on for essential meaning. |
| 1241 |
Reading agents SHOULD recognize underlining and translate it |
| 1242 |
to the appropriate commands for devices that support it. |
| 1243 |
|
| 1244 |
NOTE: Interpretation of almost all control charac- |
| 1245 |
ters is device-specific to some degree, and |
| 1246 |
devices differ. Tabs and underlining are sup- |
| 1247 |
ported, to some extent, by most modern devices and |
| 1248 |
reading agents, hence the cautious exemptions for |
| 1249 |
|
| 1250 |
|
| 1251 |
|
| 1252 |
2 June 1994 - 19 - expires 15 July 1994 |
| 1253 |
|
| 1254 |
|
| 1255 |
|
| 1256 |
|
| 1257 |
|
| 1258 |
INTERNET DRAFT to be NEWS sec. 4.4 |
| 1259 |
|
| 1260 |
|
| 1261 |
them. The underlining method is specified because |
| 1262 |
the inverse method, text and then underscores, is |
| 1263 |
tempting to the naive... but if sent unaltered to |
| 1264 |
a device that shows only the most recent of sev- |
| 1265 |
eral overstruck characters rather than a compos- |
| 1266 |
ite, the result can be utterly unreadable. |
| 1267 |
|
| 1268 |
NOTE: A common interpretation of tab is that it is |
| 1269 |
a request to space forward to the next position |
| 1270 |
whose number is one more than a multiple of 8, |
| 1271 |
with positions numbered sequentially starting at |
| 1272 |
1. (So tab positions are 9, 17, 25, ...) Reading |
| 1273 |
agents not constrained by existing system conven- |
| 1274 |
tions might wish to use this interpretation. |
| 1275 |
|
| 1276 |
NOTE: It will typically be necessary for a reading |
| 1277 |
agent to catch and interpret formfeed, not just |
| 1278 |
send it to the output device. The actions per- |
| 1279 |
formed by typical output devices on receiving a |
| 1280 |
formfeed are neither adequate for nor appropriate |
| 1281 |
to the pause-for-interaction meaning. |
| 1282 |
|
| 1283 |
Cooperating subnets which wish to employ non-ASCII character |
| 1284 |
sets by using escape sequences (employing, e.g., ESC (ASCII |
| 1285 |
27), SO (ASCII 14), and SI (ASCII 15)) to alter the meaning |
| 1286 |
of superficially-ASCII characters MAY do so, but MUST use |
| 1287 |
MIME headers to alert reading agents to the particular char- |
| 1288 |
acter set(s) and escape sequences in use. A reading agent |
| 1289 |
SHOULD not pass such an escape sequence through, unaltered, |
| 1290 |
to the output device unless the agent confirms that the |
| 1291 |
sequence is one used to affect character sets and has reason |
| 1292 |
to believe that the device is capable of interpreting that |
| 1293 |
particular sequence properly. |
| 1294 |
|
| 1295 |
NOTE: Cooperating-subnet organizers are warned |
| 1296 |
that some very old relayers strip certain control |
| 1297 |
characters out of articles they pass along. ESC |
| 1298 |
is known to be among the affected characters. |
| 1299 |
|
| 1300 |
NOTE: There are now standard Internet encodings |
| 1301 |
for Japanese [rrr] and Vietnamese [rrr] in partic- |
| 1302 |
ular. |
| 1303 |
|
| 1304 |
Articles MUST not contain any octet with value exceeding |
| 1305 |
127, i.e. any octet that is not an ASCII character. |
| 1306 |
|
| 1307 |
NOTE: This rule, like others, may be relaxed by |
| 1308 |
unanimous consent of the members of a cooperating |
| 1309 |
subnet, provided suitable precautions are taken to |
| 1310 |
ensure that rule-violating articles do not leak |
| 1311 |
out of the subnet. (This has already been done in |
| 1312 |
many areas where ASCII is not adequate for the |
| 1313 |
local language(s).) Beware that articles contain- |
| 1314 |
ing non-ASCII octets in headers are a violation of |
| 1315 |
|
| 1316 |
|
| 1317 |
|
| 1318 |
2 June 1994 - 20 - expires 15 July 1994 |
| 1319 |
|
| 1320 |
|
| 1321 |
|
| 1322 |
|
| 1323 |
|
| 1324 |
INTERNET DRAFT to be NEWS sec. 4.4 |
| 1325 |
|
| 1326 |
|
| 1327 |
the MAIL specifications and are not valid MAIL |
| 1328 |
messages. MIME offers a way to encode non-ASCII |
| 1329 |
characters in ASCII for use in headers; see sec- |
| 1330 |
tion 4.5. |
| 1331 |
|
| 1332 |
NOTE: While there is great interest in using 8-bit |
| 1333 |
character sets, not all software can yet handle |
| 1334 |
them correctly. Hence the restriction to cooper- |
| 1335 |
ating subnets. MIME encodings can be used to |
| 1336 |
transmit such characters while remaining within |
| 1337 |
the octet restriction. |
| 1338 |
|
| 1339 |
In anticipation of the day when it is possible to use non- |
| 1340 |
ASCII characters safely anywhere, and to provide for the |
| 1341 |
(substantial) cooperating subnets that are already using |
| 1342 |
them, transmission paths SHOULD treat news articles as unin- |
| 1343 |
terpreted sequences of octets (except perhaps for transfor- |
| 1344 |
mations between EOL representations) and relayers SHOULD |
| 1345 |
treat non-ASCII characters in articles as ordinary charac- |
| 1346 |
ters. |
| 1347 |
|
| 1348 |
NOTE: 8-bit enthusiasts are warned that not all |
| 1349 |
software conforms to these recommendations yet. |
| 1350 |
In particular, standard NNTP [rrr] is a 7-bit pro- |
| 1351 |
tocol, and there may be implementations which |
| 1352 |
enforce this rule. Be warned, also, that it will |
| 1353 |
never be safe to send raw binary data in the body |
| 1354 |
of news articles, because changes of EOL represen- |
| 1355 |
tation may (will!) corrupt it. |
| 1356 |
|
| 1357 |
Except where cooperating subnets permit more direct |
| 1358 |
approaches, MIME [rrr] headers and encodings SHOULD be used |
| 1359 |
to transmit non-ASCII content using ASCII characters; see |
| 1360 |
section 4.5, appendix B, and the MIME RFCs for details. If |
| 1361 |
article content can be expressed in ASCII, it SHOULD be. |
| 1362 |
Failing that, the order of preference for character sets is |
| 1363 |
that described in MIME [rrr]. |
| 1364 |
|
| 1365 |
NOTE: Using the MIME facilities, it is possible to |
| 1366 |
transmit ANY character set, and ANY form of binary |
| 1367 |
data, using only ASCII characters. Equally impor- |
| 1368 |
tant, such articles are self-describing and the |
| 1369 |
reading agent can tell which octet-to-symbol map- |
| 1370 |
ping is intended! Designation of some preferred |
| 1371 |
character sets is intended to minimize the number |
| 1372 |
of character sets that a reading agent must under- |
| 1373 |
stand in order to display most articles properly. |
| 1374 |
|
| 1375 |
Articles containing non-ASCII characters, articles using |
| 1376 |
ASCII characters (values 0 through 127) to refer to non- |
| 1377 |
ASCII symbols, and articles using escape sequences to shift |
| 1378 |
character sets SHOULD include MIME headers indicating which |
| 1379 |
character set(s) and conventions are being used, and MUST do |
| 1380 |
so unless such articles are strictly confined to a |
| 1381 |
|
| 1382 |
|
| 1383 |
|
| 1384 |
2 June 1994 - 21 - expires 15 July 1994 |
| 1385 |
|
| 1386 |
|
| 1387 |
|
| 1388 |
|
| 1389 |
|
| 1390 |
INTERNET DRAFT to be NEWS sec. 4.4 |
| 1391 |
|
| 1392 |
|
| 1393 |
cooperating subnet which has its own pre-agreed conventions. |
| 1394 |
MIME encodings are preferred over all these techniques. If |
| 1395 |
it comes to a relayer's attention that it is being asked to |
| 1396 |
pass an article using such techniques outward across what it |
| 1397 |
knows to be the boundary of such a cooperating subnet, it |
| 1398 |
MUST report this error to its administrator, and MAY refuse |
| 1399 |
to pass the article beyond the subnet boundary. If it does |
| 1400 |
pass the article, it MUST re-encode it with MIME encodings |
| 1401 |
to make it conform to this Draft. |
| 1402 |
|
| 1403 |
NOTE: Such re-encoding is a non-trivial task, due |
| 1404 |
to MIME rules such as the prohibition of nested |
| 1405 |
encodings. It's not just a matter of pouring the |
| 1406 |
body through a simple filter. |
| 1407 |
|
| 1408 |
Reading agents SHOULD note MIME headers and attempt to show |
| 1409 |
the reader the closest possible approximation to the |
| 1410 |
intended content. They SHOULD not just send the octets of |
| 1411 |
the article to the output device unaltered, unless there is |
| 1412 |
reason to believe that the output device will indeed inter- |
| 1413 |
pret them correctly. Reading agents MUST not pass ASCII |
| 1414 |
control characters or escape sequences, other than as dis- |
| 1415 |
cussed above, unaltered to the output device; only by chance |
| 1416 |
would the result be the desired one, and there is serious |
| 1417 |
potential for harmful side effects, either accidental or |
| 1418 |
malicious. |
| 1419 |
|
| 1420 |
NOTE: Exactly what to do with unwanted control |
| 1421 |
characters/sequences depends on the philosophy of |
| 1422 |
the reading agent, but passing them straight to |
| 1423 |
the output device is almost always wrong. If the |
| 1424 |
reading agent wants to mark the presence of such a |
| 1425 |
character/sequence in circumstances where only |
| 1426 |
ASCII printable characters are available, trans- |
| 1427 |
lating it to "#" might be a suitable method; "#" |
| 1428 |
is a conspicuous character seldom used in normal |
| 1429 |
text. |
| 1430 |
|
| 1431 |
NOTE: Reading agents should be aware that many old |
| 1432 |
output devices (or the transmission paths to them) |
| 1433 |
zero out the top bit of octets sent to them. This |
| 1434 |
can transform non-ASCII characters into ASCII con- |
| 1435 |
trol characters. |
| 1436 |
|
| 1437 |
Followup agents MUST be careful to apply appropriate trans- |
| 1438 |
formations of representation to the outbound followup as |
| 1439 |
well as the inbound precursor. A followup to an article |
| 1440 |
containing non-ASCII material is very likely to contain non- |
| 1441 |
ASCII material itself. |
| 1442 |
|
| 1443 |
|
| 1444 |
|
| 1445 |
|
| 1446 |
|
| 1447 |
|
| 1448 |
|
| 1449 |
|
| 1450 |
2 June 1994 - 22 - expires 15 July 1994 |
| 1451 |
|
| 1452 |
|
| 1453 |
|
| 1454 |
|
| 1455 |
|
| 1456 |
INTERNET DRAFT to be NEWS sec. 4.5 |
| 1457 |
|
| 1458 |
|
| 1459 |
4.5. Non-ASCII Characters In Headers |
| 1460 |
|
| 1461 |
All octets found in headers MUST be ASCII characters. How- |
| 1462 |
ever, it is desirable to have a way of encoding non-ASCII |
| 1463 |
characters, especially in "human-readable" headers such as |
| 1464 |
Subject. MIME [rrr] provides a way to do this. Full |
| 1465 |
details may be found in the MIME specifications; herewith a |
| 1466 |
quick summary to alert software authors to the issues... |
| 1467 |
|
| 1468 |
encoded-word = "=?" charset "?" encoding "?" codes "?=" |
| 1469 |
charset = 1*tag-char |
| 1470 |
encoding = 1*tag-char |
| 1471 |
tag-char = <ASCII printable character except !()<>@,;:\"[]/?=> |
| 1472 |
codes = 1*code-char |
| 1473 |
code-char = <ASCII printable character except ?> |
| 1474 |
|
| 1475 |
An encoded word is a sequence of ASCII printable characters |
| 1476 |
that specifies the character set, encoding method, and bits |
| 1477 |
of (potentially) non-ASCII characters. Encoded words are |
| 1478 |
allowed only in certain positions in certain headers. Spe- |
| 1479 |
cific headers impose restrictions on the content of encoded |
| 1480 |
words beyond that specified in this section. Posting agents |
| 1481 |
MUST ensure that any material resembling an encoded word |
| 1482 |
(complete with all delimiters), in a context where encoded |
| 1483 |
words may appear, really is an encoded word. |
| 1484 |
|
| 1485 |
NOTE: The syntax is a bit ugly, but it was |
| 1486 |
designed to minimize chances of confusion with |
| 1487 |
legitimate header contents, and to satisfy diffi- |
| 1488 |
cult constraints on use within existing headers. |
| 1489 |
|
| 1490 |
An encoded word MUST not be more than 75 octets long. Each |
| 1491 |
line of a header containing encoded word(s) MUST be at most |
| 1492 |
76 octets long, not counting the EOL. |
| 1493 |
|
| 1494 |
NOTE: These limits are meant to bound the looka- |
| 1495 |
head needed to determine whether text that begins |
| 1496 |
"=?" is really an encoded word. |
| 1497 |
|
| 1498 |
The details of charsets and encodings are defined by MIME |
| 1499 |
[rrr]; the sequence of preferred character sets is the same |
| 1500 |
as MIME's. Encoded words SHOULD not be used for content |
| 1501 |
expressible in ASCII. |
| 1502 |
|
| 1503 |
When an encoded word is used, other than in a newsgroup name |
| 1504 |
(see section 5.5), it MUST be separated from any adjacent |
| 1505 |
non-space characters (including other encoded words) by |
| 1506 |
white space. Reading agents displaying the contents of |
| 1507 |
encoded words (as opposed to their encoded form) should |
| 1508 |
ignore white space adjacent to encoded words. |
| 1509 |
|
| 1510 |
UNRESOLVED ISSUE: Should this section be deleted |
| 1511 |
entirely, or made much more terse? The material |
| 1512 |
is relevant, but too complex to discuss fully. |
| 1513 |
|
| 1514 |
|
| 1515 |
|
| 1516 |
2 June 1994 - 23 - expires 15 July 1994 |
| 1517 |
|
| 1518 |
|
| 1519 |
|
| 1520 |
|
| 1521 |
|
| 1522 |
INTERNET DRAFT to be NEWS sec. 4.5 |
| 1523 |
|
| 1524 |
|
| 1525 |
NOTE: The deletion of intervening white space per- |
| 1526 |
mits using multiple encoded words, implicitly con- |
| 1527 |
catenated by the deletion, to encode text that |
| 1528 |
will not fit within a single 75-character encoded |
| 1529 |
word. |
| 1530 |
|
| 1531 |
Reading-agent implementors are warned that although this |
| 1532 |
Draft completely specifies where encoded words may appear in |
| 1533 |
the headers it defines, there are other headers (e.g. the |
| 1534 |
MIME Content-Description header) that MAY contain them. |
| 1535 |
|
| 1536 |
|
| 1537 |
4.6. Size Limits |
| 1538 |
|
| 1539 |
Implementations SHOULD avoid fixed constraints on the sizes |
| 1540 |
of lines within an article and on the size of the entire |
| 1541 |
article. |
| 1542 |
|
| 1543 |
Relayers SHOULD treat the body of an article as an uninter- |
| 1544 |
preted sequence of octets (except as mandated by changes of |
| 1545 |
EOL representation and processing of control messages), not |
| 1546 |
to be altered or constrained in any way. |
| 1547 |
|
| 1548 |
If it is absolutely necessary for an implementation to |
| 1549 |
impose a limit on the length of header lines, body lines, or |
| 1550 |
header logical lines, that limit shall be at least 1000 |
| 1551 |
octets, including EOL representations. Relayers and trans- |
| 1552 |
mission paths confronted with lines beyond their internal |
| 1553 |
limits (if any) MUST not simply inject EOLs at random |
| 1554 |
places; they MAY break headers (as described in 4.2.3) as a |
| 1555 |
last resort, and otherwise they MUST either pass the long |
| 1556 |
lines through unaltered, or refuse to pass the article at |
| 1557 |
all (see section 9.1 for further discussion). |
| 1558 |
|
| 1559 |
NOTE: The limit here is essentially the same mini- |
| 1560 |
mum as that specified for SMTP mail in RFC 821 |
| 1561 |
[rrr]. Implementors are warned that Path (see |
| 1562 |
section 5.6) and References (see section 6.5) |
| 1563 |
headers, in particular, often become several hun- |
| 1564 |
dred characters long, so 1000 is not an overly |
| 1565 |
generous limit. |
| 1566 |
|
| 1567 |
All implementations MUST be able to handle an article |
| 1568 |
totalling at least 65,000 octets, including headers and EOL |
| 1569 |
representations, gracefully and efficiently. All implemen- |
| 1570 |
tations SHOULD be able to handle an article totalling at |
| 1571 |
least 1,000,000 (one million) octets, including headers and |
| 1572 |
EOL representations, gracefully and efficiently. "Grace- |
| 1573 |
fully and efficiently" is intended to preclude not only |
| 1574 |
failures, but also major loss of performance, serious prob- |
| 1575 |
lems in error recovery, or resource consumption beyond what |
| 1576 |
is reasonably necessary. |
| 1577 |
|
| 1578 |
|
| 1579 |
|
| 1580 |
|
| 1581 |
|
| 1582 |
2 June 1994 - 24 - expires 15 July 1994 |
| 1583 |
|
| 1584 |
|
| 1585 |
|
| 1586 |
|
| 1587 |
|
| 1588 |
INTERNET DRAFT to be NEWS sec. 4.6 |
| 1589 |
|
| 1590 |
|
| 1591 |
NOTE: The intent here is to prohibit lowering the |
| 1592 |
existing de-facto limit any further, while |
| 1593 |
strongly encouraging movement towards a higher |
| 1594 |
one. Actually, although improvements are desir- |
| 1595 |
able in some cases, much news software copes rea- |
| 1596 |
sonably well with very large articles. The same |
| 1597 |
cannot be said of the communications software and |
| 1598 |
protocols used to transmit news from one host to |
| 1599 |
another, especially when slow communications links |
| 1600 |
are involved. Occasional huge articles that |
| 1601 |
appear now (by accident or through ignorance) typ- |
| 1602 |
ically leave trails of failing software, system |
| 1603 |
problems, and irate administrators in their wake. |
| 1604 |
|
| 1605 |
NOTE: It is intended that the successor to this |
| 1606 |
Draft will raise the "MUST" limit to 1,000,000 and |
| 1607 |
the "SHOULD" limit still further. |
| 1608 |
|
| 1609 |
Posters SHOULD limit posted articles to at most 60,000 |
| 1610 |
octets, including headers and EOL representations, unless |
| 1611 |
the articles are being posted only within a cooperating sub- |
| 1612 |
net which is known to be capable of handling larger articles |
| 1613 |
gracefully. Posting agents presented with a large article |
| 1614 |
SHOULD warn the poster and request confirmation. |
| 1615 |
|
| 1616 |
NOTE: The difference between this and the earlier |
| 1617 |
"MUST" limit is margin for header growth, differ- |
| 1618 |
ing EOL representations, and transmission over- |
| 1619 |
heads. |
| 1620 |
|
| 1621 |
NOTE: Disagreeable though these limits are, it is |
| 1622 |
a fact that in current networks, an article larger |
| 1623 |
than 64K (after header growth etc.) simply is not |
| 1624 |
transmitted reliably. Note also the comments |
| 1625 |
above on the trauma caused by single extremely- |
| 1626 |
large articles now; the problems are real and cur- |
| 1627 |
rent. These problems arguably should be fixed, |
| 1628 |
but this will not happen network-wide in the imme- |
| 1629 |
diate future. Hence the restriction of larger |
| 1630 |
articles to cooperating subnets, for now. |
| 1631 |
|
| 1632 |
Posters using non-ASCII characters in their text MUST take |
| 1633 |
into account the overhead involved in MIME encoding, unless |
| 1634 |
the article's propagation will be entirely limited to a |
| 1635 |
cooperating subnet which does not use MIME encodings for |
| 1636 |
non-ASCII characters. For example, MIME base64 encoding |
| 1637 |
involves growth by a factor of approximately 4/3, so an |
| 1638 |
article which would likely have to use this encoding should |
| 1639 |
be at most about 45,000 octets before encoding. |
| 1640 |
|
| 1641 |
Posters SHOULD use MIME "message/partial" conventions to |
| 1642 |
facilitate automatic reassembly of a large document split |
| 1643 |
into smaller pieces for posting. It is recommended that the |
| 1644 |
content identifier used should be a message ID, generated by |
| 1645 |
|
| 1646 |
|
| 1647 |
|
| 1648 |
2 June 1994 - 25 - expires 15 July 1994 |
| 1649 |
|
| 1650 |
|
| 1651 |
|
| 1652 |
|
| 1653 |
|
| 1654 |
INTERNET DRAFT to be NEWS sec. 4.6 |
| 1655 |
|
| 1656 |
|
| 1657 |
the same means as article message IDs (see section 5.3), and |
| 1658 |
that all parts should have a See-Also header (see section |
| 1659 |
6.16) giving the message IDs of at least the previous parts |
| 1660 |
and preferably all the parts. |
| 1661 |
|
| 1662 |
NOTE: See-Also is more correct for this purpose |
| 1663 |
than References, although References is in common |
| 1664 |
use today (with less-formal reassembly arrange- |
| 1665 |
ments). MIME reassemblers should probably examine |
| 1666 |
articles suggested by References headers if See- |
| 1667 |
Also headers are not present to indicate the |
| 1668 |
whereabouts of the other parts of "mes- |
| 1669 |
sage/partial" articles. |
| 1670 |
|
| 1671 |
To repeat: implementations SHOULD avoid fixed constraints on |
| 1672 |
the sizes of lines within an article and on the size of the |
| 1673 |
entire article. |
| 1674 |
|
| 1675 |
|
| 1676 |
4.7. Example |
| 1677 |
|
| 1678 |
Here is a sample article: |
| 1679 |
|
| 1680 |
From: jerry@eagle.ATT.COM (Jerry Schwarz) |
| 1681 |
Path: cbosgd!mhuxj!mhuxt!eagle!jerry |
| 1682 |
Newsgroups: news.announce |
| 1683 |
Subject: Usenet Etiquette -- Please Read |
| 1684 |
Message-ID: <642@eagle.ATT.COM> |
| 1685 |
Date: Mon, 17 Jan 1994 11:14:55 -0500 (EST) |
| 1686 |
Followup-To: news.misc |
| 1687 |
Expires: Wed, 19 Jan 1994 00:00:00 -0500 |
| 1688 |
Organization: AT&T Bell Laboratories, Murray Hill |
| 1689 |
|
| 1690 |
body |
| 1691 |
body |
| 1692 |
body |
| 1693 |
|
| 1694 |
|
| 1695 |
|
| 1696 |
5. Mandatory Headers |
| 1697 |
|
| 1698 |
An article MUST have one, and only one, of each of the fol- |
| 1699 |
lowing headers: Date, From, Message-ID, Subject, Newsgroups, |
| 1700 |
Path. |
| 1701 |
|
| 1702 |
NOTE: MAIL specifies (if read most carefully) that |
| 1703 |
there must be exactly one Date header and exactly |
| 1704 |
one From header, but otherwise does not restrict |
| 1705 |
multiple appearances of headers. (Notably, it |
| 1706 |
permits multiple Message-ID headers!) This |
| 1707 |
appears singularly useless, or even harmful, in |
| 1708 |
the context of news, and much current news soft- |
| 1709 |
ware will not tolerate multiple appearances of |
| 1710 |
mandatory headers. |
| 1711 |
|
| 1712 |
|
| 1713 |
|
| 1714 |
2 June 1994 - 26 - expires 15 July 1994 |
| 1715 |
|
| 1716 |
|
| 1717 |
|
| 1718 |
|
| 1719 |
|
| 1720 |
INTERNET DRAFT to be NEWS sec. 5 |
| 1721 |
|
| 1722 |
|
| 1723 |
Note also that there are situations, discussed in the rele- |
| 1724 |
vant parts of section 6, where References, Sender, or |
| 1725 |
Approved headers are mandatory. |
| 1726 |
|
| 1727 |
In the discussions of the individual headers, the content of |
| 1728 |
each is specified using the syntax notation. The convention |
| 1729 |
used is that the content of, for example, the Subject header |
| 1730 |
is defined as <Subject-content>. |
| 1731 |
|
| 1732 |
|
| 1733 |
5.1. Date |
| 1734 |
|
| 1735 |
The Date header contains the date and time when the article |
| 1736 |
was submitted for transmission: |
| 1737 |
|
| 1738 |
Date-content = [ weekday "," space ] date space time |
| 1739 |
weekday = "Mon" / "Tue" / "Wed" / "Thu" |
| 1740 |
/ "Fri" / "Sat" / "Sun" |
| 1741 |
date = day space month space year |
| 1742 |
day = 1*2digit |
| 1743 |
month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" |
| 1744 |
/ "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" |
| 1745 |
year = 4digit / 2digit |
| 1746 |
time = hh ":" mm [ ":" ss ] space timezone |
| 1747 |
timezone = "UT" / "GMT" |
| 1748 |
/ ( "+" / "-" ) hh mm [ space "(" zone-name ")" ] |
| 1749 |
hh = 2digit |
| 1750 |
mm = 2digit |
| 1751 |
ss = 2digit |
| 1752 |
zone-name = 1*( <ASCII printable character except ()\> / space ) |
| 1753 |
|
| 1754 |
This is a restricted subset of the MAIL date format. |
| 1755 |
|
| 1756 |
If a weekday is given, it MUST be consistent with the date. |
| 1757 |
The modern Gregorian calendar is used, and dates MUST be |
| 1758 |
consistent with its usual conventions; for example, if the |
| 1759 |
month is May, the day must be between 1 and 31 inclusive. |
| 1760 |
The year SHOULD be given as four digits, and posting agents |
| 1761 |
SHOULD enforce this; however, relayers MUST accept the two- |
| 1762 |
digit form, and MUST interpret it as having the implicit |
| 1763 |
prefix "19". |
| 1764 |
|
| 1765 |
NOTE: Two-digit year numbers can, should, and must |
| 1766 |
be phased out by 1999. |
| 1767 |
|
| 1768 |
The time is given on the 24-hour clock, e.g. two hours |
| 1769 |
before midnight is "22:00" or "22:00:00". The hh must be |
| 1770 |
between 00 and 23 inclusive, the mm between 0 and 59 inclu- |
| 1771 |
sive, and the ss between 0 and 61 inclusive. |
| 1772 |
|
| 1773 |
NOTE: Leap seconds very occasionally result in |
| 1774 |
minutes that are 61 or 62 seconds long. |
| 1775 |
|
| 1776 |
|
| 1777 |
|
| 1778 |
|
| 1779 |
|
| 1780 |
2 June 1994 - 27 - expires 15 July 1994 |
| 1781 |
|
| 1782 |
|
| 1783 |
|
| 1784 |
|
| 1785 |
|
| 1786 |
INTERNET DRAFT to be NEWS sec. 5.1 |
| 1787 |
|
| 1788 |
|
| 1789 |
The date and time SHOULD be given in the poster's local |
| 1790 |
timezone, including a specification of that timezone as a |
| 1791 |
numeric offset (which SHOULD include the timezone name, e.g. |
| 1792 |
"EST", supplied in parentheses like a MAIL comment). If |
| 1793 |
not, they MUST be given in Universal Time (abbreviated "UT"; |
| 1794 |
"GMT" is a historical synonym for "UT"). The timezone name |
| 1795 |
in parentheses, if present, is a comment; software MUST |
| 1796 |
ignore it, except that reading agents might wish to display |
| 1797 |
it to the reader. Timezone names other than "UT" and "GMT" |
| 1798 |
MUST appear only in the comment. |
| 1799 |
|
| 1800 |
NOTE: Attempts to deal with a full set of timezone |
| 1801 |
names have all foundered on the vast number of |
| 1802 |
such names in use and the duplications (for exam- |
| 1803 |
ple, there are at least FIVE different timezones |
| 1804 |
called "EST" by somebody). Even the limited set |
| 1805 |
of North American zone names authorized by MAIL is |
| 1806 |
subject to confusion and misinterpretation. Hence |
| 1807 |
the flat ban on non-UT timezone names except as |
| 1808 |
comments. |
| 1809 |
|
| 1810 |
NOTE: RFC 1036 specified that use of GMT (aka UT, |
| 1811 |
UTC) was preferred. However, the local time (in |
| 1812 |
the poster's timezone) is arguably information of |
| 1813 |
possible interest to the reader, and this requires |
| 1814 |
some indication of the poster's timezone. Numeric |
| 1815 |
offsets are an unambiguous way of doing this, and |
| 1816 |
their use was indeed sanctioned by RFC 1036 (that |
| 1817 |
is, this is a change of preference only). |
| 1818 |
|
| 1819 |
NOTE: There is frequent confusion, including |
| 1820 |
errors in some news software, regarding the sign |
| 1821 |
of numeric timezones. Zones west of Greenwich |
| 1822 |
have negative offsets. For example, North Ameri- |
| 1823 |
can Eastern Standard Time is zone -0500 and North |
| 1824 |
American Eastern Daylight Time is zone -0400. |
| 1825 |
|
| 1826 |
NOTE: Implementors are warned that the hh in a |
| 1827 |
timezone can go up to about 14; it is not limited |
| 1828 |
to 12. This is because the International Date |
| 1829 |
Line does not run exactly along the boundary |
| 1830 |
between zone -1200 and zone +1200. |
| 1831 |
|
| 1832 |
NOTE: The comments in section 2.6 regarding trans- |
| 1833 |
lation to other languages are relevant here. The |
| 1834 |
Date-content format, and the spellings of its com- |
| 1835 |
ponents, as found in articles themselves, are |
| 1836 |
always as defined in this Draft, regardless of the |
| 1837 |
language used to interact with readers and |
| 1838 |
posters. Reading and posting agents should trans- |
| 1839 |
late as appropriate. Actually, even English- |
| 1840 |
language reading and posting agents will probably |
| 1841 |
want to do some degree of translation on dates, if |
| 1842 |
only to abbreviate the lengthy format and |
| 1843 |
|
| 1844 |
|
| 1845 |
|
| 1846 |
2 June 1994 - 28 - expires 15 July 1994 |
| 1847 |
|
| 1848 |
|
| 1849 |
|
| 1850 |
|
| 1851 |
|
| 1852 |
INTERNET DRAFT to be NEWS sec. 5.1 |
| 1853 |
|
| 1854 |
|
| 1855 |
(perhaps) translate to and from the reader's time- |
| 1856 |
zone. |
| 1857 |
|
| 1858 |
|
| 1859 |
5.2. From |
| 1860 |
|
| 1861 |
The From header contains the electronic address, and possi- |
| 1862 |
bly the full name, of the article's author: |
| 1863 |
|
| 1864 |
From-content = address [ space "(" paren-phrase ")" ] |
| 1865 |
/ [ plain-phrase space ] "<" address ">" |
| 1866 |
paren-phrase = 1*( paren-char / space / encoded-word ) |
| 1867 |
paren-char = <ASCII printable character except ()<>\> |
| 1868 |
plain-phrase = plain-word *( space plain-word ) |
| 1869 |
plain-word = unquoted-word / quoted-word / encoded-word |
| 1870 |
unquoted-word = 1*unquoted-char |
| 1871 |
unquoted-char = <ASCII printable character except !()<>@,;:\".[]> |
| 1872 |
quoted-word = quote 1*( quoted-char / space ) quote |
| 1873 |
quote = <" (ASCII 34)> |
| 1874 |
quoted-char = <ASCII printable character except "()<>\> |
| 1875 |
address = local-part "@" domain |
| 1876 |
local-part = unquoted-word *( "." unquoted-word ) |
| 1877 |
domain = unquoted-word *( "." unquoted-word ) |
| 1878 |
|
| 1879 |
(Encoded words are described in section 4.5.) The full name |
| 1880 |
is distinguished from the electronic address either by |
| 1881 |
enclosing the former in parentheses (making it resemble a |
| 1882 |
MAIL comment, after the address) or by enclosing the latter |
| 1883 |
in angle brackets. The second form is preferred. In the |
| 1884 |
first form, encoded words inside the full name MUST be com- |
| 1885 |
posed entirely of <paren-char>s. In the second form, |
| 1886 |
encoded words inside the full name may not contain charac- |
| 1887 |
ters other than letters (of either case), digits, and the |
| 1888 |
characters "!", "*", "+", "-", "/", "=", and "_". The local |
| 1889 |
part is case-sensitive (except that all case counterparts of |
| 1890 |
"postmaster" are deemed equivalent), the domain is case- |
| 1891 |
insensitive, and all other parts of the From content are |
| 1892 |
comments which MUST be ignored by news software (except |
| 1893 |
insofar as reading agents may wish to display them to the |
| 1894 |
reader). Posters and posting agents MUST restrict them- |
| 1895 |
selves to this subset of the MAIL From syntax; relayers MAY |
| 1896 |
accept a broader subset, but see the discussion in section |
| 1897 |
9.1. |
| 1898 |
|
| 1899 |
NOTE: The syntax here is a restricted subset of |
| 1900 |
the MAIL From syntax, with quoting particularly |
| 1901 |
restricted, for simple parsing. In particular, |
| 1902 |
the presence of "<" in the From content indicates |
| 1903 |
that the second form is being used, otherwise the |
| 1904 |
first form is being used. The major restrictions |
| 1905 |
here are those already de-facto imposed by exist- |
| 1906 |
ing software. |
| 1907 |
|
| 1908 |
|
| 1909 |
|
| 1910 |
|
| 1911 |
|
| 1912 |
2 June 1994 - 29 - expires 15 July 1994 |
| 1913 |
|
| 1914 |
|
| 1915 |
|
| 1916 |
|
| 1917 |
|
| 1918 |
INTERNET DRAFT to be NEWS sec. 5.2 |
| 1919 |
|
| 1920 |
|
| 1921 |
NOTE: Overly-lenient posting agents sometimes per- |
| 1922 |
mit the second form with a full name containing |
| 1923 |
"(" or ")", but it is extremely rare for a full |
| 1924 |
name to contain "<" or ">" even in mail. Accord- |
| 1925 |
ingly, reading agents wishing to robustly deter- |
| 1926 |
mine which form is in use in a particular article |
| 1927 |
should key on the presence or absence of "<", not |
| 1928 |
the presence or absence of "(". |
| 1929 |
|
| 1930 |
The address SHOULD be a valid and complete Internet domain |
| 1931 |
address, capable of being successfully mailed to by an |
| 1932 |
Internet host (possibly via an MX record and a forwarder). |
| 1933 |
The pseudo-domain ".uucp" MAY be used for hosts registered |
| 1934 |
in the UUCP maps (e.g. name "xyz.uucp" for registered site |
| 1935 |
"xyz"), but such hosts SHOULD discontinue this usage (either |
| 1936 |
by arranging a proper Internet address and forwarder, or by |
| 1937 |
using the "% hack" (see below)), as soon as possible. Bit- |
| 1938 |
net hosts SHOULD use Internet addresses, avoiding the obso- |
| 1939 |
lescent ".bitnet" pseudo-domain. Other forms of address |
| 1940 |
MUST not be used. |
| 1941 |
|
| 1942 |
NOTE: "Other forms" specifically include UK-style |
| 1943 |
"backward" domains ("uk.oxbridge.cs" is in the |
| 1944 |
Czech Republic, not the UK), pure-UUCP addressing |
| 1945 |
("knee!shin!foot" instead of |
| 1946 |
"foot%shin@knee.uucp"), and abbreviated domains |
| 1947 |
("zebra.zoo" instead of "zebra.zoo.toronto.edu"). |
| 1948 |
|
| 1949 |
If it is necessary to use the local part to specify a rout- |
| 1950 |
ing relative to the nearest Internet host, this MUST be done |
| 1951 |
using the "% hack", using "%" as a secondary "@". For exam- |
| 1952 |
ple, to specify that mail to the address should go to Inter- |
| 1953 |
net host "foo.bar.edu", then to non-Internet host "ein", |
| 1954 |
then to non-Internet host "deux", for delivery there to |
| 1955 |
mailbox "fred", a suitable address would be: |
| 1956 |
|
| 1957 |
fred%deux%ein@foo.bar.edu |
| 1958 |
|
| 1959 |
Analogous forms using "!" in the local part MUST not be |
| 1960 |
used, as they are ambiguous; they should be expressed in the |
| 1961 |
"%" form. |
| 1962 |
|
| 1963 |
NOTE: "a!b@c" can be interpreted as either "b%c@a" |
| 1964 |
or "b%a@c", and there is no consistency in which |
| 1965 |
choice is made. Such addresses consequently are |
| 1966 |
unreliable. The "%" form does not suffer from |
| 1967 |
this problem, and although its use is officially |
| 1968 |
discouraged, it is a de-facto standard, to the |
| 1969 |
point that MAIL recognizes it. |
| 1970 |
|
| 1971 |
Relayers MUST not, repeat MUST not, repeat MUST not, rewrite |
| 1972 |
From lines, in any way, however minor or innocent-seeming. |
| 1973 |
Trying to "fix" a non-conforming address has a very high |
| 1974 |
probability of making things worse. Either pass it along |
| 1975 |
|
| 1976 |
|
| 1977 |
|
| 1978 |
2 June 1994 - 30 - expires 15 July 1994 |
| 1979 |
|
| 1980 |
|
| 1981 |
|
| 1982 |
|
| 1983 |
|
| 1984 |
INTERNET DRAFT to be NEWS sec. 5.2 |
| 1985 |
|
| 1986 |
|
| 1987 |
unchanged, or reject the article. |
| 1988 |
|
| 1989 |
NOTE: An additional reason for banning the use of |
| 1990 |
"!" addressing is that it has a much higher proba- |
| 1991 |
bility of being rewritten into mangled unrecogniz- |
| 1992 |
ability by old relayers. |
| 1993 |
|
| 1994 |
Posters and posting agents SHOULD avoid use of the charac- |
| 1995 |
ters "!" and "@" in full names, as they may trigger unwanted |
| 1996 |
header rewriting by old, simple-minded news software. |
| 1997 |
|
| 1998 |
NOTE: Also, the characters "." and ",", not infre- |
| 1999 |
quently found in names (e.g., "John W. Campbell, |
| 2000 |
Jr."), are NOT, repeat NOT, allowed in an unquoted |
| 2001 |
word. A From header like the following MUST not |
| 2002 |
be written without the quotation marks: |
| 2003 |
|
| 2004 |
From: "John W. Campbell, Jr." <editor@analog.com> |
| 2005 |
|
| 2006 |
|
| 2007 |
|
| 2008 |
5.3. Message-ID |
| 2009 |
|
| 2010 |
The Message-ID header contains the article's message ID, a |
| 2011 |
unique identifier distinguishing the article from every |
| 2012 |
other article: |
| 2013 |
|
| 2014 |
Message-ID-content = message-id |
| 2015 |
message-id = "<" local-part "@" domain ">" |
| 2016 |
|
| 2017 |
As with From addresses, a message ID's local part is case- |
| 2018 |
sensitive and its domain is case-insensitive. The "<" and |
| 2019 |
">" are parts of the message ID, not peculiarities of the |
| 2020 |
Message-ID header. |
| 2021 |
|
| 2022 |
NOTE: News message IDs are a restricted subset of |
| 2023 |
MAIL message IDs. In particular, no existing news |
| 2024 |
software copes properly with MAIL quoting conven- |
| 2025 |
tions within the local part, so they are forbid- |
| 2026 |
den. This is unfortunate, particularly for X.400 |
| 2027 |
gateways that often wish to include characters |
| 2028 |
which are not legal in unquoted message IDs, but |
| 2029 |
it is impossible to fix net-wide. See the notes |
| 2030 |
on gatewaying in section 10. |
| 2031 |
|
| 2032 |
The domain in the message ID SHOULD be the full Internet |
| 2033 |
domain name of the posting agent's host. Use of the ".uucp" |
| 2034 |
pseudo-domain (for hosts registered in the UUCP maps) or the |
| 2035 |
".bitnet" pseudo-domain (for Bitnet hosts) is permissible, |
| 2036 |
but SHOULD be avoided. |
| 2037 |
|
| 2038 |
Posters and posting agents MUST generate the local part of a |
| 2039 |
message ID using an algorithm which obeys the specified syn- |
| 2040 |
tax (words separated by ".", with certain characters not |
| 2041 |
|
| 2042 |
|
| 2043 |
|
| 2044 |
2 June 1994 - 31 - expires 15 July 1994 |
| 2045 |
|
| 2046 |
|
| 2047 |
|
| 2048 |
|
| 2049 |
|
| 2050 |
INTERNET DRAFT to be NEWS sec. 5.3 |
| 2051 |
|
| 2052 |
|
| 2053 |
permitted) (see section 5.2 for details), and will not |
| 2054 |
repeat itself (ever). The algorithm SHOULD not generate |
| 2055 |
message IDs which differ only in case of letters. Note the |
| 2056 |
specification in section 6.5 of a recommended convention for |
| 2057 |
indicating subject changes. Otherwise the algorithm is up |
| 2058 |
to the implementor. |
| 2059 |
|
| 2060 |
NOTE: The crucial use of message IDs is to distin- |
| 2061 |
guish circulating articles from each other and |
| 2062 |
from articles circulated recently. They are also |
| 2063 |
potentially useful as permanent indexing keys, |
| 2064 |
hence the requirement for permanent uniqueness... |
| 2065 |
but indexers cannot absolutely rely on this |
| 2066 |
because the earlier RFCs urged it but did not |
| 2067 |
demand it. All major implementations have always |
| 2068 |
generated permanently-unique message IDs by |
| 2069 |
design, but in some cases this is sensitive to |
| 2070 |
proper administration, and duplicates may have |
| 2071 |
occurred by accident. |
| 2072 |
|
| 2073 |
NOTE: The most popular method of generating local |
| 2074 |
parts is to use the date and time, plus some way |
| 2075 |
of distinguishing between simultaneous postings on |
| 2076 |
the same host (e.g. a process number), and encode |
| 2077 |
them in a suitably-restricted alphabet. An older |
| 2078 |
but now less-popular alternative is to use a |
| 2079 |
sequence number, incremented each time the host |
| 2080 |
generates a new message ID; this is workable, but |
| 2081 |
requires careful design to cope properly with |
| 2082 |
simultaneous posting attempts, and is not as |
| 2083 |
robust in the presence of crashes and other mal- |
| 2084 |
functions. |
| 2085 |
|
| 2086 |
NOTE: Some buggy news software considers message |
| 2087 |
IDs completely case-insensitive, hence the advice |
| 2088 |
to avoid relying on case distinctions. The |
| 2089 |
restrictions placed on the "alphabet" of local |
| 2090 |
parts and domains in section 5.2 have the useful |
| 2091 |
side effect of making it unnecessary to parse mes- |
| 2092 |
sage IDs in complex ways to break them into case- |
| 2093 |
sensitive and case-insensitive portions. |
| 2094 |
|
| 2095 |
The local part of a message ID MUST not be "postmaster" or |
| 2096 |
any other string that would compare equal to "postmaster" in |
| 2097 |
a case-insensitive comparison. Message IDs MUST be no |
| 2098 |
longer than 250 octets, including the "<" and ">". |
| 2099 |
|
| 2100 |
NOTE: "Postmaster" is an irksome exception to |
| 2101 |
case-sensitivity in local parts, inherited from |
| 2102 |
MAIL, and simply avoiding it is the best way to |
| 2103 |
deal with it (not that it's likely, but the issue |
| 2104 |
needs to be dealt with). The length limit is |
| 2105 |
undesirable, but is present in widely-used exist- |
| 2106 |
ing software. The limit is actually 255, but a |
| 2107 |
|
| 2108 |
|
| 2109 |
|
| 2110 |
2 June 1994 - 32 - expires 15 July 1994 |
| 2111 |
|
| 2112 |
|
| 2113 |
|
| 2114 |
|
| 2115 |
|
| 2116 |
INTERNET DRAFT to be NEWS sec. 5.3 |
| 2117 |
|
| 2118 |
|
| 2119 |
small safety margin is wise. |
| 2120 |
|
| 2121 |
|
| 2122 |
5.4. Subject |
| 2123 |
|
| 2124 |
The Subject header's content (the "subject" of the article) |
| 2125 |
is a short phrase describing the topic of the article: |
| 2126 |
|
| 2127 |
Subject-content = [ "Re: " ] nonblank-text |
| 2128 |
|
| 2129 |
Encoded words MAY appear in this header. |
| 2130 |
|
| 2131 |
If the article is a followup, the subject SHOULD begin with |
| 2132 |
"Re: " (a "back reference"). If the article is not a fol- |
| 2133 |
lowup, the subject MUST not begin with a back reference. |
| 2134 |
Back references are case-insensitive, although "Re: " is the |
| 2135 |
preferred form. A followup agent assisting a poster in |
| 2136 |
preparing a followup SHOULD prepend a back reference, UNLESS |
| 2137 |
the subject already begins with one. If the poster deter- |
| 2138 |
mines that the topic of the followup differs significantly |
| 2139 |
from what is described in the subject, a new, more descrip- |
| 2140 |
tive, subject SHOULD be substituted (with no back refer- |
| 2141 |
ence). An article whose subject begins with a back refer- |
| 2142 |
ence MUST have a References header referencing the precur- |
| 2143 |
sor. |
| 2144 |
|
| 2145 |
NOTE: A back reference is FOUR characters, the |
| 2146 |
fourth being a blank. RFC 1036 was confused about |
| 2147 |
this. Observe also that only ONE back reference |
| 2148 |
should be present. |
| 2149 |
|
| 2150 |
NOTE: There is a semi-standard convention, often |
| 2151 |
used, in which a subject change is flagged by mak- |
| 2152 |
ing the new Subject-content of the form: |
| 2153 |
|
| 2154 |
new topic (was: old topic) |
| 2155 |
|
| 2156 |
possibly with "old topic" somewhat truncated. |
| 2157 |
Posters wishing to do something like this are |
| 2158 |
urged to use this exact form, to simplify auto- |
| 2159 |
mated analysis. |
| 2160 |
|
| 2161 |
For historical reasons, the subject MUST not begin with |
| 2162 |
"cmsg " (note that this sequence ends with a blank). |
| 2163 |
|
| 2164 |
NOTE: Some old news software takes a subject |
| 2165 |
beginning with "cmsg " as an indication that the |
| 2166 |
article is a control message (see sections 6.6 and |
| 2167 |
7). This mechanism is obsolete and undesirable, |
| 2168 |
but accidental triggering of it is still possible. |
| 2169 |
|
| 2170 |
The subject SHOULD be terse. Posters SHOULD avoid trying to |
| 2171 |
cram their entire article into the headers; even the sim- |
| 2172 |
plest query usually benefits from a sentence or two of |
| 2173 |
|
| 2174 |
|
| 2175 |
|
| 2176 |
2 June 1994 - 33 - expires 15 July 1994 |
| 2177 |
|
| 2178 |
|
| 2179 |
|
| 2180 |
|
| 2181 |
|
| 2182 |
INTERNET DRAFT to be NEWS sec. 5.4 |
| 2183 |
|
| 2184 |
|
| 2185 |
elaboration and context, and the details of header display |
| 2186 |
vary widely among reading agents. |
| 2187 |
|
| 2188 |
NOTE: All-in-the-subject articles are sometimes |
| 2189 |
the result of misunderstandings over the interac- |
| 2190 |
tion protocol of a posting agent. Posting agents |
| 2191 |
might wish to give special attention to the possi- |
| 2192 |
bility that a poster specifying a very long sub- |
| 2193 |
ject might have thought he was typing the body of |
| 2194 |
the article. |
| 2195 |
|
| 2196 |
|
| 2197 |
5.5. Newsgroups |
| 2198 |
|
| 2199 |
The Newsgroups header's content specifies which newsgroup(s) |
| 2200 |
the article is posted to: |
| 2201 |
|
| 2202 |
Newsgroups-content = newsgroup-name *( ng-delim newsgroup-name ) |
| 2203 |
newsgroup-name = plain-component *( "." component ) |
| 2204 |
component = plain-component / encoded-word |
| 2205 |
plain-component = component-start *13component-rest |
| 2206 |
component-start = lowercase / digit |
| 2207 |
lowercase = <letter a-z> |
| 2208 |
component-rest = component-start / "+" / "-" / "_" |
| 2209 |
ng-delim = "," |
| 2210 |
|
| 2211 |
Encoded words used in newsgroup names MUST not contain char- |
| 2212 |
acters other than letters, digits, "+", "-", "/", "_", "=", |
| 2213 |
and "?" (although they may encode them). |
| 2214 |
|
| 2215 |
A newsgroup name consists of one or more components, which |
| 2216 |
may be plain components or (except for the first) encoded |
| 2217 |
words. A plain component MUST contain at least one letter, |
| 2218 |
MUST begin with a letter or digit, and MUST not be longer |
| 2219 |
than 14 characters. The first component MUST begin with a |
| 2220 |
letter; subsequent components SHOULD begin with a letter. |
| 2221 |
Newsgroup names MUST not contain uppercase letters, except |
| 2222 |
where required by encodings in encoded words. The sequences |
| 2223 |
"all" and "ctl" MUST not be used as components. |
| 2224 |
|
| 2225 |
NOTE: The alphabet and syntax specified encom- |
| 2226 |
passes all existing names of widespread news- |
| 2227 |
groups, while avoiding various forms that are |
| 2228 |
known to cause problems. Important existing soft- |
| 2229 |
ware uses various non-alphanumeric characters as |
| 2230 |
punctuation adjacent to newsgroup names. (It |
| 2231 |
would, in fact, be preferable to ban "+" from |
| 2232 |
newsgroup names, were it not that several |
| 2233 |
widespread newsgroups related to the C++ program- |
| 2234 |
ming language already use it.) |
| 2235 |
|
| 2236 |
NOTE: Much existing software converts the news- |
| 2237 |
group name into a directory path and stores the |
| 2238 |
articles themselves using numeric filenames, so |
| 2239 |
|
| 2240 |
|
| 2241 |
|
| 2242 |
2 June 1994 - 34 - expires 15 July 1994 |
| 2243 |
|
| 2244 |
|
| 2245 |
|
| 2246 |
|
| 2247 |
|
| 2248 |
INTERNET DRAFT to be NEWS sec. 5.5 |
| 2249 |
|
| 2250 |
|
| 2251 |
all-digit name components can be troublesome; the |
| 2252 |
"Great Renaming" early in the history of Usenet |
| 2253 |
included revisions of several newsgroup names to |
| 2254 |
eliminate such components. |
| 2255 |
|
| 2256 |
NOTE: The same storage technique is the reason for |
| 2257 |
the 14-character limit. The limit is now largely |
| 2258 |
historical, since most modern systems have much |
| 2259 |
larger limits on the length of a directory entry's |
| 2260 |
name, but many old systems are still in use. Sys- |
| 2261 |
tems with shorter limits also exist, but news |
| 2262 |
software on such systems has had to deal with the |
| 2263 |
problem already, since there are several |
| 2264 |
widespread newsgroups with 14-character components |
| 2265 |
in their names. Implementors are warned that it |
| 2266 |
is intended that the successor to this Draft will |
| 2267 |
increase the 14-character limit, and are urged to |
| 2268 |
fix their software to handle longer names grace- |
| 2269 |
fully (if such fixes are necessary, given the |
| 2270 |
intended domain of application of the particular |
| 2271 |
software). |
| 2272 |
|
| 2273 |
NOTE: The requirement that the first character of |
| 2274 |
a name be a letter accommodates existing software |
| 2275 |
which assumes it can tell the difference between a |
| 2276 |
newsgroup name and other possible syntactic enti- |
| 2277 |
ties by inspecting the first character. Similar |
| 2278 |
considerations motivate excluding "+", "-", and |
| 2279 |
"_" from coming first in a component, and the |
| 2280 |
preference for components that do not begin with |
| 2281 |
digits. The "all" sequence is used as a wildcard |
| 2282 |
symbol in much existing software, and the "ctl" |
| 2283 |
sequence was involved in an obsolete historical |
| 2284 |
mechanism for marking control messages, so they |
| 2285 |
are best avoided. |
| 2286 |
|
| 2287 |
NOTE: Possibly newsgroup names should have been |
| 2288 |
case-insensitive, but all existing software treats |
| 2289 |
them as case-sensitive. (RFC 977 [rrr] claims |
| 2290 |
that they are case-insensitive in NNTP, but exist- |
| 2291 |
ing implementations are believed to ignore this.) |
| 2292 |
The simplest solution is just to ban use of upper- |
| 2293 |
case letters, since no widespread newsgroup name |
| 2294 |
uses them anyway; this avoids any possibility of |
| 2295 |
confusion. |
| 2296 |
|
| 2297 |
NOTE: The syntax has the disadvantage of contain- |
| 2298 |
ing no white space, making it impossible to con- |
| 2299 |
tinue a Newsgroups header across several lines. |
| 2300 |
Implementors of relayers and reading agents are |
| 2301 |
warned that it is intended that the successor to |
| 2302 |
this Draft will change the definition of ng-delim |
| 2303 |
to: |
| 2304 |
|
| 2305 |
|
| 2306 |
|
| 2307 |
|
| 2308 |
2 June 1994 - 35 - expires 15 July 1994 |
| 2309 |
|
| 2310 |
|
| 2311 |
|
| 2312 |
|
| 2313 |
|
| 2314 |
INTERNET DRAFT to be NEWS sec. 5.5 |
| 2315 |
|
| 2316 |
|
| 2317 |
ng-delim = "," [ space ] |
| 2318 |
|
| 2319 |
and are urged to fix their software to handle |
| 2320 |
(i.e., ignore) white space following the commas. |
| 2321 |
Meanwhile, posters must avoid inserting such space |
| 2322 |
(despite the natural-language convention which |
| 2323 |
permits it) and posting agents should strip it |
| 2324 |
out. |
| 2325 |
|
| 2326 |
NOTE: Encoded words as components are somewhat |
| 2327 |
problematic, but are clearly desirable for use in |
| 2328 |
non-English-speaking nations. They are not sub- |
| 2329 |
ject to the 14-character limit, and this (plus the |
| 2330 |
possibility of "/" within them) may require spe- |
| 2331 |
cial handling in news software. |
| 2332 |
|
| 2333 |
Encoded words are allowed in newsgroup names ONLY where non- |
| 2334 |
ASCII characters are necessary to the name, and must use the |
| 2335 |
"b" encoding [rrr] and the first suitable character set in |
| 2336 |
the MIME order of preferred character sets [rrr]. |
| 2337 |
|
| 2338 |
NOTE: Since the newsgroup name is the encoded |
| 2339 |
form, NOT the underlying non-ASCII form, there is |
| 2340 |
room for terrible confusion here if the choice of |
| 2341 |
encoding for a particular name is not fully stan- |
| 2342 |
dardized. |
| 2343 |
|
| 2344 |
Posters SHOULD use only the names of existing newsgroups in |
| 2345 |
the Newsgroups header, because newsgroups are NOT created |
| 2346 |
simply by being posted to. However, it is legitimate to |
| 2347 |
cross-post to newsgroup(s) which do not exist on the posting |
| 2348 |
agent's host, provided that at least one of the newsgroups |
| 2349 |
DOES exist there, and followup agents MUST accept this |
| 2350 |
(posting agents MAY accept it, but SHOULD at least alert the |
| 2351 |
poster to the situation and request confirmation). Relayers |
| 2352 |
MUST not rewrite Newsgroups headers in any way, even if some |
| 2353 |
or all of the newsgroups do not exist on the relayer's host. |
| 2354 |
|
| 2355 |
NOTE: Early experience with news software that |
| 2356 |
created newsgroups when they were mentioned in a |
| 2357 |
Newsgroups header was thoroughly negative: posters |
| 2358 |
frequently mistype newsgroup names. |
| 2359 |
|
| 2360 |
NOTE: While it is legitimate for some of an arti- |
| 2361 |
cle's newsgroups not to exist on the host where it |
| 2362 |
is posted, this IS a rather unusual situation |
| 2363 |
except in followups (which should go to all news- |
| 2364 |
groups the precursor was posted to, even if not |
| 2365 |
all of them reach the site where the followup is |
| 2366 |
being posted). |
| 2367 |
|
| 2368 |
NOTE: Rewriting Newsgroups headers to strip |
| 2369 |
locally-unknown newsgroups is superficially |
| 2370 |
attractive. However, early experience with |
| 2371 |
|
| 2372 |
|
| 2373 |
|
| 2374 |
2 June 1994 - 36 - expires 15 July 1994 |
| 2375 |
|
| 2376 |
|
| 2377 |
|
| 2378 |
|
| 2379 |
|
| 2380 |
INTERNET DRAFT to be NEWS sec. 5.5 |
| 2381 |
|
| 2382 |
|
| 2383 |
exactly that policy was thoroughly negative: news |
| 2384 |
propagation is more redundant and much less |
| 2385 |
orderly than many people imagine, and in particu- |
| 2386 |
lar it is not unheard-of for the (sometimes) |
| 2387 |
fastest path between two (say) U of Toronto sites |
| 2388 |
to pass outside U of Toronto... in which case |
| 2389 |
newsgroup stripping can cause incomplete propaga- |
| 2390 |
tion. Having an article's set of newsgroups |
| 2391 |
change as it propagates can also result in fol- |
| 2392 |
lowups not achieving the same propagation as the |
| 2393 |
original. It's been tried; it's more trouble than |
| 2394 |
it's worth; don't do it. |
| 2395 |
|
| 2396 |
NOTE: In particular, newsgroup stripping superfi- |
| 2397 |
cially looks like a solution to the problem of |
| 2398 |
duplicate regional newsgroup names. For example, |
| 2399 |
both University of Toronto and University of Texas |
| 2400 |
have "ut.general" newsgroups, and material cross- |
| 2401 |
posted to that name and a global newsgroup appears |
| 2402 |
in both universities' local newsgroups. However, |
| 2403 |
the side effects of stripping are sufficiently |
| 2404 |
unacceptable to disqualify it for this purpose. |
| 2405 |
Don't do it. |
| 2406 |
|
| 2407 |
Cross-posting an article to several relevant newsgroups is |
| 2408 |
far superior to posting separate articles with duplicated |
| 2409 |
content to each newsgroup, because reading agents can detect |
| 2410 |
the situation and show the article to a reader only once. |
| 2411 |
Posters SHOULD cross-post rather than duplicate-post. |
| 2412 |
|
| 2413 |
NOTE: On the other hand, cross-posting to a large |
| 2414 |
number of newsgroups usually indicates that the |
| 2415 |
poster has not thought about his audience; arti- |
| 2416 |
cles are rarely pertinent to more than (say) half |
| 2417 |
a dozen newsgroups. Posting agents might wish to |
| 2418 |
request confirmation when the number of newsgroups |
| 2419 |
exceeds (say) five in the presence of a Followup- |
| 2420 |
To header, or (say) two in the absence of such a |
| 2421 |
header. |
| 2422 |
|
| 2423 |
NOTE: One problem with cross-postings is what to |
| 2424 |
do with an article cross-posted to a set of news- |
| 2425 |
groups including both moderated and unmoderated |
| 2426 |
ones. Posters tend to expect such an article to |
| 2427 |
show up immediately in the unmoderated newsgroups, |
| 2428 |
especially if they do not realize that one or more |
| 2429 |
of the newsgroups is moderated. However, since it |
| 2430 |
is not possible for a moderator to retroactively |
| 2431 |
add an already-posted article to a moderated news- |
| 2432 |
group, the only correct action is to mail such an |
| 2433 |
article to one (and only one) of the moderators |
| 2434 |
for action. It is probably best for the posting |
| 2435 |
agent to detect this situation and ask the poster |
| 2436 |
what action is preferred. The acceptable choices |
| 2437 |
|
| 2438 |
|
| 2439 |
|
| 2440 |
2 June 1994 - 37 - expires 15 July 1994 |
| 2441 |
|
| 2442 |
|
| 2443 |
|
| 2444 |
|
| 2445 |
|
| 2446 |
INTERNET DRAFT to be NEWS sec. 5.5 |
| 2447 |
|
| 2448 |
|
| 2449 |
are to alter the newsgroup list or to mail to a |
| 2450 |
moderator of the poster's choice; the posting |
| 2451 |
agent should NOT offer duplicate-posting as an |
| 2452 |
easy-to-request option (if only because many mod- |
| 2453 |
erators will reject a submission that has already |
| 2454 |
been posted to unmoderated newsgroups). |
| 2455 |
|
| 2456 |
NOTE: An article cross-posted to multiple moder- |
| 2457 |
ated newsgroups really should have approval from |
| 2458 |
all the moderators involved. In practice, the |
| 2459 |
only straightforward way to do this is to send the |
| 2460 |
article to one of them and have him consult the |
| 2461 |
others. |
| 2462 |
|
| 2463 |
A newsgroup SHOULD not appear more than once in the News- |
| 2464 |
groups header. |
| 2465 |
|
| 2466 |
Newsgroup names having only one component are reserved for |
| 2467 |
newsgroups whose propagation is restricted to a single host |
| 2468 |
(or the administrative equivalent). It is inadvisable to |
| 2469 |
name a newsgroup "poster" because that word has special |
| 2470 |
meaning in the Followup-To header (see section 6.1). The |
| 2471 |
names "control" and "junk" are frequently used for pseudo- |
| 2472 |
newsgroups internal to relayer implementations, and hence |
| 2473 |
are also best avoided. |
| 2474 |
|
| 2475 |
NOTE: Beware of the duplicate-regional-newsgroup- |
| 2476 |
names problem mentioned above. In particular, |
| 2477 |
there are many, many hosts with a newsgroup named |
| 2478 |
"general", and some surprising things show up in |
| 2479 |
such newsgroups when people cross-post. It is |
| 2480 |
probably better to use multi-component names, |
| 2481 |
which are less likely to be duplicated. Fred's |
| 2482 |
Widget House should use "fwh.general" rather than |
| 2483 |
just "general" as its in-house general-topics |
| 2484 |
newsgroup. |
| 2485 |
|
| 2486 |
It is conventional to reserve newsgroup names beginning with |
| 2487 |
"to." for test messages sent on an essentially point-to- |
| 2488 |
point basis (see also the ihave/sendme protocol described in |
| 2489 |
section 7.2); newsgroup names beginning with "to." SHOULD |
| 2490 |
not be used for any other purpose. The second (and possibly |
| 2491 |
later) components of such a name should, together, comprise |
| 2492 |
the relayer name (see section 5.6) of a relayer. The news- |
| 2493 |
group exists only at the named relayer and its neighbors. |
| 2494 |
The neighbors all pass that newsgroup to the named relayer, |
| 2495 |
while the named relayer does not pass it to anyone. |
| 2496 |
|
| 2497 |
The order of newsgroup names in the Newsgroups header is not |
| 2498 |
significant. |
| 2499 |
|
| 2500 |
|
| 2501 |
|
| 2502 |
|
| 2503 |
|
| 2504 |
|
| 2505 |
|
| 2506 |
2 June 1994 - 38 - expires 15 July 1994 |
| 2507 |
|
| 2508 |
|
| 2509 |
|
| 2510 |
|
| 2511 |
|
| 2512 |
INTERNET DRAFT to be NEWS sec. 5.6 |
| 2513 |
|
| 2514 |
|
| 2515 |
5.6. Path |
| 2516 |
|
| 2517 |
The Path header's content indicates which relayers the arti- |
| 2518 |
cle has already visited, so that unnecessary redundant |
| 2519 |
transmission can be avoided: |
| 2520 |
|
| 2521 |
Path-content = [ path-list path-delimiter ] local-part |
| 2522 |
path-list = relayer-name *( path-delimiter relayer-name ) |
| 2523 |
relayer-name = 1*rn-char |
| 2524 |
rn-char = letter / digit / "." / "-" / "_" |
| 2525 |
path-delimiter = "!" |
| 2526 |
|
| 2527 |
The Path content is a list of relayer names, separated by |
| 2528 |
path delimiters, followed (after a final delimiter) by the |
| 2529 |
local part of a mailing address. Each relayer MUST prepend |
| 2530 |
its name, and a delimiter, to the Path content in all arti- |
| 2531 |
cles it processes. A relayer MUST not pass an article to a |
| 2532 |
neighboring relayer whose name is already mentioned in an |
| 2533 |
article's path list, unless this is explicitly requested by |
| 2534 |
the neighbor in some way. The Path content is case- |
| 2535 |
sensitive. |
| 2536 |
|
| 2537 |
NOTE: The Path header supplied by a posting agent |
| 2538 |
should normally contain only the local part. The |
| 2539 |
relayer that the posting agent passes the article |
| 2540 |
to for posting will prepend its relayer name to |
| 2541 |
get the path list started. |
| 2542 |
|
| 2543 |
NOTE: Observe that the trailing local part is NOT |
| 2544 |
part of the path list. This Path header: |
| 2545 |
|
| 2546 |
Path: fee!fie!foe!fum |
| 2547 |
|
| 2548 |
contains three relayer names: "fee", "fie", and |
| 2549 |
"foe". A relayer named "fum" is still eligible to |
| 2550 |
be sent this article. |
| 2551 |
|
| 2552 |
NOTE: This syntax has the disadvantage of contain- |
| 2553 |
ing no white space, making it impossible to con- |
| 2554 |
tinue a Path header across several lines. Imple- |
| 2555 |
mentors of relayers and reading agents are warned |
| 2556 |
that it is intended that the successor to this |
| 2557 |
Draft will change the definition of path delimiter |
| 2558 |
to: |
| 2559 |
|
| 2560 |
path-delimiter = "!" [ space ] |
| 2561 |
|
| 2562 |
and are urged to fix their software to handle |
| 2563 |
(i.e., ignore) white space following the exclama- |
| 2564 |
tion points. They are urged to hurry; some ill- |
| 2565 |
behaved systems reportedly already feel free to |
| 2566 |
add such white space. |
| 2567 |
|
| 2568 |
|
| 2569 |
|
| 2570 |
|
| 2571 |
|
| 2572 |
2 June 1994 - 39 - expires 15 July 1994 |
| 2573 |
|
| 2574 |
|
| 2575 |
|
| 2576 |
|
| 2577 |
|
| 2578 |
INTERNET DRAFT to be NEWS sec. 5.6 |
| 2579 |
|
| 2580 |
|
| 2581 |
NOTE: RFC 1036 allows considerably more flexibil- |
| 2582 |
ity in choice of delimiter, in theory, but this |
| 2583 |
flexibility has never been used and most news |
| 2584 |
software does not implement it properly. The |
| 2585 |
grammar reflects the current reality. Note, in |
| 2586 |
particular, that RFC 1036 treats "_" as a delim- |
| 2587 |
iter, but in fact it is known to appear in relayer |
| 2588 |
names occasionally. |
| 2589 |
|
| 2590 |
Because an article will not propagate to a relayer already |
| 2591 |
mentioned in its path list, the path list MUST not contain |
| 2592 |
any names other than those of relayers the article has |
| 2593 |
passed through AS NEWS. This is trivially obvious for nor- |
| 2594 |
mal news articles, but requires attention from the modera- |
| 2595 |
tors of moderated newsgroups and the implementors and main- |
| 2596 |
tainers of gateways. |
| 2597 |
|
| 2598 |
NOTE: For the same reason, a relayer and its |
| 2599 |
neighbors need to agree on the choice of relayer |
| 2600 |
name, and names should not be changed without |
| 2601 |
notifying neighbors. |
| 2602 |
|
| 2603 |
Relayer names need to be unique among all relayers which |
| 2604 |
will ever see the articles using them. A relayer name is |
| 2605 |
normally either an "official" name for the host the relayer |
| 2606 |
runs on, or some other "official" name controlled by the |
| 2607 |
same organization. Except in cooperating subnets that agree |
| 2608 |
to some other convention, and don't let articles using it |
| 2609 |
escape beyond the subnet, a relayer name MUST be either a |
| 2610 |
UUCP name registered in the UUCP maps (without any domain |
| 2611 |
suffix such as ".UUCP"), or a complete Internet domain name. |
| 2612 |
Use of a (registered) UUCP name is recommended, where prac- |
| 2613 |
tical, to keep the length of the path list down. |
| 2614 |
|
| 2615 |
The use of Internet domain names in the path list presents |
| 2616 |
one problem: domain names are case-insensitive, but the path |
| 2617 |
list is case-sensitive. Relayers using domain names as |
| 2618 |
their relayer names MUST pick a standard form for the name, |
| 2619 |
and use that form consistently to the exclusion of all oth- |
| 2620 |
ers. The preferred form for this purpose, which relayers |
| 2621 |
SHOULD use, is the all-lowercase form. |
| 2622 |
|
| 2623 |
NOTE: It is arguably unfortunate that the path |
| 2624 |
list is case-sensitive, but it is much too late to |
| 2625 |
change this. Most Internet sites do, in any |
| 2626 |
event, use one standardized form of their name |
| 2627 |
almost everywhere. |
| 2628 |
|
| 2629 |
In the ordinary case, where the poster is the author of the |
| 2630 |
article, the local part following the path list SHOULD be |
| 2631 |
the local part of the poster's full Internet domain mailing |
| 2632 |
address. |
| 2633 |
|
| 2634 |
|
| 2635 |
|
| 2636 |
|
| 2637 |
|
| 2638 |
2 June 1994 - 40 - expires 15 July 1994 |
| 2639 |
|
| 2640 |
|
| 2641 |
|
| 2642 |
|
| 2643 |
|
| 2644 |
INTERNET DRAFT to be NEWS sec. 5.6 |
| 2645 |
|
| 2646 |
|
| 2647 |
NOTE: It should be just the local part, not the |
| 2648 |
full address. The character "@" does not appear |
| 2649 |
in a Path header. |
| 2650 |
|
| 2651 |
The Path content somewhat resembles a mailing address, par- |
| 2652 |
ticularly in the UUCP world with its manual routing and "!" |
| 2653 |
address syntax. Historically, this resemblance was impor- |
| 2654 |
tant, and the Path content was often used as a reply |
| 2655 |
address. This practice has always been somewhat unreliable, |
| 2656 |
since news paths are not always mail paths and news relayer |
| 2657 |
names are not always recognized by mail handlers, and its |
| 2658 |
reliability has generally worsened in recent times. The |
| 2659 |
widespread use of and recognition of Internet domain |
| 2660 |
addresses, even outside the actual Internet, has largely |
| 2661 |
eliminated the problem. Readers SHOULD not use the Path |
| 2662 |
content as a reply address. On the other hand, relayer |
| 2663 |
administrators are urged not to break this usage without |
| 2664 |
good reason; where practical, paths followed by news SHOULD |
| 2665 |
be traversable by mail, and mail handlers SHOULD recognize |
| 2666 |
relayer names as host names. |
| 2667 |
|
| 2668 |
It will typically be difficult or impractical for gateways |
| 2669 |
and moderators to supply a Path content that is useful as a |
| 2670 |
reply address for the author, bearing in mind that the path |
| 2671 |
list they supply will normally be empty. (To reiterate: the |
| 2672 |
path list MUST not contain any names other than those of |
| 2673 |
relayers the article has passed through AS NEWS.) They |
| 2674 |
SHOULD supply a local part that will result in replies to a |
| 2675 |
Path-derived address being returned to the sender with a |
| 2676 |
brief explanation. Software permitting, the local part |
| 2677 |
"not-for-mail" is recommended. |
| 2678 |
|
| 2679 |
NOTE: A moderator or gateway administrator who |
| 2680 |
supplies a local part that delivers such mail to |
| 2681 |
an administrative mailbox will quickly discover |
| 2682 |
why it should be bounced automatically! It is |
| 2683 |
best, however, for the returned message to include |
| 2684 |
an explanation of what has probably happened, |
| 2685 |
rather than just a mysterious "undeliverable mail" |
| 2686 |
complaint, since the sender may not be aware that |
| 2687 |
his/her software is unwisely using the Path con- |
| 2688 |
tent as a reply address. Reply software might |
| 2689 |
wish to question attempts to reply to a Path- |
| 2690 |
derived address ending in "not-for-mail" (which is |
| 2691 |
why a specific name is being recommended here). |
| 2692 |
|
| 2693 |
|
| 2694 |
6. Optional Headers |
| 2695 |
|
| 2696 |
Many MAIL headers, and many of those specified in present |
| 2697 |
and future MAIL extensions, are potentially applicable to |
| 2698 |
news. Headers specific to MAIL's point-to-point transmis- |
| 2699 |
sion paradigm, e.g. To and Cc, SHOULD not appear in news |
| 2700 |
articles. (Gateways wishing to preserve such information |
| 2701 |
|
| 2702 |
|
| 2703 |
|
| 2704 |
2 June 1994 - 41 - expires 15 July 1994 |
| 2705 |
|
| 2706 |
|
| 2707 |
|
| 2708 |
|
| 2709 |
|
| 2710 |
INTERNET DRAFT to be NEWS sec. 6 |
| 2711 |
|
| 2712 |
|
| 2713 |
for debugging probably SHOULD hide it under different names; |
| 2714 |
prefixing "X-" to the original headers, resulting in e.g. |
| 2715 |
"X-To", is suggested.) |
| 2716 |
|
| 2717 |
The following optional headers are either specific to news |
| 2718 |
or of particular note in news articles; an article MAY con- |
| 2719 |
tain some or all of them. (Note that there are some circum- |
| 2720 |
stances in which some of them are mandatory; these are |
| 2721 |
explained under the individual headers.) An article MUST |
| 2722 |
not contain two or more headers with any one of these header |
| 2723 |
names. |
| 2724 |
|
| 2725 |
NOTE: The ban on duplicate header names does not |
| 2726 |
apply to headers not specified in this Draft at |
| 2727 |
all, such as "X-" headers. Software should not |
| 2728 |
assume that all header names in a given article |
| 2729 |
are unique. |
| 2730 |
|
| 2731 |
|
| 2732 |
6.1. Followup-To |
| 2733 |
|
| 2734 |
The Followup-To header contents specify which newsgroup(s) |
| 2735 |
followups should be posted to: |
| 2736 |
|
| 2737 |
Followup-To-content = Newsgroups-content / "poster" |
| 2738 |
|
| 2739 |
The syntax is the same as that of the Newsgroups content, |
| 2740 |
with the exception that the magic word "poster" means that |
| 2741 |
followups should be mailed to the article's reply address |
| 2742 |
rather than posted. In the absence of Followup-To, the |
| 2743 |
default newsgroup(s) for a followup are those in the News- |
| 2744 |
groups header. |
| 2745 |
|
| 2746 |
NOTE: The way to request that followups be mailed |
| 2747 |
to a specific address other than that in the From |
| 2748 |
line is to supply "Followup-To: poster" and a |
| 2749 |
Reply-To header. Putting a mailing address in the |
| 2750 |
Followup-To line is incorrect; posting agents |
| 2751 |
should reject or rewrite such headers. |
| 2752 |
|
| 2753 |
NOTE: There is no syntax for "no followups |
| 2754 |
allowed" because "Followup-To: poster" accom- |
| 2755 |
plishes this effect without extra machinery. |
| 2756 |
|
| 2757 |
Although it is generally desirable to limit followups to the |
| 2758 |
smallest reasonable set of newsgroups, especially when the |
| 2759 |
precursor was cross-posted widely, posting agents SHOULD not |
| 2760 |
supply a Followup-To header except at the poster's explicit |
| 2761 |
request. |
| 2762 |
|
| 2763 |
NOTE: In particular, it is incorrect for the post- |
| 2764 |
ing agent to assume that followups to a cross- |
| 2765 |
posted article should be directed to the first |
| 2766 |
newsgroup only. Trimming the list of newsgroups |
| 2767 |
|
| 2768 |
|
| 2769 |
|
| 2770 |
2 June 1994 - 42 - expires 15 July 1994 |
| 2771 |
|
| 2772 |
|
| 2773 |
|
| 2774 |
|
| 2775 |
|
| 2776 |
INTERNET DRAFT to be NEWS sec. 6.1 |
| 2777 |
|
| 2778 |
|
| 2779 |
should be the poster's decision, not the posting |
| 2780 |
agent's. However, when an article is to be cross- |
| 2781 |
posted to a considerable number of newsgroups, a |
| 2782 |
posting agent might wish to SUGGEST to the poster |
| 2783 |
that followups go to a shorter list. |
| 2784 |
|
| 2785 |
|
| 2786 |
6.2. Expires |
| 2787 |
|
| 2788 |
The Expires header content specifies a date and time when |
| 2789 |
the article is deemed to be no longer useful and should be |
| 2790 |
removed ("expired"): |
| 2791 |
|
| 2792 |
Expires-content = Date-content |
| 2793 |
|
| 2794 |
The content syntax is the same as that of the Date content. |
| 2795 |
In the absence of Expires, the default is decided by the |
| 2796 |
administrators of each host the article reaches, who MAY |
| 2797 |
also restrict the extent to which the Expires header is hon- |
| 2798 |
ored. |
| 2799 |
|
| 2800 |
The Expires header has two main applications: removing arti- |
| 2801 |
cles whose utility ends on a specific date (e.g., event |
| 2802 |
announcements which can be removed once the day of the event |
| 2803 |
is past) and preserving articles expected to be of prolonged |
| 2804 |
usefulness (e.g., information aimed at new readers of a |
| 2805 |
newsgroup). The latter application is sometimes abused. |
| 2806 |
Since individual hosts have local policies for expiration of |
| 2807 |
news (depending on available disk space, for instance), |
| 2808 |
posters SHOULD not provide Expires headers for articles |
| 2809 |
unless there is a natural expiration date associated with |
| 2810 |
the topic. Posting agents MUST not provide a default |
| 2811 |
Expires header. Leave it out and allow local policies to be |
| 2812 |
used unless there is a good reason not to. Expiry dates are |
| 2813 |
properly the decision of individual host administrators; |
| 2814 |
posters and moderators SHOULD set only expiry dates that |
| 2815 |
most administrators would agree with. |
| 2816 |
|
| 2817 |
NOTE: A poster preparing an Expires header for an |
| 2818 |
article whose utility ends on a specific day |
| 2819 |
should typically specify the NEXT day as the |
| 2820 |
expiry date. A meeting on July 7th remains of |
| 2821 |
interest on the 7th. |
| 2822 |
|
| 2823 |
|
| 2824 |
6.3. Reply-To |
| 2825 |
|
| 2826 |
The Reply-To header content specifies a reply address dif- |
| 2827 |
ferent from the author's address given in the From header: |
| 2828 |
|
| 2829 |
Reply-To-content = From-content |
| 2830 |
|
| 2831 |
In the absence of Reply-To, the reply address is the address |
| 2832 |
in the From header. |
| 2833 |
|
| 2834 |
|
| 2835 |
|
| 2836 |
2 June 1994 - 43 - expires 15 July 1994 |
| 2837 |
|
| 2838 |
|
| 2839 |
|
| 2840 |
|
| 2841 |
|
| 2842 |
INTERNET DRAFT to be NEWS sec. 6.3 |
| 2843 |
|
| 2844 |
|
| 2845 |
Use of a Reply-To header is preferable to including a simi- |
| 2846 |
lar request in the article body, because reply-preparation |
| 2847 |
software can take account of Reply-To automatically. |
| 2848 |
|
| 2849 |
|
| 2850 |
6.4. Sender |
| 2851 |
|
| 2852 |
The Sender header identifies the poster, in the event that |
| 2853 |
this differs from the author identified in the From header: |
| 2854 |
|
| 2855 |
Sender-content = From-content |
| 2856 |
|
| 2857 |
In the absence of Sender, the default poster is the author |
| 2858 |
(named in the From header). |
| 2859 |
|
| 2860 |
NOTE: The intent is that the Sender header have a |
| 2861 |
fairly high probability of identifying the person |
| 2862 |
who really posted the article. The ability to |
| 2863 |
specify a From header naming someone other than |
| 2864 |
the poster is useful but can be abused. |
| 2865 |
|
| 2866 |
If the poster supplies a From header, the posting agent MUST |
| 2867 |
ensure that a Sender header is present, unless it can verify |
| 2868 |
that the mailing address in the From header is a valid mail- |
| 2869 |
ing address for the poster. A poster-supplied Sender header |
| 2870 |
MAY be used, if its mailing address is verifiably a valid |
| 2871 |
mailing address for the poster; otherwise the posting agent |
| 2872 |
MUST supply a Sender header and delete (or rename, e.g. to |
| 2873 |
X-Unverifiable-Sender) any poster-supplied Sender header. |
| 2874 |
|
| 2875 |
NOTE: It might be useful to preserve a poster- |
| 2876 |
supplied Sender header so that the poster can sup- |
| 2877 |
ply the full-name part of the content. The mail- |
| 2878 |
ing address, however, must be right. Hence, the |
| 2879 |
posting agent must generate the Sender header if |
| 2880 |
it is unable to verify the mailing address of a |
| 2881 |
poster-supplied one. |
| 2882 |
|
| 2883 |
NOTE: NNTP implementors, in particular, are urged |
| 2884 |
to note this requirement (which would eliminate |
| 2885 |
the need for ad hoc headers like NNTP-Posting- |
| 2886 |
Host), although there are admittedly some imple- |
| 2887 |
mentation difficulties. A user name from an RFC |
| 2888 |
1413 server and a host name from an inverse map- |
| 2889 |
ping of the address, perhaps with a "full name" |
| 2890 |
comment noting the origin of the information, |
| 2891 |
would be at least a first approximation: |
| 2892 |
|
| 2893 |
Sender: fred@zoo.toronto.edu (RFC-1413@reverse-lookup; not verified) |
| 2894 |
|
| 2895 |
While this does not completely meet the specs, it |
| 2896 |
comes a lot closer than not having a Sender header |
| 2897 |
at all. Even just supplying a placeholder for the |
| 2898 |
user name: |
| 2899 |
|
| 2900 |
|
| 2901 |
|
| 2902 |
2 June 1994 - 44 - expires 15 July 1994 |
| 2903 |
|
| 2904 |
|
| 2905 |
|
| 2906 |
|
| 2907 |
|
| 2908 |
INTERNET DRAFT to be NEWS sec. 6.4 |
| 2909 |
|
| 2910 |
|
| 2911 |
Sender: somebody@zoo.toronto.edu (user name unknown) |
| 2912 |
|
| 2913 |
would be better than nothing. |
| 2914 |
|
| 2915 |
|
| 2916 |
6.5. References |
| 2917 |
|
| 2918 |
The References header content lists message IDs of precur- |
| 2919 |
sors: |
| 2920 |
|
| 2921 |
References-content = message-id *( space message-id ) |
| 2922 |
|
| 2923 |
A followup MUST have a References header, and an article |
| 2924 |
which is not a followup MUST not have a References header. |
| 2925 |
In a followup, if the precursor had a References header, the |
| 2926 |
message ID of the precursor is appended to the end of the |
| 2927 |
precursor's References-content to form the followup's Refer- |
| 2928 |
ences-content. a References header containing the precur- |
| 2929 |
sor's message ID. A followup to an article which had a Ref- |
| 2930 |
erences header MUST have a References header containing the |
| 2931 |
precursor's References content, plus the precursor's message |
| 2932 |
ID appended to the end of the list. |
| 2933 |
|
| 2934 |
NOTE: Use the See-Also header (section 6.16) for |
| 2935 |
interconnection of articles which are not in a |
| 2936 |
followup relationship to each other. |
| 2937 |
|
| 2938 |
NOTE: In retrospect, RFCs 850 and 1036, and the |
| 2939 |
implementations whose practice they represented, |
| 2940 |
erred here. The proper MAIL header to use for |
| 2941 |
references to precursors is In-Reply-To, and the |
| 2942 |
References header is meant to be used for the pur- |
| 2943 |
poses here ascribed to See-Also. This incompati- |
| 2944 |
bility is far too solidly established to be fixed, |
| 2945 |
unfortunately. The best that can be done is to |
| 2946 |
provide a clear mapping between the two, and urge |
| 2947 |
gateways to do the transformation. The news usage |
| 2948 |
is (now) a deliberate violation of the MAIL speci- |
| 2949 |
fications; articles containing news References |
| 2950 |
headers are technically not valid MAIL messages, |
| 2951 |
although it is unlikely that much MAIL software |
| 2952 |
will notice because the incompatibility is at a |
| 2953 |
subtle semantic level that does not affect the |
| 2954 |
syntax. |
| 2955 |
|
| 2956 |
UNRESOLVED ISSUE: Would it be better to just give |
| 2957 |
up and admit that news uses References for both |
| 2958 |
purposes? |
| 2959 |
|
| 2960 |
UNRESOLVED ISSUE: Should the syntax be generalized |
| 2961 |
to include URLs as alternatives to message IDs? |
| 2962 |
Perhaps not; too many things know about References |
| 2963 |
already. And non-articles can't be precursors of |
| 2964 |
articles, not really. |
| 2965 |
|
| 2966 |
|
| 2967 |
|
| 2968 |
2 June 1994 - 45 - expires 15 July 1994 |
| 2969 |
|
| 2970 |
|
| 2971 |
|
| 2972 |
|
| 2973 |
|
| 2974 |
INTERNET DRAFT to be NEWS sec. 6.5 |
| 2975 |
|
| 2976 |
|
| 2977 |
Followup agents SHOULD not shorten References headers. If |
| 2978 |
it is absolutely necessary to shorten the header, as a des- |
| 2979 |
perate last resort, a followup agent MAY do this by deleting |
| 2980 |
some of the message IDs. However, it MUST not delete the |
| 2981 |
first message ID, the last three message IDs (including that |
| 2982 |
of the immediate precursor), or any message ID mentioned in |
| 2983 |
the body of the followup. If it is possible for the fol- |
| 2984 |
lowup agent to determine the Subject content of the articles |
| 2985 |
identified in the References header, it MUST not delete the |
| 2986 |
message ID of any article where the Subject content changed |
| 2987 |
(other than by prepending of a back reference). The fol- |
| 2988 |
lowup agent MUST not delete any message ID whose local part |
| 2989 |
ends with "_-_" (underscore (ASCII 95), hyphen (ASCII 45), |
| 2990 |
underscore); followup agents are urged to use this form to |
| 2991 |
mark subject changes, and to avoid using it otherwise. |
| 2992 |
|
| 2993 |
NOTE: As software capable of exploiting References |
| 2994 |
chains has grown more common, the random shorten- |
| 2995 |
ing permitted by RFC 1036 has become increasingly |
| 2996 |
troublesome. ANY shortening is undesirable, and |
| 2997 |
software should do it only in cases of dire neces- |
| 2998 |
sity. In such cases, these rules attempt to limit |
| 2999 |
the damage. |
| 3000 |
|
| 3001 |
NOTE: The first message ID is very important as |
| 3002 |
the starting point of the "thread" of discussion, |
| 3003 |
and absolutely should not be deleted. Keeping the |
| 3004 |
last three message IDs gives thread-following |
| 3005 |
software a fighting chance to reconstruct a full |
| 3006 |
thread even if an article or two is missing. |
| 3007 |
Keeping message IDs mentioned in the body is obvi- |
| 3008 |
ously desirable. |
| 3009 |
|
| 3010 |
NOTE: Subject changes are difficult to determine, |
| 3011 |
but they are significant as possible beginnings of |
| 3012 |
new threads. The "_-_" convention is provided so |
| 3013 |
that posting agents (which have more information |
| 3014 |
about subjects) can flag articles containing a |
| 3015 |
subject change in a way that followup agents can |
| 3016 |
detect without access to the articles themselves. |
| 3017 |
The sequence is chosen as one that is fairly |
| 3018 |
unlikely to occur by accident. |
| 3019 |
|
| 3020 |
NOTE: Is "_-_" really worth having? |
| 3021 |
|
| 3022 |
When a References header is shortened, at least three blanks |
| 3023 |
SHOULD be left between adjacent message IDs at each point |
| 3024 |
where deletions were made. Software preparing new Refer- |
| 3025 |
ences headers SHOULD preserve multiple blanks in older Ref- |
| 3026 |
erences content. |
| 3027 |
|
| 3028 |
NOTE: It's desirable to have some marker of where |
| 3029 |
deletions occurred, but the restricted syntax of |
| 3030 |
the header makes this difficult. Extra white |
| 3031 |
|
| 3032 |
|
| 3033 |
|
| 3034 |
2 June 1994 - 46 - expires 15 July 1994 |
| 3035 |
|
| 3036 |
|
| 3037 |
|
| 3038 |
|
| 3039 |
|
| 3040 |
INTERNET DRAFT to be NEWS sec. 6.5 |
| 3041 |
|
| 3042 |
|
| 3043 |
space is not a very good marker, since it may be |
| 3044 |
deleted by software that ill-advisedly rewrites |
| 3045 |
headers, but at least it doesn't break existing |
| 3046 |
software. |
| 3047 |
|
| 3048 |
To repeat: followup agents SHOULD not shorten References |
| 3049 |
headers. |
| 3050 |
|
| 3051 |
NOTE: Unfortunately, reading agents and other |
| 3052 |
software analyzing References patterns have to be |
| 3053 |
prepared for the worst anyway. The worst includes |
| 3054 |
random deletions and the possibility of circular |
| 3055 |
References chains (when References is misused in |
| 3056 |
place of See-Also, section 6.16). |
| 3057 |
|
| 3058 |
|
| 3059 |
6.6. Control |
| 3060 |
|
| 3061 |
The Control header content marks the article as a control |
| 3062 |
message, and specifies the desired actions (other than the |
| 3063 |
usual ones of filing and passing on the article): |
| 3064 |
|
| 3065 |
Control-content = verb *( space argument ) |
| 3066 |
verb = 1*( letter / digit ) |
| 3067 |
argument = 1*<ASCII printable character> |
| 3068 |
|
| 3069 |
The verb indicates what action should be taken, and the |
| 3070 |
argument(s) (if any) supply details. In some cases, the |
| 3071 |
body of the article may also contain details. Section 7 |
| 3072 |
describes the standard verbs. See also the Also-Control |
| 3073 |
header (section 6.15). |
| 3074 |
|
| 3075 |
NOTE: Control messages are often processed and |
| 3076 |
filed rather differently than normal articles. |
| 3077 |
|
| 3078 |
NOTE: The restriction of verbs to letters and dig- |
| 3079 |
its is new, but is consistent with existing prac- |
| 3080 |
tice and potentially simplifies implementation by |
| 3081 |
avoiding characters significant to command inter- |
| 3082 |
preters. Beware that the arguments are under no |
| 3083 |
such restriction in general. |
| 3084 |
|
| 3085 |
NOTE: Two other conventions for distinguishing |
| 3086 |
control messages from normal articles were for- |
| 3087 |
merly in use: a three-component newsgroup name |
| 3088 |
ending in ".ctl" or a subject beginning with |
| 3089 |
"cmsg " was considered to imply that the article |
| 3090 |
was a control message. These conventions are |
| 3091 |
obsolete. Do not use them. |
| 3092 |
|
| 3093 |
An article with a Control header MUST not have an Also- |
| 3094 |
Control or Supersedes header. |
| 3095 |
|
| 3096 |
|
| 3097 |
|
| 3098 |
|
| 3099 |
|
| 3100 |
2 June 1994 - 47 - expires 15 July 1994 |
| 3101 |
|
| 3102 |
|
| 3103 |
|
| 3104 |
|
| 3105 |
|
| 3106 |
INTERNET DRAFT to be NEWS sec. 6.7 |
| 3107 |
|
| 3108 |
|
| 3109 |
6.7. Distribution |
| 3110 |
|
| 3111 |
The Distribution header content specifies geographic or |
| 3112 |
organizational limits on an article's propagation: |
| 3113 |
|
| 3114 |
Distribution-content = distribution *( dist-delim distribution ) |
| 3115 |
dist-delim = "," |
| 3116 |
distribution = plain-component |
| 3117 |
|
| 3118 |
A distribution is syntactically identical to a one-component |
| 3119 |
newsgroup name, and must satisfy the same rules and restric- |
| 3120 |
tions. In the absence of Distribution, the default distri- |
| 3121 |
bution is "world". |
| 3122 |
|
| 3123 |
NOTE: This syntax has the disadvantage of contain- |
| 3124 |
ing no white space, making it impossible to con- |
| 3125 |
tinue a Distribution header across several lines. |
| 3126 |
Implementors of relayers and reading agents are |
| 3127 |
warned that it is intended that the successor to |
| 3128 |
this Draft will change the definition of dist |
| 3129 |
delimiter to: |
| 3130 |
|
| 3131 |
dist-delim = "," [ space ] |
| 3132 |
|
| 3133 |
and are urged to fix their software to handle |
| 3134 |
(i.e., ignore) white space following the commas. |
| 3135 |
|
| 3136 |
A relayer MUST not pass an article to another relayer unless |
| 3137 |
configuration information specifies transmission to that |
| 3138 |
other relayer of BOTH (a) at least one of the article's |
| 3139 |
newsgroup(s), and (b) at least one of the article's distri- |
| 3140 |
bution(s). In effect, the only role of distributions is to |
| 3141 |
limit propagation, by preventing transmission of articles |
| 3142 |
that would have been transmitted had the decision been based |
| 3143 |
solely on newsgroups. |
| 3144 |
|
| 3145 |
A posting agent might wish to present a menu of possible |
| 3146 |
distributions, or suggest a default, but normally SHOULD not |
| 3147 |
supply a default without giving the poster a chance to over- |
| 3148 |
ride it. A followup agent SHOULD initially supply the same |
| 3149 |
Distribution header as found in the precursor, although the |
| 3150 |
poster MAY alter this if appropriate. |
| 3151 |
|
| 3152 |
Despite the syntactic similarity and some historical confu- |
| 3153 |
sion, distributions are NOT newsgroup names. The whole |
| 3154 |
point of putting a distribution on an article is that it is |
| 3155 |
DIFFERENT from the newsgroup(s). In general, a meaningful |
| 3156 |
distribution corresponds to some sort of region of propaga- |
| 3157 |
tion: a geographical area, an organization, or a cooperating |
| 3158 |
subnet. |
| 3159 |
|
| 3160 |
NOTE: Distributions have historically suffered |
| 3161 |
from the completely uncontrolled nature of their |
| 3162 |
name space, the lack of feedback to posters on |
| 3163 |
|
| 3164 |
|
| 3165 |
|
| 3166 |
2 June 1994 - 48 - expires 15 July 1994 |
| 3167 |
|
| 3168 |
|
| 3169 |
|
| 3170 |
|
| 3171 |
|
| 3172 |
INTERNET DRAFT to be NEWS sec. 6.7 |
| 3173 |
|
| 3174 |
|
| 3175 |
incomplete propagation resulting from use of ran- |
| 3176 |
dom trash in Distribution headers, and confusion |
| 3177 |
with newsgroups (arising partly because many |
| 3178 |
regions and organizations DO have internal news- |
| 3179 |
groups with names resembling their internal dis- |
| 3180 |
tributions). This has resulted in much garbage in |
| 3181 |
Distribution headers, notably the pointless prac- |
| 3182 |
tice of automatically supplying the first compo- |
| 3183 |
nent of the newsgroup name as a distribution |
| 3184 |
(which is MOST unlikely to restrict propagation!). |
| 3185 |
Many sites have opted to maximize propagation of |
| 3186 |
such ill-formed articles by essentially ignoring |
| 3187 |
distributions. This unfortunately interferes with |
| 3188 |
legitimate uses. The situation is bad enough that |
| 3189 |
distributions must be considered largely useless |
| 3190 |
except within cooperating subnets that make an |
| 3191 |
organized effort to restrain propagation of their |
| 3192 |
internal distributions. |
| 3193 |
|
| 3194 |
NOTE: The distributions "world" and "local" have |
| 3195 |
no standard magic meaning (except that the former |
| 3196 |
is the default distribution if none is given). |
| 3197 |
Some pieces of software do assign such meanings to |
| 3198 |
them. |
| 3199 |
|
| 3200 |
|
| 3201 |
6.8. Keywords |
| 3202 |
|
| 3203 |
The Keywords header content is one or more phrases intended |
| 3204 |
to describe some aspect of the content of the article: |
| 3205 |
|
| 3206 |
Keywords-content = plain-phrase *( "," [ space ] plain-phrase ) |
| 3207 |
|
| 3208 |
Keywords, separated by commas, each follow the <plain- |
| 3209 |
phrase> syntax defined in section 5.2. Encoded words in |
| 3210 |
keywords MUST not contain characters other than letters (of |
| 3211 |
either case), digits, and the characters "!", "*", "+", "-", |
| 3212 |
"/", "=", and "_". |
| 3213 |
|
| 3214 |
NOTE: Posters and posting agents are asked to take |
| 3215 |
note that keywords are separated by commas, not by |
| 3216 |
white space. The following Keywords header con- |
| 3217 |
tains only one keyword (a rather unlikely and |
| 3218 |
improbable one): |
| 3219 |
|
| 3220 |
Keywords: Thompson Ritchie Multics Linux |
| 3221 |
|
| 3222 |
and should probably have been written: |
| 3223 |
|
| 3224 |
Keywords: Thompson, Ritchie, Multics, Linux |
| 3225 |
|
| 3226 |
This particular error is unfortunately rather |
| 3227 |
widespread. |
| 3228 |
|
| 3229 |
|
| 3230 |
|
| 3231 |
|
| 3232 |
2 June 1994 - 49 - expires 15 July 1994 |
| 3233 |
|
| 3234 |
|
| 3235 |
|
| 3236 |
|
| 3237 |
|
| 3238 |
INTERNET DRAFT to be NEWS sec. 6.8 |
| 3239 |
|
| 3240 |
|
| 3241 |
NOTE: Reading agents and archivers preparing |
| 3242 |
indexes of articles should bear in mind that user- |
| 3243 |
chosen keywords are notoriously poor for indexing |
| 3244 |
purposes unless the keywords are picked from a |
| 3245 |
predefined set (which they are not in this case). |
| 3246 |
Also, some followup agents unwisely propagate the |
| 3247 |
Keywords header from the precursor into the fol- |
| 3248 |
lowup by default. At least one news-based experi- |
| 3249 |
ment has found the contents of Keywords headers to |
| 3250 |
be completely valueless for indexing. |
| 3251 |
|
| 3252 |
|
| 3253 |
6.9. Summary |
| 3254 |
|
| 3255 |
The Summary header content is a short phrase summarizing the |
| 3256 |
article's content: |
| 3257 |
|
| 3258 |
Summary-content = nonblank-text |
| 3259 |
|
| 3260 |
As with the subject, no restriction is placed on the content |
| 3261 |
since it is intended solely for display to humans. |
| 3262 |
|
| 3263 |
NOTE: Reading agents should be aware that the Sum- |
| 3264 |
mary header is often used as a sort of secondary |
| 3265 |
Subject header, and (if present) its contents |
| 3266 |
should perhaps be displayed when the subject is |
| 3267 |
displayed. |
| 3268 |
|
| 3269 |
The summary SHOULD be terse. Posters SHOULD avoid trying to |
| 3270 |
cram their entire article into the headers; even the sim- |
| 3271 |
plest query usually benefits from a sentence or two of elab- |
| 3272 |
oration and context, and not all reading agents display all |
| 3273 |
headers. |
| 3274 |
|
| 3275 |
|
| 3276 |
6.10. Approved |
| 3277 |
|
| 3278 |
The Approved header content indicates the mailing addresses |
| 3279 |
(and possibly the full names) of the persons or entities |
| 3280 |
approving the article for posting: |
| 3281 |
|
| 3282 |
Approved-content = From-content *( "," [ space ] From-content ) |
| 3283 |
|
| 3284 |
An Approved header is required in all postings to moderated |
| 3285 |
newsgroups; the presence or absence of this header allows a |
| 3286 |
posting agent to distinguish between articles posted by the |
| 3287 |
moderator (which are normal articles to be posted normally) |
| 3288 |
and attempted contributions by others (which should be |
| 3289 |
mailed to the moderator for approval). An Approved header |
| 3290 |
is also required in certain control messages, to reduce the |
| 3291 |
probability of accidental posting of same; see the relevant |
| 3292 |
parts of section 7. |
| 3293 |
|
| 3294 |
|
| 3295 |
|
| 3296 |
|
| 3297 |
|
| 3298 |
2 June 1994 - 50 - expires 15 July 1994 |
| 3299 |
|
| 3300 |
|
| 3301 |
|
| 3302 |
|
| 3303 |
|
| 3304 |
INTERNET DRAFT to be NEWS sec. 6.10 |
| 3305 |
|
| 3306 |
|
| 3307 |
NOTE: There is, at present, no way to authenticate |
| 3308 |
Approved headers to ensure that the claimed |
| 3309 |
approval really was bestowed. Nor is there an |
| 3310 |
established mechanism for even maintaining a list |
| 3311 |
of legitimate approvers (such a list would quickly |
| 3312 |
become out of date if it had to be maintained by |
| 3313 |
hand). Such mechanisms, presumably relying on |
| 3314 |
cryptographic authentication, would be a worth- |
| 3315 |
while extension to this Draft, and experimental |
| 3316 |
work in this area is encouraged. (The problem is |
| 3317 |
harder than it sounds because news is used on many |
| 3318 |
systems which do not have real-time access to key |
| 3319 |
servers.) |
| 3320 |
|
| 3321 |
NOTE: Relayer implementors, please note well: it |
| 3322 |
is the POSTING AGENT that is authorized to distin- |
| 3323 |
guish between moderator postings and attempted |
| 3324 |
contributions, and to mail the latter to the mod- |
| 3325 |
erator. As discussed in section 9.1, relayers |
| 3326 |
MUST not, repeat MUST not, send such mail; on |
| 3327 |
receipt of an unApproved article in a moderated |
| 3328 |
newsgroup, they should discard the article, NOT |
| 3329 |
transform it into a mail message (except perhaps |
| 3330 |
to a local administrator). |
| 3331 |
|
| 3332 |
NOTE: RFC 1036 restricted Approved to a single |
| 3333 |
From-content. However, multiple moderation is no |
| 3334 |
longer rare, and multi-moderator Approved headers |
| 3335 |
are already in use. |
| 3336 |
|
| 3337 |
|
| 3338 |
6.11. Lines |
| 3339 |
|
| 3340 |
The Lines header content indicates the number of lines in |
| 3341 |
the body of the article: |
| 3342 |
|
| 3343 |
Lines-content = 1*digit |
| 3344 |
|
| 3345 |
The line count includes all body lines, including the signa- |
| 3346 |
ture if any, including empty lines (if any) at beginning or |
| 3347 |
end of the body. (The single empty separator line between |
| 3348 |
the headers and the body is not part of the body.) The |
| 3349 |
"body" here is the body as found in the posted article, |
| 3350 |
AFTER all transformations such as MIME encodings. |
| 3351 |
|
| 3352 |
Reading agents SHOULD not rely on the presence of this |
| 3353 |
header, since it is optional (and some posting agents do not |
| 3354 |
supply it). They MUST not rely on it being precise, since |
| 3355 |
it frequently is not. |
| 3356 |
|
| 3357 |
NOTE: The average line length in article bodies is |
| 3358 |
surprisingly consistent at about 40 characters, |
| 3359 |
and since the line count typically is used only |
| 3360 |
for approximate judgements ("is this too long to |
| 3361 |
|
| 3362 |
|
| 3363 |
|
| 3364 |
2 June 1994 - 51 - expires 15 July 1994 |
| 3365 |
|
| 3366 |
|
| 3367 |
|
| 3368 |
|
| 3369 |
|
| 3370 |
INTERNET DRAFT to be NEWS sec. 6.11 |
| 3371 |
|
| 3372 |
|
| 3373 |
read quickly?"), dividing the byte count of the |
| 3374 |
body by 40 gives an estimate of the body line |
| 3375 |
count that is adequate for normal use. This esti- |
| 3376 |
mate is NOT adequate if the body has been MIME |
| 3377 |
encoded... but neither is the Lines header, since |
| 3378 |
at least one major relayer will supply a Lines |
| 3379 |
header for an article that lacks one, and will not |
| 3380 |
consider the possibility of MIME encodings when |
| 3381 |
computing the line count. |
| 3382 |
|
| 3383 |
NOTE: It would be better to have a Content-Size |
| 3384 |
header as part of MIME, so that body parts could |
| 3385 |
have their own sizes, and so that the units used |
| 3386 |
could be appropriate to the data type (line count |
| 3387 |
is not a useful measure of the size of an encoded |
| 3388 |
image, for example). Doing this is preferable to |
| 3389 |
trying to fix Lines. |
| 3390 |
|
| 3391 |
UNRESOLVED ISSUE: Update on Content-Size? |
| 3392 |
|
| 3393 |
Relayers SHOULD discard this header if they find it neces- |
| 3394 |
sary to re-encode the article in such a way that the origi- |
| 3395 |
nal Lines header would be rendered incorrect. |
| 3396 |
|
| 3397 |
|
| 3398 |
6.12. Xref |
| 3399 |
|
| 3400 |
The Xref header content indicates where an article was filed |
| 3401 |
by the last relayer to process it: |
| 3402 |
|
| 3403 |
Xref-content = relayer 1*( space location ) |
| 3404 |
relayer = relayer-name |
| 3405 |
location = newsgroup-name ":" article-locator |
| 3406 |
article-locator = 1*<ASCII printable character> |
| 3407 |
|
| 3408 |
The relayer's name is included so that software can deter- |
| 3409 |
mine which relayer generated the header (and specifically, |
| 3410 |
whether it really was the one that filed the copy being |
| 3411 |
examined). The locations specify what newsgroups the arti- |
| 3412 |
cle was filed under (which may differ from those in the |
| 3413 |
Newsgroups header) and where it was filed under them. The |
| 3414 |
exact form of an article locator is implementation-specific. |
| 3415 |
|
| 3416 |
NOTE: Reading agents can exploit this information |
| 3417 |
to avoid presenting the same article to a reader |
| 3418 |
several times. The information is sometimes |
| 3419 |
available in system databases, but having it in |
| 3420 |
the article is convenient. Relayers traditionally |
| 3421 |
generate an Xref header only if the article is |
| 3422 |
cross-posted, but this is not mandatory, and there |
| 3423 |
is at least one new application ("mirroring": |
| 3424 |
keeping news databases on two hosts identical) |
| 3425 |
where the header is useful in all articles. |
| 3426 |
|
| 3427 |
|
| 3428 |
|
| 3429 |
|
| 3430 |
2 June 1994 - 52 - expires 15 July 1994 |
| 3431 |
|
| 3432 |
|
| 3433 |
|
| 3434 |
|
| 3435 |
|
| 3436 |
INTERNET DRAFT to be NEWS sec. 6.12 |
| 3437 |
|
| 3438 |
|
| 3439 |
NOTE: The traditional form of an article locator |
| 3440 |
is a decimal number, with articles in each news- |
| 3441 |
group numbered consecutively starting from 1. |
| 3442 |
NNTP [rrr] demands that such a model be provided, |
| 3443 |
and there may be other software which expects it, |
| 3444 |
but it seems desirable to permit flexibility for |
| 3445 |
unorthodox implementations. |
| 3446 |
|
| 3447 |
A relayer inserting an Xref header into an article MUST |
| 3448 |
delete any previous Xref header. A relayer which is not |
| 3449 |
inserting its own Xref header SHOULD delete any previous |
| 3450 |
Xref header. A relayer MAY delete the Xref header when |
| 3451 |
passing an article on to another relayer. |
| 3452 |
|
| 3453 |
NOTE: RFC 1036 specified that the Xref header was |
| 3454 |
not transmitted when an article was passed to |
| 3455 |
another relayer, but the major news implementa- |
| 3456 |
tions have never obeyed this rule, and applica- |
| 3457 |
tions like mirroring depend on this disobedience. |
| 3458 |
|
| 3459 |
A relayer MUST use the same name in Xref headers as it uses |
| 3460 |
in Path headers. Reading agents MUST ignore an Xref header |
| 3461 |
containing a relayer name that differs from the one that |
| 3462 |
begins the path list. |
| 3463 |
|
| 3464 |
|
| 3465 |
6.13. Organization |
| 3466 |
|
| 3467 |
The Organization header content is a short phrase identify- |
| 3468 |
ing the poster's organization: |
| 3469 |
|
| 3470 |
Organization-content = nonblank-text |
| 3471 |
|
| 3472 |
This header is typically supplied by the posting agent. The |
| 3473 |
Organization content SHOULD mention geographical location |
| 3474 |
(e.g. city and country) when it is not obvious from the |
| 3475 |
organization's name. |
| 3476 |
|
| 3477 |
NOTE: The motive here is that the organization is |
| 3478 |
often difficult to guess from the mailing address, |
| 3479 |
is not always supplied in a signature, and can |
| 3480 |
help identify the poster to the reader. |
| 3481 |
|
| 3482 |
NOTE: There is no "s" in "Organization". |
| 3483 |
|
| 3484 |
The Organization content is provided for identification |
| 3485 |
only, and does not imply that the poster speaks for the |
| 3486 |
organization or that the article represents organization |
| 3487 |
policy. Posting agents SHOULD permit the poster to override |
| 3488 |
a local default Organization header. |
| 3489 |
|
| 3490 |
|
| 3491 |
|
| 3492 |
|
| 3493 |
|
| 3494 |
|
| 3495 |
|
| 3496 |
2 June 1994 - 53 - expires 15 July 1994 |
| 3497 |
|
| 3498 |
|
| 3499 |
|
| 3500 |
|
| 3501 |
|
| 3502 |
INTERNET DRAFT to be NEWS sec. 6.14 |
| 3503 |
|
| 3504 |
|
| 3505 |
6.14. Supersedes |
| 3506 |
|
| 3507 |
The Supersedes header content specifies articles to be can- |
| 3508 |
celled on arrival of this one: |
| 3509 |
|
| 3510 |
Supersedes-content = message-id *( space message-id ) |
| 3511 |
|
| 3512 |
Supersedes is equivalent to Also-Control (section 6.15) with |
| 3513 |
an implicit verb of "cancel" (section 7.1). |
| 3514 |
|
| 3515 |
NOTE: Supersedes is normally used where the arti- |
| 3516 |
cle is an updated version of the one(s) being can- |
| 3517 |
celled. |
| 3518 |
|
| 3519 |
NOTE: Although the ability to use multiple message |
| 3520 |
IDs in Supersedes is highly desirable (see section |
| 3521 |
7.1), posters are warned that existing implementa- |
| 3522 |
tions often do not correctly handle more than one. |
| 3523 |
|
| 3524 |
NOTE: There is no "c" in "Supersedes". |
| 3525 |
|
| 3526 |
An article with a Supersedes header MUST not have an Also- |
| 3527 |
Control or Control header. |
| 3528 |
|
| 3529 |
|
| 3530 |
6.15. Also-Control |
| 3531 |
|
| 3532 |
The Also-Control header content marks the article as being a |
| 3533 |
control message IN ADDITION to being a normal news article, |
| 3534 |
and specifies the desired actions: |
| 3535 |
|
| 3536 |
Also-Control-content = Control-content |
| 3537 |
|
| 3538 |
An article with an Also-Control header is filed and passed |
| 3539 |
on normally, but the content of the Also-Control header is |
| 3540 |
processed as if it were found in a Control header. |
| 3541 |
|
| 3542 |
NOTE: It is sometimes desirable to piggyback con- |
| 3543 |
trol actions on a normal article, so that the |
| 3544 |
article will be filed normally but will also be |
| 3545 |
acted on as a control message. This header is |
| 3546 |
essentially a generalization of Supersedes. |
| 3547 |
|
| 3548 |
NOTE: Be warned that some old relayers do not |
| 3549 |
implement Also-Control. |
| 3550 |
|
| 3551 |
An article with an Also-Control header MUST not have a Con- |
| 3552 |
trol or Supersedes header. |
| 3553 |
|
| 3554 |
|
| 3555 |
6.16. See-Also |
| 3556 |
|
| 3557 |
The See-Also header content lists message IDs of articles |
| 3558 |
that are related to this one but are not its precursors: |
| 3559 |
|
| 3560 |
|
| 3561 |
|
| 3562 |
2 June 1994 - 54 - expires 15 July 1994 |
| 3563 |
|
| 3564 |
|
| 3565 |
|
| 3566 |
|
| 3567 |
|
| 3568 |
INTERNET DRAFT to be NEWS sec. 6.16 |
| 3569 |
|
| 3570 |
|
| 3571 |
See-Also-content = message-id *( space message-id ) |
| 3572 |
|
| 3573 |
See-Also resembles References, but without the restrictions |
| 3574 |
imposed on References by the followup rules. |
| 3575 |
|
| 3576 |
NOTE: See-Also provides a way to group related |
| 3577 |
articles, such as the parts of a single document |
| 3578 |
that had to be split across multiple articles due |
| 3579 |
to its size, or to cross-reference between paral- |
| 3580 |
lel threads. |
| 3581 |
|
| 3582 |
NOTE: See the discussion (in section 6.5) on MAIL |
| 3583 |
compatibility issues of References and See-Also. |
| 3584 |
|
| 3585 |
NOTE: In the specific case where it is desired to |
| 3586 |
essentially make another article PART of the cur- |
| 3587 |
rent one, e.g. for annotation of the other arti- |
| 3588 |
cle, MIME's "message/external-body" convention can |
| 3589 |
be used to do so without actual inclusion. "news- |
| 3590 |
message-ID" was registered as a standard external- |
| 3591 |
body access method, with a mandatory NAME parame- |
| 3592 |
ter giving the message ID and an optional SITE |
| 3593 |
parameter suggesting an NNTP site that might have |
| 3594 |
the article available (if it is not available |
| 3595 |
locally), by IANA 22 June 1993. |
| 3596 |
|
| 3597 |
UNRESOLVED ISSUE: Could the syntax be generalized |
| 3598 |
to include URLs as alternatives to message IDs? |
| 3599 |
Here it makes much more sense than in References. |
| 3600 |
|
| 3601 |
|
| 3602 |
6.17. Article-Names |
| 3603 |
|
| 3604 |
The Article-Names header content indicates any special sig- |
| 3605 |
nificance the article may have in particular newsgroups: |
| 3606 |
|
| 3607 |
Article-Names-content = 1*( name-clause space ) |
| 3608 |
name-clause = newsgroup-name ":" article-name |
| 3609 |
article-name = letter 1*( letter / digit / "-" ) |
| 3610 |
|
| 3611 |
Each name clause specifies a newsgroup (which SHOULD be |
| 3612 |
among those in the Newsgroups header) and an article name |
| 3613 |
local to that newsgroup. Article names MAY be used by |
| 3614 |
relayers to file the article in special ways, or they MAY |
| 3615 |
just be noted for possible special attention by reading |
| 3616 |
agents. Article names are case-sensitive. |
| 3617 |
|
| 3618 |
NOTE: This header provides a way to mark special |
| 3619 |
postings, such as introductions, frequently-asked- |
| 3620 |
question lists, etc., so that reading agents have |
| 3621 |
a way of finding them automatically. The news- |
| 3622 |
group name is specified for each article name |
| 3623 |
because the names may be newsgroup-specific; for |
| 3624 |
example, many frequently-asked-question lists are |
| 3625 |
|
| 3626 |
|
| 3627 |
|
| 3628 |
2 June 1994 - 55 - expires 15 July 1994 |
| 3629 |
|
| 3630 |
|
| 3631 |
|
| 3632 |
|
| 3633 |
|
| 3634 |
INTERNET DRAFT to be NEWS sec. 6.17 |
| 3635 |
|
| 3636 |
|
| 3637 |
posted to "news.answers" in addition to their |
| 3638 |
"home" newsgroup, and they would not be known by |
| 3639 |
the same name(s) in both newsgroups. |
| 3640 |
|
| 3641 |
The Article-Names header SHOULD be ignored unless the arti- |
| 3642 |
cle also contains an Approved header. |
| 3643 |
|
| 3644 |
NOTE: This stipulation is made in anticipation of |
| 3645 |
the possibility that Approved headers will be |
| 3646 |
involved in cryptographic authentication. |
| 3647 |
|
| 3648 |
The presence of an Article-Names header does not necessarily |
| 3649 |
imply that the article will be retained unusually long |
| 3650 |
before expiration, or that previous article(s) with similar |
| 3651 |
Article-Names headers will be cancelled by its arrival. |
| 3652 |
Posters preparing special postings SHOULD include appropri- |
| 3653 |
ate other headers, such as Expires and Supersedes, to |
| 3654 |
request such actions. |
| 3655 |
|
| 3656 |
Different networks MAY establish different sets of article |
| 3657 |
names for the special postings they deem significant; it is |
| 3658 |
preferable for usage to be standardized within networks, |
| 3659 |
although it might be desirable for individual newsgroups to |
| 3660 |
have different naming conventions in some situations. Arti- |
| 3661 |
cle names MUST be 14 characters or less. The following |
| 3662 |
names are suggested but are not mandatory: |
| 3663 |
|
| 3664 |
intro Introduction to the newsgroup for newcomers. |
| 3665 |
|
| 3666 |
charter Charter, rules, organization, moderation poli- |
| 3667 |
cies, etc. |
| 3668 |
|
| 3669 |
background Biographies of special participants, history of |
| 3670 |
the newsgroup, notes on related newsgroups, etc. |
| 3671 |
|
| 3672 |
subgroups Descriptions of sub-newsgroups under this news- |
| 3673 |
group, e.g. "sci.space.news" under "sci.space". |
| 3674 |
|
| 3675 |
facts Information relating to the purpose of the news- |
| 3676 |
group, e.g. an acronym glossary in "sci.space". |
| 3677 |
|
| 3678 |
references Where to get more information: books, journals, |
| 3679 |
FTP repositories, etc. |
| 3680 |
|
| 3681 |
faq Answers to frequently-asked questions. |
| 3682 |
|
| 3683 |
menu If present, a list of all the other article |
| 3684 |
names local to this newsgroup, with brief |
| 3685 |
descriptions of their contents. |
| 3686 |
|
| 3687 |
Such articles may be divided into subsections using the MIME |
| 3688 |
"multipart/mixed" conventions. If size considerations make |
| 3689 |
it necessary to split such articles, names ending in a |
| 3690 |
hyphen and a part number are suggested; for example, a |
| 3691 |
|
| 3692 |
|
| 3693 |
|
| 3694 |
2 June 1994 - 56 - expires 15 July 1994 |
| 3695 |
|
| 3696 |
|
| 3697 |
|
| 3698 |
|
| 3699 |
|
| 3700 |
INTERNET DRAFT to be NEWS sec. 6.17 |
| 3701 |
|
| 3702 |
|
| 3703 |
three-part frequently-asked-questions list could have arti- |
| 3704 |
cle names "faq-1", "faq-2", and "faq-3". |
| 3705 |
|
| 3706 |
NOTE: It is somewhat premature to attempt to stan- |
| 3707 |
dardize article names, since this is essentially a |
| 3708 |
new feature with no experience behind it. How- |
| 3709 |
ever, if reading agents are to attach special sig- |
| 3710 |
nificance to these names, some attempt at standard |
| 3711 |
conventions is imperative. This is a first |
| 3712 |
attempt at providing some. |
| 3713 |
|
| 3714 |
|
| 3715 |
6.18. Article-Updates |
| 3716 |
|
| 3717 |
The Article-Updates header content indicates what previous |
| 3718 |
articles this one is deemed (by the poster) to update (i.e., |
| 3719 |
replace): |
| 3720 |
|
| 3721 |
Article-Updates-content = message-id *( space message-id ) |
| 3722 |
|
| 3723 |
Each message ID identifies a previous article that this one |
| 3724 |
is deemed to update. This MUST not cause the previous arti- |
| 3725 |
cle(s) to be cancelled or otherwise altered, unless this is |
| 3726 |
implied by other headers (e.g. Supersedes); Article-Updates |
| 3727 |
is merely an advisory which MAY be noted for special atten- |
| 3728 |
tion by reading agents. |
| 3729 |
|
| 3730 |
NOTE: This header provides a way to mark articles |
| 3731 |
which are only minor updates of previous ones, |
| 3732 |
containing no significant new information and not |
| 3733 |
worth reading if the previous ones have been read. |
| 3734 |
|
| 3735 |
NOTE: If suitable conventions using MIME multipart |
| 3736 |
bodies and the "message/external-body" body-part |
| 3737 |
type can be developed, a replacing article might |
| 3738 |
contain only differences between the old text and |
| 3739 |
the new text, rather than a complete new copy. |
| 3740 |
This is the motivation for not making Article- |
| 3741 |
Updates also function as Supersedes does: the |
| 3742 |
replacing article might depend on the continued |
| 3743 |
presence of the replaced article. |
| 3744 |
|
| 3745 |
|
| 3746 |
7. Control Messages |
| 3747 |
|
| 3748 |
The following sections document the currently-defined con- |
| 3749 |
trol messages. "Message" is used herein as a synonym for |
| 3750 |
"article" unless context indicates otherwise. |
| 3751 |
|
| 3752 |
Posting agents are warned that since certain control mes- |
| 3753 |
sages require article bodies in quite specific formats, sig- |
| 3754 |
natures SHOULD not be appended to such articles, and it may |
| 3755 |
be wise to take greater care than usual to avoid unintended |
| 3756 |
(although perhaps well-meaning) alterations to text supplied |
| 3757 |
|
| 3758 |
|
| 3759 |
|
| 3760 |
2 June 1994 - 57 - expires 15 July 1994 |
| 3761 |
|
| 3762 |
|
| 3763 |
|
| 3764 |
|
| 3765 |
|
| 3766 |
INTERNET DRAFT to be NEWS sec. 7 |
| 3767 |
|
| 3768 |
|
| 3769 |
by the poster. Relayers MUST assume that control messages |
| 3770 |
mean what they say; they MAY be obeyed as is or rejected, |
| 3771 |
but MUST not be reinterpreted. |
| 3772 |
|
| 3773 |
The execution of the actions requested by control messages |
| 3774 |
is subject to local administrative restrictions, which MAY |
| 3775 |
deny requests or refer them to an administrator for |
| 3776 |
approval. The descriptions below are generally phrased in |
| 3777 |
terms suggesting mandatory actions, but any or all of these |
| 3778 |
MAY be subject to local administrative approval (either as a |
| 3779 |
class or case-by-case). Analogously, where the description |
| 3780 |
below specifies that a message or portion thereof is to be |
| 3781 |
ignored, this action MAY include reporting it to an adminis- |
| 3782 |
trator. |
| 3783 |
|
| 3784 |
NOTE: The exact choice of local action might |
| 3785 |
depend on what action the control message |
| 3786 |
requests, who it claims to come from, etc. |
| 3787 |
|
| 3788 |
Relayers MUST propagate even control messages they do not |
| 3789 |
understand. |
| 3790 |
|
| 3791 |
In the following sections, each type of control message is |
| 3792 |
defined syntactically by defining its arguments and its |
| 3793 |
body. For example, "cancel" is defined by defining cancel- |
| 3794 |
arguments and cancel-body. |
| 3795 |
|
| 3796 |
|
| 3797 |
7.1. cancel |
| 3798 |
|
| 3799 |
The cancel message requests that one or more previous arti- |
| 3800 |
cles be "cancelled": |
| 3801 |
|
| 3802 |
cancel-arguments = message-id *( space message-id ) |
| 3803 |
cancel-body = body |
| 3804 |
|
| 3805 |
The argument(s) identify the articles to be cancelled, by |
| 3806 |
message ID. The body is a comment, which software MUST |
| 3807 |
ignore, and SHOULD contain an indication of why the cancel- |
| 3808 |
lation was requested. The cancel message SHOULD be posted |
| 3809 |
to the same newsgroup(s), with the same distribution(s), as |
| 3810 |
the article(s) it is attempting to cancel. |
| 3811 |
|
| 3812 |
NOTE: Using the same newsgroups and distributions |
| 3813 |
maximizes the chances of the cancel message propa- |
| 3814 |
gating everywhere the target articles went. |
| 3815 |
|
| 3816 |
NOTE: RFC 1036 permitted only a single message-id |
| 3817 |
in a cancel message. Support for cancelling mul- |
| 3818 |
tiple articles is highly desirable, especially for |
| 3819 |
use with Supersedes (see section 6.14). If sev- |
| 3820 |
eral revisions of an article appear in fast suc- |
| 3821 |
cession, each using Supersedes to cancel the pre- |
| 3822 |
vious one, it is possible for a middle revision to |
| 3823 |
|
| 3824 |
|
| 3825 |
|
| 3826 |
2 June 1994 - 58 - expires 15 July 1994 |
| 3827 |
|
| 3828 |
|
| 3829 |
|
| 3830 |
|
| 3831 |
|
| 3832 |
INTERNET DRAFT to be NEWS sec. 7.1 |
| 3833 |
|
| 3834 |
|
| 3835 |
be destroyed by cancellation before it is propa- |
| 3836 |
gated onward to cancel its predecessor. Allowing |
| 3837 |
each article to cancel several predecessors |
| 3838 |
greatly alleviates this problem. (Posting agents |
| 3839 |
preparing a cancel of an article which itself can- |
| 3840 |
cels other articles might wish to add those arti- |
| 3841 |
cles to the cancel-arguments.) However, posters |
| 3842 |
should be aware that much old software does not |
| 3843 |
implement multiple cancellation properly, and |
| 3844 |
should avoid using it when reliable cancellation |
| 3845 |
is vitally important. |
| 3846 |
|
| 3847 |
When an article (the "target article") is to be cancelled, |
| 3848 |
there are four cases of interest: the article hasn't arrived |
| 3849 |
yet, it has arrived and been filed and is available for |
| 3850 |
reading, it has expired and been archived on some less- |
| 3851 |
accessible storage medium, or it has expired and been |
| 3852 |
deleted. The next few paragraphs discuss each case in turn |
| 3853 |
(in reverse order, which is convenient for the explanation). |
| 3854 |
|
| 3855 |
EXPIRED AND DELETED. Take no action. |
| 3856 |
|
| 3857 |
EXPIRED AND ARCHIVED. If the article is readily accessible |
| 3858 |
and can be deleted or made unreadable easily, treat as under |
| 3859 |
AVAILABLE below. Otherwise treat as under EXPIRED AND |
| 3860 |
DELETED. |
| 3861 |
|
| 3862 |
NOTE: While it is desirable for archived articles |
| 3863 |
to be cancellable, this can easily involve rewrit- |
| 3864 |
ing an entire archive volume just to get rid of |
| 3865 |
one article, perhaps with manual actions required |
| 3866 |
to arrange it. It is difficult to envision a sit- |
| 3867 |
uation so dire as to require such measures from |
| 3868 |
hundreds or thousands of administrators, or for |
| 3869 |
that matter one in which widespread compliance |
| 3870 |
with such a request is likely. |
| 3871 |
|
| 3872 |
AVAILABLE. Compare the mailing addresses from the From |
| 3873 |
lines of the cancel message and the target article, bearing |
| 3874 |
in mind that local parts (except for "postmaster") are case- |
| 3875 |
sensitive and domains are case-insensitive. If they do not |
| 3876 |
match, either refer the issue to an administrator for a |
| 3877 |
case-by-case decision, or treat as if they matched. |
| 3878 |
|
| 3879 |
NOTE: It is generally trivial to forge articles, |
| 3880 |
so nothing short of cryptographic authentication |
| 3881 |
is really adequate to ensure that a cancel came |
| 3882 |
from the original article's author. Moreover, it |
| 3883 |
is highly desirable to permit authorities other |
| 3884 |
than the author to cancel articles, to allow for |
| 3885 |
cases in which the author is unavailable, uncoop- |
| 3886 |
erative, or malicious, and in which damage and/or |
| 3887 |
legal problems may be minimized by prompt cancel- |
| 3888 |
lation. Reliable authentication that would permit |
| 3889 |
|
| 3890 |
|
| 3891 |
|
| 3892 |
2 June 1994 - 59 - expires 15 July 1994 |
| 3893 |
|
| 3894 |
|
| 3895 |
|
| 3896 |
|
| 3897 |
|
| 3898 |
INTERNET DRAFT to be NEWS sec. 7.1 |
| 3899 |
|
| 3900 |
|
| 3901 |
such administrative cancels would be a worthwhile |
| 3902 |
extension to this Draft, and experimental work in |
| 3903 |
this area is encouraged. |
| 3904 |
|
| 3905 |
NOTE: Meanwhile, a simple check of addresses is |
| 3906 |
useful accident prevention and catches at least |
| 3907 |
the most simple-minded forgers. Since the intent |
| 3908 |
is accident prevention rather than ironclad secu- |
| 3909 |
rity, use of the From address is appropriate, all |
| 3910 |
the more so because in the presence of gateways |
| 3911 |
(especially redundant multiple gateways), the |
| 3912 |
author may not have full control over Sender head- |
| 3913 |
ers. |
| 3914 |
|
| 3915 |
NOTE: The "refer... or treat as if they matched" |
| 3916 |
rule is intended to specifically forbid quietly |
| 3917 |
ignoring cancels with mismatched addresses. |
| 3918 |
|
| 3919 |
If the addresses match, then if technically possible, the |
| 3920 |
relayer MUST delete the target article completely and imme- |
| 3921 |
diately. Failing that, it MUST make the target article |
| 3922 |
unreadable (preferably to everyone, minimally to everyone |
| 3923 |
but the administrator) and either arrange for it to be |
| 3924 |
deleted as soon as possible or notify an administrator at |
| 3925 |
once. |
| 3926 |
|
| 3927 |
NOTE: To allow for events such as criminal |
| 3928 |
actions, malicious forgeries, and copyright |
| 3929 |
infringements, where damage and/or legal problems |
| 3930 |
may be minimized by prompt cancellation, complete |
| 3931 |
removal is strongly preferred over merely making |
| 3932 |
the target article unreadable. The potential for |
| 3933 |
malice is outweighed by the importance of really |
| 3934 |
getting rid of the target article in some legiti- |
| 3935 |
mate cases. (In cases of inadvertent copyright |
| 3936 |
violation in particular, the ability to quickly |
| 3937 |
remedy the violation is of considerable legal |
| 3938 |
importance.) Failing that, making it unreadable |
| 3939 |
is better than nothing. |
| 3940 |
|
| 3941 |
NOTE: Merely annotating the article so that read- |
| 3942 |
ers see an indication that the author wanted it |
| 3943 |
cancelled is not acceptable. Making the article |
| 3944 |
unreadable is the minimum action. |
| 3945 |
|
| 3946 |
NOTE: There have been experiments with making can- |
| 3947 |
celled articles unreadable, so that local news |
| 3948 |
administrators could reverse cancellations. In |
| 3949 |
practice, administrators almost never find cause |
| 3950 |
to do so. Removal appears to be clearly prefer- |
| 3951 |
able where technically feasible. |
| 3952 |
|
| 3953 |
NOT ARRIVED YET. If practical, retain the cancel message |
| 3954 |
until the target article does arrive, or until there is no |
| 3955 |
|
| 3956 |
|
| 3957 |
|
| 3958 |
2 June 1994 - 60 - expires 15 July 1994 |
| 3959 |
|
| 3960 |
|
| 3961 |
|
| 3962 |
|
| 3963 |
|
| 3964 |
INTERNET DRAFT to be NEWS sec. 7.1 |
| 3965 |
|
| 3966 |
|
| 3967 |
further possibility of it arriving and being accepted (see |
| 3968 |
section 9.2), and then treat as under AVAILABLE. Failing |
| 3969 |
that, arrange for the target article to be rejected and dis- |
| 3970 |
carded if it does arrive. |
| 3971 |
|
| 3972 |
NOTE: It may well be impractical to retain the |
| 3973 |
control message, given uncertainty about whether |
| 3974 |
the target article will ever arrive. Existing |
| 3975 |
practice in such cases is to assume that addresses |
| 3976 |
would match and arrange the equivalent of dele- |
| 3977 |
tion. This is often done by making a spurious |
| 3978 |
entry in a database of already-seen message IDs |
| 3979 |
(see section 9.3), so that if the article does |
| 3980 |
arrive, it will be rejected as a duplicate. |
| 3981 |
|
| 3982 |
The cancel message MUST be propagated onward in the usual |
| 3983 |
fashion, regardless of which of the four cases applied, so |
| 3984 |
that the target article will be cancelled everywhere even if |
| 3985 |
cancellation and target article follow different routes. |
| 3986 |
|
| 3987 |
NOTE: RFC 1036 appeared to require stopping cancel |
| 3988 |
propagation in the NOT ARRIVED YET case, although |
| 3989 |
the wording was somewhat unclear. This appears to |
| 3990 |
have been an unwise decision; there are known |
| 3991 |
cases of important cancellations (in situations |
| 3992 |
of, e.g., inadvertent copyright violation) achiev- |
| 3993 |
ing rather poorer propagation than the target |
| 3994 |
article. News propagation is often a much less |
| 3995 |
orderly process than the authors of RFC 1036 |
| 3996 |
apparently envisioned. Modern implementations |
| 3997 |
generally propagate the cancellation regardless. |
| 3998 |
|
| 3999 |
Posting agents meant for use by ordinary posters SHOULD |
| 4000 |
reject an attempt to post a cancel message if the target |
| 4001 |
article is available and the mailing address in its From |
| 4002 |
header does not match the one in the cancel message's From |
| 4003 |
header. |
| 4004 |
|
| 4005 |
NOTE: This, again, is primarily accident preven- |
| 4006 |
tion. |
| 4007 |
|
| 4008 |
|
| 4009 |
7.2. ihave, sendme |
| 4010 |
|
| 4011 |
The ihave and sendme control messages implement a crude |
| 4012 |
batched predecessor of the NNTP [rrr] protocol. They are |
| 4013 |
largely obsolete in the Internet, but still see use in the |
| 4014 |
UUCP environment, especially for backup feeds that normally |
| 4015 |
are active only when a primary feed path has failed. |
| 4016 |
|
| 4017 |
NOTE: The ihave and sendme messages defined here |
| 4018 |
have ABSOLUTELY NOTHING TO DO WITH NNTP, despite |
| 4019 |
similarities of terminology. |
| 4020 |
|
| 4021 |
|
| 4022 |
|
| 4023 |
|
| 4024 |
2 June 1994 - 61 - expires 15 July 1994 |
| 4025 |
|
| 4026 |
|
| 4027 |
|
| 4028 |
|
| 4029 |
|
| 4030 |
INTERNET DRAFT to be NEWS sec. 7.2 |
| 4031 |
|
| 4032 |
|
| 4033 |
The two messages share the same syntax: |
| 4034 |
|
| 4035 |
ihave-arguments = *( message-id space ) relayer-name |
| 4036 |
sendme-arguments = ihave-arguments |
| 4037 |
ihave-body = *( message-id eol ) |
| 4038 |
sendme-body = ihave-body |
| 4039 |
|
| 4040 |
Message IDs MUST appear in either the arguments or the body, |
| 4041 |
but not both. Relayers SHOULD generate the form putting |
| 4042 |
message IDs in the body, but the other form MUST be sup- |
| 4043 |
ported for backward compatibility. |
| 4044 |
|
| 4045 |
NOTE: RFC 1036 made the relayer name optional, but |
| 4046 |
difficulties could easily ensue in determining the |
| 4047 |
origin of the message, and this option is believed |
| 4048 |
to be unused nowadays. Putting the message IDs in |
| 4049 |
the body is strongly preferred over putting them |
| 4050 |
in the arguments because it lends itself much bet- |
| 4051 |
ter to large numbers of message IDs and avoids the |
| 4052 |
empty-body problem mentioned in section 4.3.1. |
| 4053 |
|
| 4054 |
The ihave message states that the named relayer has filed |
| 4055 |
articles with the specified message IDs, which may be of |
| 4056 |
interest to the relayer(s) receiving the ihave message. The |
| 4057 |
sendme message requests that the relayer receiving it send |
| 4058 |
the articles having the specified message IDs to the named |
| 4059 |
relayer. |
| 4060 |
|
| 4061 |
These control messages are normally sent essentially as |
| 4062 |
point-to-point messages, by using "to." newsgroups (see sec- |
| 4063 |
tion 5.5) that are sent only to the relayer the messages are |
| 4064 |
intended for. The two relayers MUST be neighbors, exchang- |
| 4065 |
ing news directly with each other. Each relayer advertises |
| 4066 |
its new arrivals to the other using ihave messages, and each |
| 4067 |
uses sendme messages to request the articles it lacks. |
| 4068 |
|
| 4069 |
NOTE: Arguably these point-to-point control mes- |
| 4070 |
sages should flow by some other protocol, e.g. |
| 4071 |
mail, but administrative and interfacing issues |
| 4072 |
are simplified if the news system doesn't need to |
| 4073 |
talk to the mail system. |
| 4074 |
|
| 4075 |
To reduce overhead, ihave and sendme messages SHOULD be sent |
| 4076 |
relatively infrequently and SHOULD contain substantial num- |
| 4077 |
bers of message IDs. If ihave and sendme are being used to |
| 4078 |
implement a backup feed, it may be desirable to insert a |
| 4079 |
delay between reception of an ihave and generation of a |
| 4080 |
sendme, so that a slightly slow primary feed will not cause |
| 4081 |
large numbers of articles to be requested unnecessarily via |
| 4082 |
sendme. |
| 4083 |
|
| 4084 |
|
| 4085 |
|
| 4086 |
|
| 4087 |
|
| 4088 |
|
| 4089 |
|
| 4090 |
2 June 1994 - 62 - expires 15 July 1994 |
| 4091 |
|
| 4092 |
|
| 4093 |
|
| 4094 |
|
| 4095 |
|
| 4096 |
INTERNET DRAFT to be NEWS sec. 7.3 |
| 4097 |
|
| 4098 |
|
| 4099 |
7.3. newgroup |
| 4100 |
|
| 4101 |
The newgroup control message requests that a new newsgroup |
| 4102 |
be created: |
| 4103 |
|
| 4104 |
newgroup-arguments = newsgroup-name [ space moderation ] |
| 4105 |
moderation = "moderated" / "unmoderated" |
| 4106 |
newgroup-body = body |
| 4107 |
/ [ body ] descriptor [ body ] |
| 4108 |
descriptor = descriptor-tag eol description-line eol |
| 4109 |
descriptor-tag = "For your newsgroups file:" |
| 4110 |
description-line = newsgroup-name space description |
| 4111 |
description = nonblank-text [ " (Moderated)" ] |
| 4112 |
|
| 4113 |
The first argument names the newsgroup to be created, and |
| 4114 |
the second one (if present) indicates whether it is moder- |
| 4115 |
ated. If there is no second argument, the default is |
| 4116 |
"unmoderated". |
| 4117 |
|
| 4118 |
NOTE: Implementors are warned that there is occa- |
| 4119 |
sional use of other forms in the second argument. |
| 4120 |
It is suggested that such violations of this |
| 4121 |
Draft, which are also violations of RFC 1036, |
| 4122 |
cause the newgroup message to be ignored. RFC |
| 4123 |
1036 was slightly vague about how second arguments |
| 4124 |
other than "moderated" were to be treated (specif- |
| 4125 |
ically, whether they were illegal or just |
| 4126 |
ignored), but it is thought that all existing |
| 4127 |
major implementations will handle "unmoderated" |
| 4128 |
correctly, and it appears desirable to tighten up |
| 4129 |
the specs to make it possible for other forms to |
| 4130 |
be used in future. |
| 4131 |
|
| 4132 |
The body is a comment, which software MUST ignore, except |
| 4133 |
that if it contains a descriptor, the description line is |
| 4134 |
intended to be suitable for addition to a list of newsgroup |
| 4135 |
descriptions. The description cannot be continued onto |
| 4136 |
later lines, but is not constrained to any particular |
| 4137 |
length. Moderated newsgroups have descriptions that end |
| 4138 |
with the string " (Moderated)" (note that this string begins |
| 4139 |
with a blank). |
| 4140 |
|
| 4141 |
NOTE: It is unfortunate that the description line |
| 4142 |
is part of the body, rather than being supplied in |
| 4143 |
a header, but this is established practice. News- |
| 4144 |
group creators are cautioned that the descriptor |
| 4145 |
tag must be reproduced exactly as given above, |
| 4146 |
alone on a line, and is case-sensitive. (To |
| 4147 |
reduce errors in this regard, posting agents might |
| 4148 |
wish to question or reject newgroup messages which |
| 4149 |
do not contain a descriptor.) Given the desire |
| 4150 |
for short lines, description writers should avoid |
| 4151 |
content-free phrases like "discussion of" and |
| 4152 |
"news about", and stick to defining what the |
| 4153 |
|
| 4154 |
|
| 4155 |
|
| 4156 |
2 June 1994 - 63 - expires 15 July 1994 |
| 4157 |
|
| 4158 |
|
| 4159 |
|
| 4160 |
|
| 4161 |
|
| 4162 |
INTERNET DRAFT to be NEWS sec. 7.3 |
| 4163 |
|
| 4164 |
|
| 4165 |
newsgroup is about. |
| 4166 |
|
| 4167 |
The remainder of the body SHOULD contain an explanation of |
| 4168 |
the purpose of the newsgroup and the decision to create it. |
| 4169 |
|
| 4170 |
NOTE: Criteria for newsgroup creation vary widely |
| 4171 |
and are outside the scope of this Draft, but if |
| 4172 |
formal procedures of one kind or another were fol- |
| 4173 |
lowed in the decision, the body should mention |
| 4174 |
this. Administrators often look for such informa- |
| 4175 |
tion when deciding whether to comply with cre- |
| 4176 |
ation/deletion requests. |
| 4177 |
|
| 4178 |
A newgroup message which lacks an Approved header MUST be |
| 4179 |
ignored. |
| 4180 |
|
| 4181 |
NOTE: It would also be desirable to ignore a new- |
| 4182 |
group message unless its Approved header names a |
| 4183 |
person who is authorized (in some sense) to create |
| 4184 |
such a newsgroup. A cooperating subnet with suf- |
| 4185 |
ficiently strong coordination to maintain a cor- |
| 4186 |
rect and current list of authorized creators might |
| 4187 |
wish to do so for its internal newsgroups. It |
| 4188 |
also (or alternatively) might wish to ignore a |
| 4189 |
newgroup message for an internal newsgroup that |
| 4190 |
was posted (or cross-posted) to a non-internal |
| 4191 |
newsgroup. |
| 4192 |
|
| 4193 |
NOTE: As mentioned in section 6.10, some form of |
| 4194 |
(cryptographic?) authentication of Approved head- |
| 4195 |
ers would be highly desirable, especially for con- |
| 4196 |
trol messages. |
| 4197 |
|
| 4198 |
It would be desirable to provide some way of supplying a |
| 4199 |
moderator's address in a newgroup message for a moderated |
| 4200 |
newsgroup, but this will cause problems unless effective |
| 4201 |
authentication is available, so it is left for future work. |
| 4202 |
|
| 4203 |
NOTE: This leaves news administrators stuck with |
| 4204 |
the annoying chore of arranging proper mailing of |
| 4205 |
moderated-newsgroup submissions. On Usenet, this |
| 4206 |
can be simplified by exploiting a forwarding |
| 4207 |
facility that some major sites provide: they main- |
| 4208 |
tain forwarding addresses, each the name of a mod- |
| 4209 |
erated newsgroup with all periods (".", ASCII 46) |
| 4210 |
replaced by hyphens ("-", ASCII 45), which forward |
| 4211 |
mail to the current newsgroup moderators. More |
| 4212 |
advice on the subject of forwarding to moderators |
| 4213 |
can be found in the document titled "How to Con- |
| 4214 |
struct the Mailpaths File", posted regularly to |
| 4215 |
the Usenet newsgroups news.lists, news.admin.misc, |
| 4216 |
and news.answers. |
| 4217 |
|
| 4218 |
|
| 4219 |
|
| 4220 |
|
| 4221 |
|
| 4222 |
2 June 1994 - 64 - expires 15 July 1994 |
| 4223 |
|
| 4224 |
|
| 4225 |
|
| 4226 |
|
| 4227 |
|
| 4228 |
INTERNET DRAFT to be NEWS sec. 7.3 |
| 4229 |
|
| 4230 |
|
| 4231 |
A newgroup message naming a newsgroup that already exists is |
| 4232 |
requesting a change in the moderation status or description |
| 4233 |
of the newsgroup. The same rules apply. |
| 4234 |
|
| 4235 |
|
| 4236 |
7.4. rmgroup |
| 4237 |
|
| 4238 |
The rmgroup message requests that a newsgroup be deleted: |
| 4239 |
|
| 4240 |
rmgroup-arguments = newsgroup-name |
| 4241 |
rmgroup-body = body |
| 4242 |
|
| 4243 |
The sole argument is the newsgroup name. The body is a com- |
| 4244 |
ment, which software MUST ignore; it SHOULD contain an |
| 4245 |
explanation of the decision to delete the newsgroup. |
| 4246 |
|
| 4247 |
NOTE: Criteria for newsgroup deletion vary widely |
| 4248 |
and are outside the scope of this Draft, but if |
| 4249 |
formal procedures of one kind or another were fol- |
| 4250 |
lowed in the decision, the body should mention |
| 4251 |
this. Administrators often look for such informa- |
| 4252 |
tion when deciding whether to comply with cre- |
| 4253 |
ation/deletion requests. |
| 4254 |
|
| 4255 |
A rmgroup message which lacks an Approved header MUST be |
| 4256 |
ignored. |
| 4257 |
|
| 4258 |
NOTE: It would also be desirable to ignore a |
| 4259 |
rmgroup message unless its Approved header names a |
| 4260 |
person who is authorized (in some sense) to delete |
| 4261 |
such a newsgroup. A cooperating subnet with suf- |
| 4262 |
ficiently strong coordination to maintain a cor- |
| 4263 |
rect and current list of authorized deleters might |
| 4264 |
wish to do so for its internal newsgroups. It |
| 4265 |
also (or alternatively) might wish to ignore a |
| 4266 |
rmgroup message for an internal newsgroup that was |
| 4267 |
posted (or cross-posted) to a non-internal news- |
| 4268 |
group. |
| 4269 |
|
| 4270 |
Unexpected deletion of a newsgroup being a disruptive |
| 4271 |
action, implementations are strongly advised to refer |
| 4272 |
rmgroup messages to an administrator by default, unless per- |
| 4273 |
haps the message can be determined to have originated within |
| 4274 |
a cooperating subnet whose members are considered trustwor- |
| 4275 |
thy. Abuses have occurred. |
| 4276 |
|
| 4277 |
|
| 4278 |
7.5. sendsys, version, whogets |
| 4279 |
|
| 4280 |
The sendsys message requests that a description of the |
| 4281 |
relayer's news feeds to other relayers be mailed to the |
| 4282 |
article's reply address: |
| 4283 |
|
| 4284 |
|
| 4285 |
|
| 4286 |
|
| 4287 |
|
| 4288 |
2 June 1994 - 65 - expires 15 July 1994 |
| 4289 |
|
| 4290 |
|
| 4291 |
|
| 4292 |
|
| 4293 |
|
| 4294 |
INTERNET DRAFT to be NEWS sec. 7.5 |
| 4295 |
|
| 4296 |
|
| 4297 |
sendsys-arguments = [ relayer-name ] |
| 4298 |
sendsys-body = body |
| 4299 |
|
| 4300 |
If there is an argument, relayers other than the one named |
| 4301 |
by the argument MUST not respond. The body is a comment, |
| 4302 |
which software MUST ignore; it SHOULD contain an explanation |
| 4303 |
of the reason for the request. |
| 4304 |
|
| 4305 |
The version message requests that the name and version of |
| 4306 |
the relayer software be mailed to the reply address: |
| 4307 |
|
| 4308 |
version-arguments = |
| 4309 |
version-body = body |
| 4310 |
|
| 4311 |
There are no arguments. The body is a comment, which soft- |
| 4312 |
ware MUST ignore; it SHOULD contain an explanation of the |
| 4313 |
reason for the request. |
| 4314 |
|
| 4315 |
The whogets message requests that a description of the |
| 4316 |
relayer and its news feeds to other relayers be mailed to |
| 4317 |
the article's reply address: |
| 4318 |
|
| 4319 |
whogets-arguments = newsgroup-name [ space relayer-name ] |
| 4320 |
whogets-body = body |
| 4321 |
|
| 4322 |
The first argument is the name of the "target newsgroup", |
| 4323 |
specifying the newsgroup for which propagation information |
| 4324 |
is desired. This MUST be a complete newsgroup name, not the |
| 4325 |
name of a hierarchy or a portion of a newsgroup name that is |
| 4326 |
not itself the name of a newsgroup. If there is a second |
| 4327 |
argument, only the relayer named by that argument should |
| 4328 |
respond. The body is a comment, which software MUST ignore; |
| 4329 |
it SHOULD contain an explanation of the reason for the |
| 4330 |
request. |
| 4331 |
|
| 4332 |
NOTE: Whogets is intended as a replacement for |
| 4333 |
sendsys (and version) with a precisely-specified |
| 4334 |
reply format. Since the syntax for specifying |
| 4335 |
what newsgroups get sent to what other relayers |
| 4336 |
varies widely between different forms of relayer |
| 4337 |
software, the only practical way to standardize |
| 4338 |
the reply format is to indicate a specific news- |
| 4339 |
group and ask where THAT newsgroup propagates. |
| 4340 |
The requirement that it be a complete newsgroup |
| 4341 |
name is intended to (largely) avoid the problem of |
| 4342 |
having to answer "yes and no" in cases where not |
| 4343 |
all newsgroups in a hierarchy are sent. |
| 4344 |
|
| 4345 |
Any of these messages lacking an Approved header MUST be |
| 4346 |
ignored. Response to any of these messages SHOULD be |
| 4347 |
delayed for at least 24 hours, and no response should be |
| 4348 |
attempted if the message has been cancelled in that time. |
| 4349 |
Also, no response SHOULD be attempted unless the local part |
| 4350 |
of the destination address is "newsmap". News |
| 4351 |
|
| 4352 |
|
| 4353 |
|
| 4354 |
2 June 1994 - 66 - expires 15 July 1994 |
| 4355 |
|
| 4356 |
|
| 4357 |
|
| 4358 |
|
| 4359 |
|
| 4360 |
INTERNET DRAFT to be NEWS sec. 7.5 |
| 4361 |
|
| 4362 |
|
| 4363 |
administrators SHOULD arrange for mail to "newsmap" on their |
| 4364 |
systems to be discarded (without reply) unless legitimate |
| 4365 |
use is in progress. |
| 4366 |
|
| 4367 |
NOTE: Because these messages can cause many, many |
| 4368 |
relayers to send mail to one person, such mes- |
| 4369 |
sages, specifying mailing to an innocent person's |
| 4370 |
mailbox, have been forged as a half-witted practi- |
| 4371 |
cal joke. A delay gives administrators time to |
| 4372 |
notice a fraudulent message and act (by cancelling |
| 4373 |
the message, preparing to divert the flood of mail |
| 4374 |
into the bit bucket, or both). Restriction of the |
| 4375 |
destination address to "newsmap" reduces the |
| 4376 |
appeal of fraud by making it impossible to use it |
| 4377 |
to harass a normal user. (A site which does NOT |
| 4378 |
discard mail to "newsmap", but rather bounces it |
| 4379 |
back, may incur higher communications costs than |
| 4380 |
if the mail had been accepted into a user's mail- |
| 4381 |
box... but a malicious forger could accomplish |
| 4382 |
this anyway, by using an address whose local part |
| 4383 |
is very unlikely to be a legitimate mailbox name.) |
| 4384 |
|
| 4385 |
NOTE: RFC 1036 did not require the Approved header |
| 4386 |
for these control messages. This has been added |
| 4387 |
because of the possibility that cryptographic |
| 4388 |
authentication of Approved headers will become |
| 4389 |
available. |
| 4390 |
|
| 4391 |
The body of the reply to a sendsys message SHOULD be of the |
| 4392 |
form: |
| 4393 |
|
| 4394 |
sendsys-reply = responder 1*sys-line |
| 4395 |
responder = "Responding-System:" space domain eol |
| 4396 |
sys-line = relayer-name ":" newsgroup-patterns [ ":" text ] eol |
| 4397 |
newsgroup-patterns = newsgroup-name *( "," newsgroup-name ) |
| 4398 |
|
| 4399 |
The first line identifies the responding system, using a |
| 4400 |
syntax resembling a header (but note that it is part of the |
| 4401 |
BODY). Remaining lines indicate what newsgroups are sent to |
| 4402 |
what other systems. The syntax of newsgroup patterns is not |
| 4403 |
well standardized; the form described is common (often with |
| 4404 |
newsgroup names only partially given, denoting all names |
| 4405 |
starting with a particular set of components) but not uni- |
| 4406 |
versal. The whogets message provides a better-defined |
| 4407 |
alternative. |
| 4408 |
|
| 4409 |
The reply to a version message is of somewhat ill-defined |
| 4410 |
form, with a body normally consisting of a single line of |
| 4411 |
text that somehow describes the version of the relayer soft- |
| 4412 |
ware. The whogets message provides a better-defined alter- |
| 4413 |
native. |
| 4414 |
|
| 4415 |
The body of the reply to a whogets message MUST be of the |
| 4416 |
form: |
| 4417 |
|
| 4418 |
|
| 4419 |
|
| 4420 |
2 June 1994 - 67 - expires 15 July 1994 |
| 4421 |
|
| 4422 |
|
| 4423 |
|
| 4424 |
|
| 4425 |
|
| 4426 |
INTERNET DRAFT to be NEWS sec. 7.5 |
| 4427 |
|
| 4428 |
|
| 4429 |
whogets-reply = responder-domain responder-relayer response-date |
| 4430 |
responding-to arrived-via responder-version |
| 4431 |
whogets-delimiter *pass-line |
| 4432 |
responder-domain = "Responding-System:" space domain eol |
| 4433 |
responder-relayer = "Responding-Relayer:" space relayer-name eol |
| 4434 |
response-date = "Response-Date:" space date eol |
| 4435 |
responding-to = "Responding-To:" space message-id eol |
| 4436 |
arrived-via = "Arrived-Via:" path-list eol |
| 4437 |
responder-version = "Responding-Version:" space nonblank-text eol |
| 4438 |
whogets-delimiter = eol |
| 4439 |
pass-line = relayer-name [ space domain ] eol |
| 4440 |
|
| 4441 |
The first six lines identify the responding relayer by its |
| 4442 |
Internet domain name (use of the ".uucp" and ".bitnet" |
| 4443 |
pseudo-domains is permissible, for registered hosts in them, |
| 4444 |
but discouraged) and its relayer name, specify the date when |
| 4445 |
the reply was generated and the message ID of the whogets |
| 4446 |
message being replied to, give the path list (from the Path |
| 4447 |
header) of the whogets message (which MAY, if absolutely |
| 4448 |
necessary, be truncated to a convenient length, but MUST |
| 4449 |
contain at least the leading three relayer names), and indi- |
| 4450 |
cate the version of relayer software responding. Note that |
| 4451 |
these lines are part of the BODY even though their format |
| 4452 |
resembles that of headers. Despite the apparently-fixed |
| 4453 |
order specified by the syntax above, they can appear in any |
| 4454 |
order, but there must be exactly one of each. |
| 4455 |
|
| 4456 |
After those preliminaries, and an empty line to unambigu- |
| 4457 |
ously define their end, the remaining lines are the relayer |
| 4458 |
names (which MAY be accompanied by the corresponding domain |
| 4459 |
names, if known) of systems which the responding system |
| 4460 |
passes the target newsgroup to. Only the names of news |
| 4461 |
relayers are to be included. |
| 4462 |
|
| 4463 |
NOTE: It is desirable for a reply to identify its |
| 4464 |
source by both domain name and relayer name |
| 4465 |
because news propagation is governed by the latter |
| 4466 |
but location in a broader context is best deter- |
| 4467 |
mined by the former. The date and whogets message |
| 4468 |
ID should, in principle, be present in the MAIL |
| 4469 |
headers, but are included in the body for robust- |
| 4470 |
ness in the presence of uncooperative mail sys- |
| 4471 |
tems. The reason for the path list is discussed |
| 4472 |
below. Adding version information eliminates the |
| 4473 |
need for a separate message to gather it. |
| 4474 |
|
| 4475 |
NOTE: The limitation of pass lines to contain only |
| 4476 |
names of news relayers is meant to exclude names |
| 4477 |
used within a single host (as identifiers for mail |
| 4478 |
gateways, portions of ihave/sendme implementa- |
| 4479 |
tions, etc.), which do not actually refer to other |
| 4480 |
hosts. |
| 4481 |
|
| 4482 |
|
| 4483 |
|
| 4484 |
|
| 4485 |
|
| 4486 |
2 June 1994 - 68 - expires 15 July 1994 |
| 4487 |
|
| 4488 |
|
| 4489 |
|
| 4490 |
|
| 4491 |
|
| 4492 |
INTERNET DRAFT to be NEWS sec. 7.5 |
| 4493 |
|
| 4494 |
|
| 4495 |
A relayer which is unaware of the existence of the target |
| 4496 |
newsgroup MUST not reply to a whogets message at all, |
| 4497 |
although this MUST not influence decisions on whether to |
| 4498 |
pass the article on to other relayers. |
| 4499 |
|
| 4500 |
NOTE: While this may result in discontinuous maps |
| 4501 |
in cases where some hosts have not honored |
| 4502 |
requests for creation of a newsgroup, it will also |
| 4503 |
prevent a flood of useless responses in the event |
| 4504 |
that a whogets message intended to map a small |
| 4505 |
region "leaks" out to a larger one. The possibil- |
| 4506 |
ity of discontinuous recognition of a newsgroup |
| 4507 |
does make it important that the whogets message |
| 4508 |
itself continue to propagate (if other criteria |
| 4509 |
permit). This is also the reason for the inclu- |
| 4510 |
sion of the whogets message's path list, or at |
| 4511 |
least the leading portion of it, in the reply: to |
| 4512 |
permit reconstruction of at least small gaps in |
| 4513 |
maps. |
| 4514 |
|
| 4515 |
Different networks set different rules for the legitimacy of |
| 4516 |
these messages, given that they may reveal details of orga- |
| 4517 |
nization-internal topology that are sometimes considered |
| 4518 |
proprietary. |
| 4519 |
|
| 4520 |
NOTE: On Usenet, in particular, willingness to |
| 4521 |
respond to these messages is held to be a condi- |
| 4522 |
tion of network membership: the topology of Usenet |
| 4523 |
is public information. Organizations wishing to |
| 4524 |
belong to such networks while keeping their inter- |
| 4525 |
nal topology confidential might wish to organize |
| 4526 |
their internal news software so that all articles |
| 4527 |
reaching outsiders appear to be from a single |
| 4528 |
"gatekeeper" system, with the details of internal |
| 4529 |
topology hidden behind that system. |
| 4530 |
|
| 4531 |
UNRESOLVED ISSUE: It might be useful to have a way |
| 4532 |
to set some sort of hop limit for these. |
| 4533 |
|
| 4534 |
|
| 4535 |
7.6. checkgroups |
| 4536 |
|
| 4537 |
The checkgroups control message contains a supposedly |
| 4538 |
authoritative list of the valid newsgroups within some sub- |
| 4539 |
set of the newsgroup name space: |
| 4540 |
|
| 4541 |
checkgroups-arguments = |
| 4542 |
checkgroups-body = [ invalidation ] valid-groups |
| 4543 |
/ invalidation |
| 4544 |
invalidation = "!" plain-component *( "," plain-component ) eol |
| 4545 |
valid-groups = 1*( description-line eol ) |
| 4546 |
|
| 4547 |
There are no arguments. The body lines (except possibly for |
| 4548 |
an initial invalidation) each contain a description line for |
| 4549 |
|
| 4550 |
|
| 4551 |
|
| 4552 |
2 June 1994 - 69 - expires 15 July 1994 |
| 4553 |
|
| 4554 |
|
| 4555 |
|
| 4556 |
|
| 4557 |
|
| 4558 |
INTERNET DRAFT to be NEWS sec. 7.6 |
| 4559 |
|
| 4560 |
|
| 4561 |
a newsgroup, as defined under the newgroup message (section |
| 4562 |
7.3). |
| 4563 |
|
| 4564 |
NOTE: Some other, ill-defined, forms of the check- |
| 4565 |
groups body were formerly used. See appendix A. |
| 4566 |
|
| 4567 |
The checkgroups message applies to all hierarchies contain- |
| 4568 |
ing any of the newsgroups listed in the body. The check- |
| 4569 |
groups message asserts that the newsgroups it lists are the |
| 4570 |
only newsgroups in those hierarchies. If there is an inval- |
| 4571 |
idation, it asserts that the hierarchies it names no longer |
| 4572 |
contain any newsgroups. |
| 4573 |
|
| 4574 |
Processing a checkgroups message MAY cause a local list of |
| 4575 |
newsgroup descriptions to be updated. It SHOULD also cause |
| 4576 |
the local lists of newsgroups (and their moderation sta- |
| 4577 |
tuses) in the mentioned hierarchies to be checked against |
| 4578 |
the message. The results of the check MAY be used for auto- |
| 4579 |
matic corrective action, or MAY be reported to the news |
| 4580 |
administrator in some way. |
| 4581 |
|
| 4582 |
NOTE: Automatically updating descriptions of |
| 4583 |
existing newsgroups is relatively safe. In the |
| 4584 |
case of newsgroup additions or deletions, simply |
| 4585 |
notifying the administrator is generally the wis- |
| 4586 |
est action, unless perhaps the message can be |
| 4587 |
determined to have originated within a cooperating |
| 4588 |
subnet whose members are considered trustworthy. |
| 4589 |
|
| 4590 |
NOTE: There is a problem with the checkgroups con- |
| 4591 |
cept: not all newsgroups in a hierarchy necessar- |
| 4592 |
ily propagate to the same set of machines. |
| 4593 |
(Notably, there is a set of newsgroups known as |
| 4594 |
the "inet" newsgroups, which have relatively lim- |
| 4595 |
ited distribution but coexist in several hierar- |
| 4596 |
chies with more widely-distributed newsgroups.) |
| 4597 |
The advice of checkgroups should always be taken |
| 4598 |
with a grain of salt, and should never be followed |
| 4599 |
blindly. |
| 4600 |
|
| 4601 |
|
| 4602 |
8. Transmission Formats |
| 4603 |
|
| 4604 |
While this Draft does not specify transmission methods |
| 4605 |
except to place a few constraints on them, there are some |
| 4606 |
data formats used only for transmission that are unique to |
| 4607 |
news. |
| 4608 |
|
| 4609 |
|
| 4610 |
8.1. Batches |
| 4611 |
|
| 4612 |
For efficient bulk transmission and processing of news arti- |
| 4613 |
cles, it is often desirable to transmit a number of them as |
| 4614 |
a single block of data, a "batch". The format of a batch |
| 4615 |
|
| 4616 |
|
| 4617 |
|
| 4618 |
2 June 1994 - 70 - expires 15 July 1994 |
| 4619 |
|
| 4620 |
|
| 4621 |
|
| 4622 |
|
| 4623 |
|
| 4624 |
INTERNET DRAFT to be NEWS sec. 8.1 |
| 4625 |
|
| 4626 |
|
| 4627 |
is: |
| 4628 |
|
| 4629 |
batch = 1*( batch-header article ) |
| 4630 |
batch-header = "#! rnews " article-size eol |
| 4631 |
article-size = 1*digit |
| 4632 |
|
| 4633 |
A batch is a sequence of articles, each prefixed by a header |
| 4634 |
line that includes its size. The article size is a decimal |
| 4635 |
count of the octets in the article, counting each EOL as one |
| 4636 |
octet regardless of how it is actually represented. |
| 4637 |
|
| 4638 |
NOTE: A relayer might wish to accept either a sin- |
| 4639 |
gle article or a batch as input. Since "#" cannot |
| 4640 |
appear in a header name, examination of the first |
| 4641 |
octet of the input will reveal its nature. |
| 4642 |
|
| 4643 |
NOTE: In the header line, there is exactly one |
| 4644 |
blank before "rnews", there is exactly one blank |
| 4645 |
after "rnews", and the EOL immediately follows the |
| 4646 |
article size. Beware that some software inserts |
| 4647 |
non-standard trash after the size. |
| 4648 |
|
| 4649 |
NOTE: Despite the similarity of this format to the |
| 4650 |
executable-script format used by some operating |
| 4651 |
systems, it is EXTREMELY unwise to just feed |
| 4652 |
incoming batches to a command interpreter in the |
| 4653 |
anticipation that it will run a command named |
| 4654 |
"rnews" to process the batch. Unless arrangements |
| 4655 |
are made to very tightly restrict the range of |
| 4656 |
commands that can be executed by this means, the |
| 4657 |
security implications are disastrous. |
| 4658 |
|
| 4659 |
|
| 4660 |
8.2. Encoded Batches |
| 4661 |
|
| 4662 |
When transmitting news, especially over communications links |
| 4663 |
that are slow or are billed by the bit, it is often desir- |
| 4664 |
able to batch news and apply data compression to the |
| 4665 |
batches. Transmission links sending compressed batches |
| 4666 |
SHOULD use out-of-band means of communication to specify the |
| 4667 |
compression algorithm being used. If there is no way to |
| 4668 |
send out-of-band information along with a batch, the follow- |
| 4669 |
ing encapsulation for a compressed batch MAY be used: |
| 4670 |
|
| 4671 |
ec-batch = "#! " compression-keyword eol compressed-batch |
| 4672 |
compression-keyword = "cunbatch" |
| 4673 |
|
| 4674 |
A line containing a keyword indicating the type of compres- |
| 4675 |
sion is followed by the compressed batch. The only truly |
| 4676 |
widespread compression keyword at present is "cunbatch", |
| 4677 |
indicating compression using the widely-distributed "com- |
| 4678 |
press" program. Other compression keywords MAY be used by |
| 4679 |
mutual agreement between the hosts involved. |
| 4680 |
|
| 4681 |
|
| 4682 |
|
| 4683 |
|
| 4684 |
2 June 1994 - 71 - expires 15 July 1994 |
| 4685 |
|
| 4686 |
|
| 4687 |
|
| 4688 |
|
| 4689 |
|
| 4690 |
INTERNET DRAFT to be NEWS sec. 8.2 |
| 4691 |
|
| 4692 |
|
| 4693 |
NOTE: An encapsulated compressed batch is NOT, in |
| 4694 |
general, a text file, despite having an initial |
| 4695 |
text line. This combination of text and non-text |
| 4696 |
data is often awkward to handle; for example, |
| 4697 |
standard decompression programs cannot be used |
| 4698 |
without first stripping off the initial line, and |
| 4699 |
that in turn is painful to do because many text- |
| 4700 |
handling tools that are superficially suited to |
| 4701 |
the job do not cope well with non-text data. |
| 4702 |
Hence the recommendation that out-of-band communi- |
| 4703 |
cation be used instead when possible. |
| 4704 |
|
| 4705 |
NOTE: For UUCP transmission, where a batch is typ- |
| 4706 |
ically transmitted by invoking the remote command |
| 4707 |
"rnews" with the batch as its input stream, a |
| 4708 |
plausible out-of-band method for indicating a com- |
| 4709 |
pression type would be to give a compression key- |
| 4710 |
word in an option to "rnews", perhaps in the form: |
| 4711 |
|
| 4712 |
rnews -d decompressor |
| 4713 |
|
| 4714 |
where "decompressor" is the name of a decompres- |
| 4715 |
sion program (e.g. "uncompress" for a batch com- |
| 4716 |
pressed with "compress" or "gunzip" for a batch |
| 4717 |
compressed with "gzip"). How this decompression |
| 4718 |
program is located and invoked by the receiving |
| 4719 |
relayer is implementation-specific. |
| 4720 |
|
| 4721 |
NOTE: See the notes in section 8.1 on the inadvis- |
| 4722 |
ability of feeding batches directly to command |
| 4723 |
interpreters. |
| 4724 |
|
| 4725 |
NOTE: There is exactly one blank between "#!" and |
| 4726 |
the compression keyword, and the EOL immediately |
| 4727 |
follows the keyword. |
| 4728 |
|
| 4729 |
|
| 4730 |
8.3. News Within Mail |
| 4731 |
|
| 4732 |
It is often desirable to transmit news as mail, either for |
| 4733 |
the convenience of a human recipient or because that is the |
| 4734 |
only type of transmission available on a restrictive commu- |
| 4735 |
nication path. |
| 4736 |
|
| 4737 |
Given the similarity between the news format and the MAIL |
| 4738 |
format, it is superficially attractive to just send the news |
| 4739 |
article as a mail message. This is typically a mistake: |
| 4740 |
mail-handling software often feels free to manipulate vari- |
| 4741 |
ous headers in undesirable ways (in some cases, such as |
| 4742 |
Sender, such manipulation is actually mandatory), and mail |
| 4743 |
transmission problems etc. MUST be reported to the adminis- |
| 4744 |
trators responsible for the mail transmission rather than to |
| 4745 |
the article's author. In general, news sent as mail should |
| 4746 |
be encapsulated to separate the mail headers and the news |
| 4747 |
|
| 4748 |
|
| 4749 |
|
| 4750 |
2 June 1994 - 72 - expires 15 July 1994 |
| 4751 |
|
| 4752 |
|
| 4753 |
|
| 4754 |
|
| 4755 |
|
| 4756 |
INTERNET DRAFT to be NEWS sec. 8.3 |
| 4757 |
|
| 4758 |
|
| 4759 |
headers. |
| 4760 |
|
| 4761 |
When the intended recipient is a human, any convenient form |
| 4762 |
of encapsulation may be used. Recommended practice is to |
| 4763 |
use MIME encapsulation with a content type of "mes- |
| 4764 |
sage/news", given that news articles have additional seman- |
| 4765 |
tics beyond what "message/rfc822" implies. |
| 4766 |
|
| 4767 |
NOTE: "message/news" was registered as a standard |
| 4768 |
subtype by IANA 22 June 1993. |
| 4769 |
|
| 4770 |
When mail is being used as a transmission path between two |
| 4771 |
relayers, however, a standard method is desirable. Cur- |
| 4772 |
rently the standard method is to send the mail to an address |
| 4773 |
whose local part is "rnews", with whatever mail headers are |
| 4774 |
necessary for successful transmission. The news article |
| 4775 |
(including its headers) is sent as the body of the mail mes- |
| 4776 |
sage, with an "N" prepended to each line. |
| 4777 |
|
| 4778 |
NOTE: The "N" reduces the probability of an inno- |
| 4779 |
cent line in a news article being taken as a magic |
| 4780 |
command to mail software, and makes it easy for |
| 4781 |
receiving software to strip off any lines added by |
| 4782 |
mail software (e.g. the trailing empty line added |
| 4783 |
by some UUCP mail software). |
| 4784 |
|
| 4785 |
This method has its weaknesses. In particular, it assumes |
| 4786 |
that the mail transmission channel can transmit nearly- |
| 4787 |
arbitrary body text undamaged. When mail is being used as a |
| 4788 |
transmission path of last resort, however, the mail system |
| 4789 |
often has inconvenient preconceived notions about the format |
| 4790 |
of message bodies. Various ad-hoc encoding schemes have |
| 4791 |
been used to avoid such problems. The recommended method is |
| 4792 |
to send a news article or batch as the body of a MIME mail |
| 4793 |
message, using content type "application/news-transmission" |
| 4794 |
and MIME's "base64" encoding (which is specifically designed |
| 4795 |
to survive all known major mail systems). |
| 4796 |
|
| 4797 |
NOTE: In the process, MIME conventions could be |
| 4798 |
used to fragment and reassemble an article which |
| 4799 |
is too large to be sent as a single mail message |
| 4800 |
over a transmission path that restricts message |
| 4801 |
length. In addition, the "conversions" parameter |
| 4802 |
to the content type could be used to indicate what |
| 4803 |
(if any) compression method has been used. And |
| 4804 |
the Content-MD5 header [rrr 1544] can be used as a |
| 4805 |
"checksum" to provide high confidence of detecting |
| 4806 |
accidental damage to the contents. |
| 4807 |
|
| 4808 |
UNRESOLVED ISSUE: The "conversions" parameter no |
| 4809 |
longer exists. What should be done about this, if |
| 4810 |
anything? |
| 4811 |
|
| 4812 |
|
| 4813 |
|
| 4814 |
|
| 4815 |
|
| 4816 |
2 June 1994 - 73 - expires 15 July 1994 |
| 4817 |
|
| 4818 |
|
| 4819 |
|
| 4820 |
|
| 4821 |
|
| 4822 |
INTERNET DRAFT to be NEWS sec. 8.3 |
| 4823 |
|
| 4824 |
|
| 4825 |
NOTE: It might look tempting to use a content type |
| 4826 |
such as "message/X-netnews", but MIME bans non- |
| 4827 |
trivial encodings of the entire body of messages |
| 4828 |
with content type "message". The intent is to |
| 4829 |
avoid obscuring nested structure underneath encod- |
| 4830 |
ings. For inter-relayer news transmission, there |
| 4831 |
is no nested structure of interest, and it is |
| 4832 |
important that the entire article (including its |
| 4833 |
headers, not just its body) be protected against |
| 4834 |
the vagaries of intervening mail software. This |
| 4835 |
situation appears to fit the MIME description of |
| 4836 |
circumstances in which "application" is the proper |
| 4837 |
content type. |
| 4838 |
|
| 4839 |
NOTE: "application/news-transmission", with a |
| 4840 |
"conversions" parameter, was registered as a stan- |
| 4841 |
dard subtype by IANA 22 June 1993. |
| 4842 |
|
| 4843 |
UNRESOLVED ISSUE: The "conversions" parameter no |
| 4844 |
longer exists in MIME. What should we do about |
| 4845 |
this? |
| 4846 |
|
| 4847 |
|
| 4848 |
8.4. Partial Batches |
| 4849 |
|
| 4850 |
UNRESOLVED ISSUE: The existing batch conventions |
| 4851 |
assemble (potentially) many articles into one |
| 4852 |
batch. Handling very large articles would be sub- |
| 4853 |
stantially less troublesome if there was also a |
| 4854 |
fragmentation convention for splitting a large |
| 4855 |
article into several batches. Is this worth |
| 4856 |
defining at this time? |
| 4857 |
|
| 4858 |
|
| 4859 |
9. Propagation and Processing |
| 4860 |
|
| 4861 |
Most aspects of news propagation and processing are imple- |
| 4862 |
mentation-specific. The basic propagation algorithms, and |
| 4863 |
certain details of how they are implemented, nevertheless |
| 4864 |
need to be standard. |
| 4865 |
|
| 4866 |
There are two important principles that news implementors |
| 4867 |
(and administrators) need to keep in mind. The first is the |
| 4868 |
well-known Internet Robustness Principle: |
| 4869 |
|
| 4870 |
Be liberal in what you accept, and conservative in what you send. |
| 4871 |
|
| 4872 |
However, in the case of news there is an even more important |
| 4873 |
principle, derived from a much older code of practice, the |
| 4874 |
Hippocratic Oath (we will thus call this the Hippocratic |
| 4875 |
Principle): |
| 4876 |
|
| 4877 |
First, do no harm. |
| 4878 |
|
| 4879 |
|
| 4880 |
|
| 4881 |
|
| 4882 |
2 June 1994 - 74 - expires 15 July 1994 |
| 4883 |
|
| 4884 |
|
| 4885 |
|
| 4886 |
|
| 4887 |
|
| 4888 |
INTERNET DRAFT to be NEWS sec. 9 |
| 4889 |
|
| 4890 |
|
| 4891 |
It is VITAL to realize that decisions which might be merely |
| 4892 |
suboptimal in a smaller context can become devastating mis- |
| 4893 |
takes when amplified by the actions of thousands of hosts |
| 4894 |
within a few hours. |
| 4895 |
|
| 4896 |
|
| 4897 |
9.1. Relayer General Issues |
| 4898 |
|
| 4899 |
Relayers MUST not alter the content of articles unnecessar- |
| 4900 |
ily. Well-intentioned attempts to "improve" headers, in |
| 4901 |
particular, typically do more harm than good. It is neces- |
| 4902 |
sary for a relayer to prepend its own name to the Path con- |
| 4903 |
tent (see section 5.6) and permissible for it to rewrite or |
| 4904 |
delete the Xref header (see section 6.12). Relayers MAY |
| 4905 |
delete the thoroughly-obsolete headers described in appendix |
| 4906 |
A.3, although this behavior no longer seems useful enough to |
| 4907 |
encourage. Other alterations SHOULD be avoided at all |
| 4908 |
costs, as per the Hippocratic Principle. |
| 4909 |
|
| 4910 |
NOTE: As discussed in section 2.3, tidying up the |
| 4911 |
headers of a user-prepared article is the job of |
| 4912 |
the posting agent, not the relayer. The relayer's |
| 4913 |
purpose is to move already-compliant articles |
| 4914 |
around efficiently without damaging them. Note |
| 4915 |
that in existing implementations, specific pro- |
| 4916 |
grams may contain both posting-agent functions and |
| 4917 |
relayer functions. The distinction is that post- |
| 4918 |
ing-agent functions are invoked only on articles |
| 4919 |
posted by local posters, never on articles |
| 4920 |
received from other relayers. |
| 4921 |
|
| 4922 |
NOTE: A particular corollary of this rule is that |
| 4923 |
relayers should not add headers unless truly nec- |
| 4924 |
essary. In particular, this is not SMTP; do not |
| 4925 |
add Received headers. |
| 4926 |
|
| 4927 |
Relayers MUST not pass non-conforming articles on to other |
| 4928 |
relayers, except perhaps in a cooperating subnet that has |
| 4929 |
agreed to permit certain kinds of non-conforming behavior. |
| 4930 |
This is a direct consequence of the Internet Robustness |
| 4931 |
Principle. |
| 4932 |
|
| 4933 |
The two preceding paragraphs may appear to be in conflict. |
| 4934 |
What is to be done when a non-conforming article is |
| 4935 |
received? The Robustness Principle argues that it should be |
| 4936 |
accepted but must not be passed on to other relayers while |
| 4937 |
still non-conforming, and the Hippocratic Principle strongly |
| 4938 |
discourages attempts at repair. The conclusion that this |
| 4939 |
appears to lead to is correct: a non-conforming article MAY |
| 4940 |
be accepted for local filing and processing, or it MAY be |
| 4941 |
discarded entirely, but it MUST not be passed on to other |
| 4942 |
relayers. |
| 4943 |
|
| 4944 |
|
| 4945 |
|
| 4946 |
|
| 4947 |
|
| 4948 |
2 June 1994 - 75 - expires 15 July 1994 |
| 4949 |
|
| 4950 |
|
| 4951 |
|
| 4952 |
|
| 4953 |
|
| 4954 |
INTERNET DRAFT to be NEWS sec. 9.1 |
| 4955 |
|
| 4956 |
|
| 4957 |
A relayer MUST not respond to the arrival of an article by |
| 4958 |
sending mail to any destination, other than a local adminis- |
| 4959 |
trator, except by explicit prearrangement with the recipi- |
| 4960 |
ent. Neither posting an article (other than certain types |
| 4961 |
of control message, see section 7.5) nor being the moderator |
| 4962 |
of a moderated newsgroup constitutes such prearrangement. |
| 4963 |
UNDER NO CIRCUMSTANCES WHATSOEVER may a relayer attempt to |
| 4964 |
send mail to either an article's originator or a moderator. |
| 4965 |
|
| 4966 |
NOTE: Reporting apparent errors in message compo- |
| 4967 |
sition is the job of a posting agent, not a |
| 4968 |
relayer. The same is true of mailing moderated- |
| 4969 |
newsgroup postings to moderators. In networks of |
| 4970 |
thousands of cooperating relayers, it is simply |
| 4971 |
unacceptable for there to be any circumstance |
| 4972 |
whatsoever that causes any significant fraction of |
| 4973 |
them to simultaneously send mail to the same des- |
| 4974 |
tination. (Some control messages are exceptions, |
| 4975 |
although perhaps ill-advised ones.) What might, |
| 4976 |
in a smaller network, be a useful notification or |
| 4977 |
forwarding becomes a deluge of near-identical mes- |
| 4978 |
sages that can bring mail software to its knees |
| 4979 |
and severely inconvenience recipients. Modera- |
| 4980 |
tors, in particular, historically have suffered |
| 4981 |
grievously from this. |
| 4982 |
|
| 4983 |
Notification of problems in incoming articles MAY go to |
| 4984 |
local administrators, or at most (by prearrangement!) to |
| 4985 |
the administrators of the neighboring relayer(s) that passed |
| 4986 |
on the problematic articles. |
| 4987 |
|
| 4988 |
NOTE: It would be desirable to notify the author |
| 4989 |
that his posting is not propagating as he expects. |
| 4990 |
However, there is no known method for doing this |
| 4991 |
that will scale up gracefully. (In particular, |
| 4992 |
"notify only if within N relayers of the origina- |
| 4993 |
tor" falls down in the presence of commercial news |
| 4994 |
services like UUNET: there may be hundreds or |
| 4995 |
thousands of relayers within a couple of hops of |
| 4996 |
the originator.) The best that can be done right |
| 4997 |
now is to notify neighbors, in hopes that the word |
| 4998 |
will eventually propagate up the line, or organize |
| 4999 |
regional monitoring at major hubs. |
| 5000 |
|
| 5001 |
If it is necessary to alter an article, e.g. translate it to |
| 5002 |
another character set or alter its EOL representation, |
| 5003 |
strenuous efforts should be made to ensure that such trans- |
| 5004 |
formations are reversible, and that relayers or other soft- |
| 5005 |
ware that might wish to reverse them know exactly how to do |
| 5006 |
so. |
| 5007 |
|
| 5008 |
NOTE: For example, a cooperating subnet that |
| 5009 |
exchanges articles using a non-ASCII character set |
| 5010 |
like EBCDIC should define a standard, reversible |
| 5011 |
|
| 5012 |
|
| 5013 |
|
| 5014 |
2 June 1994 - 76 - expires 15 July 1994 |
| 5015 |
|
| 5016 |
|
| 5017 |
|
| 5018 |
|
| 5019 |
|
| 5020 |
INTERNET DRAFT to be NEWS sec. 9.1 |
| 5021 |
|
| 5022 |
|
| 5023 |
ASCII-EBCDIC mapping and take pains to see that it |
| 5024 |
is used at all points where the subnet meets the |
| 5025 |
outside. If the only reason for using EBCDIC is |
| 5026 |
that the readers typically employ EBCDIC devices, |
| 5027 |
it would be more robust to employ ASCII as the |
| 5028 |
interchange format and do the transformation in |
| 5029 |
the reading and posting agents. |
| 5030 |
|
| 5031 |
|
| 5032 |
9.2. Article Acceptance And Propagation |
| 5033 |
|
| 5034 |
When a relayer first receives an article, it must decide |
| 5035 |
whether to accept it. (This applies regardless of whether |
| 5036 |
the article arrived by itself or as part of a batch, and in |
| 5037 |
principle regardless of whether it originated as a local |
| 5038 |
posting or as traffic from another relayer.) In a cooperat- |
| 5039 |
ing subnet with well-controlled propagation paths, some of |
| 5040 |
the tests specified here MAY be delegated to centrally- |
| 5041 |
located relayers; that is, relayers that can receive news |
| 5042 |
ONLY via one of the central relayers might simplify accep- |
| 5043 |
tance testing based on the assumption that incoming traffic |
| 5044 |
has already passed the full set of tests at a central |
| 5045 |
relayer. |
| 5046 |
|
| 5047 |
The wording that follows is based on a model in which arti- |
| 5048 |
cles arrive on a relayer's host before acceptance tests are |
| 5049 |
done. However, depending on the degree of integration of |
| 5050 |
the transport mechanisms and the relayer, some or all of |
| 5051 |
these tests MAY be done before the article is actually |
| 5052 |
transmitted, so that articles which definitely will not be |
| 5053 |
accepted need not be transmitted at all. |
| 5054 |
|
| 5055 |
The wording that follows also specifies a particular order |
| 5056 |
for the acceptance tests. While this order is the obvious |
| 5057 |
one, the tests MAY be done in any order. |
| 5058 |
|
| 5059 |
First, the relayer MUST verify that the article is a legal |
| 5060 |
news article, with all mandatory headers present with legal |
| 5061 |
contents. |
| 5062 |
|
| 5063 |
NOTE: This check in principle is done by the first |
| 5064 |
relayer to see an article, so an article received |
| 5065 |
from another relayer should always be legal, but |
| 5066 |
there is enough old software still operational |
| 5067 |
that this cannot be taken for granted; see the |
| 5068 |
discussion of the Internet Robustness Principle in |
| 5069 |
section 9.1. |
| 5070 |
|
| 5071 |
Second, the relayer MUST determine whether it has already |
| 5072 |
seen this article (identified by its message ID). This is |
| 5073 |
normally done by retaining a history of all article message |
| 5074 |
IDs seen in the last N days, where the value of N is decided |
| 5075 |
by the relayer's administrator but SHOULD be at least 7. |
| 5076 |
Since N cannot practically be infinite, articles whose Date |
| 5077 |
|
| 5078 |
|
| 5079 |
|
| 5080 |
2 June 1994 - 77 - expires 15 July 1994 |
| 5081 |
|
| 5082 |
|
| 5083 |
|
| 5084 |
|
| 5085 |
|
| 5086 |
INTERNET DRAFT to be NEWS sec. 9.2 |
| 5087 |
|
| 5088 |
|
| 5089 |
content indicates that they are older than N days are |
| 5090 |
declared "stale" and are deemed to have been seen already. |
| 5091 |
|
| 5092 |
NOTE: This check is important because news propa- |
| 5093 |
gation topology is typically redundant, often |
| 5094 |
highly so, and it is not at all uncommon for a |
| 5095 |
relayer to receive the same article from several |
| 5096 |
neighbors. The history of already-seen message |
| 5097 |
IDs can get quite large, hence the desire to limit |
| 5098 |
its length... but it is important that it be long |
| 5099 |
enough that slowly-propagating articles are not |
| 5100 |
classed as stale. News propagation within the |
| 5101 |
Internet is normally very rapid, but when UUCP |
| 5102 |
links are involved, end-to-end delays of several |
| 5103 |
days are not rare, so a week is not a particularly |
| 5104 |
generous minimum. |
| 5105 |
|
| 5106 |
NOTE: Despite generally more rapid propagation in |
| 5107 |
recent times, it is still not unheard-of for some |
| 5108 |
propagation paths to be very slow. This can |
| 5109 |
introduce the possibility of old articles arriving |
| 5110 |
again after they are gone from the history. Hence |
| 5111 |
the "stale" rule. |
| 5112 |
|
| 5113 |
Third, the relayer MUST determine whether any of the arti- |
| 5114 |
cle's newsgroups are "subscribed to" by the host, i.e. fit a |
| 5115 |
description of what hierarchies or newsgroups the site wants |
| 5116 |
to receive. |
| 5117 |
|
| 5118 |
NOTE: This check is significant because informa- |
| 5119 |
tion on what newsgroups a relayer wishes to |
| 5120 |
receive is often stored at its neighbors, who may |
| 5121 |
not have up-to-date information or may simplify |
| 5122 |
the rules for implementation reasons. As a hedge |
| 5123 |
against the possibility of missed or delayed new- |
| 5124 |
group control messages, relayers may wish to |
| 5125 |
observe a notion of a newsgroup subscription that |
| 5126 |
is independent of the list of newsgroups actually |
| 5127 |
known to the relayer. This would permit reception |
| 5128 |
and relaying of articles in newsgroups that the |
| 5129 |
relayer is not (yet) aware of, subject to more |
| 5130 |
general criteria indicating that they are likely |
| 5131 |
to be of interest. |
| 5132 |
|
| 5133 |
Once an article has been accepted, it may be passed on to |
| 5134 |
other relayers. The fundamental news propagation rule is a |
| 5135 |
flooding algorithm: on receiving and accepting an article, |
| 5136 |
send it to all neighboring relayers not already in its path |
| 5137 |
list that are sent its newsgroup(s) and distribution(s). |
| 5138 |
|
| 5139 |
NOTE: The path list's role in loop prevention may |
| 5140 |
appear relatively unimportant, given that looping |
| 5141 |
articles would typically be rejected as duplicates |
| 5142 |
anyway. However, the path list's role in |
| 5143 |
|
| 5144 |
|
| 5145 |
|
| 5146 |
2 June 1994 - 78 - expires 15 July 1994 |
| 5147 |
|
| 5148 |
|
| 5149 |
|
| 5150 |
|
| 5151 |
|
| 5152 |
INTERNET DRAFT to be NEWS sec. 9.2 |
| 5153 |
|
| 5154 |
|
| 5155 |
preventing superfluous transmissions is not triv- |
| 5156 |
ial. In particular, the path list is the only |
| 5157 |
thing that prevents relayer X, on receiving an |
| 5158 |
article from relayer Y, from sending it back to Y |
| 5159 |
again. (Indeed, the usual symptom of confusion |
| 5160 |
about relayer names is that incoming news loops |
| 5161 |
back in this manner.) The looping articles would |
| 5162 |
be rejected as duplicates, but doubling the commu- |
| 5163 |
nications load on every news transmission path is |
| 5164 |
not to be taken lightly! |
| 5165 |
|
| 5166 |
In general, relayers SHOULD not make propagation decisions |
| 5167 |
by "anticipation": relayer X, noting that the article's path |
| 5168 |
list already contains relayer Y, decides not to send it to |
| 5169 |
relayer Z because X anticipates that Z will get the article |
| 5170 |
by a better path. If that is generally true, then why is |
| 5171 |
there a news feed from X to Z at all? In fact, the "better |
| 5172 |
path" may be running slowly or may be down. News propaga- |
| 5173 |
tion is very robust precisely because some redundant trans- |
| 5174 |
mission is done "just in case". If it is imperative to |
| 5175 |
limit unnecessary traffic on a path, use of NNTP [rrr] or |
| 5176 |
ihave/sendme (see section 7.2) to pass articles only when |
| 5177 |
necessary is better than arbitrary decisions not to pass |
| 5178 |
articles at all. |
| 5179 |
|
| 5180 |
Anticipation is occasionally justified in special cases. |
| 5181 |
Such cases should involve both (1) a cooperating subnet |
| 5182 |
whose propagation paths are well-understood and well- |
| 5183 |
monitored, with failures and slowdowns noticed and dealt |
| 5184 |
with promptly, and (2) a persistent pattern of heavy unnec- |
| 5185 |
essary traffic on a path that is either slow or costly. In |
| 5186 |
addition, there should be some reason why neither NNTP nor |
| 5187 |
ihave/sendme is suitable as a solution to the problem. |
| 5188 |
|
| 5189 |
|
| 5190 |
9.3. Administrator Contact |
| 5191 |
|
| 5192 |
It is desirable to have a standardized contact address for a |
| 5193 |
relayer's administrators, in the spirit of the "postmaster" |
| 5194 |
address for mail administrators. Mail addressed to "news- |
| 5195 |
master" on a relayer's host MUST go to the administrator(s) |
| 5196 |
of that relayer. Mail addressed to "usenet" on the |
| 5197 |
relayer's host SHOULD be handled likewise. Mail addressed |
| 5198 |
to either address on other hosts using the same news |
| 5199 |
database SHOULD be handled likewise. |
| 5200 |
|
| 5201 |
NOTE: These addresses are case-sensitive, although |
| 5202 |
it would be desirable for sequences equivalent to |
| 5203 |
them using case-insensitive comparison to be han- |
| 5204 |
dled likewise. While "newsmaster" seems the pre- |
| 5205 |
ferred network-independent address, by analogy to |
| 5206 |
"postmaster", there is an existing practice of |
| 5207 |
using "usenet" for this purpose, and so "usenet" |
| 5208 |
should be supported if at all possible (especially |
| 5209 |
|
| 5210 |
|
| 5211 |
|
| 5212 |
2 June 1994 - 79 - expires 15 July 1994 |
| 5213 |
|
| 5214 |
|
| 5215 |
|
| 5216 |
|
| 5217 |
|
| 5218 |
INTERNET DRAFT to be NEWS sec. 9.3 |
| 5219 |
|
| 5220 |
|
| 5221 |
on hosts belonging to Usenet!). The address |
| 5222 |
`news" is also sometimes used for purposes like |
| 5223 |
this, but less consistently. |
| 5224 |
|
| 5225 |
|
| 5226 |
10. Gatewaying |
| 5227 |
|
| 5228 |
Gatewaying of traffic between news networks using this Draft |
| 5229 |
and those using other exchange mechanisms can be useful, but |
| 5230 |
must be done cautiously. Gateway administrators are taking |
| 5231 |
on significant responsibilities, and must recognize that the |
| 5232 |
consequences of error can be quite serious. |
| 5233 |
|
| 5234 |
|
| 5235 |
10.1. General Gatewaying Issues |
| 5236 |
|
| 5237 |
This section will primarily address the problems of gateway- |
| 5238 |
ing traffic INTO news networks. Little can be said about |
| 5239 |
the other direction without some specific knowledge of the |
| 5240 |
network(s) involved. However, the two issues are not |
| 5241 |
entirely independent: if a non-news network is gatewayed |
| 5242 |
into a news network at more than one point, traffic injected |
| 5243 |
into the non-news network by one gateway may appear at |
| 5244 |
another as a candidate for injection back into the news net- |
| 5245 |
work. |
| 5246 |
|
| 5247 |
This raises a more general principle, the single most impor- |
| 5248 |
tant issue for gatewaying: |
| 5249 |
|
| 5250 |
Above all, prevent loops. |
| 5251 |
|
| 5252 |
The normal loop prevention of news transmission is vitally |
| 5253 |
dependent on the Message-ID header. Any gateway which finds |
| 5254 |
it necessary to remove this header, alter it, or supersede |
| 5255 |
it (by moving it into the body), MUST take equally effective |
| 5256 |
precautions against looping. |
| 5257 |
|
| 5258 |
NOTE: There are few things more effective at turn- |
| 5259 |
ing news readers into a lynch mob than a malfunc- |
| 5260 |
tioning gateway, or pair of gateways, that takes |
| 5261 |
in news articles, mangles them just enough to pre- |
| 5262 |
vent news relayers from recognizing them as dupli- |
| 5263 |
cates, and regurgitates them back into the news |
| 5264 |
stream. This happens rather too often. |
| 5265 |
|
| 5266 |
Gateway implementors should realize that gateways have all |
| 5267 |
the responsibilities of relayers, plus the added complica- |
| 5268 |
tions introduced by transformations between different infor- |
| 5269 |
mation formats. Much of section 9's discussion of relayer |
| 5270 |
issues is relevant to gateways as well. In particular, |
| 5271 |
gateways SHOULD keep a history of recently-seen articles, as |
| 5272 |
described in section 9.2, and not assume that articles will |
| 5273 |
never reappear. This is particularly important for networks |
| 5274 |
that have their own concept analogous to message IDs: a |
| 5275 |
|
| 5276 |
|
| 5277 |
|
| 5278 |
2 June 1994 - 80 - expires 15 July 1994 |
| 5279 |
|
| 5280 |
|
| 5281 |
|
| 5282 |
|
| 5283 |
|
| 5284 |
INTERNET DRAFT to be NEWS sec. 10.1 |
| 5285 |
|
| 5286 |
|
| 5287 |
gateway should keep a history of traffic seen from BOTH |
| 5288 |
directions. |
| 5289 |
|
| 5290 |
If at all possible, articles entering the non-news network |
| 5291 |
SHOULD be marked in some way so that they will NOT be re- |
| 5292 |
gatewayed back into news. Multiple gateways obviously must |
| 5293 |
agree on the marking method used; if it is done by having |
| 5294 |
them know each others' names, name changes MUST be coordi- |
| 5295 |
nated with great care. If marking cannot be done, all |
| 5296 |
transformations MUST be reversible so that a re-gatewayed |
| 5297 |
article is identical to the original (except perhaps for a |
| 5298 |
longer Path header). |
| 5299 |
|
| 5300 |
Gateways MUST not pass control messages (articles containing |
| 5301 |
Control, Also-Control, or Supersedes headers) without remov- |
| 5302 |
ing the headers that make them control messages, unless |
| 5303 |
there are compelling reasons to believe that they are rele- |
| 5304 |
vant to both sides and that conventions are compatible. If |
| 5305 |
it is truly desirable to pass them unaltered, suitable pre- |
| 5306 |
cautions MUST be taken to ensure that there is NO POSSIBIL- |
| 5307 |
ITY of a looping control message. |
| 5308 |
|
| 5309 |
NOTE: The damage done by looping articles is mul- |
| 5310 |
tiplied a thousandfold if one of the affected |
| 5311 |
articles is something like a sendsys message (see |
| 5312 |
section 7.3) that requests multiple automatic |
| 5313 |
replies. Most gateways simply should not pass |
| 5314 |
control messages at all. If some unusual reason |
| 5315 |
dictates doing so, gateway implementors and admin- |
| 5316 |
istrators are urged to consider bulletproof rate- |
| 5317 |
limiting measures for the more destructive ones |
| 5318 |
like sendsys, e.g. passing only one per hour no |
| 5319 |
matter how many are offered. |
| 5320 |
|
| 5321 |
Gateways, like relayers, SHOULD make determined efforts to |
| 5322 |
avoid mangling articles unnecessarily. In the case of gate- |
| 5323 |
ways, some transformations may be inevitable, but keeping |
| 5324 |
them to a minimum and ensuring that they are reversible is |
| 5325 |
still highly desirable. |
| 5326 |
|
| 5327 |
Gateways MUST avoid destroying information. In particular, |
| 5328 |
the restrictions of section 4.2.2 are best taken with a |
| 5329 |
grain of salt in the context of gateways. Information that |
| 5330 |
does not translate directly into news headers SHOULD be |
| 5331 |
retained, perhaps in "X-" headers, both because it may be of |
| 5332 |
interest to sophisticated readers and because it may be cru- |
| 5333 |
cial to tracing propagation problems. |
| 5334 |
|
| 5335 |
Gateway implementors should take particular note of the dis- |
| 5336 |
cussion of mailed replies, or more precisely the ban on |
| 5337 |
same, in section 9.1. Gateway problems MUST be reported to |
| 5338 |
the local administration, not to the innocent originator of |
| 5339 |
traffic. "Gateway problems" here includes all forms of |
| 5340 |
propagation anomaly on the non-news side of the gateway, |
| 5341 |
|
| 5342 |
|
| 5343 |
|
| 5344 |
2 June 1994 - 81 - expires 15 July 1994 |
| 5345 |
|
| 5346 |
|
| 5347 |
|
| 5348 |
|
| 5349 |
|
| 5350 |
INTERNET DRAFT to be NEWS sec. 10.1 |
| 5351 |
|
| 5352 |
|
| 5353 |
e.g. unreachable addresses on a mailing list. Note that |
| 5354 |
this requires consideration of possible misbehavior of |
| 5355 |
"downstream" hosts, not just the gateway host. |
| 5356 |
|
| 5357 |
|
| 5358 |
10.2. Header Synthesis |
| 5359 |
|
| 5360 |
News articles prepared by gateways MUST be legal news arti- |
| 5361 |
cles. In particular, they MUST include all of the mandatory |
| 5362 |
headers (see section 5) and MUST fully conform to the |
| 5363 |
restrictions on said headers. This often requires that a |
| 5364 |
gateway function not only as a relayer, but also partly as a |
| 5365 |
posting agent, aiding in the synthesis of a conforming arti- |
| 5366 |
cle from non-conforming input. |
| 5367 |
|
| 5368 |
NOTE: The full-conformance requirement needs par- |
| 5369 |
ticularly careful attention when gatewaying mail- |
| 5370 |
ing lists to news, because a number of constructs |
| 5371 |
that are legal in MAIL headers are NOT permissible |
| 5372 |
in news headers. (Note also that not all mail |
| 5373 |
traffic fully conforms to even the MAIL specifica- |
| 5374 |
tion.) The rest of this section will be phrased |
| 5375 |
in terms of mail-to-news gatewaying, but most of |
| 5376 |
it is more generally applicable. |
| 5377 |
|
| 5378 |
The mandatory headers generally present few problems. |
| 5379 |
|
| 5380 |
If no date information is available, the gateway should sup- |
| 5381 |
ply a Date header with the gateway's current date. If only |
| 5382 |
partial information is available (e.g. date but not time), |
| 5383 |
this should be fleshed out to a full Date header by adding |
| 5384 |
default values, not by mixing in parts of the gateway's cur- |
| 5385 |
rent date. (Defaults should be chosen so that fleshed-out |
| 5386 |
dates will not be in the future!) It may be necessary to |
| 5387 |
map timezone information to the restricted forms permitted |
| 5388 |
in the news Date header. See section 5.1. |
| 5389 |
|
| 5390 |
NOTE: The prohibition of mixing dates is on the |
| 5391 |
theory that it is better to admit ignorance than |
| 5392 |
to lie. |
| 5393 |
|
| 5394 |
If the author's address as supplied in the original message |
| 5395 |
is not suitable for inclusion in a From header, the gateway |
| 5396 |
MUST transform it so it is, e.g. by use of the "% hack" and |
| 5397 |
the domain address of the gateway. The desire to preserve |
| 5398 |
information is NOT an excuse for violating the rules. If |
| 5399 |
the transformation is drastic enough that there is reason to |
| 5400 |
suspect loss of information, it may be desirable to include |
| 5401 |
the original form in an X- header, but the From header's |
| 5402 |
contents MUST be as specified in section 5.2. |
| 5403 |
|
| 5404 |
If the message contains a Message-ID header, the contents |
| 5405 |
should be dealt with as discussed in section 10.3. If there |
| 5406 |
is no message ID present, it will be necessary to synthesize |
| 5407 |
|
| 5408 |
|
| 5409 |
|
| 5410 |
2 June 1994 - 82 - expires 15 July 1994 |
| 5411 |
|
| 5412 |
|
| 5413 |
|
| 5414 |
|
| 5415 |
|
| 5416 |
INTERNET DRAFT to be NEWS sec. 10.2 |
| 5417 |
|
| 5418 |
|
| 5419 |
one, following the news rules (see section 5.3). |
| 5420 |
|
| 5421 |
Every effort should be made to produce a meaningful Subject |
| 5422 |
header; see section 5.4. Many news readers select articles |
| 5423 |
to read based on Subject headers, and inserting a place- |
| 5424 |
holder like "<no subject available>" is considered highly |
| 5425 |
objectionable. Even synthesizing a Subject header by pick- |
| 5426 |
ing out the first half-dozen nouns and adjectives in the |
| 5427 |
article body is better than using a placeholder, since it |
| 5428 |
offers SOME indication of what the article might contain. |
| 5429 |
|
| 5430 |
The contents of the Newsgroups header (section 5.5) are usu- |
| 5431 |
ally predetermined by gateway configuration, but a gateway |
| 5432 |
to a network that has its own concept of newsgroups or dis- |
| 5433 |
cussions might have to make transformations. Such transfor- |
| 5434 |
mations should be reversible; otherwise confusion is likely |
| 5435 |
on both sides. |
| 5436 |
|
| 5437 |
It will rarely be possible for gateways to provide a Path |
| 5438 |
header that is both an accurate history of the relayers the |
| 5439 |
article has passed through AS NEWS and a usable reply |
| 5440 |
address. The history function MUST be given priority; see |
| 5441 |
the discussion in section 5.6. It will usually be necessary |
| 5442 |
for a gateway to supply an empty path list, abandoning the |
| 5443 |
reply function. |
| 5444 |
|
| 5445 |
It is desirable for gatewayed articles to convey as much |
| 5446 |
useful information as possible, e.g. by use of optional news |
| 5447 |
headers (see section 6) when the relevant information is |
| 5448 |
available. Synthesis of optional headers can generally fol- |
| 5449 |
low similar rules. |
| 5450 |
|
| 5451 |
Software synthesizing References headers should note the |
| 5452 |
discussion in section 6.5 concerning the incompatibility |
| 5453 |
between MAIL and news. Also of interest is the possibility |
| 5454 |
of incorporating information from In-Reply-To headers and |
| 5455 |
from attribution lines in the body; an incomplete or some- |
| 5456 |
what conjectural References header is much better than none |
| 5457 |
at all, and reading agents already have to cope with incom- |
| 5458 |
plete or slightly erroneous References lists. |
| 5459 |
|
| 5460 |
|
| 5461 |
10.3. Message ID Mapping |
| 5462 |
|
| 5463 |
This section, like the previous one, is phrased in terms of |
| 5464 |
mail being gatewayed into news, but most of the discussion |
| 5465 |
should be more generally applicable. |
| 5466 |
|
| 5467 |
A particularly sticky problem of gatewaying mail into news |
| 5468 |
is supplying legal news message IDs. Note, in particular, |
| 5469 |
that not all MAIL message IDs are legal in news; the news |
| 5470 |
syntax (specified in section 5.3, with related material in |
| 5471 |
5.2) is more restrictive. Generating a fully-conforming |
| 5472 |
news article from a mail message may require transforming |
| 5473 |
|
| 5474 |
|
| 5475 |
|
| 5476 |
2 June 1994 - 83 - expires 15 July 1994 |
| 5477 |
|
| 5478 |
|
| 5479 |
|
| 5480 |
|
| 5481 |
|
| 5482 |
INTERNET DRAFT to be NEWS sec. 10.3 |
| 5483 |
|
| 5484 |
|
| 5485 |
the message ID somewhat. |
| 5486 |
|
| 5487 |
Generation and transformation of message IDs assumes partic- |
| 5488 |
ular importance if a given mailing list (or whatever) is |
| 5489 |
being handled by more than one gateway. It is highly desir- |
| 5490 |
able that the same article contents not appear twice in the |
| 5491 |
same newsgroup, which requires that they receive the same |
| 5492 |
message ID from all gateways. Gateways SHOULD use the fol- |
| 5493 |
lowing algorithm (possibly modified by the later discussion |
| 5494 |
of gatewaying into more than one newsgroup) unless local |
| 5495 |
considerations dictate another: |
| 5496 |
|
| 5497 |
1. Separate message ID from surroundings, if necessary. |
| 5498 |
A plausible method for this is to start at the first |
| 5499 |
"<", end at the next ">", and reject the message if |
| 5500 |
no ">" is found or a second "<" is seen before the |
| 5501 |
">". Also reject the message if the message ID con- |
| 5502 |
tains no "@" or more than one "@", or if it contains |
| 5503 |
no ".". Also reject the message if the message ID |
| 5504 |
contains non-ASCII characters, ASCII control charac- |
| 5505 |
ters, or white space. |
| 5506 |
|
| 5507 |
NOTE: Any legitimate domain will include at |
| 5508 |
least one ".". RFC 822 section 6.2.2 forbids |
| 5509 |
white space in this context when passing mail |
| 5510 |
on to non-MAIL software. |
| 5511 |
|
| 5512 |
2. Delete the leading "<" and trailing ">". Separate |
| 5513 |
message ID into local part and domain at the "@". |
| 5514 |
|
| 5515 |
3. In both components, transliterate leading dots |
| 5516 |
(".", ASCII 46), trailing dots, and dots after the |
| 5517 |
first in sequences of two or more consecutive |
| 5518 |
dots, into underscores (ASCII 95). |
| 5519 |
|
| 5520 |
4. In both components, transliterate disallowed char- |
| 5521 |
acters other than dots (see the definition of |
| 5522 |
<unquoted-char> in section 5.2) to underscores |
| 5523 |
(ASCII 95). |
| 5524 |
|
| 5525 |
5. Form the message ID as |
| 5526 |
|
| 5527 |
"<" local-part "@" domain ">" |
| 5528 |
|
| 5529 |
|
| 5530 |
NOTE: This algorithm is approximately that of Rich |
| 5531 |
Salz's successful gatewaying package. |
| 5532 |
|
| 5533 |
Despite the desire to keep message IDs consistent across |
| 5534 |
multiple gateways, there is also a more subtle issue that |
| 5535 |
can require a different approach. If the same articles are |
| 5536 |
being gatewayed into more than one newsgroup, and it is not |
| 5537 |
possible to arrange that all gateways gateway them to the |
| 5538 |
same cross-posted set of newsgroups, then the message IDs in |
| 5539 |
|
| 5540 |
|
| 5541 |
|
| 5542 |
2 June 1994 - 84 - expires 15 July 1994 |
| 5543 |
|
| 5544 |
|
| 5545 |
|
| 5546 |
|
| 5547 |
|
| 5548 |
INTERNET DRAFT to be NEWS sec. 10.3 |
| 5549 |
|
| 5550 |
|
| 5551 |
the different newsgroups MUST be DIFFERENT. |
| 5552 |
|
| 5553 |
NOTE: Otherwise, arrival of an article in one |
| 5554 |
newsgroup will prevent it from appearing in |
| 5555 |
another, and which newsgroup a particular article |
| 5556 |
appears in will be an accident of which direction |
| 5557 |
it arrives from first. It is very difficult to |
| 5558 |
maintain a coherent discussion when each partici- |
| 5559 |
pant sees a randomly-selected 50% of the traffic. |
| 5560 |
The fundamental problem here is that the basic |
| 5561 |
assumption behind message IDs is being violated: |
| 5562 |
the gateways are assigning the same message ID to |
| 5563 |
articles that differ in an important respect |
| 5564 |
(Newsgroups header). |
| 5565 |
|
| 5566 |
In such cases, it is suggested that the newsgroup name, or |
| 5567 |
an agreed-on abbreviation thereof, be prepended to the local |
| 5568 |
part of the message ID (with a separating ".") by the gate- |
| 5569 |
way. This will ensure that multiple gateways generate the |
| 5570 |
same message ID, while also ensuring that different news- |
| 5571 |
groups can be read independently. |
| 5572 |
|
| 5573 |
NOTE: It is preferable to have the gateway(s) |
| 5574 |
cross-post the article, avoiding the issue alto- |
| 5575 |
gether, but this may not be feasible, especially |
| 5576 |
if one newsgroup is widespread and the other is |
| 5577 |
purely local. |
| 5578 |
|
| 5579 |
|
| 5580 |
10.4. Mail to and from News |
| 5581 |
|
| 5582 |
Gatewaying mail to news, and vice-versa, is the most obvious |
| 5583 |
form of news gatewaying. It is common to set up gateways |
| 5584 |
between news and mail rather too casually. |
| 5585 |
|
| 5586 |
It is hard to go very wrong in gatewaying news into a mail- |
| 5587 |
ing list, except for the non-trivial matter of making sure |
| 5588 |
that error reports go to the local administration rather |
| 5589 |
than to the authors of news articles. (This requires atten- |
| 5590 |
tion to the "envelope address" as well as to the message |
| 5591 |
headers.) Doing the reverse connection correctly is much |
| 5592 |
harder than it looks. |
| 5593 |
|
| 5594 |
NOTE: In particular, just feeding the mail message |
| 5595 |
to "inews -h" or the equivalent is NOT, repeat |
| 5596 |
NOT, adequate to gateway mail to news. Signifi- |
| 5597 |
cant gatewaying software is necessary to do it |
| 5598 |
right. Not all headers of mail messages conform |
| 5599 |
to even the MAIL specifications, never mind the |
| 5600 |
stricter rules for news. |
| 5601 |
|
| 5602 |
It is useful to distinguish between two different forms of |
| 5603 |
mail-to-news gatewaying: gatewaying a mailing list into a |
| 5604 |
newsgroup, and operating a "post-by-mail" service in which |
| 5605 |
|
| 5606 |
|
| 5607 |
|
| 5608 |
2 June 1994 - 85 - expires 15 July 1994 |
| 5609 |
|
| 5610 |
|
| 5611 |
|
| 5612 |
|
| 5613 |
|
| 5614 |
INTERNET DRAFT to be NEWS sec. 10.4 |
| 5615 |
|
| 5616 |
|
| 5617 |
individual articles can be posted to a newsgroup by mailing |
| 5618 |
them to a specific address. In the first case, the message |
| 5619 |
is already being "broadcast", and the situation can be |
| 5620 |
viewed as gatewaying one form of news into another. The |
| 5621 |
second case is closer to that of a moderator posting submis- |
| 5622 |
sions to a moderated newsgroup. |
| 5623 |
|
| 5624 |
In either case, the discussions in the preceding two sec- |
| 5625 |
tions are relevant, as is the Hippocratic Principle of sec- |
| 5626 |
tion 9. However, some additional considerations are spe- |
| 5627 |
cific to mail-to-news gatewaying. |
| 5628 |
|
| 5629 |
As mentioned in section 6, point-to-point headers like To |
| 5630 |
and Cc SHOULD not appear as such in news, although it is |
| 5631 |
suggested that they be transformed to "X-" headers, e.g. X- |
| 5632 |
To and X-Cc, to preserve their information content for pos- |
| 5633 |
sible use by readers or troubleshooters. The Received |
| 5634 |
header is entirely specific to MAIL and SHOULD be deleted |
| 5635 |
completely during gatewaying, except perhaps for the |
| 5636 |
Received header supplied by the gateway host itself. |
| 5637 |
|
| 5638 |
The Sender header is a tricky case, one where mailing-list |
| 5639 |
and post-by-mail practice should differ. For gatewaying |
| 5640 |
mailing lists, the mailing-list host should be considered a |
| 5641 |
relayer, and the From and Sender headers supplied in its |
| 5642 |
transmissions left strictly untouched. For post-by-mail, as |
| 5643 |
for a moderator posting a mailed submission, the Sender |
| 5644 |
header should reflect the poster rather than the author. If |
| 5645 |
a post-by-mail gateway receives a message with its own |
| 5646 |
Sender header, it might wish to preserve the content in an |
| 5647 |
X-Sender header. |
| 5648 |
|
| 5649 |
It will generally be necessary to transform between mail's |
| 5650 |
In-Reply-To/References convention and news's References/See- |
| 5651 |
Also convention, to preserve correct semantics of cross ref- |
| 5652 |
erences. This also requires attention when going the other |
| 5653 |
way, from news to mail. See the discussion of the differ- |
| 5654 |
ence in section 6.5. |
| 5655 |
|
| 5656 |
|
| 5657 |
10.5. Gateway Administration |
| 5658 |
|
| 5659 |
Any news system will benefit from an attentive administra- |
| 5660 |
tor, preferably assisted by automated monitoring for anoma- |
| 5661 |
lies. This is particularly true of gateways. Gateway soft- |
| 5662 |
ware SHOULD be instrumented so that unusual occurrences, |
| 5663 |
such as sudden massive surges in traffic, are reported |
| 5664 |
promptly. It is desirable, in fact, to go further: gateway |
| 5665 |
software SHOULD endeavour to limit damage in the event that |
| 5666 |
the administrator does not respond promptly. |
| 5667 |
|
| 5668 |
NOTE: For example, software might limit the gate- |
| 5669 |
waying rate by queueing incoming traffic and emp- |
| 5670 |
tying the queue at a finite maximum rate (well |
| 5671 |
|
| 5672 |
|
| 5673 |
|
| 5674 |
2 June 1994 - 86 - expires 15 July 1994 |
| 5675 |
|
| 5676 |
|
| 5677 |
|
| 5678 |
|
| 5679 |
|
| 5680 |
INTERNET DRAFT to be NEWS sec. 10.5 |
| 5681 |
|
| 5682 |
|
| 5683 |
below the maximum that the host is capable of!) |
| 5684 |
which is set by the administrator and is not |
| 5685 |
raised automatically. |
| 5686 |
|
| 5687 |
Traffic gatewayed into a news network SHOULD include a suit- |
| 5688 |
able header, perhaps X-Gateway-Administrator, giving an |
| 5689 |
electronic address that can be used to report problems. |
| 5690 |
This SHOULD be an address that goes direct to a human, not |
| 5691 |
to a "routine administrative issues" mailbox that is exam- |
| 5692 |
ined only occasionally, since the point is to be able to |
| 5693 |
reach the administrator quickly in an emergency. Gateway |
| 5694 |
administrators SHOULD arrange substitutes to cover gateway |
| 5695 |
operation (with suitable redirection of mail) when they are |
| 5696 |
on vacation etc. |
| 5697 |
|
| 5698 |
|
| 5699 |
11. Security And Related Issues |
| 5700 |
|
| 5701 |
Although the interchange format itself raises no significant |
| 5702 |
security issues, the wider context does. |
| 5703 |
|
| 5704 |
|
| 5705 |
11.1. Leakage |
| 5706 |
|
| 5707 |
The most obvious form of security problem with news is |
| 5708 |
"leakage" of articles which are intended to have only |
| 5709 |
restricted circulation. The flooding algorithm is EXTREMELY |
| 5710 |
good at finding any path by which articles can leave a sub- |
| 5711 |
net with supposedly-restrictive boundaries. Substantial |
| 5712 |
administrative effort is required to ensure that local news- |
| 5713 |
groups remain local, unless connections to the outside world |
| 5714 |
are tightly restricted. |
| 5715 |
|
| 5716 |
A related problem is that the sendme control message can be |
| 5717 |
used to ask for any article by its message ID. The useful- |
| 5718 |
ness of this has declined as message-ID generation algo- |
| 5719 |
rithms have become less predictable, but it remains a poten- |
| 5720 |
tial problem for "secure" newsgroups. Hosts with such news- |
| 5721 |
groups may wish to disable the sendme control message |
| 5722 |
entirely. |
| 5723 |
|
| 5724 |
The sendsys, version, and whogets control messages also |
| 5725 |
allow "outsiders" to request information from "inside", |
| 5726 |
which may reveal details of internal topology (etc.) that |
| 5727 |
are considered confidential. (Note that at least limited |
| 5728 |
openness about such matters may be a condition of membership |
| 5729 |
in such networks, e.g. Usenet.) |
| 5730 |
|
| 5731 |
Organizations wishing to control these forms of leakage are |
| 5732 |
strongly advised to designate a small number of "official |
| 5733 |
gateway" hosts to handle all news exchange with the outside |
| 5734 |
world, so that a bounded amount of administrative effort is |
| 5735 |
needed to control propagation and eliminate problems. |
| 5736 |
Attempts to keep news out entirely, by refusing to support |
| 5737 |
|
| 5738 |
|
| 5739 |
|
| 5740 |
2 June 1994 - 87 - expires 15 July 1994 |
| 5741 |
|
| 5742 |
|
| 5743 |
|
| 5744 |
|
| 5745 |
|
| 5746 |
INTERNET DRAFT to be NEWS sec. 11.1 |
| 5747 |
|
| 5748 |
|
| 5749 |
an official gateway, typically result in large numbers of |
| 5750 |
unofficial partial gateways appearing over time. Such a |
| 5751 |
configuration is much more difficult to troubleshoot. |
| 5752 |
|
| 5753 |
A somewhat-related problem is the possibility of proprietary |
| 5754 |
material being disclosed unintentionally by a poster who |
| 5755 |
does not realize how far his words will propagate, either |
| 5756 |
from sheer misunderstanding or because of errors made (by |
| 5757 |
human or software) in followup preparation. There is little |
| 5758 |
that can be done about this except education. |
| 5759 |
|
| 5760 |
|
| 5761 |
11.2. Attacks |
| 5762 |
|
| 5763 |
Although the limitations of the medium restrict what can be |
| 5764 |
done to attack a host via news, some possibilities exist, |
| 5765 |
most of them problems news shares with mail. |
| 5766 |
|
| 5767 |
If reading agents are careless about transmitting non- |
| 5768 |
printable characters to output devices, malicious posters |
| 5769 |
may post articles containing control sequences ("letter- |
| 5770 |
bombs") meant to have various destructive effects on output |
| 5771 |
devices. Possible effects depend on the device, but they |
| 5772 |
can include hardware damage (e.g. by repeated writing of |
| 5773 |
values into configuration memories that can tolerate only a |
| 5774 |
limited number of write cycles) and security violation (e.g. |
| 5775 |
by reprogramming function keys potentially used by privi- |
| 5776 |
leged readers). |
| 5777 |
|
| 5778 |
A more sophisticated variation on the letterbomb is inclu- |
| 5779 |
sion of "Trojan horses" in programs. Obviously, readers |
| 5780 |
must be cautious about using software found in news, but |
| 5781 |
more subtly, reading agents must also exercise care. MIME |
| 5782 |
messages can include material that is executable in some |
| 5783 |
sense, such as PostScript documents (which are programs!), |
| 5784 |
and letterbombs may be introduced into such material. |
| 5785 |
|
| 5786 |
Given the presence of finite resources and other software |
| 5787 |
limitations, some degree of system disruption can be |
| 5788 |
achieved by posting otherwise-innocent material in great |
| 5789 |
volume, either in single huge articles (see section 4.6) or |
| 5790 |
in a stream of modest-sized articles. (Some would say that |
| 5791 |
the steady growth of Usenet volume constitutes a subtle and |
| 5792 |
unintentional attack of the latter type; certainly it can |
| 5793 |
have disruptive effects if administrators are inattentive.) |
| 5794 |
Systems need some ability to cope with surges, because sin- |
| 5795 |
gle huge articles occur occasionally as the result of soft- |
| 5796 |
ware error, innocent misunderstanding, or deliberate malice, |
| 5797 |
and downtime at upstream hosts can cause droughts, followed |
| 5798 |
by floods, of legitimate articles. (There is also a certain |
| 5799 |
amount of normal variation; for example, Usenet traffic is |
| 5800 |
noticeably lighter on weekends and during Christmas holi- |
| 5801 |
days, and rises noticeably at the start of the school term |
| 5802 |
of North American universities.) However, a site that |
| 5803 |
|
| 5804 |
|
| 5805 |
|
| 5806 |
2 June 1994 - 88 - expires 15 July 1994 |
| 5807 |
|
| 5808 |
|
| 5809 |
|
| 5810 |
|
| 5811 |
|
| 5812 |
INTERNET DRAFT to be NEWS sec. 11.2 |
| 5813 |
|
| 5814 |
|
| 5815 |
normally receives little traffic may be quite vulnerable to |
| 5816 |
"swamping" attack if its software is insufficiently careful. |
| 5817 |
|
| 5818 |
In general, careless implementation may open doors that are |
| 5819 |
not intrinsic to news. In particular, implementation of |
| 5820 |
control messages (see sections 6.6 and 7) and unbatchers |
| 5821 |
(see section 8.1 and 8.2) via a command interpreter requires |
| 5822 |
substantial precautions to ensure that only the intended |
| 5823 |
capabilities are available. Care must also be taken that |
| 5824 |
article-supplied text is not fed to programs that have |
| 5825 |
escapes to command interpreters. |
| 5826 |
|
| 5827 |
Finally, there is considerable potential for malice in the |
| 5828 |
sendsys, version, and whogets control messages. They are |
| 5829 |
not harmful to the hosts receiving them as news, but they |
| 5830 |
can be used to enlist those hosts (by the thousands) as |
| 5831 |
unwitting allies in a mail-swamping attack on a victim who |
| 5832 |
may not even receive news. The precautions discussed in |
| 5833 |
section 7.5 can reduce the potential for such attacks con- |
| 5834 |
siderably, but the hazard cannot be eliminated as long as |
| 5835 |
these control messages exist. |
| 5836 |
|
| 5837 |
|
| 5838 |
11.3. Anarchy |
| 5839 |
|
| 5840 |
The highly distributed nature of news propagation, and the |
| 5841 |
lack of adequate authentication protocols (especially for |
| 5842 |
use over the less-interactive transport mechanisms such as |
| 5843 |
UUCP), make article forgery relatively straightforward. It |
| 5844 |
may be possible to at least track a forgery to its source, |
| 5845 |
once it is recognized as such, but clever forgers can make |
| 5846 |
even that relatively difficult. The assumption that forg- |
| 5847 |
eries will be recognized as such is also not to be taken for |
| 5848 |
granted; readers are notoriously prone to blindly assuming |
| 5849 |
authenticity. If a forged article's initial path list |
| 5850 |
includes the relayer name of the supposed poster's host, the |
| 5851 |
article will never be sent to that host, and the alleged |
| 5852 |
author may learn about the forgery secondhand or not at all. |
| 5853 |
|
| 5854 |
A particularly noxious form of forgery is the forged "can- |
| 5855 |
cel" control message. Notably, it is relatively straight- |
| 5856 |
forward to write software that will automatically send out a |
| 5857 |
(forged) cancel message for any article meeting some crite- |
| 5858 |
rion, e.g. written by a specific author. The authentication |
| 5859 |
problems discussed in section 7.1 make it difficult to solve |
| 5860 |
this without crippling cancel's important functionality. |
| 5861 |
|
| 5862 |
A related problem is the possibility of disagreements over |
| 5863 |
newsgroup creation, on networks where such things are not |
| 5864 |
decided by central authorities. There have been cases of |
| 5865 |
"rmgroup wars", where one poster persistently sends out new- |
| 5866 |
group messages to create a newsgroup and another, equally |
| 5867 |
persistently, sends out rmgroup messages asking that it be |
| 5868 |
removed. This is not particularly damaging, if relayers are |
| 5869 |
|
| 5870 |
|
| 5871 |
|
| 5872 |
2 June 1994 - 89 - expires 15 July 1994 |
| 5873 |
|
| 5874 |
|
| 5875 |
|
| 5876 |
|
| 5877 |
|
| 5878 |
INTERNET DRAFT to be NEWS sec. 11.3 |
| 5879 |
|
| 5880 |
|
| 5881 |
configured to be cautious, but can cause serious confusion |
| 5882 |
among innocent third parties who just want to know whether |
| 5883 |
they can use the newsgroup for communication or not. |
| 5884 |
|
| 5885 |
|
| 5886 |
11.4. Liability |
| 5887 |
|
| 5888 |
News shares the legal uncertainty surrounding other forms of |
| 5889 |
electronic communication: what rules apply to this new |
| 5890 |
medium of information exchange? News is a particularly |
| 5891 |
problematic case because it is a broadcast medium rather |
| 5892 |
than a point-to-point one like mail, and analogies to older |
| 5893 |
forms of communication are particularly weak. |
| 5894 |
|
| 5895 |
Are news-carrying hosts common carriers, like the phone com- |
| 5896 |
panies, providing communications paths without having either |
| 5897 |
authority over or responsibility for content? Or are they |
| 5898 |
publishers, responsible for the content regardless of |
| 5899 |
whether they are aware of it or not? Or something in |
| 5900 |
between? Such questions are particularly significant when |
| 5901 |
the content is technically criminal, e.g. some types of sex- |
| 5902 |
ually-oriented material in some jurisdictions, in which case |
| 5903 |
ignorance of its presence may not be an adequate defence. |
| 5904 |
|
| 5905 |
Even in milder situations such as libel or copyright viola- |
| 5906 |
tion, the responsibilities of the poster, his host, and |
| 5907 |
other hosts carrying the traffic are unclear. Note, in par- |
| 5908 |
ticular, the problems arising when the article is a forgery, |
| 5909 |
or when the alleged author claims it is a forgery but cannot |
| 5910 |
prove this. |
| 5911 |
|
| 5912 |
|
| 5913 |
A. Archeological Notes |
| 5914 |
|
| 5915 |
|
| 5916 |
A.1. A-News Article Format |
| 5917 |
|
| 5918 |
The obsolete "A News" article format consisted of exactly |
| 5919 |
five lines of header information, followed by the body. For |
| 5920 |
example: |
| 5921 |
|
| 5922 |
Aeagle.642 |
| 5923 |
news.misc |
| 5924 |
cbosgd!mhuxj!mhuxt!eagle!jerry |
| 5925 |
Fri Nov 19 16:14:55 1982 |
| 5926 |
Usenet Etiquette - Please Read |
| 5927 |
body |
| 5928 |
body |
| 5929 |
body |
| 5930 |
|
| 5931 |
The first line consisted of an "A" followed by an article ID |
| 5932 |
(analogous to a message ID and used for similar purposes). |
| 5933 |
The second line was the list of newsgroups. The third line |
| 5934 |
was the path. The fourth was the date, in the format above |
| 5935 |
|
| 5936 |
|
| 5937 |
|
| 5938 |
2 June 1994 - 90 - expires 15 July 1994 |
| 5939 |
|
| 5940 |
|
| 5941 |
|
| 5942 |
|
| 5943 |
|
| 5944 |
INTERNET DRAFT to be NEWS sec. A.1 |
| 5945 |
|
| 5946 |
|
| 5947 |
(all fields fixed width), resembling an Internet date but |
| 5948 |
not quite the same. The fifth was the subject. |
| 5949 |
|
| 5950 |
This format is documented for archeological purposes only. |
| 5951 |
Do not generate articles in this format. |
| 5952 |
|
| 5953 |
|
| 5954 |
A.2. Early B-News Article Format |
| 5955 |
|
| 5956 |
The obsolete pseudo-Internet article format, used briefly |
| 5957 |
during the transition between the A News format and the mod- |
| 5958 |
ern format, followed the general outline of a MAIL message |
| 5959 |
but with some non-standard headers. For example: |
| 5960 |
|
| 5961 |
From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz) |
| 5962 |
Newsgroups: news.misc |
| 5963 |
Title: Usenet Etiquette -- Please Read |
| 5964 |
Article-I.D.: eagle.642 |
| 5965 |
Posted: Fri Nov 19 16:14:55 1982 |
| 5966 |
Received: Fri Nov 19 16:59:30 1982 |
| 5967 |
Expires: Mon Jan 1 00:00:00 1990 |
| 5968 |
|
| 5969 |
body |
| 5970 |
body |
| 5971 |
body |
| 5972 |
|
| 5973 |
The From header contained the information now found in the |
| 5974 |
Path header, plus possibly the full name now typically found |
| 5975 |
in the From header. The Title header contained what is now |
| 5976 |
the Subject content. The Posted header contained what is |
| 5977 |
now the Date content. The Article-I.D. header contained an |
| 5978 |
article ID, analogous to a message ID and used for similar |
| 5979 |
purposes. The Newsgroups and Expires headers were approxi- |
| 5980 |
mately as now. The Received header contained the date when |
| 5981 |
the latest relayer to process the article first saw it. All |
| 5982 |
dates were in the above format, with all fields fixed width, |
| 5983 |
resembling an Internet date but not quite the same. |
| 5984 |
|
| 5985 |
This format is documented for archeological purposes only. |
| 5986 |
Do not generate articles in this format. |
| 5987 |
|
| 5988 |
|
| 5989 |
A.3. Obsolete Headers |
| 5990 |
|
| 5991 |
Early versions of news software following the modern format |
| 5992 |
sometimes generated headers like the following: |
| 5993 |
|
| 5994 |
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP |
| 5995 |
Posting-Version: version B 2.10 2/13/83; site eagle.UUCP |
| 5996 |
Date-Received: Friday, 19-Nov-82 16:59:30 EST |
| 5997 |
|
| 5998 |
Relay-Version contained version information about the |
| 5999 |
relayer that last processed the article. Posting-Version |
| 6000 |
contained version information about the posting agent that |
| 6001 |
|
| 6002 |
|
| 6003 |
|
| 6004 |
2 June 1994 - 91 - expires 15 July 1994 |
| 6005 |
|
| 6006 |
|
| 6007 |
|
| 6008 |
|
| 6009 |
|
| 6010 |
INTERNET DRAFT to be NEWS sec. A.3 |
| 6011 |
|
| 6012 |
|
| 6013 |
posted the article. Date-Received contained the date when |
| 6014 |
the last relayer to process the article first saw it (in a |
| 6015 |
slightly nonstandard format). |
| 6016 |
|
| 6017 |
These headers are documented for archeological purposes |
| 6018 |
only. Do not generate articles using them. |
| 6019 |
|
| 6020 |
|
| 6021 |
A.4. Obsolete Control Messages |
| 6022 |
|
| 6023 |
There once was a senduuname control message, resembling |
| 6024 |
sendsys but requesting transmission of the list of hosts |
| 6025 |
that the receiving host had UUCP connections to. This |
| 6026 |
rapidly ceased to be of much use, and many organizations |
| 6027 |
consider information about their internal connectivity to be |
| 6028 |
confidential. |
| 6029 |
|
| 6030 |
Historically, a checkgroups body consisting of one or two |
| 6031 |
lines, the first of the form "-n newsgroup", caused check- |
| 6032 |
groups to apply to only that single newsgroup. This form is |
| 6033 |
documented for archeological purposes only; do not use it. |
| 6034 |
|
| 6035 |
Historically, an article posted to a newsgroup whose name |
| 6036 |
had exactly three components of which the third was "ctl" |
| 6037 |
signified that article was to be taken as a control message. |
| 6038 |
The Subject header specified the actions, in the same way |
| 6039 |
the Control header does now. This form is documented for |
| 6040 |
archeological purposes only; do not use it; do not implement |
| 6041 |
it. |
| 6042 |
|
| 6043 |
|
| 6044 |
B. A Quick Tour Of MIME |
| 6045 |
|
| 6046 |
(The editor wishes to thank Luc Rooijakkers; most of this |
| 6047 |
appendix is a lightly-edited version of a summary he kindly |
| 6048 |
supplied.) |
| 6049 |
|
| 6050 |
MIME (Multipurpose Internet Mail Extensions) is an upward- |
| 6051 |
compatible set of extensions to RFC 822, currently docu- |
| 6052 |
mented in RFCs 1341 and 1342. This appendix summarizes |
| 6053 |
these documents. See the MIME RFCs for more information; |
| 6054 |
they are very readable. |
| 6055 |
|
| 6056 |
UNRESOLVED ISSUE: These RFC numbers (here and |
| 6057 |
elsewhere in this Draft) need updating when the |
| 6058 |
new MIME RFCs come out. |
| 6059 |
|
| 6060 |
MIME defines the following new headers: |
| 6061 |
|
| 6062 |
|
| 6063 |
|
| 6064 |
|
| 6065 |
|
| 6066 |
|
| 6067 |
|
| 6068 |
|
| 6069 |
|
| 6070 |
2 June 1994 - 92 - expires 15 July 1994 |
| 6071 |
|
| 6072 |
|
| 6073 |
|
| 6074 |
|
| 6075 |
|
| 6076 |
INTERNET DRAFT to be NEWS sec. B |
| 6077 |
|
| 6078 |
|
| 6079 |
MIME-Version |
| 6080 |
Content-Type |
| 6081 |
Content-Transfer-Encoding |
| 6082 |
Content-ID |
| 6083 |
Content-Description |
| 6084 |
|
| 6085 |
|
| 6086 |
The MIME-Version header is mandatory for all messages con- |
| 6087 |
forming to the MIME specification and carries the version |
| 6088 |
number of the MIME specification. Example: |
| 6089 |
|
| 6090 |
MIME-Version: 1.0 |
| 6091 |
|
| 6092 |
|
| 6093 |
The Content-Type header indicates the content type of the |
| 6094 |
message. Content types are split into a top-level type and |
| 6095 |
a subtype, separated by a slash. Auxiliary information can |
| 6096 |
also be supplied, using an attribute-value notation. Exam- |
| 6097 |
ple: |
| 6098 |
|
| 6099 |
Content-Type: text/plain; charset=us-ascii |
| 6100 |
|
| 6101 |
(In the absence of a Content-Type header this is in fact the |
| 6102 |
default content type.) |
| 6103 |
|
| 6104 |
Important type/subtype combinations are |
| 6105 |
|
| 6106 |
text/plain Plain text, possibly in a non- |
| 6107 |
ASCII character set. |
| 6108 |
|
| 6109 |
text/enriched A very simple wordprocessor-like |
| 6110 |
language supporting character |
| 6111 |
attributes (e.g., underlining), |
| 6112 |
justification control, and multi- |
| 6113 |
ple character sets. (This pro- |
| 6114 |
posal has gone through several |
| 6115 |
iterations and has recently split |
| 6116 |
off from the main MIME RFCs into a |
| 6117 |
separate document.) |
| 6118 |
|
| 6119 |
message/rfc822 A mail message conforming to a |
| 6120 |
slightly-relaxed version of RFC |
| 6121 |
822. |
| 6122 |
|
| 6123 |
message/partial Part of a message (supporting the |
| 6124 |
transparent splitting and joining |
| 6125 |
of messages when they are too |
| 6126 |
large to be handled by some trans- |
| 6127 |
port agent). |
| 6128 |
|
| 6129 |
message/external-body A message whose body is external. |
| 6130 |
Possible access methods include |
| 6131 |
via mail, FTP, local file, etc. |
| 6132 |
|
| 6133 |
|
| 6134 |
|
| 6135 |
|
| 6136 |
2 June 1994 - 93 - expires 15 July 1994 |
| 6137 |
|
| 6138 |
|
| 6139 |
|
| 6140 |
|
| 6141 |
|
| 6142 |
INTERNET DRAFT to be NEWS sec. B |
| 6143 |
|
| 6144 |
|
| 6145 |
multipart/mixed A message whose body consists of |
| 6146 |
multiple parts, possibly of dif- |
| 6147 |
ferent types, intended to be |
| 6148 |
viewed in serial order. Each part |
| 6149 |
looks like an RFC 822 message, |
| 6150 |
consisting of headers and a body. |
| 6151 |
Most of the RFC 822 headers have |
| 6152 |
no defined semantics for body |
| 6153 |
parts. |
| 6154 |
|
| 6155 |
multipart/parallel Likewise, except that the parts |
| 6156 |
are intended to be viewed in par- |
| 6157 |
allel (on user agents that support |
| 6158 |
it). |
| 6159 |
|
| 6160 |
multipart/alternative Likewise, except that the parts |
| 6161 |
are intended to be semantically |
| 6162 |
equivalent such that the part that |
| 6163 |
best matches the capabilities of |
| 6164 |
the environment should be dis- |
| 6165 |
played. For example, a message |
| 6166 |
may include plain-text, enriched- |
| 6167 |
text, and postscript versions of |
| 6168 |
some document. |
| 6169 |
|
| 6170 |
multipart/digest A variant of multipart/mixed espe- |
| 6171 |
cially intended for message |
| 6172 |
digests (the default type of the |
| 6173 |
parts is message/rfc822 instead of |
| 6174 |
text/plain, saving on the number |
| 6175 |
of headers for the parts). |
| 6176 |
|
| 6177 |
application/postscript A PostScript document. |
| 6178 |
(PostScript is a trademark of |
| 6179 |
Adobe.) |
| 6180 |
|
| 6181 |
Other top-level types exist for still images, audio, and |
| 6182 |
video samples. |
| 6183 |
|
| 6184 |
Some of the above types require the ability to transport |
| 6185 |
binary data. Since the existing message systems usually do |
| 6186 |
not support this, MIME provides a Content-Transfer-Encoding |
| 6187 |
header to indicate the kind of encoding used. The possible |
| 6188 |
encodings are: |
| 6189 |
|
| 6190 |
7bit No encoding; the data consists of short |
| 6191 |
(less than 1000 characters) lines of |
| 6192 |
7-bit ASCII data, delimited by EOL |
| 6193 |
sequences. This is the default encod- |
| 6194 |
ing. |
| 6195 |
|
| 6196 |
8bit Like 7bit, except that bytes with the |
| 6197 |
high-order bit set may be present. |
| 6198 |
Many transmission paths are incapable |
| 6199 |
|
| 6200 |
|
| 6201 |
|
| 6202 |
2 June 1994 - 94 - expires 15 July 1994 |
| 6203 |
|
| 6204 |
|
| 6205 |
|
| 6206 |
|
| 6207 |
|
| 6208 |
INTERNET DRAFT to be NEWS sec. B |
| 6209 |
|
| 6210 |
|
| 6211 |
of carrying messages which use this |
| 6212 |
encoding. |
| 6213 |
|
| 6214 |
binary No encoding; any sequence of bytes may |
| 6215 |
be present. Many transmission paths |
| 6216 |
are incapable of carrying messages |
| 6217 |
which use this encoding. |
| 6218 |
|
| 6219 |
base64 The data is encoded by representing |
| 6220 |
every group of 3 bytes as 4 characters |
| 6221 |
from the alphabet "A-Za-z0-9+/", which |
| 6222 |
was chosen for its high robustness |
| 6223 |
through mail gateways (the alphabet |
| 6224 |
used by uuencode does not survive |
| 6225 |
ASCII-EBCDIC-ASCII translations). In |
| 6226 |
the final group of 4 characters, "=" is |
| 6227 |
used for those characters not repre- |
| 6228 |
senting data bytes. Line length is |
| 6229 |
limited and EOLs in the encoded form |
| 6230 |
are ignored. |
| 6231 |
|
| 6232 |
quoted-printable Any byte can be represented by a three |
| 6233 |
character "=XX" sequence where the X's |
| 6234 |
are upper case hexadecimal digits. |
| 6235 |
Bytes representing printable 7-bit US- |
| 6236 |
ASCII characters except "=" may be rep- |
| 6237 |
resented literally. Tabs and blanks |
| 6238 |
may be represented literally if not at |
| 6239 |
the end of a line. Line length is lim- |
| 6240 |
ited, and an EOL preceded by "=" was |
| 6241 |
inserted for this purpose and is not |
| 6242 |
present in the original. |
| 6243 |
|
| 6244 |
The base64 and quoted-printable encodings are applied to |
| 6245 |
data in Internet canonical form, which means that any EOL |
| 6246 |
encoded as anything but EOL must be an Internet canonical |
| 6247 |
EOL: CR followed by LF. |
| 6248 |
|
| 6249 |
The Content-Description header allows further description of |
| 6250 |
a body part, analogous to the use of Subject for messages. |
| 6251 |
|
| 6252 |
Finally, the Content-ID header can be used to assign an |
| 6253 |
identification to body parts, analogous to the assignment of |
| 6254 |
identifications to messages by Message-ID. |
| 6255 |
|
| 6256 |
Note that most of these headers are structured header |
| 6257 |
fields, as defined in RFC 822. Consequently, comments are |
| 6258 |
allowed in their values. The following is a legal MIME |
| 6259 |
header: |
| 6260 |
|
| 6261 |
Content-Type: (a comment) text (yeah) / |
| 6262 |
plain (and now some params:) ; charset= (guess what) |
| 6263 |
iso-8859-1 (we don't have iso-10646 yet, pity) |
| 6264 |
|
| 6265 |
|
| 6266 |
|
| 6267 |
|
| 6268 |
2 June 1994 - 95 - expires 15 July 1994 |
| 6269 |
|
| 6270 |
|
| 6271 |
|
| 6272 |
|
| 6273 |
|
| 6274 |
INTERNET DRAFT to be NEWS sec. B |
| 6275 |
|
| 6276 |
|
| 6277 |
NOTE: Although the MIME specification was devel- |
| 6278 |
oped for mail, there is nothing precluding its use |
| 6279 |
for news as well. While it might simplify imple- |
| 6280 |
mentation to restrict the MIME headers somewhat, |
| 6281 |
in the same way that other news headers (e.g. |
| 6282 |
From) are restricted subsets of the RFC-822 origi- |
| 6283 |
nals, this would add yet another divergence |
| 6284 |
between two formats that ought to be as compatible |
| 6285 |
as possible. In the case of the MIME headers, |
| 6286 |
there is no body of existing code posing compati- |
| 6287 |
bility concerns. A full-featured MIME reading |
| 6288 |
agent needs a full RFC-822 parser anyway, to prop- |
| 6289 |
erly handle body parts of types like mes- |
| 6290 |
sage/rfc822, so there is little gain from |
| 6291 |
restricting MIME headers. Adopting the MIME spec- |
| 6292 |
ification unchanged seems best. However, article- |
| 6293 |
level MIME headers must still comply with the |
| 6294 |
overall news header syntax given in section 4, so |
| 6295 |
that news software which is NOT interested in MIME |
| 6296 |
need not contain a full RFC-822 parser. |
| 6297 |
|
| 6298 |
The second part of MIME, RFC 1342 (Representation of Non- |
| 6299 |
ASCII Text in Internet Message Headers), addresses the prob- |
| 6300 |
lem of non-ASCII characters in headers. An example of a |
| 6301 |
header using the RFC 1342 mechanism is |
| 6302 |
|
| 6303 |
From: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be> |
| 6304 |
|
| 6305 |
Such encodings are allowed in selected headers, subject to |
| 6306 |
the restrictions listed in RFC 1342. |
| 6307 |
|
| 6308 |
The MIME effort has also produced an RFC defining a Content- |
| 6309 |
MD5 header [rrr 1544], containing an MD5-based "checksum" of |
| 6310 |
the contents of an article or body part, giving high confi- |
| 6311 |
dence of detecting accidental modifications to the contents. |
| 6312 |
|
| 6313 |
The "metamail" software package [rrr] helps provide MIME |
| 6314 |
support with minimal changes to mailers, and may also be |
| 6315 |
relevant to news reading agents. |
| 6316 |
|
| 6317 |
The PEM (Privacy Enhanced Mail) effort is pursuing analogous |
| 6318 |
facilities to offer stronger guarantees against malicious |
| 6319 |
modifications, unauthorized eavesdropping, and forgery. |
| 6320 |
This work too may be applicable to news, once it is recon- |
| 6321 |
ciled with MIME (by efforts now underway). |
| 6322 |
|
| 6323 |
|
| 6324 |
C. Summary of Changes Since RFC 1036 |
| 6325 |
|
| 6326 |
This Draft is much longer than RFC 1036, so there is obvi- |
| 6327 |
ously much change in content. Much of this is just |
| 6328 |
increased precision and rigor. Noteworthy changes and addi- |
| 6329 |
tions include: |
| 6330 |
|
| 6331 |
|
| 6332 |
|
| 6333 |
|
| 6334 |
2 June 1994 - 96 - expires 15 July 1994 |
| 6335 |
|
| 6336 |
|
| 6337 |
|
| 6338 |
|
| 6339 |
|
| 6340 |
INTERNET DRAFT to be NEWS sec. C |
| 6341 |
|
| 6342 |
|
| 6343 |
+ section 4.3's restrictions on article bodies |
| 6344 |
|
| 6345 |
+ all references to MIME facilities |
| 6346 |
|
| 6347 |
+ size limits on articles |
| 6348 |
|
| 6349 |
+ precise specification of Date-content syntax |
| 6350 |
|
| 6351 |
+ message IDs must never be re-used, ever |
| 6352 |
|
| 6353 |
+ "!" is the only Path delimiter |
| 6354 |
|
| 6355 |
+ multiple moderators in the Approved header |
| 6356 |
|
| 6357 |
+ rules on References trimming, and the _-_ mechanism |
| 6358 |
|
| 6359 |
+ generalization of the Xref rules |
| 6360 |
|
| 6361 |
+ multiple message IDs in Cancel and Supersedes |
| 6362 |
|
| 6363 |
+ Also-Control |
| 6364 |
|
| 6365 |
+ See-Also |
| 6366 |
|
| 6367 |
+ Article-Names |
| 6368 |
|
| 6369 |
+ Article-Replacing |
| 6370 |
|
| 6371 |
+ more precise rules for cancellation |
| 6372 |
|
| 6373 |
+ cancellation authorization based on From, not Sender |
| 6374 |
|
| 6375 |
+ "unmoderated" and descriptors in newgroup messages |
| 6376 |
|
| 6377 |
+ restrictive rules on handling of sendsys and version |
| 6378 |
messages |
| 6379 |
|
| 6380 |
+ the whogets control message |
| 6381 |
|
| 6382 |
+ precise specification of checkgroups messages |
| 6383 |
|
| 6384 |
+ compression type preferably specified out-of-band |
| 6385 |
|
| 6386 |
+ rules for encapsulating news in MIME mail |
| 6387 |
|
| 6388 |
+ tighter specification of relayer functioning (section |
| 6389 |
9.1) |
| 6390 |
|
| 6391 |
+ the "newsmaster" contact address |
| 6392 |
|
| 6393 |
+ rules for gatewaying (section 10) |
| 6394 |
|
| 6395 |
+ discussion of security issues (section 11) |
| 6396 |
|
| 6397 |
|
| 6398 |
|
| 6399 |
|
| 6400 |
2 June 1994 - 97 - expires 15 July 1994 |
| 6401 |
|
| 6402 |
|
| 6403 |
|
| 6404 |
|
| 6405 |
|
| 6406 |
INTERNET DRAFT to be NEWS sec. C |
| 6407 |
|
| 6408 |
|
| 6409 |
D. Summary of Completely New Features |
| 6410 |
|
| 6411 |
Most of this Draft merely documents existing practice, but |
| 6412 |
there are a few attempts to extend it. These are: |
| 6413 |
|
| 6414 |
TBW |
| 6415 |
|
| 6416 |
|
| 6417 |
E. Summary of Differences From RFC 822+1123 |
| 6418 |
|
| 6419 |
The following are noteworthy differences between this |
| 6420 |
Draft's articles and MAIL messages: |
| 6421 |
|
| 6422 |
+ generally less-permissive header syntax |
| 6423 |
|
| 6424 |
+ notably, limited From syntax |
| 6425 |
|
| 6426 |
+ MAIL header comments allowed in only a few contexts |
| 6427 |
|
| 6428 |
+ slightly more restricted message-ID syntax |
| 6429 |
|
| 6430 |
+ several more mandatory headers |
| 6431 |
|
| 6432 |
+ duplicate headers forbidden |
| 6433 |
|
| 6434 |
+ References/See-Also versus In-Reply-To/References |
| 6435 |
(section 6.5) |
| 6436 |
|
| 6437 |
+ case sensitivity in some contexts |
| 6438 |
|
| 6439 |
+ point-to-point headers, e.g. To and Cc, forbidden |
| 6440 |
(section 6) |
| 6441 |
|
| 6442 |
+ several new headers |
| 6443 |
|
| 6444 |
|
| 6445 |
References |
| 6446 |
|
| 6447 |
[Sanderson] "Smileys", David Sanderson, O'Reilly & Associ- |
| 6448 |
ates Ltd., 1993. |
| 6449 |
|
| 6450 |
TBW |
| 6451 |
|
| 6452 |
|
| 6453 |
Security Considerations |
| 6454 |
|
| 6455 |
Section 11 discusses security considerations in detail. |
| 6456 |
|
| 6457 |
|
| 6458 |
Author's Address |
| 6459 |
|
| 6460 |
|
| 6461 |
|
| 6462 |
|
| 6463 |
|
| 6464 |
|
| 6465 |
|
| 6466 |
2 June 1994 - 98 - expires 15 July 1994 |
| 6467 |
|
| 6468 |
|
| 6469 |
|
| 6470 |
|
| 6471 |
|
| 6472 |
INTERNET DRAFT to be NEWS sec. - |
| 6473 |
|
| 6474 |
|
| 6475 |
Henry Spencer |
| 6476 |
henry@zoo.toronto.edu |
| 6477 |
|
| 6478 |
SP Systems |
| 6479 |
Box 280 Stn. A |
| 6480 |
Toronto, Ont. M5W1B2 Canada |
| 6481 |
|
| 6482 |
|
| 6483 |
|
| 6484 |
|
| 6485 |
|
| 6486 |
|
| 6487 |
|
| 6488 |
|
| 6489 |
|
| 6490 |
|
| 6491 |
|
| 6492 |
|
| 6493 |
|
| 6494 |
|
| 6495 |
|
| 6496 |
|
| 6497 |
|
| 6498 |
|
| 6499 |
|
| 6500 |
|
| 6501 |
|
| 6502 |
|
| 6503 |
|
| 6504 |
|
| 6505 |
|
| 6506 |
|
| 6507 |
|
| 6508 |
|
| 6509 |
|
| 6510 |
|
| 6511 |
|
| 6512 |
|
| 6513 |
|
| 6514 |
|
| 6515 |
|
| 6516 |
|
| 6517 |
|
| 6518 |
|
| 6519 |
|
| 6520 |
|
| 6521 |
|
| 6522 |
|
| 6523 |
|
| 6524 |
|
| 6525 |
|
| 6526 |
|
| 6527 |
|
| 6528 |
|
| 6529 |
|
| 6530 |
|
| 6531 |
|
| 6532 |
2 June 1994 - 99 - expires 15 July 1994 |
| 6533 |
|
| 6534 |
|