1 |
|
2 |
|
3 |
|
4 |
INTERNET DRAFT to be NEWS sec. - |
5 |
|
6 |
|
7 |
|
8 |
|
9 |
|
10 |
News Article Format and Transmission |
11 |
|
12 |
Henry Spencer |
13 |
|
14 |
|
15 |
|
16 |
Status of this Memo |
17 |
|
18 |
This document is intended to become an Internet Draft. |
19 |
Internet Drafts are working documents of the Internet Engi- |
20 |
neering Task Force (IETF), its Areas, and its Working |
21 |
Groups. Note that other groups may also distribute working |
22 |
documents as Internet Drafts. |
23 |
|
24 |
Internet Drafts are draft documents valid for a maximum of |
25 |
six months. Internet Drafts may be updated, replaced, or |
26 |
obsoleted by other documents at any time. It is not appro- |
27 |
priate to use Internet Drafts as reference material or to |
28 |
cite them other than as a "working draft" or "work in |
29 |
progress". |
30 |
|
31 |
Please check the I-D abstract listing contained in each |
32 |
Internet Draft directory to learn the current status of this |
33 |
or any other Internet Draft. (Actually, this draft is at |
34 |
too early a stage to even be listed there yet.) |
35 |
|
36 |
It is hoped that a later version of this Draft will obsolete |
37 |
RFC 1036 and will become an Internet standard. |
38 |
|
39 |
References to the "successor to this Draft" refer not to |
40 |
later versions of this draft, but to a hypothetical future |
41 |
rewrite of this Draft (in the same way that this Draft is a |
42 |
rewrite of RFC 1036). |
43 |
|
44 |
Distribution of this memo is unlimited. |
45 |
|
46 |
|
47 |
Abstract |
48 |
|
49 |
This Draft defines the format and procedures for interchange |
50 |
of network news articles. It is hoped that a later version |
51 |
of this Draft will obsolete RFC 1036, reflecting more recent |
52 |
experience and accommodating future directions. |
53 |
|
54 |
Network news articles resemble mail messages but are broad- |
55 |
cast to potentially-large audiences, using a flooding algo- |
56 |
rithm that propagates one copy to each interested host (or |
57 |
group thereof), typically stores only one copy per host, and |
58 |
does not require any central administration or systematic |
59 |
registration of interested users. Network news originated |
60 |
as the medium of communication for Usenet, circa 1980. |
61 |
|
62 |
|
63 |
|
64 |
2 June 1994 - 1 - expires 15 July 1994 |
65 |
|
66 |
|
67 |
|
68 |
|
69 |
|
70 |
INTERNET DRAFT to be NEWS sec. - |
71 |
|
72 |
|
73 |
Since then Usenet has grown explosively, and many Internet |
74 |
sites participate in it. In addition, the news technology |
75 |
is now in widespread use for other purposes, on the Internet |
76 |
and elsewhere. |
77 |
|
78 |
This Draft primarily codifies and organizes existing prac- |
79 |
tice. A few small extensions have been added in an attempt |
80 |
to solve problems that are considered serious. Major exten- |
81 |
sions (e.g. cryptographic authentication) that need signifi- |
82 |
cant development effort are left to be undertaken as inde- |
83 |
pendent efforts. |
84 |
|
85 |
|
86 |
Table of Contents |
87 |
|
88 |
TBW |
89 |
|
90 |
|
91 |
1. Introduction |
92 |
|
93 |
Network news articles resemble mail messages but are broad- |
94 |
cast to potentially-large audiences, using a flooding algo- |
95 |
rithm that propagates one copy to each interested host (or |
96 |
groups thereof), typically stores only one copy per host, |
97 |
and does not require any central administration or system- |
98 |
atic registration of interested users. Network news origi- |
99 |
nated as the medium of communication for Usenet, circa 1980. |
100 |
Since then Usenet has grown explosively, and many Internet |
101 |
sites participate in it. In addition, the news technology |
102 |
is now in widespread use for other purposes, on the Internet |
103 |
and elsewhere. |
104 |
|
105 |
The earliest news interchange used the so-called "A News" |
106 |
article format. Shortly thereafter, an article format |
107 |
vaguely resembling Internet mail was devised and used |
108 |
briefly. Both of those formats are completely obsolete; |
109 |
they are documented in appendix A for historical reasons |
110 |
only. With publication of RFC 850 [rrr] in 1983, news arti- |
111 |
cles came to closely resemble Internet mail messages, with |
112 |
some restrictions and some additional headers. RFC 1036 |
113 |
[rrr] in 1987 updated RFC 850 without making major changes. |
114 |
|
115 |
In the intervening five years, the RFC 1036 article format |
116 |
has proven quite satisfactory, although minor extensions |
117 |
appear desirable to match recent developments in areas such |
118 |
as multi-media mail. RFC 1036 itself has not proven quite |
119 |
so satisfactory. It is often rather vague and does not |
120 |
address some issues at all; this has caused significant |
121 |
interoperability problems at times, and implementations have |
122 |
diverged somewhat. Worse, although it was intended primar- |
123 |
ily to document existing practice, it did not precisely |
124 |
match existing practice even at the time it was published, |
125 |
and the deviations have grown since. |
126 |
|
127 |
|
128 |
|
129 |
|
130 |
2 June 1994 - 2 - expires 15 July 1994 |
131 |
|
132 |
|
133 |
|
134 |
|
135 |
|
136 |
INTERNET DRAFT to be NEWS sec. 1 |
137 |
|
138 |
|
139 |
This Draft attempts to specify the format of articles, and |
140 |
the procedures used to exchange them and process them, in |
141 |
sufficient detail to allow full interoperability. In addi- |
142 |
tion, some tentative suggestions are made about directions |
143 |
for future development, in an attempt to avert unnecessary |
144 |
divergence and consequent loss of interoperability. Major |
145 |
extensions (e.g. cryptographic authentication) that need |
146 |
significant development effort are left to be undertaken as |
147 |
independent efforts. |
148 |
|
149 |
NOTE: One question this all may raise is: why is |
150 |
there no News-Version header, analogous to MIME- |
151 |
Version, specifying a version number corresponding |
152 |
to this specification? The answer is: it doesn't |
153 |
appear to be useful, given news's backward- |
154 |
compatibility constraints. The major use of a |
155 |
version number is indicating which of several |
156 |
INCOMPATIBLE interpretations is relevant. The |
157 |
impossibility of orchestrating any sort of simul- |
158 |
taneous change over news's installed base makes it |
159 |
necessary to avoid such incompatible changes (as |
160 |
opposed to extensions) entirely. MIME has a ver- |
161 |
sion number mostly because it introduced incompat- |
162 |
ible changes to the interpretation of several |
163 |
"Content-" headers. This Draft attempts no |
164 |
changes in interpretation and it appears doubtful |
165 |
that future Drafts will find it feasible to intro- |
166 |
duce any. |
167 |
|
168 |
UNRESOLVED ISSUE: Should this be reconsidered? |
169 |
Only if the header has SPECIFIC IDENTIFIABLE uses |
170 |
today. Otherwise it's just useless added bulk. |
171 |
|
172 |
As in this Draft's predecessors, the exact means used to |
173 |
transmit articles from one host to another is not specified. |
174 |
NNTP [rrr] is probably the most common transmission method |
175 |
on the Internet, but a number of others are known to be in |
176 |
use, including the UUCP protocol [rrr] extensively used in |
177 |
the early days of Usenet and still much used on its fringes |
178 |
today. |
179 |
|
180 |
Several of the mechanisms described in this Draft may seem |
181 |
somewhat strange or even bizarre at first reading. As with |
182 |
Internet mail, there is no reasonable possibility of updat- |
183 |
ing the entire installed base of news software promptly, so |
184 |
interoperability with old software is crucial and will |
185 |
remain so. Compatibility with existing practice and robust- |
186 |
ness in an imperfect world necessarily take priority over |
187 |
|
188 |
|
189 |
|
190 |
|
191 |
|
192 |
|
193 |
|
194 |
|
195 |
|
196 |
2 June 1994 - 3 - expires 15 July 1994 |
197 |
|
198 |
|
199 |
|
200 |
|
201 |
|
202 |
INTERNET DRAFT to be NEWS sec. 1 |
203 |
|
204 |
|
205 |
elegance. |
206 |
|
207 |
|
208 |
2. Definitions, Notations, and Conventions |
209 |
|
210 |
|
211 |
2.1. Textual Notations |
212 |
|
213 |
Throughout this Draft, "MAIL" is short for "RFC 822 [rrr] as |
214 |
amended by RFC 1123 [rrr]". (RFC 1123's amendments are |
215 |
mostly relatively small, but they are not insignificant.) |
216 |
See also the discussion in section 3 about this Draft's |
217 |
relationship to MAIL. "MIME" is short for "RFCs 1341 and |
218 |
1342" (or their updated replacements). |
219 |
|
220 |
UNRESOLVED ISSUE: Update these numbers. |
221 |
|
222 |
"ASCII" is short for "the ANSI X3.4 character set" [rrr]. |
223 |
While "ASCII" is often misused to refer to various character |
224 |
sets somewhat similar to X3.4, in this Draft, "ASCII" means |
225 |
X3.4 and only X3.4. |
226 |
|
227 |
NOTE: The name is traditional (to the point where |
228 |
the ANSI standard sanctions it) even though it is |
229 |
no longer an acronym for the name of the standard. |
230 |
|
231 |
NOTE: ASCII, X3.4, contains 128 characters, not |
232 |
all of them printable. Character sets with more |
233 |
characters are not ASCII, although they may |
234 |
include it as a subset. |
235 |
|
236 |
Certain words used to define the significance of individual |
237 |
requirements are capitalized. "MUST" means that the item is |
238 |
an absolute requirement of the specification. "SHOULD" |
239 |
means that the item is a strong recommendation: there may be |
240 |
valid reasons to ignore it in unusual circumstances, but |
241 |
this should be done only after careful study of the full |
242 |
implications and a firm conclusion that it is necessary, |
243 |
because there are serious disadvantages to doing so. "MAY" |
244 |
means that the item is truly optional, and implementors and |
245 |
users are warned that conformance is possible but not to be |
246 |
relied on. |
247 |
|
248 |
The term "compliant", applied to implementations etc., indi- |
249 |
cates satisfaction of all relevant "MUST" and "SHOULD" |
250 |
requirements. The term "conditionally compliant" indicates |
251 |
satisfaction of all relevant "MUST" requirements but viola- |
252 |
tion of at least one relevant "SHOULD" requirement. |
253 |
|
254 |
This Draft contains explanatory notes using the following |
255 |
format. These may be skipped by persons interested solely |
256 |
in the content of the specification. The purpose of the |
257 |
notes is to explain why choices were made, to place them in |
258 |
context, or to suggest possible implementation techniques. |
259 |
|
260 |
|
261 |
|
262 |
2 June 1994 - 4 - expires 15 July 1994 |
263 |
|
264 |
|
265 |
|
266 |
|
267 |
|
268 |
INTERNET DRAFT to be NEWS sec. 2.1 |
269 |
|
270 |
|
271 |
NOTE: While such explanatory notes may seem super- |
272 |
fluous in principle, they often help the less- |
273 |
than-omniscient reader grasp the purpose of the |
274 |
specification and the constraints involved. Given |
275 |
the limitations of natural language for descrip- |
276 |
tive purposes, this improves the probability that |
277 |
implementors and users will understand the true |
278 |
intent of the specification in cases where the |
279 |
wording is not entirely clear. |
280 |
|
281 |
All numeric values are given in decimal unless otherwise |
282 |
indicated. Octets are assumed to be unsigned values for |
283 |
this purpose. Large numbers are written using the North |
284 |
American convention, in which "," separates groups of three |
285 |
digits but otherwise has no significance. |
286 |
|
287 |
|
288 |
2.2. Syntax Notation |
289 |
|
290 |
Although the mechanisms specified in this Draft are all |
291 |
described in prose, most are also described formally in the |
292 |
modified BNF notation of RFC 822. Implementors will need to |
293 |
be familiar with this notation to fully understand this |
294 |
specification, and are referred to RFC 822 for a complete |
295 |
explanation of the modified BNF notation. Here is a brief |
296 |
illustrative example: |
297 |
|
298 |
sentence = clause *( punct clause ) "." |
299 |
punct = ":" / ";" |
300 |
clause = 1*word [ "(" clause ")" / "," 1*word ] |
301 |
word = <any English word> |
302 |
|
303 |
This defines a sentence as some clauses separated by puncts |
304 |
and ended by a period, a punct as a colon or semicolon, a |
305 |
clause as at least one <word> optionally followed by either |
306 |
a parenthesized clause or a comma and at least one more |
307 |
<word>, and a <word> as (informally) any English word. <> |
308 |
are used to enclose names when (and only when) distinguish- |
309 |
ing them from surrounding text is useful. The full form of |
310 |
the repetition notation is <m>"*"<n><thing>, denoting <m> |
311 |
through <n> repetitions of <thing>; <m> defaults to zero, |
312 |
<n> to infinity, and the "*" and <n> can be omitted if <m> |
313 |
and <n> are equal, so 1*word is one or more words, 1*5word |
314 |
is one through five words, and 2word is exactly two words. |
315 |
|
316 |
The character "\" is not special in any way in this nota- |
317 |
tion. |
318 |
|
319 |
This Draft is intended to be self-contained; all syntax |
320 |
rules used in it are defined within it, and a rule with the |
321 |
same name as one found in MAIL does not necessarily have the |
322 |
same definition. The lexical layer of MAIL is NOT, repeat |
323 |
NOT, used in this Draft, and its presence must not be |
324 |
assumed; notably, this Draft spells out all places where |
325 |
|
326 |
|
327 |
|
328 |
2 June 1994 - 5 - expires 15 July 1994 |
329 |
|
330 |
|
331 |
|
332 |
|
333 |
|
334 |
INTERNET DRAFT to be NEWS sec. 2.2 |
335 |
|
336 |
|
337 |
white space is permitted/required and all places where con- |
338 |
structs resembling MAIL comments can occur. |
339 |
|
340 |
NOTE: News parsers historically have been much |
341 |
less permissive than MAIL parsers. |
342 |
|
343 |
|
344 |
2.3. Definitions |
345 |
|
346 |
The term "character set", wherever it is used in this Draft, |
347 |
refers to a coded character set, in the sense of ISO charac- |
348 |
ter set standardization work, and must not be misinterpreted |
349 |
as meaning merely "a set of characters". |
350 |
|
351 |
In this Draft, ASCII character 32 is referred to as "blank"; |
352 |
the word "space" has a more generic meaning. |
353 |
|
354 |
An "article" is the unit of news, analogous to a MAIL "mes- |
355 |
sage". |
356 |
|
357 |
A "poster" is a human being (or software equivalent) submit- |
358 |
ting a possibly-compliant article to be "posted": made |
359 |
available for reading on all relevant hosts. A "posting |
360 |
agent" is software that assists posters to prepare articles, |
361 |
including determining whether the final article is compli- |
362 |
ant, passing it on to a relayer for posting if so, and |
363 |
returning it to the poster with an explanation if not. A |
364 |
"relayer" is software which receives allegedly-compliant |
365 |
articles from posting agents and/or other relayers, files |
366 |
copies in a "news database", and possibly passes copies on |
367 |
to other relayers. |
368 |
|
369 |
NOTE: While the same software may well function |
370 |
both as a relayer and as part of a posting agent, |
371 |
the two functions are distinct and should not be |
372 |
confused. The posting agent's purpose is (in |
373 |
part) to validate an article, supply header infor- |
374 |
mation that can or should be supplied automati- |
375 |
cally, and generally take reasonable actions in an |
376 |
attempt to transform the poster's submission into |
377 |
a compliant article. The relayer's purpose is to |
378 |
move already-compliant articles around efficiently |
379 |
without damaging them. |
380 |
|
381 |
A "reader" is a human being reading news articles. A "read- |
382 |
ing agent" is software which presents articles to a reader. |
383 |
|
384 |
NOTE: Informal usage often uses "reader" for both |
385 |
these meanings, but this introduces considerable |
386 |
potential for confusion and misunderstanding, so |
387 |
this Draft takes care to make the distinction. |
388 |
|
389 |
A "newsgroup" is a single news forum, a logical bulletin |
390 |
board, having a name and nominally intended for articles on |
391 |
|
392 |
|
393 |
|
394 |
2 June 1994 - 6 - expires 15 July 1994 |
395 |
|
396 |
|
397 |
|
398 |
|
399 |
|
400 |
INTERNET DRAFT to be NEWS sec. 2.3 |
401 |
|
402 |
|
403 |
a specific topic. An article is "posted to" a single news- |
404 |
group or several newsgroups. When an article is posted to |
405 |
more than one newsgroup, it is said to be "cross-posted"; |
406 |
note that this differs from posting the same text as part of |
407 |
each of several articles, one per newsgroup. A "hierarchy" |
408 |
is the set of all newsgroups whose names share a first com- |
409 |
ponent (see the name syntax in section 5.5). |
410 |
|
411 |
A newsgroup may be "moderated", in which case submissions |
412 |
are not posted directly, but mailed to a "moderator" for |
413 |
consideration and possible posting. Moderators are typi- |
414 |
cally human but may be implemented partially or entirely in |
415 |
software. |
416 |
|
417 |
A "followup" is an article containing a response to the con- |
418 |
tents of an earlier article (the followup's "precursor"). A |
419 |
"followup agent" is a combination of reading agent and post- |
420 |
ing agent that aids in the preparation and posting of a fol- |
421 |
lowup. |
422 |
|
423 |
Text comparisons are "case-sensitive" if they consider |
424 |
uppercase letters (e.g. "A") different from lowercase let- |
425 |
ters (e.g. "a"), and "case-insensitive" if letters differing |
426 |
only in case (e.g. "A" and "a") are considered identical. |
427 |
Categories of text are said to be case-(in)sensitive if com- |
428 |
parisons of such texts to others are case-(in)sensitive. |
429 |
|
430 |
A "cooperating subnet" is a set of news-exchanging hosts |
431 |
which is sufficiently well-coordinated (typically via a cen- |
432 |
tral administration of some sort) that stronger assumptions |
433 |
can be made about hosts in the set than about news hosts in |
434 |
general. This is typically used to relax restrictions which |
435 |
are otherwise required for worst-case interoperability; mem- |
436 |
bers of a cooperating subnet MAY interchange articles that |
437 |
do not conform to this Draft's specifications, provided all |
438 |
members have agreed to this and provided the articles are |
439 |
not permitted to leak out of the subnet. The word "subnet" |
440 |
is used to emphasize that a cooperating subnet is typically |
441 |
not an isolated universe; care must be taken that traffic |
442 |
leaving the subnet complies with the restrictions of the |
443 |
larger net, not just those of the cooperating subnet. |
444 |
|
445 |
A "message ID" is a unique identifier for an article, usu- |
446 |
ally supplied by the posting agent which posted it. It dis- |
447 |
tinguishes the article from every other article ever posted |
448 |
anywhere (in theory). Articles with the same message ID are |
449 |
treated as identical copies of the same article even if they |
450 |
are not in fact identical. |
451 |
|
452 |
A "gateway" is software which receives news articles and |
453 |
converts them to messages of some other kind (e.g. mail to a |
454 |
mailing list), or vice-versa; in essence it is a translating |
455 |
relayer that straddles boundaries between different methods |
456 |
of message exchange. The most common type of gateway |
457 |
|
458 |
|
459 |
|
460 |
2 June 1994 - 7 - expires 15 July 1994 |
461 |
|
462 |
|
463 |
|
464 |
|
465 |
|
466 |
INTERNET DRAFT to be NEWS sec. 2.3 |
467 |
|
468 |
|
469 |
connects newsgroup(s) to mailing list(s), either unidirec- |
470 |
tionally or bidirectionally, but there are also gateways |
471 |
between news networks using this Draft's news format and |
472 |
those using other formats. |
473 |
|
474 |
A "control message" is an article which is marked as con- |
475 |
taining control information; a relayer receiving such an |
476 |
article will (subject to permissions etc.) take actions |
477 |
beyond just filing and passing on the article. |
478 |
|
479 |
NOTE: "Control article" would be more consistent |
480 |
terminology, but "control message" is already well |
481 |
established. |
482 |
|
483 |
An article's "reply address" is the address to which mailed |
484 |
replies should be sent. This is the address specified in |
485 |
the article's From header (see section 5.2), unless it also |
486 |
has a Reply-To header (see section 6.3). |
487 |
|
488 |
The notation (e.g.) "(ASCII 17)" following a name means |
489 |
"this name refers to the ASCII character having value 17". |
490 |
An "ASCII printable character" is an ASCII character in the |
491 |
range 33-126. An "ASCII control character" is an ASCII |
492 |
character in the range 0-31, or the character DEL (ASCII |
493 |
127). A "non-ASCII character" is a character having a value |
494 |
exceeding 127. |
495 |
|
496 |
NOTE: Blank is neither an "ASCII printable charac- |
497 |
ter" nor an "ASCII control character". |
498 |
|
499 |
|
500 |
2.4. End Of Line |
501 |
|
502 |
How the end of a text line is represented depends on the |
503 |
context and the implementation. For Internet transmission |
504 |
via protocols such as SMTP [rrr], an end-of-line is a CR |
505 |
(ASCII 13) followed by an LF (ASCII 10). ISO C [rrr] and |
506 |
many modern operating systems indicate end-of-line with a |
507 |
single character, typically ASCII LF (aka "newline"), and |
508 |
this is the normal convention when news is transmitted via |
509 |
UUCP. A variety of other methods are in use, including out- |
510 |
of-band methods in which there is no specific character that |
511 |
means end-of-line. |
512 |
|
513 |
This Draft does not constrain how end-of-line is represented |
514 |
in news, except that characters other than CR and LF MUST |
515 |
not be usurped for use in end-of-line representations. |
516 |
Also, obviously, all software dealing with a particular copy |
517 |
of an article must agree on the convention to be used. |
518 |
"EOL" is used to mean "whatever end-of-line representation |
519 |
is appropriate"; it is not necessarily a character or |
520 |
sequence of characters. |
521 |
|
522 |
|
523 |
|
524 |
|
525 |
|
526 |
2 June 1994 - 8 - expires 15 July 1994 |
527 |
|
528 |
|
529 |
|
530 |
|
531 |
|
532 |
INTERNET DRAFT to be NEWS sec. 2.4 |
533 |
|
534 |
|
535 |
NOTE: If faced with picking an EOL representation |
536 |
in the absence of other constraints, use of a sin- |
537 |
gle character simplifies processing, and the ASCII |
538 |
standard [rrr] specifies that if one character is |
539 |
to be used for this purpose, it should be LF |
540 |
(ASCII 10). |
541 |
|
542 |
NOTE: Inside MIME encodings, use of the Internet |
543 |
canonical EOL representation (CR followed by LF) |
544 |
is mandatory. See [rrr]. |
545 |
|
546 |
|
547 |
2.5. Case-Sensitivity |
548 |
|
549 |
Text in newsgroup names, header parameters, etc. is case- |
550 |
sensitive unless stated otherwise. |
551 |
|
552 |
NOTE: This is at variance with MAIL, which is |
553 |
case-insensitive unless stated otherwise, but is |
554 |
consistent with news historical practice and |
555 |
existing news software. See the comments on back- |
556 |
ward compatibility in section 1. |
557 |
|
558 |
|
559 |
2.6. Language |
560 |
|
561 |
Various constant strings in this Draft, such as header names |
562 |
and month names, are derived from English words. Despite |
563 |
their derivation, these words do NOT change when the poster |
564 |
or reader employing them is interacting in a language other |
565 |
than English. Posting and reading agents SHOULD translate |
566 |
as appropriate in their interaction with the poster or |
567 |
reader, but the forms that actually appear in articles are |
568 |
always the English-derived ones defined in this Draft. |
569 |
|
570 |
|
571 |
3. Relation To MAIL (RFC 822 etc.) |
572 |
|
573 |
The primary intent of this Draft is to completely describe |
574 |
the news article format as a subset of MAIL's message format |
575 |
augmented by some new headers. Unless explicitly noted oth- |
576 |
erwise, the intent throughout is that an article MUST also |
577 |
be a valid MAIL message. |
578 |
|
579 |
NOTE: Despite obvious similarities between news |
580 |
and mail, opinions vary on whether it is possible |
581 |
or desirable to unify them into a single service. |
582 |
However, it is unquestionably both possible and |
583 |
useful to employ some of the same tools for manip- |
584 |
ulating both mail messages and news articles, so |
585 |
there is specific advantage to be had in defining |
586 |
them compatibly. Furthermore, there is no appar- |
587 |
ent need to re-invent the wheel when slight exten- |
588 |
sions to an existing definition will suffice. |
589 |
|
590 |
|
591 |
|
592 |
2 June 1994 - 9 - expires 15 July 1994 |
593 |
|
594 |
|
595 |
|
596 |
|
597 |
|
598 |
INTERNET DRAFT to be NEWS sec. 3 |
599 |
|
600 |
|
601 |
Given that this Draft attempts to be self-contained, it |
602 |
inevitably contains considerable repetition of information |
603 |
found in MAIL. This raises the possibility of unintentional |
604 |
conflicts. Unless specifically noted otherwise, any wording |
605 |
in this Draft which permits behavior that is not MAIL- |
606 |
compliant is erroneous and should be followed only to the |
607 |
extent that the result remains compliant with MAIL. |
608 |
|
609 |
NOTE: RFC 1036 said "where this standard conflicts |
610 |
with [RFC 822], RFC-822 should be considered cor- |
611 |
rect and this standard in error". Taken liter- |
612 |
ally, this was obviously incorrect, since RFC 1036 |
613 |
imposed a number of restrictions not found in RFC |
614 |
822. The intent, however, was reasonable: to |
615 |
indicate that UNINTENTIONAL differences were |
616 |
errors in RFC 1036. |
617 |
|
618 |
Implementors and users should note that MAIL is deliberately |
619 |
an extensible standard, and most extensions devised for mail |
620 |
are also relevant to (and compatible with) news. Note par- |
621 |
ticularly MIME [rrr], summarized briefly in appendix B, |
622 |
which extends MAIL in a number of useful ways that are defi- |
623 |
nitely relevant to news. Also of note is the work in |
624 |
progress on reconciling PEM (Privacy Enhanced Mail, which |
625 |
defines extensions for authentication and security) with |
626 |
MIME, after which this may also be relevant to news. |
627 |
|
628 |
UNRESOLVED ISSUE: Update the MIME/PEM information. |
629 |
|
630 |
Similarly, descriptions here of MIME facilities should be |
631 |
considered correct only to the extent that they do not |
632 |
require or legitimize practices that would violate those |
633 |
RFCs. (Note that this Draft does extend the application of |
634 |
some MIME facilities, but this is an extension rather than |
635 |
an alteration.) |
636 |
|
637 |
|
638 |
4. Basic Format |
639 |
|
640 |
|
641 |
4.1. Overall Syntax |
642 |
|
643 |
The overall syntax of a news article is: |
644 |
|
645 |
|
646 |
|
647 |
|
648 |
|
649 |
|
650 |
|
651 |
|
652 |
|
653 |
|
654 |
|
655 |
|
656 |
|
657 |
|
658 |
2 June 1994 - 10 - expires 15 July 1994 |
659 |
|
660 |
|
661 |
|
662 |
|
663 |
|
664 |
INTERNET DRAFT to be NEWS sec. 4.1 |
665 |
|
666 |
|
667 |
article = 1*header separator body |
668 |
header = start-line *continuation |
669 |
start-line = header-name ":" space [ nonblank-text ] eol |
670 |
continuation = space nonblank-text eol |
671 |
header-name = 1*name-character *( "-" 1*name-character ) |
672 |
name-character = letter / digit |
673 |
letter = <ASCII letter A-Z or a-z> |
674 |
digit = <ASCII digit 0-9> |
675 |
separator = eol |
676 |
body = *( [ nonblank-text / space ] eol ) |
677 |
eol = <EOL> |
678 |
nonblank-text = [ space ] text-character *( space-or-text ) |
679 |
text-character = <any ASCII character except NUL (ASCII 0), |
680 |
HT (ASCII 9), LF (ASCII 10), CR (ASCII 13), |
681 |
or blank (ASCII 32)> |
682 |
space = 1*( <HT (ASCII 9)> / <blank (ASCII 32)> ) |
683 |
space-or-text = space / text-character |
684 |
|
685 |
An article consists of some headers followed by a body. An |
686 |
empty line separates the two. The headers contain struc- |
687 |
tured information about the article and its transmission. A |
688 |
header begins with a header name identifying it, and can be |
689 |
continued onto subsequent lines by beginning the continua- |
690 |
tion line(s) with white space. (Note that section 4.2.3 |
691 |
adds some restrictions to the header syntax indicated here.) |
692 |
The body is largely-unstructured text significant only to |
693 |
the poster and the readers. |
694 |
|
695 |
NOTE: Terminology here follows the current custom |
696 |
in the news community, rather than the MAIL con- |
697 |
vention of (sometimes) referring to what is here |
698 |
called a "header" as a "header field" or "field". |
699 |
|
700 |
Note that the separator line must be truly empty, not just a |
701 |
line containing white space. Further empty lines following |
702 |
it are part of the body, as are empty lines at the end of |
703 |
the article. |
704 |
|
705 |
NOTE: Some systems make no distinction between |
706 |
empty lines and lines consisting entirely of white |
707 |
space; indeed, some systems cannot represent |
708 |
entirely empty lines. The grammar's requirement |
709 |
that header continuation lines contain some print- |
710 |
able text is meant to ensure that the empty/space |
711 |
distinction cannot confuse identification of the |
712 |
separator line. |
713 |
|
714 |
NOTE: It is tempting to authorize posting agents |
715 |
to strip empty lines at the beginning and end of |
716 |
the body, but such empty lines could possibly be |
717 |
part of a preformatted document. |
718 |
|
719 |
Implementors are warned that trailing white space, whether |
720 |
alone on the line or not, MAY be significant in the body, |
721 |
|
722 |
|
723 |
|
724 |
2 June 1994 - 11 - expires 15 July 1994 |
725 |
|
726 |
|
727 |
|
728 |
|
729 |
|
730 |
INTERNET DRAFT to be NEWS sec. 4.1 |
731 |
|
732 |
|
733 |
notably in early versions of the "uuencode" encoding for |
734 |
binary data. Trailing white space MUST be preserved unless |
735 |
the article is known to have originated within a cooperating |
736 |
subnet that avoids using significant trailing white space, |
737 |
and SHOULD be preserved regardless. Posters SHOULD avoid |
738 |
using conventions or encodings which make trailing white |
739 |
space significant; for encoding of binary data, MIME's |
740 |
"base64" encoding is recommended. Implementors are warned |
741 |
that ISO C implementations are not required to preserve |
742 |
trailing white space, and special precautions may be neces- |
743 |
sary in implementations which do not. |
744 |
|
745 |
NOTE: Unfortunately, the signature-delimiter con- |
746 |
vention (described in section 4.3.2) does use sig- |
747 |
nificant trailing white space. It's too late to |
748 |
fix this; there is work underway on defining an |
749 |
organized signature convention as part of MIME, |
750 |
which is a preferable solution in the long run. |
751 |
|
752 |
Posters are warned that some very old relayer software mis- |
753 |
behaves when the first non-empty line of an article body |
754 |
begins with white space. |
755 |
|
756 |
|
757 |
4.2. Headers |
758 |
|
759 |
|
760 |
4.2.1. Names and Contents |
761 |
|
762 |
Despite the restrictions on header-name syntax imposed by |
763 |
the grammar, relayers and reading agents SHOULD tolerate |
764 |
header names containing any ASCII printable character other |
765 |
than colon (":", ASCII 58). |
766 |
|
767 |
NOTE: MAIL header names can contain any ASCII |
768 |
printable character (other than colon) in theory, |
769 |
but in practice, arbitrary header names are known |
770 |
to cause trouble for some news software. Section |
771 |
4.1's restriction to alphanumeric sequences sepa- |
772 |
rated by hyphens is believed to permit all widely- |
773 |
used header names without causing problems for any |
774 |
widely-used software. Software is nevertheless |
775 |
encouraged to cope correctly with the full range |
776 |
of possibilities, since aberrations are known to |
777 |
occur. |
778 |
|
779 |
Relayers MUST disregard headers not described in this Draft |
780 |
(that is, with header names not mentioned in this Draft), |
781 |
and pass them on unaltered. |
782 |
|
783 |
Posters wishing to convey non-standard information in head- |
784 |
ers SHOULD use header names beginning with "X-". No stan- |
785 |
dard header name will ever be of this form. Reading agents |
786 |
SHOULD ignore "X-" headers, or at least treat them with |
787 |
|
788 |
|
789 |
|
790 |
2 June 1994 - 12 - expires 15 July 1994 |
791 |
|
792 |
|
793 |
|
794 |
|
795 |
|
796 |
INTERNET DRAFT to be NEWS sec. 4.2.1 |
797 |
|
798 |
|
799 |
great care. |
800 |
|
801 |
The order of headers in an article is not significant. How- |
802 |
ever, posting agents are encouraged to put mandatory headers |
803 |
(see section 5) first, followed by optional headers (see |
804 |
section 6), followed by headers not defined in this Draft. |
805 |
|
806 |
NOTE: While relayers and reading agents must be |
807 |
prepared to handle any order, having the signifi- |
808 |
cant headers (the precise definition of "signifi- |
809 |
cant" depends on context) first can noticeably |
810 |
improve efficiency, especially in memory-limited |
811 |
environments where it is difficult to buffer up an |
812 |
arbitrary quantity of headers while searching for |
813 |
the few that matter. |
814 |
|
815 |
Header names are case-insensitive. There is a preferred |
816 |
case convention, which posters and posting agents SHOULD |
817 |
use: each hyphen-separated "word" has its initial letter (if |
818 |
any) in uppercase and the rest in lowercase, except that |
819 |
some abbreviations have all letters uppercase (e.g. "Mes- |
820 |
sage-ID" and "MIME-Version"). The forms used in this Draft |
821 |
are the preferred forms for the headers described herein. |
822 |
Relayers and reading agents are warned that articles might |
823 |
not obey this convention. |
824 |
|
825 |
NOTE: Although software must be prepared for the |
826 |
possibility of random use of case in header names |
827 |
(and other case-independent text), establishing a |
828 |
preferred convention reduces pointless diversity, |
829 |
and may permit optimized software that looks for |
830 |
the preferred forms before resorting to less- |
831 |
efficient case-insensitive searches. |
832 |
|
833 |
In general, a header can consist of several lines, with each |
834 |
continuation line beginning with white space. The EOLs pre- |
835 |
ceding continuation lines are ignored when processing such a |
836 |
header, effectively combining the start-line and the contin- |
837 |
uations into a single logical line. The logical line, less |
838 |
the header name, colon, and any white space following the |
839 |
colon, is the "header content". |
840 |
|
841 |
|
842 |
4.2.2. Undesirable Headers |
843 |
|
844 |
A header whose content is empty is said to be an empty |
845 |
header. Relayers and reading agents SHOULD not consider |
846 |
presence or absence of an empty header to alter the seman- |
847 |
tics of an article (although syntactic rules, such as |
848 |
requirements that certain header names appear at most once |
849 |
in an article, MUST still be satisfied). Posting agents |
850 |
SHOULD delete empty headers from articles before posting |
851 |
them. |
852 |
|
853 |
|
854 |
|
855 |
|
856 |
2 June 1994 - 13 - expires 15 July 1994 |
857 |
|
858 |
|
859 |
|
860 |
|
861 |
|
862 |
INTERNET DRAFT to be NEWS sec. 4.2.2 |
863 |
|
864 |
|
865 |
Headers that merely state defaults explicitly (e.g., a Fol- |
866 |
lowup-To header with the same content as the Newsgroups |
867 |
header, or a MIME Content-Type header with contents |
868 |
"text/plain; charset=us-ascii") or state information that |
869 |
reading agents can typically determine easily themselves |
870 |
(e.g. the length of the body in octets) are redundant, con- |
871 |
veying no information whatsoever. Headers that state infor- |
872 |
mation which cannot possibly be of use to a significant num- |
873 |
ber of relayers, reading agents, or readers (e.g., the name |
874 |
of the software package used as the posting agent) are use- |
875 |
less and pointless. Posters and posting agents SHOULD avoid |
876 |
including redundant or useless headers in articles. |
877 |
|
878 |
NOTE: Information that someone, somewhere, might |
879 |
someday find useful is best omitted from headers. |
880 |
(There's quite enough of it in article bodies.) |
881 |
Headers should contain information of known util- |
882 |
ity only. This is not meant to preclude inclusion |
883 |
of information primarily meant for news-software |
884 |
debugging, but such information should be included |
885 |
only if there is real reason, preferably based on |
886 |
experience, to suspect that it may be genuinely |
887 |
useful. Articles passing through gateways are the |
888 |
only obvious case where inclusion of debugging |
889 |
information appears clearly legitimate. (See sec- |
890 |
tion 10.1.) |
891 |
|
892 |
NOTE: A useful rule of thumb for software imple- |
893 |
mentors is: "if I had to pay a dollar a day for |
894 |
the transmission of this header, would I still |
895 |
think it worthwhile?". |
896 |
|
897 |
|
898 |
4.2.3. White Space and Continuations |
899 |
|
900 |
The colon following the header name on the start-line MUST |
901 |
be followed by white space, even if the header is empty. If |
902 |
the header is not empty, at least some of the content MUST |
903 |
appear on the start-line. Posting agents MUST enforce these |
904 |
restrictions, but relayers (etc.) SHOULD accept even arti- |
905 |
cles that violate them. |
906 |
|
907 |
NOTE: MAIL does not require white space after the |
908 |
colon, but it is usual. RFC 1036 required the |
909 |
white space, even in empty headers, and some |
910 |
existing software demands it. In MAIL, and |
911 |
arguably in RFC 1036 (although the wording is |
912 |
vague), it is technically legitimate for the white |
913 |
space to be part of a continuation line rather |
914 |
than the start-line, but not all existing software |
915 |
will accept this. Deleting empty headers and |
916 |
placing some content on the start-line avoids this |
917 |
issue... which is desirable because trailing |
918 |
blanks, easily deleted by accident, are best not |
919 |
|
920 |
|
921 |
|
922 |
2 June 1994 - 14 - expires 15 July 1994 |
923 |
|
924 |
|
925 |
|
926 |
|
927 |
|
928 |
INTERNET DRAFT to be NEWS sec. 4.2.3 |
929 |
|
930 |
|
931 |
made significant in headers. |
932 |
|
933 |
In general, posters and posting agents SHOULD use blank |
934 |
(ASCII 32), not tab (ASCII 9), where white space is desired |
935 |
in headers. Existing software does not consistently accept |
936 |
tab as synonymous with blank in all contexts. In particu- |
937 |
lar, RFC 1036 appeared to specify that the character immedi- |
938 |
ately following the colon after a header name was required |
939 |
to be a blank, and some news software insists on that, so |
940 |
this character MUST be a blank. Again, posting agents MUST |
941 |
enforce these restrictions but relayers SHOULD be more tol- |
942 |
erant. |
943 |
|
944 |
Since the white space beginning a continuation line remains |
945 |
a part of the logical line, headers can be "broken" into |
946 |
multiple lines only at white space. Posting agents SHOULD |
947 |
not break headers unnecessarily. Relayers SHOULD preserve |
948 |
existing header breaks, and SHOULD not introduce new breaks. |
949 |
Breaking headers SHOULD be a last resort; relayers and read- |
950 |
ing agents SHOULD handle long header lines gracefully. (See |
951 |
the discussion of size limits in section 4.6.) |
952 |
|
953 |
|
954 |
4.3. Body |
955 |
|
956 |
Although the article body is unstructured for most of the |
957 |
purposes of this Draft, structure MAY be imposed on it by |
958 |
other means, notably MIME headers (see appendix B). |
959 |
|
960 |
|
961 |
4.3.1. Body Format Issues |
962 |
|
963 |
The body of an article MAY be empty, although posting agents |
964 |
SHOULD consider this an error condition (meriting returning |
965 |
the article to the poster for revision). A posting agent |
966 |
which does not reject such an article SHOULD issue a warning |
967 |
message to the poster and supply a non-empty body. Note |
968 |
that the separator line MUST be present even if the body is |
969 |
empty. |
970 |
|
971 |
NOTE: An empty body is probably a poster error |
972 |
except, arguably, for some control messages... and |
973 |
even they really ought to have a body explaining |
974 |
the reason for the control message. Some old |
975 |
reading agents are known to generate empty bodies |
976 |
for "cancel" control messages, so posting agents |
977 |
might opt not to reject body-less articles in such |
978 |
cases (although it would be better to fix the |
979 |
reading agents to request a body). However, some |
980 |
existing news software is known to react badly to |
981 |
body-less articles, hence the request for posting |
982 |
agents to insert a body in such cases. |
983 |
|
984 |
|
985 |
|
986 |
|
987 |
|
988 |
2 June 1994 - 15 - expires 15 July 1994 |
989 |
|
990 |
|
991 |
|
992 |
|
993 |
|
994 |
INTERNET DRAFT to be NEWS sec. 4.3.1 |
995 |
|
996 |
|
997 |
NOTE: A possible posting-agent-supplied body text |
998 |
(already used by one widespread posting agent) is |
999 |
"This article was probably generated by a buggy |
1000 |
news reader.". (The use of "reader" to refer to |
1001 |
the reading agent is traditional, although this |
1002 |
Draft uses more precise terminology.) |
1003 |
|
1004 |
NOTE: The requirement for the separator line even |
1005 |
in a bodyless article is inherited from MAIL, and |
1006 |
also distinguishes legitimately-bodyless articles |
1007 |
from articles accidentally truncated in the middle |
1008 |
of the headers. |
1009 |
|
1010 |
Note that an article body is a sequence of lines terminated |
1011 |
by EOLs, not arbitrary binary data, and in particular it |
1012 |
MUST end with an EOL. However, relayers SHOULD treat the |
1013 |
body of an article as an uninterpreted sequence of octets |
1014 |
(except as mandated by changes of EOL representation and by |
1015 |
control-message processing) and SHOULD avoid imposing con- |
1016 |
straints on it. See also section 4.6. |
1017 |
|
1018 |
|
1019 |
4.3.2. Body Conventions |
1020 |
|
1021 |
Although body lines can in principle be very long (see sec- |
1022 |
tion 4.6 for some discussion of length limits), posters |
1023 |
SHOULD restrict body line lengths to circa 70-75 characters. |
1024 |
On systems where text is conventionally stored with EOLs |
1025 |
only at paragraph breaks and other "hard return" points, |
1026 |
with software breaking lines as appropriate for display or |
1027 |
manipulation, posting agents SHOULD insert EOLs as necessary |
1028 |
so that posted articles comply with this restriction. |
1029 |
|
1030 |
NOTE: News originated in environments where line |
1031 |
breaks in plain text files were supplied by the |
1032 |
user, not the software. Be this good or bad, much |
1033 |
reading-agent and posting-agent software assumes |
1034 |
that news articles follow this convention, so it |
1035 |
is often inconvenient to read or respond to arti- |
1036 |
cles which violate it. The "70-75" number comes |
1037 |
from the widespread use of display devices which |
1038 |
are 80 columns wide, and the desire to leave a bit |
1039 |
of margin for quoting etc. (see below). |
1040 |
|
1041 |
Reading agents confronted with body lines much longer than |
1042 |
the available output-device width SHOULD break lines as |
1043 |
appropriate. Posters are warned that such breaks may not |
1044 |
occur exactly where the poster intends. |
1045 |
|
1046 |
NOTE: "As appropriate" would typically include |
1047 |
breaking lines when supplying the text of an arti- |
1048 |
cle to be quoted in a reply or followup, something |
1049 |
that line-breaking reading agents often neglect to |
1050 |
do now. |
1051 |
|
1052 |
|
1053 |
|
1054 |
2 June 1994 - 16 - expires 15 July 1994 |
1055 |
|
1056 |
|
1057 |
|
1058 |
|
1059 |
|
1060 |
INTERNET DRAFT to be NEWS sec. 4.3.2 |
1061 |
|
1062 |
|
1063 |
Although styles vary widely, for plain text it is usual to |
1064 |
use no left margin, leave the right edge ragged, use a sin- |
1065 |
gle empty line to separate paragraphs, and employ normal |
1066 |
natural-language usage on matters such as upper/lowercase. |
1067 |
(In particular, articles SHOULD not be written entirely in |
1068 |
uppercase. In environments where posters have access only |
1069 |
to uppercase, posting agents SHOULD translate it to lower- |
1070 |
case.) |
1071 |
|
1072 |
NOTE: Most people find substantial bodies of text |
1073 |
entirely in uppercase relatively hard to read, |
1074 |
while all-lowercase text merely looks slightly |
1075 |
odd. The common association of uppercase with |
1076 |
strong emphasis adds to this. |
1077 |
|
1078 |
Tone of voice does not carry well in written text, and mis- |
1079 |
understandings are common when sarcasm, parody, or exaggera- |
1080 |
tion for humorous effect is attempted without explicit warn- |
1081 |
ing. It has become conventional to use the sequence ":-)", |
1082 |
which (on most output devices) resembles a rotated "smiley |
1083 |
face" symbol, as a marker for text not meant to be taken |
1084 |
literally, especially when humor is intended. This practice |
1085 |
aids communication and averts unintended ill-will; posters |
1086 |
are urged to use it. A variety of analogous sequences are |
1087 |
used with less-standardized meanings [Sanderson]. |
1088 |
|
1089 |
The order of arrival of news articles at a particular host |
1090 |
depends somewhat on transmission paths, and occasionally |
1091 |
articles are lost for various reasons. When responding to a |
1092 |
previous article, posters SHOULD not assume that all readers |
1093 |
understand the exact context. It is common to quote some of |
1094 |
the previous article to establish context. This SHOULD be |
1095 |
done by prefacing each quoted line (even if it is empty) |
1096 |
with the character ">". This will result in multiple levels |
1097 |
of ">" when quoted context itself contains quoted context. |
1098 |
|
1099 |
NOTE: It may seem superfluous to put a prefix on |
1100 |
empty lines, but it simplifies implementation of |
1101 |
functions such as "skip all quoted text" in read- |
1102 |
ing agents. |
1103 |
|
1104 |
Readability is enhanced if quoted text and new text are sep- |
1105 |
arated by an empty line. |
1106 |
|
1107 |
Posters SHOULD edit quoted context to trim it down to the |
1108 |
minimum necessary. However, posting agents SHOULD not |
1109 |
attempt to enforce this by imposing overly-simplistic rules |
1110 |
like "no more than 50% of the lines should be quotes". |
1111 |
|
1112 |
NOTE: While encouraging trimming is desirable, the |
1113 |
50% rule imposed by some old posting agents is |
1114 |
both inadequate and counterproductive. Posters do |
1115 |
not respond to it by being more selective about |
1116 |
quoting; they respond by padding short responses, |
1117 |
|
1118 |
|
1119 |
|
1120 |
2 June 1994 - 17 - expires 15 July 1994 |
1121 |
|
1122 |
|
1123 |
|
1124 |
|
1125 |
|
1126 |
INTERNET DRAFT to be NEWS sec. 4.3.2 |
1127 |
|
1128 |
|
1129 |
or by using different quoting styles to defeat |
1130 |
automatic analysis. The former adds unnecessary |
1131 |
noise and volume, while the latter also defeats |
1132 |
more useful forms of automatic analysis that read- |
1133 |
ing agents might wish to do. |
1134 |
|
1135 |
NOTE: At the very least, if a minimum-unquoted |
1136 |
quota is being set, article bodies shorter than |
1137 |
(say) 20 lines, or perhaps articles which exceed |
1138 |
the quota by only a few lines, should be exempt. |
1139 |
This avoids the ridiculous situation of complain- |
1140 |
ing about a 5-line response to a 6-line quote. |
1141 |
|
1142 |
NOTE: A more subtle posting-agent rule, suggested |
1143 |
for experimental use, is to reject articles that |
1144 |
appear to contain quoted signatures (see below). |
1145 |
This is almost certainly the result of a careless |
1146 |
poster not bothering to trim down quoted context. |
1147 |
Also, if a posting agent or followup agent pre- |
1148 |
sents an article template to the poster for edit- |
1149 |
ing, it really should take note of whether the |
1150 |
poster actually made any changes, and refrain from |
1151 |
posting an unmodified template. |
1152 |
|
1153 |
Some followup agents supply "attribution" lines for quoted |
1154 |
context, indicating where it first appeared and under whose |
1155 |
name. When multiple levels of quoting are present and |
1156 |
quoted context is edited for brevity, "inner" attribution |
1157 |
lines are not always retained. The editing process is also |
1158 |
somewhat error-prone. Reading agents (and readers) are |
1159 |
warned not to assume that attributions are accurate. |
1160 |
|
1161 |
UNRESOLVED ISSUE: Should a standard format for |
1162 |
attribution lines be defined? There is already |
1163 |
considerable diversity... but automatic news anal- |
1164 |
ysis would be substantially aided by a standard |
1165 |
convention. |
1166 |
|
1167 |
Early difficulties in inferring return addresses from arti- |
1168 |
cle headers led to "signatures": short closing texts, auto- |
1169 |
matically added to the end of articles by posting agents, |
1170 |
identifying the poster and giving his network addresses etc. |
1171 |
If a poster or posting agent does append a signature to an |
1172 |
article, the signature SHOULD be preceded with a delimiter |
1173 |
line containing (only) two hyphens (ASCII 45) followed by |
1174 |
one blank (ASCII 32). Posting agents SHOULD limit the |
1175 |
length of signatures, since verbose excess bordering on |
1176 |
abuse is common if no restraint is imposed; 4 lines is a |
1177 |
common limit. |
1178 |
|
1179 |
NOTE: While signatures are arguably a blemish, |
1180 |
they are a well-understood convention, and convey- |
1181 |
ing the same information in headers exposes it to |
1182 |
mangling and makes it rather less conspicuous. A |
1183 |
|
1184 |
|
1185 |
|
1186 |
2 June 1994 - 18 - expires 15 July 1994 |
1187 |
|
1188 |
|
1189 |
|
1190 |
|
1191 |
|
1192 |
INTERNET DRAFT to be NEWS sec. 4.3.2 |
1193 |
|
1194 |
|
1195 |
standard delimiter line makes it possible for |
1196 |
reading agents to handle signatures specially if |
1197 |
desired. (This is unfortunately hampered by |
1198 |
extensive misunderstanding of, and misuse of, the |
1199 |
delimiter.) |
1200 |
|
1201 |
NOTE: The choice of delimiter is somewhat unfortu- |
1202 |
nate, since it relies on preservation of trailing |
1203 |
white space, but it is too well-established to |
1204 |
change. There is work underway to define a more |
1205 |
sophisticated signature scheme as part of MIME, |
1206 |
and this will presumably supersede the current |
1207 |
convention in due time. |
1208 |
|
1209 |
NOTE: Four 75-column lines of signature text is |
1210 |
300 characters, which is ample to convey name and |
1211 |
mail-address information in all but the most |
1212 |
bizarre situations. |
1213 |
|
1214 |
|
1215 |
4.4. Characters And Character Sets |
1216 |
|
1217 |
Header and body lines MAY contain any ASCII characters other |
1218 |
than CR (ASCII 13), LF (ASCII 10), and NUL (ASCII 0). |
1219 |
|
1220 |
NOTE: CR and LF are excluded because they clash |
1221 |
with common EOL conventions. NUL is excluded |
1222 |
because it clashes with the C end-of-string con- |
1223 |
vention, which is significant to most existing |
1224 |
news software. These three characters are |
1225 |
unlikely to be transmitted successfully. |
1226 |
|
1227 |
However, posters SHOULD avoid using ASCII control characters |
1228 |
except for tab (ASCII 9), formfeed (ASCII 12), and backspace |
1229 |
(ASCII 8). Tab signifies sufficient horizontal white space |
1230 |
to reach the next of a set of fixed positions; posters are |
1231 |
warned that there is no standard set of positions, so tabs |
1232 |
should be avoided if precise spacing is essential. Formfeed |
1233 |
signifies a point at which a reading agent SHOULD pause and |
1234 |
await reader interaction before displaying further text. |
1235 |
Backspace SHOULD be used only for underlining, done by a |
1236 |
sequence of underscores (ASCII 95) followed by an equal num- |
1237 |
ber of backspaces, signifying that the same number of text |
1238 |
characters following are to be underlined. Posters are |
1239 |
warned that underlining is not available on all output |
1240 |
devices and is best not relied on for essential meaning. |
1241 |
Reading agents SHOULD recognize underlining and translate it |
1242 |
to the appropriate commands for devices that support it. |
1243 |
|
1244 |
NOTE: Interpretation of almost all control charac- |
1245 |
ters is device-specific to some degree, and |
1246 |
devices differ. Tabs and underlining are sup- |
1247 |
ported, to some extent, by most modern devices and |
1248 |
reading agents, hence the cautious exemptions for |
1249 |
|
1250 |
|
1251 |
|
1252 |
2 June 1994 - 19 - expires 15 July 1994 |
1253 |
|
1254 |
|
1255 |
|
1256 |
|
1257 |
|
1258 |
INTERNET DRAFT to be NEWS sec. 4.4 |
1259 |
|
1260 |
|
1261 |
them. The underlining method is specified because |
1262 |
the inverse method, text and then underscores, is |
1263 |
tempting to the naive... but if sent unaltered to |
1264 |
a device that shows only the most recent of sev- |
1265 |
eral overstruck characters rather than a compos- |
1266 |
ite, the result can be utterly unreadable. |
1267 |
|
1268 |
NOTE: A common interpretation of tab is that it is |
1269 |
a request to space forward to the next position |
1270 |
whose number is one more than a multiple of 8, |
1271 |
with positions numbered sequentially starting at |
1272 |
1. (So tab positions are 9, 17, 25, ...) Reading |
1273 |
agents not constrained by existing system conven- |
1274 |
tions might wish to use this interpretation. |
1275 |
|
1276 |
NOTE: It will typically be necessary for a reading |
1277 |
agent to catch and interpret formfeed, not just |
1278 |
send it to the output device. The actions per- |
1279 |
formed by typical output devices on receiving a |
1280 |
formfeed are neither adequate for nor appropriate |
1281 |
to the pause-for-interaction meaning. |
1282 |
|
1283 |
Cooperating subnets which wish to employ non-ASCII character |
1284 |
sets by using escape sequences (employing, e.g., ESC (ASCII |
1285 |
27), SO (ASCII 14), and SI (ASCII 15)) to alter the meaning |
1286 |
of superficially-ASCII characters MAY do so, but MUST use |
1287 |
MIME headers to alert reading agents to the particular char- |
1288 |
acter set(s) and escape sequences in use. A reading agent |
1289 |
SHOULD not pass such an escape sequence through, unaltered, |
1290 |
to the output device unless the agent confirms that the |
1291 |
sequence is one used to affect character sets and has reason |
1292 |
to believe that the device is capable of interpreting that |
1293 |
particular sequence properly. |
1294 |
|
1295 |
NOTE: Cooperating-subnet organizers are warned |
1296 |
that some very old relayers strip certain control |
1297 |
characters out of articles they pass along. ESC |
1298 |
is known to be among the affected characters. |
1299 |
|
1300 |
NOTE: There are now standard Internet encodings |
1301 |
for Japanese [rrr] and Vietnamese [rrr] in partic- |
1302 |
ular. |
1303 |
|
1304 |
Articles MUST not contain any octet with value exceeding |
1305 |
127, i.e. any octet that is not an ASCII character. |
1306 |
|
1307 |
NOTE: This rule, like others, may be relaxed by |
1308 |
unanimous consent of the members of a cooperating |
1309 |
subnet, provided suitable precautions are taken to |
1310 |
ensure that rule-violating articles do not leak |
1311 |
out of the subnet. (This has already been done in |
1312 |
many areas where ASCII is not adequate for the |
1313 |
local language(s).) Beware that articles contain- |
1314 |
ing non-ASCII octets in headers are a violation of |
1315 |
|
1316 |
|
1317 |
|
1318 |
2 June 1994 - 20 - expires 15 July 1994 |
1319 |
|
1320 |
|
1321 |
|
1322 |
|
1323 |
|
1324 |
INTERNET DRAFT to be NEWS sec. 4.4 |
1325 |
|
1326 |
|
1327 |
the MAIL specifications and are not valid MAIL |
1328 |
messages. MIME offers a way to encode non-ASCII |
1329 |
characters in ASCII for use in headers; see sec- |
1330 |
tion 4.5. |
1331 |
|
1332 |
NOTE: While there is great interest in using 8-bit |
1333 |
character sets, not all software can yet handle |
1334 |
them correctly. Hence the restriction to cooper- |
1335 |
ating subnets. MIME encodings can be used to |
1336 |
transmit such characters while remaining within |
1337 |
the octet restriction. |
1338 |
|
1339 |
In anticipation of the day when it is possible to use non- |
1340 |
ASCII characters safely anywhere, and to provide for the |
1341 |
(substantial) cooperating subnets that are already using |
1342 |
them, transmission paths SHOULD treat news articles as unin- |
1343 |
terpreted sequences of octets (except perhaps for transfor- |
1344 |
mations between EOL representations) and relayers SHOULD |
1345 |
treat non-ASCII characters in articles as ordinary charac- |
1346 |
ters. |
1347 |
|
1348 |
NOTE: 8-bit enthusiasts are warned that not all |
1349 |
software conforms to these recommendations yet. |
1350 |
In particular, standard NNTP [rrr] is a 7-bit pro- |
1351 |
tocol, and there may be implementations which |
1352 |
enforce this rule. Be warned, also, that it will |
1353 |
never be safe to send raw binary data in the body |
1354 |
of news articles, because changes of EOL represen- |
1355 |
tation may (will!) corrupt it. |
1356 |
|
1357 |
Except where cooperating subnets permit more direct |
1358 |
approaches, MIME [rrr] headers and encodings SHOULD be used |
1359 |
to transmit non-ASCII content using ASCII characters; see |
1360 |
section 4.5, appendix B, and the MIME RFCs for details. If |
1361 |
article content can be expressed in ASCII, it SHOULD be. |
1362 |
Failing that, the order of preference for character sets is |
1363 |
that described in MIME [rrr]. |
1364 |
|
1365 |
NOTE: Using the MIME facilities, it is possible to |
1366 |
transmit ANY character set, and ANY form of binary |
1367 |
data, using only ASCII characters. Equally impor- |
1368 |
tant, such articles are self-describing and the |
1369 |
reading agent can tell which octet-to-symbol map- |
1370 |
ping is intended! Designation of some preferred |
1371 |
character sets is intended to minimize the number |
1372 |
of character sets that a reading agent must under- |
1373 |
stand in order to display most articles properly. |
1374 |
|
1375 |
Articles containing non-ASCII characters, articles using |
1376 |
ASCII characters (values 0 through 127) to refer to non- |
1377 |
ASCII symbols, and articles using escape sequences to shift |
1378 |
character sets SHOULD include MIME headers indicating which |
1379 |
character set(s) and conventions are being used, and MUST do |
1380 |
so unless such articles are strictly confined to a |
1381 |
|
1382 |
|
1383 |
|
1384 |
2 June 1994 - 21 - expires 15 July 1994 |
1385 |
|
1386 |
|
1387 |
|
1388 |
|
1389 |
|
1390 |
INTERNET DRAFT to be NEWS sec. 4.4 |
1391 |
|
1392 |
|
1393 |
cooperating subnet which has its own pre-agreed conventions. |
1394 |
MIME encodings are preferred over all these techniques. If |
1395 |
it comes to a relayer's attention that it is being asked to |
1396 |
pass an article using such techniques outward across what it |
1397 |
knows to be the boundary of such a cooperating subnet, it |
1398 |
MUST report this error to its administrator, and MAY refuse |
1399 |
to pass the article beyond the subnet boundary. If it does |
1400 |
pass the article, it MUST re-encode it with MIME encodings |
1401 |
to make it conform to this Draft. |
1402 |
|
1403 |
NOTE: Such re-encoding is a non-trivial task, due |
1404 |
to MIME rules such as the prohibition of nested |
1405 |
encodings. It's not just a matter of pouring the |
1406 |
body through a simple filter. |
1407 |
|
1408 |
Reading agents SHOULD note MIME headers and attempt to show |
1409 |
the reader the closest possible approximation to the |
1410 |
intended content. They SHOULD not just send the octets of |
1411 |
the article to the output device unaltered, unless there is |
1412 |
reason to believe that the output device will indeed inter- |
1413 |
pret them correctly. Reading agents MUST not pass ASCII |
1414 |
control characters or escape sequences, other than as dis- |
1415 |
cussed above, unaltered to the output device; only by chance |
1416 |
would the result be the desired one, and there is serious |
1417 |
potential for harmful side effects, either accidental or |
1418 |
malicious. |
1419 |
|
1420 |
NOTE: Exactly what to do with unwanted control |
1421 |
characters/sequences depends on the philosophy of |
1422 |
the reading agent, but passing them straight to |
1423 |
the output device is almost always wrong. If the |
1424 |
reading agent wants to mark the presence of such a |
1425 |
character/sequence in circumstances where only |
1426 |
ASCII printable characters are available, trans- |
1427 |
lating it to "#" might be a suitable method; "#" |
1428 |
is a conspicuous character seldom used in normal |
1429 |
text. |
1430 |
|
1431 |
NOTE: Reading agents should be aware that many old |
1432 |
output devices (or the transmission paths to them) |
1433 |
zero out the top bit of octets sent to them. This |
1434 |
can transform non-ASCII characters into ASCII con- |
1435 |
trol characters. |
1436 |
|
1437 |
Followup agents MUST be careful to apply appropriate trans- |
1438 |
formations of representation to the outbound followup as |
1439 |
well as the inbound precursor. A followup to an article |
1440 |
containing non-ASCII material is very likely to contain non- |
1441 |
ASCII material itself. |
1442 |
|
1443 |
|
1444 |
|
1445 |
|
1446 |
|
1447 |
|
1448 |
|
1449 |
|
1450 |
2 June 1994 - 22 - expires 15 July 1994 |
1451 |
|
1452 |
|
1453 |
|
1454 |
|
1455 |
|
1456 |
INTERNET DRAFT to be NEWS sec. 4.5 |
1457 |
|
1458 |
|
1459 |
4.5. Non-ASCII Characters In Headers |
1460 |
|
1461 |
All octets found in headers MUST be ASCII characters. How- |
1462 |
ever, it is desirable to have a way of encoding non-ASCII |
1463 |
characters, especially in "human-readable" headers such as |
1464 |
Subject. MIME [rrr] provides a way to do this. Full |
1465 |
details may be found in the MIME specifications; herewith a |
1466 |
quick summary to alert software authors to the issues... |
1467 |
|
1468 |
encoded-word = "=?" charset "?" encoding "?" codes "?=" |
1469 |
charset = 1*tag-char |
1470 |
encoding = 1*tag-char |
1471 |
tag-char = <ASCII printable character except !()<>@,;:\"[]/?=> |
1472 |
codes = 1*code-char |
1473 |
code-char = <ASCII printable character except ?> |
1474 |
|
1475 |
An encoded word is a sequence of ASCII printable characters |
1476 |
that specifies the character set, encoding method, and bits |
1477 |
of (potentially) non-ASCII characters. Encoded words are |
1478 |
allowed only in certain positions in certain headers. Spe- |
1479 |
cific headers impose restrictions on the content of encoded |
1480 |
words beyond that specified in this section. Posting agents |
1481 |
MUST ensure that any material resembling an encoded word |
1482 |
(complete with all delimiters), in a context where encoded |
1483 |
words may appear, really is an encoded word. |
1484 |
|
1485 |
NOTE: The syntax is a bit ugly, but it was |
1486 |
designed to minimize chances of confusion with |
1487 |
legitimate header contents, and to satisfy diffi- |
1488 |
cult constraints on use within existing headers. |
1489 |
|
1490 |
An encoded word MUST not be more than 75 octets long. Each |
1491 |
line of a header containing encoded word(s) MUST be at most |
1492 |
76 octets long, not counting the EOL. |
1493 |
|
1494 |
NOTE: These limits are meant to bound the looka- |
1495 |
head needed to determine whether text that begins |
1496 |
"=?" is really an encoded word. |
1497 |
|
1498 |
The details of charsets and encodings are defined by MIME |
1499 |
[rrr]; the sequence of preferred character sets is the same |
1500 |
as MIME's. Encoded words SHOULD not be used for content |
1501 |
expressible in ASCII. |
1502 |
|
1503 |
When an encoded word is used, other than in a newsgroup name |
1504 |
(see section 5.5), it MUST be separated from any adjacent |
1505 |
non-space characters (including other encoded words) by |
1506 |
white space. Reading agents displaying the contents of |
1507 |
encoded words (as opposed to their encoded form) should |
1508 |
ignore white space adjacent to encoded words. |
1509 |
|
1510 |
UNRESOLVED ISSUE: Should this section be deleted |
1511 |
entirely, or made much more terse? The material |
1512 |
is relevant, but too complex to discuss fully. |
1513 |
|
1514 |
|
1515 |
|
1516 |
2 June 1994 - 23 - expires 15 July 1994 |
1517 |
|
1518 |
|
1519 |
|
1520 |
|
1521 |
|
1522 |
INTERNET DRAFT to be NEWS sec. 4.5 |
1523 |
|
1524 |
|
1525 |
NOTE: The deletion of intervening white space per- |
1526 |
mits using multiple encoded words, implicitly con- |
1527 |
catenated by the deletion, to encode text that |
1528 |
will not fit within a single 75-character encoded |
1529 |
word. |
1530 |
|
1531 |
Reading-agent implementors are warned that although this |
1532 |
Draft completely specifies where encoded words may appear in |
1533 |
the headers it defines, there are other headers (e.g. the |
1534 |
MIME Content-Description header) that MAY contain them. |
1535 |
|
1536 |
|
1537 |
4.6. Size Limits |
1538 |
|
1539 |
Implementations SHOULD avoid fixed constraints on the sizes |
1540 |
of lines within an article and on the size of the entire |
1541 |
article. |
1542 |
|
1543 |
Relayers SHOULD treat the body of an article as an uninter- |
1544 |
preted sequence of octets (except as mandated by changes of |
1545 |
EOL representation and processing of control messages), not |
1546 |
to be altered or constrained in any way. |
1547 |
|
1548 |
If it is absolutely necessary for an implementation to |
1549 |
impose a limit on the length of header lines, body lines, or |
1550 |
header logical lines, that limit shall be at least 1000 |
1551 |
octets, including EOL representations. Relayers and trans- |
1552 |
mission paths confronted with lines beyond their internal |
1553 |
limits (if any) MUST not simply inject EOLs at random |
1554 |
places; they MAY break headers (as described in 4.2.3) as a |
1555 |
last resort, and otherwise they MUST either pass the long |
1556 |
lines through unaltered, or refuse to pass the article at |
1557 |
all (see section 9.1 for further discussion). |
1558 |
|
1559 |
NOTE: The limit here is essentially the same mini- |
1560 |
mum as that specified for SMTP mail in RFC 821 |
1561 |
[rrr]. Implementors are warned that Path (see |
1562 |
section 5.6) and References (see section 6.5) |
1563 |
headers, in particular, often become several hun- |
1564 |
dred characters long, so 1000 is not an overly |
1565 |
generous limit. |
1566 |
|
1567 |
All implementations MUST be able to handle an article |
1568 |
totalling at least 65,000 octets, including headers and EOL |
1569 |
representations, gracefully and efficiently. All implemen- |
1570 |
tations SHOULD be able to handle an article totalling at |
1571 |
least 1,000,000 (one million) octets, including headers and |
1572 |
EOL representations, gracefully and efficiently. "Grace- |
1573 |
fully and efficiently" is intended to preclude not only |
1574 |
failures, but also major loss of performance, serious prob- |
1575 |
lems in error recovery, or resource consumption beyond what |
1576 |
is reasonably necessary. |
1577 |
|
1578 |
|
1579 |
|
1580 |
|
1581 |
|
1582 |
2 June 1994 - 24 - expires 15 July 1994 |
1583 |
|
1584 |
|
1585 |
|
1586 |
|
1587 |
|
1588 |
INTERNET DRAFT to be NEWS sec. 4.6 |
1589 |
|
1590 |
|
1591 |
NOTE: The intent here is to prohibit lowering the |
1592 |
existing de-facto limit any further, while |
1593 |
strongly encouraging movement towards a higher |
1594 |
one. Actually, although improvements are desir- |
1595 |
able in some cases, much news software copes rea- |
1596 |
sonably well with very large articles. The same |
1597 |
cannot be said of the communications software and |
1598 |
protocols used to transmit news from one host to |
1599 |
another, especially when slow communications links |
1600 |
are involved. Occasional huge articles that |
1601 |
appear now (by accident or through ignorance) typ- |
1602 |
ically leave trails of failing software, system |
1603 |
problems, and irate administrators in their wake. |
1604 |
|
1605 |
NOTE: It is intended that the successor to this |
1606 |
Draft will raise the "MUST" limit to 1,000,000 and |
1607 |
the "SHOULD" limit still further. |
1608 |
|
1609 |
Posters SHOULD limit posted articles to at most 60,000 |
1610 |
octets, including headers and EOL representations, unless |
1611 |
the articles are being posted only within a cooperating sub- |
1612 |
net which is known to be capable of handling larger articles |
1613 |
gracefully. Posting agents presented with a large article |
1614 |
SHOULD warn the poster and request confirmation. |
1615 |
|
1616 |
NOTE: The difference between this and the earlier |
1617 |
"MUST" limit is margin for header growth, differ- |
1618 |
ing EOL representations, and transmission over- |
1619 |
heads. |
1620 |
|
1621 |
NOTE: Disagreeable though these limits are, it is |
1622 |
a fact that in current networks, an article larger |
1623 |
than 64K (after header growth etc.) simply is not |
1624 |
transmitted reliably. Note also the comments |
1625 |
above on the trauma caused by single extremely- |
1626 |
large articles now; the problems are real and cur- |
1627 |
rent. These problems arguably should be fixed, |
1628 |
but this will not happen network-wide in the imme- |
1629 |
diate future. Hence the restriction of larger |
1630 |
articles to cooperating subnets, for now. |
1631 |
|
1632 |
Posters using non-ASCII characters in their text MUST take |
1633 |
into account the overhead involved in MIME encoding, unless |
1634 |
the article's propagation will be entirely limited to a |
1635 |
cooperating subnet which does not use MIME encodings for |
1636 |
non-ASCII characters. For example, MIME base64 encoding |
1637 |
involves growth by a factor of approximately 4/3, so an |
1638 |
article which would likely have to use this encoding should |
1639 |
be at most about 45,000 octets before encoding. |
1640 |
|
1641 |
Posters SHOULD use MIME "message/partial" conventions to |
1642 |
facilitate automatic reassembly of a large document split |
1643 |
into smaller pieces for posting. It is recommended that the |
1644 |
content identifier used should be a message ID, generated by |
1645 |
|
1646 |
|
1647 |
|
1648 |
2 June 1994 - 25 - expires 15 July 1994 |
1649 |
|
1650 |
|
1651 |
|
1652 |
|
1653 |
|
1654 |
INTERNET DRAFT to be NEWS sec. 4.6 |
1655 |
|
1656 |
|
1657 |
the same means as article message IDs (see section 5.3), and |
1658 |
that all parts should have a See-Also header (see section |
1659 |
6.16) giving the message IDs of at least the previous parts |
1660 |
and preferably all the parts. |
1661 |
|
1662 |
NOTE: See-Also is more correct for this purpose |
1663 |
than References, although References is in common |
1664 |
use today (with less-formal reassembly arrange- |
1665 |
ments). MIME reassemblers should probably examine |
1666 |
articles suggested by References headers if See- |
1667 |
Also headers are not present to indicate the |
1668 |
whereabouts of the other parts of "mes- |
1669 |
sage/partial" articles. |
1670 |
|
1671 |
To repeat: implementations SHOULD avoid fixed constraints on |
1672 |
the sizes of lines within an article and on the size of the |
1673 |
entire article. |
1674 |
|
1675 |
|
1676 |
4.7. Example |
1677 |
|
1678 |
Here is a sample article: |
1679 |
|
1680 |
From: jerry@eagle.ATT.COM (Jerry Schwarz) |
1681 |
Path: cbosgd!mhuxj!mhuxt!eagle!jerry |
1682 |
Newsgroups: news.announce |
1683 |
Subject: Usenet Etiquette -- Please Read |
1684 |
Message-ID: <642@eagle.ATT.COM> |
1685 |
Date: Mon, 17 Jan 1994 11:14:55 -0500 (EST) |
1686 |
Followup-To: news.misc |
1687 |
Expires: Wed, 19 Jan 1994 00:00:00 -0500 |
1688 |
Organization: AT&T Bell Laboratories, Murray Hill |
1689 |
|
1690 |
body |
1691 |
body |
1692 |
body |
1693 |
|
1694 |
|
1695 |
|
1696 |
5. Mandatory Headers |
1697 |
|
1698 |
An article MUST have one, and only one, of each of the fol- |
1699 |
lowing headers: Date, From, Message-ID, Subject, Newsgroups, |
1700 |
Path. |
1701 |
|
1702 |
NOTE: MAIL specifies (if read most carefully) that |
1703 |
there must be exactly one Date header and exactly |
1704 |
one From header, but otherwise does not restrict |
1705 |
multiple appearances of headers. (Notably, it |
1706 |
permits multiple Message-ID headers!) This |
1707 |
appears singularly useless, or even harmful, in |
1708 |
the context of news, and much current news soft- |
1709 |
ware will not tolerate multiple appearances of |
1710 |
mandatory headers. |
1711 |
|
1712 |
|
1713 |
|
1714 |
2 June 1994 - 26 - expires 15 July 1994 |
1715 |
|
1716 |
|
1717 |
|
1718 |
|
1719 |
|
1720 |
INTERNET DRAFT to be NEWS sec. 5 |
1721 |
|
1722 |
|
1723 |
Note also that there are situations, discussed in the rele- |
1724 |
vant parts of section 6, where References, Sender, or |
1725 |
Approved headers are mandatory. |
1726 |
|
1727 |
In the discussions of the individual headers, the content of |
1728 |
each is specified using the syntax notation. The convention |
1729 |
used is that the content of, for example, the Subject header |
1730 |
is defined as <Subject-content>. |
1731 |
|
1732 |
|
1733 |
5.1. Date |
1734 |
|
1735 |
The Date header contains the date and time when the article |
1736 |
was submitted for transmission: |
1737 |
|
1738 |
Date-content = [ weekday "," space ] date space time |
1739 |
weekday = "Mon" / "Tue" / "Wed" / "Thu" |
1740 |
/ "Fri" / "Sat" / "Sun" |
1741 |
date = day space month space year |
1742 |
day = 1*2digit |
1743 |
month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" |
1744 |
/ "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" |
1745 |
year = 4digit / 2digit |
1746 |
time = hh ":" mm [ ":" ss ] space timezone |
1747 |
timezone = "UT" / "GMT" |
1748 |
/ ( "+" / "-" ) hh mm [ space "(" zone-name ")" ] |
1749 |
hh = 2digit |
1750 |
mm = 2digit |
1751 |
ss = 2digit |
1752 |
zone-name = 1*( <ASCII printable character except ()\> / space ) |
1753 |
|
1754 |
This is a restricted subset of the MAIL date format. |
1755 |
|
1756 |
If a weekday is given, it MUST be consistent with the date. |
1757 |
The modern Gregorian calendar is used, and dates MUST be |
1758 |
consistent with its usual conventions; for example, if the |
1759 |
month is May, the day must be between 1 and 31 inclusive. |
1760 |
The year SHOULD be given as four digits, and posting agents |
1761 |
SHOULD enforce this; however, relayers MUST accept the two- |
1762 |
digit form, and MUST interpret it as having the implicit |
1763 |
prefix "19". |
1764 |
|
1765 |
NOTE: Two-digit year numbers can, should, and must |
1766 |
be phased out by 1999. |
1767 |
|
1768 |
The time is given on the 24-hour clock, e.g. two hours |
1769 |
before midnight is "22:00" or "22:00:00". The hh must be |
1770 |
between 00 and 23 inclusive, the mm between 0 and 59 inclu- |
1771 |
sive, and the ss between 0 and 61 inclusive. |
1772 |
|
1773 |
NOTE: Leap seconds very occasionally result in |
1774 |
minutes that are 61 or 62 seconds long. |
1775 |
|
1776 |
|
1777 |
|
1778 |
|
1779 |
|
1780 |
2 June 1994 - 27 - expires 15 July 1994 |
1781 |
|
1782 |
|
1783 |
|
1784 |
|
1785 |
|
1786 |
INTERNET DRAFT to be NEWS sec. 5.1 |
1787 |
|
1788 |
|
1789 |
The date and time SHOULD be given in the poster's local |
1790 |
timezone, including a specification of that timezone as a |
1791 |
numeric offset (which SHOULD include the timezone name, e.g. |
1792 |
"EST", supplied in parentheses like a MAIL comment). If |
1793 |
not, they MUST be given in Universal Time (abbreviated "UT"; |
1794 |
"GMT" is a historical synonym for "UT"). The timezone name |
1795 |
in parentheses, if present, is a comment; software MUST |
1796 |
ignore it, except that reading agents might wish to display |
1797 |
it to the reader. Timezone names other than "UT" and "GMT" |
1798 |
MUST appear only in the comment. |
1799 |
|
1800 |
NOTE: Attempts to deal with a full set of timezone |
1801 |
names have all foundered on the vast number of |
1802 |
such names in use and the duplications (for exam- |
1803 |
ple, there are at least FIVE different timezones |
1804 |
called "EST" by somebody). Even the limited set |
1805 |
of North American zone names authorized by MAIL is |
1806 |
subject to confusion and misinterpretation. Hence |
1807 |
the flat ban on non-UT timezone names except as |
1808 |
comments. |
1809 |
|
1810 |
NOTE: RFC 1036 specified that use of GMT (aka UT, |
1811 |
UTC) was preferred. However, the local time (in |
1812 |
the poster's timezone) is arguably information of |
1813 |
possible interest to the reader, and this requires |
1814 |
some indication of the poster's timezone. Numeric |
1815 |
offsets are an unambiguous way of doing this, and |
1816 |
their use was indeed sanctioned by RFC 1036 (that |
1817 |
is, this is a change of preference only). |
1818 |
|
1819 |
NOTE: There is frequent confusion, including |
1820 |
errors in some news software, regarding the sign |
1821 |
of numeric timezones. Zones west of Greenwich |
1822 |
have negative offsets. For example, North Ameri- |
1823 |
can Eastern Standard Time is zone -0500 and North |
1824 |
American Eastern Daylight Time is zone -0400. |
1825 |
|
1826 |
NOTE: Implementors are warned that the hh in a |
1827 |
timezone can go up to about 14; it is not limited |
1828 |
to 12. This is because the International Date |
1829 |
Line does not run exactly along the boundary |
1830 |
between zone -1200 and zone +1200. |
1831 |
|
1832 |
NOTE: The comments in section 2.6 regarding trans- |
1833 |
lation to other languages are relevant here. The |
1834 |
Date-content format, and the spellings of its com- |
1835 |
ponents, as found in articles themselves, are |
1836 |
always as defined in this Draft, regardless of the |
1837 |
language used to interact with readers and |
1838 |
posters. Reading and posting agents should trans- |
1839 |
late as appropriate. Actually, even English- |
1840 |
language reading and posting agents will probably |
1841 |
want to do some degree of translation on dates, if |
1842 |
only to abbreviate the lengthy format and |
1843 |
|
1844 |
|
1845 |
|
1846 |
2 June 1994 - 28 - expires 15 July 1994 |
1847 |
|
1848 |
|
1849 |
|
1850 |
|
1851 |
|
1852 |
INTERNET DRAFT to be NEWS sec. 5.1 |
1853 |
|
1854 |
|
1855 |
(perhaps) translate to and from the reader's time- |
1856 |
zone. |
1857 |
|
1858 |
|
1859 |
5.2. From |
1860 |
|
1861 |
The From header contains the electronic address, and possi- |
1862 |
bly the full name, of the article's author: |
1863 |
|
1864 |
From-content = address [ space "(" paren-phrase ")" ] |
1865 |
/ [ plain-phrase space ] "<" address ">" |
1866 |
paren-phrase = 1*( paren-char / space / encoded-word ) |
1867 |
paren-char = <ASCII printable character except ()<>\> |
1868 |
plain-phrase = plain-word *( space plain-word ) |
1869 |
plain-word = unquoted-word / quoted-word / encoded-word |
1870 |
unquoted-word = 1*unquoted-char |
1871 |
unquoted-char = <ASCII printable character except !()<>@,;:\".[]> |
1872 |
quoted-word = quote 1*( quoted-char / space ) quote |
1873 |
quote = <" (ASCII 34)> |
1874 |
quoted-char = <ASCII printable character except "()<>\> |
1875 |
address = local-part "@" domain |
1876 |
local-part = unquoted-word *( "." unquoted-word ) |
1877 |
domain = unquoted-word *( "." unquoted-word ) |
1878 |
|
1879 |
(Encoded words are described in section 4.5.) The full name |
1880 |
is distinguished from the electronic address either by |
1881 |
enclosing the former in parentheses (making it resemble a |
1882 |
MAIL comment, after the address) or by enclosing the latter |
1883 |
in angle brackets. The second form is preferred. In the |
1884 |
first form, encoded words inside the full name MUST be com- |
1885 |
posed entirely of <paren-char>s. In the second form, |
1886 |
encoded words inside the full name may not contain charac- |
1887 |
ters other than letters (of either case), digits, and the |
1888 |
characters "!", "*", "+", "-", "/", "=", and "_". The local |
1889 |
part is case-sensitive (except that all case counterparts of |
1890 |
"postmaster" are deemed equivalent), the domain is case- |
1891 |
insensitive, and all other parts of the From content are |
1892 |
comments which MUST be ignored by news software (except |
1893 |
insofar as reading agents may wish to display them to the |
1894 |
reader). Posters and posting agents MUST restrict them- |
1895 |
selves to this subset of the MAIL From syntax; relayers MAY |
1896 |
accept a broader subset, but see the discussion in section |
1897 |
9.1. |
1898 |
|
1899 |
NOTE: The syntax here is a restricted subset of |
1900 |
the MAIL From syntax, with quoting particularly |
1901 |
restricted, for simple parsing. In particular, |
1902 |
the presence of "<" in the From content indicates |
1903 |
that the second form is being used, otherwise the |
1904 |
first form is being used. The major restrictions |
1905 |
here are those already de-facto imposed by exist- |
1906 |
ing software. |
1907 |
|
1908 |
|
1909 |
|
1910 |
|
1911 |
|
1912 |
2 June 1994 - 29 - expires 15 July 1994 |
1913 |
|
1914 |
|
1915 |
|
1916 |
|
1917 |
|
1918 |
INTERNET DRAFT to be NEWS sec. 5.2 |
1919 |
|
1920 |
|
1921 |
NOTE: Overly-lenient posting agents sometimes per- |
1922 |
mit the second form with a full name containing |
1923 |
"(" or ")", but it is extremely rare for a full |
1924 |
name to contain "<" or ">" even in mail. Accord- |
1925 |
ingly, reading agents wishing to robustly deter- |
1926 |
mine which form is in use in a particular article |
1927 |
should key on the presence or absence of "<", not |
1928 |
the presence or absence of "(". |
1929 |
|
1930 |
The address SHOULD be a valid and complete Internet domain |
1931 |
address, capable of being successfully mailed to by an |
1932 |
Internet host (possibly via an MX record and a forwarder). |
1933 |
The pseudo-domain ".uucp" MAY be used for hosts registered |
1934 |
in the UUCP maps (e.g. name "xyz.uucp" for registered site |
1935 |
"xyz"), but such hosts SHOULD discontinue this usage (either |
1936 |
by arranging a proper Internet address and forwarder, or by |
1937 |
using the "% hack" (see below)), as soon as possible. Bit- |
1938 |
net hosts SHOULD use Internet addresses, avoiding the obso- |
1939 |
lescent ".bitnet" pseudo-domain. Other forms of address |
1940 |
MUST not be used. |
1941 |
|
1942 |
NOTE: "Other forms" specifically include UK-style |
1943 |
"backward" domains ("uk.oxbridge.cs" is in the |
1944 |
Czech Republic, not the UK), pure-UUCP addressing |
1945 |
("knee!shin!foot" instead of |
1946 |
"foot%shin@knee.uucp"), and abbreviated domains |
1947 |
("zebra.zoo" instead of "zebra.zoo.toronto.edu"). |
1948 |
|
1949 |
If it is necessary to use the local part to specify a rout- |
1950 |
ing relative to the nearest Internet host, this MUST be done |
1951 |
using the "% hack", using "%" as a secondary "@". For exam- |
1952 |
ple, to specify that mail to the address should go to Inter- |
1953 |
net host "foo.bar.edu", then to non-Internet host "ein", |
1954 |
then to non-Internet host "deux", for delivery there to |
1955 |
mailbox "fred", a suitable address would be: |
1956 |
|
1957 |
fred%deux%ein@foo.bar.edu |
1958 |
|
1959 |
Analogous forms using "!" in the local part MUST not be |
1960 |
used, as they are ambiguous; they should be expressed in the |
1961 |
"%" form. |
1962 |
|
1963 |
NOTE: "a!b@c" can be interpreted as either "b%c@a" |
1964 |
or "b%a@c", and there is no consistency in which |
1965 |
choice is made. Such addresses consequently are |
1966 |
unreliable. The "%" form does not suffer from |
1967 |
this problem, and although its use is officially |
1968 |
discouraged, it is a de-facto standard, to the |
1969 |
point that MAIL recognizes it. |
1970 |
|
1971 |
Relayers MUST not, repeat MUST not, repeat MUST not, rewrite |
1972 |
From lines, in any way, however minor or innocent-seeming. |
1973 |
Trying to "fix" a non-conforming address has a very high |
1974 |
probability of making things worse. Either pass it along |
1975 |
|
1976 |
|
1977 |
|
1978 |
2 June 1994 - 30 - expires 15 July 1994 |
1979 |
|
1980 |
|
1981 |
|
1982 |
|
1983 |
|
1984 |
INTERNET DRAFT to be NEWS sec. 5.2 |
1985 |
|
1986 |
|
1987 |
unchanged, or reject the article. |
1988 |
|
1989 |
NOTE: An additional reason for banning the use of |
1990 |
"!" addressing is that it has a much higher proba- |
1991 |
bility of being rewritten into mangled unrecogniz- |
1992 |
ability by old relayers. |
1993 |
|
1994 |
Posters and posting agents SHOULD avoid use of the charac- |
1995 |
ters "!" and "@" in full names, as they may trigger unwanted |
1996 |
header rewriting by old, simple-minded news software. |
1997 |
|
1998 |
NOTE: Also, the characters "." and ",", not infre- |
1999 |
quently found in names (e.g., "John W. Campbell, |
2000 |
Jr."), are NOT, repeat NOT, allowed in an unquoted |
2001 |
word. A From header like the following MUST not |
2002 |
be written without the quotation marks: |
2003 |
|
2004 |
From: "John W. Campbell, Jr." <editor@analog.com> |
2005 |
|
2006 |
|
2007 |
|
2008 |
5.3. Message-ID |
2009 |
|
2010 |
The Message-ID header contains the article's message ID, a |
2011 |
unique identifier distinguishing the article from every |
2012 |
other article: |
2013 |
|
2014 |
Message-ID-content = message-id |
2015 |
message-id = "<" local-part "@" domain ">" |
2016 |
|
2017 |
As with From addresses, a message ID's local part is case- |
2018 |
sensitive and its domain is case-insensitive. The "<" and |
2019 |
">" are parts of the message ID, not peculiarities of the |
2020 |
Message-ID header. |
2021 |
|
2022 |
NOTE: News message IDs are a restricted subset of |
2023 |
MAIL message IDs. In particular, no existing news |
2024 |
software copes properly with MAIL quoting conven- |
2025 |
tions within the local part, so they are forbid- |
2026 |
den. This is unfortunate, particularly for X.400 |
2027 |
gateways that often wish to include characters |
2028 |
which are not legal in unquoted message IDs, but |
2029 |
it is impossible to fix net-wide. See the notes |
2030 |
on gatewaying in section 10. |
2031 |
|
2032 |
The domain in the message ID SHOULD be the full Internet |
2033 |
domain name of the posting agent's host. Use of the ".uucp" |
2034 |
pseudo-domain (for hosts registered in the UUCP maps) or the |
2035 |
".bitnet" pseudo-domain (for Bitnet hosts) is permissible, |
2036 |
but SHOULD be avoided. |
2037 |
|
2038 |
Posters and posting agents MUST generate the local part of a |
2039 |
message ID using an algorithm which obeys the specified syn- |
2040 |
tax (words separated by ".", with certain characters not |
2041 |
|
2042 |
|
2043 |
|
2044 |
2 June 1994 - 31 - expires 15 July 1994 |
2045 |
|
2046 |
|
2047 |
|
2048 |
|
2049 |
|
2050 |
INTERNET DRAFT to be NEWS sec. 5.3 |
2051 |
|
2052 |
|
2053 |
permitted) (see section 5.2 for details), and will not |
2054 |
repeat itself (ever). The algorithm SHOULD not generate |
2055 |
message IDs which differ only in case of letters. Note the |
2056 |
specification in section 6.5 of a recommended convention for |
2057 |
indicating subject changes. Otherwise the algorithm is up |
2058 |
to the implementor. |
2059 |
|
2060 |
NOTE: The crucial use of message IDs is to distin- |
2061 |
guish circulating articles from each other and |
2062 |
from articles circulated recently. They are also |
2063 |
potentially useful as permanent indexing keys, |
2064 |
hence the requirement for permanent uniqueness... |
2065 |
but indexers cannot absolutely rely on this |
2066 |
because the earlier RFCs urged it but did not |
2067 |
demand it. All major implementations have always |
2068 |
generated permanently-unique message IDs by |
2069 |
design, but in some cases this is sensitive to |
2070 |
proper administration, and duplicates may have |
2071 |
occurred by accident. |
2072 |
|
2073 |
NOTE: The most popular method of generating local |
2074 |
parts is to use the date and time, plus some way |
2075 |
of distinguishing between simultaneous postings on |
2076 |
the same host (e.g. a process number), and encode |
2077 |
them in a suitably-restricted alphabet. An older |
2078 |
but now less-popular alternative is to use a |
2079 |
sequence number, incremented each time the host |
2080 |
generates a new message ID; this is workable, but |
2081 |
requires careful design to cope properly with |
2082 |
simultaneous posting attempts, and is not as |
2083 |
robust in the presence of crashes and other mal- |
2084 |
functions. |
2085 |
|
2086 |
NOTE: Some buggy news software considers message |
2087 |
IDs completely case-insensitive, hence the advice |
2088 |
to avoid relying on case distinctions. The |
2089 |
restrictions placed on the "alphabet" of local |
2090 |
parts and domains in section 5.2 have the useful |
2091 |
side effect of making it unnecessary to parse mes- |
2092 |
sage IDs in complex ways to break them into case- |
2093 |
sensitive and case-insensitive portions. |
2094 |
|
2095 |
The local part of a message ID MUST not be "postmaster" or |
2096 |
any other string that would compare equal to "postmaster" in |
2097 |
a case-insensitive comparison. Message IDs MUST be no |
2098 |
longer than 250 octets, including the "<" and ">". |
2099 |
|
2100 |
NOTE: "Postmaster" is an irksome exception to |
2101 |
case-sensitivity in local parts, inherited from |
2102 |
MAIL, and simply avoiding it is the best way to |
2103 |
deal with it (not that it's likely, but the issue |
2104 |
needs to be dealt with). The length limit is |
2105 |
undesirable, but is present in widely-used exist- |
2106 |
ing software. The limit is actually 255, but a |
2107 |
|
2108 |
|
2109 |
|
2110 |
2 June 1994 - 32 - expires 15 July 1994 |
2111 |
|
2112 |
|
2113 |
|
2114 |
|
2115 |
|
2116 |
INTERNET DRAFT to be NEWS sec. 5.3 |
2117 |
|
2118 |
|
2119 |
small safety margin is wise. |
2120 |
|
2121 |
|
2122 |
5.4. Subject |
2123 |
|
2124 |
The Subject header's content (the "subject" of the article) |
2125 |
is a short phrase describing the topic of the article: |
2126 |
|
2127 |
Subject-content = [ "Re: " ] nonblank-text |
2128 |
|
2129 |
Encoded words MAY appear in this header. |
2130 |
|
2131 |
If the article is a followup, the subject SHOULD begin with |
2132 |
"Re: " (a "back reference"). If the article is not a fol- |
2133 |
lowup, the subject MUST not begin with a back reference. |
2134 |
Back references are case-insensitive, although "Re: " is the |
2135 |
preferred form. A followup agent assisting a poster in |
2136 |
preparing a followup SHOULD prepend a back reference, UNLESS |
2137 |
the subject already begins with one. If the poster deter- |
2138 |
mines that the topic of the followup differs significantly |
2139 |
from what is described in the subject, a new, more descrip- |
2140 |
tive, subject SHOULD be substituted (with no back refer- |
2141 |
ence). An article whose subject begins with a back refer- |
2142 |
ence MUST have a References header referencing the precur- |
2143 |
sor. |
2144 |
|
2145 |
NOTE: A back reference is FOUR characters, the |
2146 |
fourth being a blank. RFC 1036 was confused about |
2147 |
this. Observe also that only ONE back reference |
2148 |
should be present. |
2149 |
|
2150 |
NOTE: There is a semi-standard convention, often |
2151 |
used, in which a subject change is flagged by mak- |
2152 |
ing the new Subject-content of the form: |
2153 |
|
2154 |
new topic (was: old topic) |
2155 |
|
2156 |
possibly with "old topic" somewhat truncated. |
2157 |
Posters wishing to do something like this are |
2158 |
urged to use this exact form, to simplify auto- |
2159 |
mated analysis. |
2160 |
|
2161 |
For historical reasons, the subject MUST not begin with |
2162 |
"cmsg " (note that this sequence ends with a blank). |
2163 |
|
2164 |
NOTE: Some old news software takes a subject |
2165 |
beginning with "cmsg " as an indication that the |
2166 |
article is a control message (see sections 6.6 and |
2167 |
7). This mechanism is obsolete and undesirable, |
2168 |
but accidental triggering of it is still possible. |
2169 |
|
2170 |
The subject SHOULD be terse. Posters SHOULD avoid trying to |
2171 |
cram their entire article into the headers; even the sim- |
2172 |
plest query usually benefits from a sentence or two of |
2173 |
|
2174 |
|
2175 |
|
2176 |
2 June 1994 - 33 - expires 15 July 1994 |
2177 |
|
2178 |
|
2179 |
|
2180 |
|
2181 |
|
2182 |
INTERNET DRAFT to be NEWS sec. 5.4 |
2183 |
|
2184 |
|
2185 |
elaboration and context, and the details of header display |
2186 |
vary widely among reading agents. |
2187 |
|
2188 |
NOTE: All-in-the-subject articles are sometimes |
2189 |
the result of misunderstandings over the interac- |
2190 |
tion protocol of a posting agent. Posting agents |
2191 |
might wish to give special attention to the possi- |
2192 |
bility that a poster specifying a very long sub- |
2193 |
ject might have thought he was typing the body of |
2194 |
the article. |
2195 |
|
2196 |
|
2197 |
5.5. Newsgroups |
2198 |
|
2199 |
The Newsgroups header's content specifies which newsgroup(s) |
2200 |
the article is posted to: |
2201 |
|
2202 |
Newsgroups-content = newsgroup-name *( ng-delim newsgroup-name ) |
2203 |
newsgroup-name = plain-component *( "." component ) |
2204 |
component = plain-component / encoded-word |
2205 |
plain-component = component-start *13component-rest |
2206 |
component-start = lowercase / digit |
2207 |
lowercase = <letter a-z> |
2208 |
component-rest = component-start / "+" / "-" / "_" |
2209 |
ng-delim = "," |
2210 |
|
2211 |
Encoded words used in newsgroup names MUST not contain char- |
2212 |
acters other than letters, digits, "+", "-", "/", "_", "=", |
2213 |
and "?" (although they may encode them). |
2214 |
|
2215 |
A newsgroup name consists of one or more components, which |
2216 |
may be plain components or (except for the first) encoded |
2217 |
words. A plain component MUST contain at least one letter, |
2218 |
MUST begin with a letter or digit, and MUST not be longer |
2219 |
than 14 characters. The first component MUST begin with a |
2220 |
letter; subsequent components SHOULD begin with a letter. |
2221 |
Newsgroup names MUST not contain uppercase letters, except |
2222 |
where required by encodings in encoded words. The sequences |
2223 |
"all" and "ctl" MUST not be used as components. |
2224 |
|
2225 |
NOTE: The alphabet and syntax specified encom- |
2226 |
passes all existing names of widespread news- |
2227 |
groups, while avoiding various forms that are |
2228 |
known to cause problems. Important existing soft- |
2229 |
ware uses various non-alphanumeric characters as |
2230 |
punctuation adjacent to newsgroup names. (It |
2231 |
would, in fact, be preferable to ban "+" from |
2232 |
newsgroup names, were it not that several |
2233 |
widespread newsgroups related to the C++ program- |
2234 |
ming language already use it.) |
2235 |
|
2236 |
NOTE: Much existing software converts the news- |
2237 |
group name into a directory path and stores the |
2238 |
articles themselves using numeric filenames, so |
2239 |
|
2240 |
|
2241 |
|
2242 |
2 June 1994 - 34 - expires 15 July 1994 |
2243 |
|
2244 |
|
2245 |
|
2246 |
|
2247 |
|
2248 |
INTERNET DRAFT to be NEWS sec. 5.5 |
2249 |
|
2250 |
|
2251 |
all-digit name components can be troublesome; the |
2252 |
"Great Renaming" early in the history of Usenet |
2253 |
included revisions of several newsgroup names to |
2254 |
eliminate such components. |
2255 |
|
2256 |
NOTE: The same storage technique is the reason for |
2257 |
the 14-character limit. The limit is now largely |
2258 |
historical, since most modern systems have much |
2259 |
larger limits on the length of a directory entry's |
2260 |
name, but many old systems are still in use. Sys- |
2261 |
tems with shorter limits also exist, but news |
2262 |
software on such systems has had to deal with the |
2263 |
problem already, since there are several |
2264 |
widespread newsgroups with 14-character components |
2265 |
in their names. Implementors are warned that it |
2266 |
is intended that the successor to this Draft will |
2267 |
increase the 14-character limit, and are urged to |
2268 |
fix their software to handle longer names grace- |
2269 |
fully (if such fixes are necessary, given the |
2270 |
intended domain of application of the particular |
2271 |
software). |
2272 |
|
2273 |
NOTE: The requirement that the first character of |
2274 |
a name be a letter accommodates existing software |
2275 |
which assumes it can tell the difference between a |
2276 |
newsgroup name and other possible syntactic enti- |
2277 |
ties by inspecting the first character. Similar |
2278 |
considerations motivate excluding "+", "-", and |
2279 |
"_" from coming first in a component, and the |
2280 |
preference for components that do not begin with |
2281 |
digits. The "all" sequence is used as a wildcard |
2282 |
symbol in much existing software, and the "ctl" |
2283 |
sequence was involved in an obsolete historical |
2284 |
mechanism for marking control messages, so they |
2285 |
are best avoided. |
2286 |
|
2287 |
NOTE: Possibly newsgroup names should have been |
2288 |
case-insensitive, but all existing software treats |
2289 |
them as case-sensitive. (RFC 977 [rrr] claims |
2290 |
that they are case-insensitive in NNTP, but exist- |
2291 |
ing implementations are believed to ignore this.) |
2292 |
The simplest solution is just to ban use of upper- |
2293 |
case letters, since no widespread newsgroup name |
2294 |
uses them anyway; this avoids any possibility of |
2295 |
confusion. |
2296 |
|
2297 |
NOTE: The syntax has the disadvantage of contain- |
2298 |
ing no white space, making it impossible to con- |
2299 |
tinue a Newsgroups header across several lines. |
2300 |
Implementors of relayers and reading agents are |
2301 |
warned that it is intended that the successor to |
2302 |
this Draft will change the definition of ng-delim |
2303 |
to: |
2304 |
|
2305 |
|
2306 |
|
2307 |
|
2308 |
2 June 1994 - 35 - expires 15 July 1994 |
2309 |
|
2310 |
|
2311 |
|
2312 |
|
2313 |
|
2314 |
INTERNET DRAFT to be NEWS sec. 5.5 |
2315 |
|
2316 |
|
2317 |
ng-delim = "," [ space ] |
2318 |
|
2319 |
and are urged to fix their software to handle |
2320 |
(i.e., ignore) white space following the commas. |
2321 |
Meanwhile, posters must avoid inserting such space |
2322 |
(despite the natural-language convention which |
2323 |
permits it) and posting agents should strip it |
2324 |
out. |
2325 |
|
2326 |
NOTE: Encoded words as components are somewhat |
2327 |
problematic, but are clearly desirable for use in |
2328 |
non-English-speaking nations. They are not sub- |
2329 |
ject to the 14-character limit, and this (plus the |
2330 |
possibility of "/" within them) may require spe- |
2331 |
cial handling in news software. |
2332 |
|
2333 |
Encoded words are allowed in newsgroup names ONLY where non- |
2334 |
ASCII characters are necessary to the name, and must use the |
2335 |
"b" encoding [rrr] and the first suitable character set in |
2336 |
the MIME order of preferred character sets [rrr]. |
2337 |
|
2338 |
NOTE: Since the newsgroup name is the encoded |
2339 |
form, NOT the underlying non-ASCII form, there is |
2340 |
room for terrible confusion here if the choice of |
2341 |
encoding for a particular name is not fully stan- |
2342 |
dardized. |
2343 |
|
2344 |
Posters SHOULD use only the names of existing newsgroups in |
2345 |
the Newsgroups header, because newsgroups are NOT created |
2346 |
simply by being posted to. However, it is legitimate to |
2347 |
cross-post to newsgroup(s) which do not exist on the posting |
2348 |
agent's host, provided that at least one of the newsgroups |
2349 |
DOES exist there, and followup agents MUST accept this |
2350 |
(posting agents MAY accept it, but SHOULD at least alert the |
2351 |
poster to the situation and request confirmation). Relayers |
2352 |
MUST not rewrite Newsgroups headers in any way, even if some |
2353 |
or all of the newsgroups do not exist on the relayer's host. |
2354 |
|
2355 |
NOTE: Early experience with news software that |
2356 |
created newsgroups when they were mentioned in a |
2357 |
Newsgroups header was thoroughly negative: posters |
2358 |
frequently mistype newsgroup names. |
2359 |
|
2360 |
NOTE: While it is legitimate for some of an arti- |
2361 |
cle's newsgroups not to exist on the host where it |
2362 |
is posted, this IS a rather unusual situation |
2363 |
except in followups (which should go to all news- |
2364 |
groups the precursor was posted to, even if not |
2365 |
all of them reach the site where the followup is |
2366 |
being posted). |
2367 |
|
2368 |
NOTE: Rewriting Newsgroups headers to strip |
2369 |
locally-unknown newsgroups is superficially |
2370 |
attractive. However, early experience with |
2371 |
|
2372 |
|
2373 |
|
2374 |
2 June 1994 - 36 - expires 15 July 1994 |
2375 |
|
2376 |
|
2377 |
|
2378 |
|
2379 |
|
2380 |
INTERNET DRAFT to be NEWS sec. 5.5 |
2381 |
|
2382 |
|
2383 |
exactly that policy was thoroughly negative: news |
2384 |
propagation is more redundant and much less |
2385 |
orderly than many people imagine, and in particu- |
2386 |
lar it is not unheard-of for the (sometimes) |
2387 |
fastest path between two (say) U of Toronto sites |
2388 |
to pass outside U of Toronto... in which case |
2389 |
newsgroup stripping can cause incomplete propaga- |
2390 |
tion. Having an article's set of newsgroups |
2391 |
change as it propagates can also result in fol- |
2392 |
lowups not achieving the same propagation as the |
2393 |
original. It's been tried; it's more trouble than |
2394 |
it's worth; don't do it. |
2395 |
|
2396 |
NOTE: In particular, newsgroup stripping superfi- |
2397 |
cially looks like a solution to the problem of |
2398 |
duplicate regional newsgroup names. For example, |
2399 |
both University of Toronto and University of Texas |
2400 |
have "ut.general" newsgroups, and material cross- |
2401 |
posted to that name and a global newsgroup appears |
2402 |
in both universities' local newsgroups. However, |
2403 |
the side effects of stripping are sufficiently |
2404 |
unacceptable to disqualify it for this purpose. |
2405 |
Don't do it. |
2406 |
|
2407 |
Cross-posting an article to several relevant newsgroups is |
2408 |
far superior to posting separate articles with duplicated |
2409 |
content to each newsgroup, because reading agents can detect |
2410 |
the situation and show the article to a reader only once. |
2411 |
Posters SHOULD cross-post rather than duplicate-post. |
2412 |
|
2413 |
NOTE: On the other hand, cross-posting to a large |
2414 |
number of newsgroups usually indicates that the |
2415 |
poster has not thought about his audience; arti- |
2416 |
cles are rarely pertinent to more than (say) half |
2417 |
a dozen newsgroups. Posting agents might wish to |
2418 |
request confirmation when the number of newsgroups |
2419 |
exceeds (say) five in the presence of a Followup- |
2420 |
To header, or (say) two in the absence of such a |
2421 |
header. |
2422 |
|
2423 |
NOTE: One problem with cross-postings is what to |
2424 |
do with an article cross-posted to a set of news- |
2425 |
groups including both moderated and unmoderated |
2426 |
ones. Posters tend to expect such an article to |
2427 |
show up immediately in the unmoderated newsgroups, |
2428 |
especially if they do not realize that one or more |
2429 |
of the newsgroups is moderated. However, since it |
2430 |
is not possible for a moderator to retroactively |
2431 |
add an already-posted article to a moderated news- |
2432 |
group, the only correct action is to mail such an |
2433 |
article to one (and only one) of the moderators |
2434 |
for action. It is probably best for the posting |
2435 |
agent to detect this situation and ask the poster |
2436 |
what action is preferred. The acceptable choices |
2437 |
|
2438 |
|
2439 |
|
2440 |
2 June 1994 - 37 - expires 15 July 1994 |
2441 |
|
2442 |
|
2443 |
|
2444 |
|
2445 |
|
2446 |
INTERNET DRAFT to be NEWS sec. 5.5 |
2447 |
|
2448 |
|
2449 |
are to alter the newsgroup list or to mail to a |
2450 |
moderator of the poster's choice; the posting |
2451 |
agent should NOT offer duplicate-posting as an |
2452 |
easy-to-request option (if only because many mod- |
2453 |
erators will reject a submission that has already |
2454 |
been posted to unmoderated newsgroups). |
2455 |
|
2456 |
NOTE: An article cross-posted to multiple moder- |
2457 |
ated newsgroups really should have approval from |
2458 |
all the moderators involved. In practice, the |
2459 |
only straightforward way to do this is to send the |
2460 |
article to one of them and have him consult the |
2461 |
others. |
2462 |
|
2463 |
A newsgroup SHOULD not appear more than once in the News- |
2464 |
groups header. |
2465 |
|
2466 |
Newsgroup names having only one component are reserved for |
2467 |
newsgroups whose propagation is restricted to a single host |
2468 |
(or the administrative equivalent). It is inadvisable to |
2469 |
name a newsgroup "poster" because that word has special |
2470 |
meaning in the Followup-To header (see section 6.1). The |
2471 |
names "control" and "junk" are frequently used for pseudo- |
2472 |
newsgroups internal to relayer implementations, and hence |
2473 |
are also best avoided. |
2474 |
|
2475 |
NOTE: Beware of the duplicate-regional-newsgroup- |
2476 |
names problem mentioned above. In particular, |
2477 |
there are many, many hosts with a newsgroup named |
2478 |
"general", and some surprising things show up in |
2479 |
such newsgroups when people cross-post. It is |
2480 |
probably better to use multi-component names, |
2481 |
which are less likely to be duplicated. Fred's |
2482 |
Widget House should use "fwh.general" rather than |
2483 |
just "general" as its in-house general-topics |
2484 |
newsgroup. |
2485 |
|
2486 |
It is conventional to reserve newsgroup names beginning with |
2487 |
"to." for test messages sent on an essentially point-to- |
2488 |
point basis (see also the ihave/sendme protocol described in |
2489 |
section 7.2); newsgroup names beginning with "to." SHOULD |
2490 |
not be used for any other purpose. The second (and possibly |
2491 |
later) components of such a name should, together, comprise |
2492 |
the relayer name (see section 5.6) of a relayer. The news- |
2493 |
group exists only at the named relayer and its neighbors. |
2494 |
The neighbors all pass that newsgroup to the named relayer, |
2495 |
while the named relayer does not pass it to anyone. |
2496 |
|
2497 |
The order of newsgroup names in the Newsgroups header is not |
2498 |
significant. |
2499 |
|
2500 |
|
2501 |
|
2502 |
|
2503 |
|
2504 |
|
2505 |
|
2506 |
2 June 1994 - 38 - expires 15 July 1994 |
2507 |
|
2508 |
|
2509 |
|
2510 |
|
2511 |
|
2512 |
INTERNET DRAFT to be NEWS sec. 5.6 |
2513 |
|
2514 |
|
2515 |
5.6. Path |
2516 |
|
2517 |
The Path header's content indicates which relayers the arti- |
2518 |
cle has already visited, so that unnecessary redundant |
2519 |
transmission can be avoided: |
2520 |
|
2521 |
Path-content = [ path-list path-delimiter ] local-part |
2522 |
path-list = relayer-name *( path-delimiter relayer-name ) |
2523 |
relayer-name = 1*rn-char |
2524 |
rn-char = letter / digit / "." / "-" / "_" |
2525 |
path-delimiter = "!" |
2526 |
|
2527 |
The Path content is a list of relayer names, separated by |
2528 |
path delimiters, followed (after a final delimiter) by the |
2529 |
local part of a mailing address. Each relayer MUST prepend |
2530 |
its name, and a delimiter, to the Path content in all arti- |
2531 |
cles it processes. A relayer MUST not pass an article to a |
2532 |
neighboring relayer whose name is already mentioned in an |
2533 |
article's path list, unless this is explicitly requested by |
2534 |
the neighbor in some way. The Path content is case- |
2535 |
sensitive. |
2536 |
|
2537 |
NOTE: The Path header supplied by a posting agent |
2538 |
should normally contain only the local part. The |
2539 |
relayer that the posting agent passes the article |
2540 |
to for posting will prepend its relayer name to |
2541 |
get the path list started. |
2542 |
|
2543 |
NOTE: Observe that the trailing local part is NOT |
2544 |
part of the path list. This Path header: |
2545 |
|
2546 |
Path: fee!fie!foe!fum |
2547 |
|
2548 |
contains three relayer names: "fee", "fie", and |
2549 |
"foe". A relayer named "fum" is still eligible to |
2550 |
be sent this article. |
2551 |
|
2552 |
NOTE: This syntax has the disadvantage of contain- |
2553 |
ing no white space, making it impossible to con- |
2554 |
tinue a Path header across several lines. Imple- |
2555 |
mentors of relayers and reading agents are warned |
2556 |
that it is intended that the successor to this |
2557 |
Draft will change the definition of path delimiter |
2558 |
to: |
2559 |
|
2560 |
path-delimiter = "!" [ space ] |
2561 |
|
2562 |
and are urged to fix their software to handle |
2563 |
(i.e., ignore) white space following the exclama- |
2564 |
tion points. They are urged to hurry; some ill- |
2565 |
behaved systems reportedly already feel free to |
2566 |
add such white space. |
2567 |
|
2568 |
|
2569 |
|
2570 |
|
2571 |
|
2572 |
2 June 1994 - 39 - expires 15 July 1994 |
2573 |
|
2574 |
|
2575 |
|
2576 |
|
2577 |
|
2578 |
INTERNET DRAFT to be NEWS sec. 5.6 |
2579 |
|
2580 |
|
2581 |
NOTE: RFC 1036 allows considerably more flexibil- |
2582 |
ity in choice of delimiter, in theory, but this |
2583 |
flexibility has never been used and most news |
2584 |
software does not implement it properly. The |
2585 |
grammar reflects the current reality. Note, in |
2586 |
particular, that RFC 1036 treats "_" as a delim- |
2587 |
iter, but in fact it is known to appear in relayer |
2588 |
names occasionally. |
2589 |
|
2590 |
Because an article will not propagate to a relayer already |
2591 |
mentioned in its path list, the path list MUST not contain |
2592 |
any names other than those of relayers the article has |
2593 |
passed through AS NEWS. This is trivially obvious for nor- |
2594 |
mal news articles, but requires attention from the modera- |
2595 |
tors of moderated newsgroups and the implementors and main- |
2596 |
tainers of gateways. |
2597 |
|
2598 |
NOTE: For the same reason, a relayer and its |
2599 |
neighbors need to agree on the choice of relayer |
2600 |
name, and names should not be changed without |
2601 |
notifying neighbors. |
2602 |
|
2603 |
Relayer names need to be unique among all relayers which |
2604 |
will ever see the articles using them. A relayer name is |
2605 |
normally either an "official" name for the host the relayer |
2606 |
runs on, or some other "official" name controlled by the |
2607 |
same organization. Except in cooperating subnets that agree |
2608 |
to some other convention, and don't let articles using it |
2609 |
escape beyond the subnet, a relayer name MUST be either a |
2610 |
UUCP name registered in the UUCP maps (without any domain |
2611 |
suffix such as ".UUCP"), or a complete Internet domain name. |
2612 |
Use of a (registered) UUCP name is recommended, where prac- |
2613 |
tical, to keep the length of the path list down. |
2614 |
|
2615 |
The use of Internet domain names in the path list presents |
2616 |
one problem: domain names are case-insensitive, but the path |
2617 |
list is case-sensitive. Relayers using domain names as |
2618 |
their relayer names MUST pick a standard form for the name, |
2619 |
and use that form consistently to the exclusion of all oth- |
2620 |
ers. The preferred form for this purpose, which relayers |
2621 |
SHOULD use, is the all-lowercase form. |
2622 |
|
2623 |
NOTE: It is arguably unfortunate that the path |
2624 |
list is case-sensitive, but it is much too late to |
2625 |
change this. Most Internet sites do, in any |
2626 |
event, use one standardized form of their name |
2627 |
almost everywhere. |
2628 |
|
2629 |
In the ordinary case, where the poster is the author of the |
2630 |
article, the local part following the path list SHOULD be |
2631 |
the local part of the poster's full Internet domain mailing |
2632 |
address. |
2633 |
|
2634 |
|
2635 |
|
2636 |
|
2637 |
|
2638 |
2 June 1994 - 40 - expires 15 July 1994 |
2639 |
|
2640 |
|
2641 |
|
2642 |
|
2643 |
|
2644 |
INTERNET DRAFT to be NEWS sec. 5.6 |
2645 |
|
2646 |
|
2647 |
NOTE: It should be just the local part, not the |
2648 |
full address. The character "@" does not appear |
2649 |
in a Path header. |
2650 |
|
2651 |
The Path content somewhat resembles a mailing address, par- |
2652 |
ticularly in the UUCP world with its manual routing and "!" |
2653 |
address syntax. Historically, this resemblance was impor- |
2654 |
tant, and the Path content was often used as a reply |
2655 |
address. This practice has always been somewhat unreliable, |
2656 |
since news paths are not always mail paths and news relayer |
2657 |
names are not always recognized by mail handlers, and its |
2658 |
reliability has generally worsened in recent times. The |
2659 |
widespread use of and recognition of Internet domain |
2660 |
addresses, even outside the actual Internet, has largely |
2661 |
eliminated the problem. Readers SHOULD not use the Path |
2662 |
content as a reply address. On the other hand, relayer |
2663 |
administrators are urged not to break this usage without |
2664 |
good reason; where practical, paths followed by news SHOULD |
2665 |
be traversable by mail, and mail handlers SHOULD recognize |
2666 |
relayer names as host names. |
2667 |
|
2668 |
It will typically be difficult or impractical for gateways |
2669 |
and moderators to supply a Path content that is useful as a |
2670 |
reply address for the author, bearing in mind that the path |
2671 |
list they supply will normally be empty. (To reiterate: the |
2672 |
path list MUST not contain any names other than those of |
2673 |
relayers the article has passed through AS NEWS.) They |
2674 |
SHOULD supply a local part that will result in replies to a |
2675 |
Path-derived address being returned to the sender with a |
2676 |
brief explanation. Software permitting, the local part |
2677 |
"not-for-mail" is recommended. |
2678 |
|
2679 |
NOTE: A moderator or gateway administrator who |
2680 |
supplies a local part that delivers such mail to |
2681 |
an administrative mailbox will quickly discover |
2682 |
why it should be bounced automatically! It is |
2683 |
best, however, for the returned message to include |
2684 |
an explanation of what has probably happened, |
2685 |
rather than just a mysterious "undeliverable mail" |
2686 |
complaint, since the sender may not be aware that |
2687 |
his/her software is unwisely using the Path con- |
2688 |
tent as a reply address. Reply software might |
2689 |
wish to question attempts to reply to a Path- |
2690 |
derived address ending in "not-for-mail" (which is |
2691 |
why a specific name is being recommended here). |
2692 |
|
2693 |
|
2694 |
6. Optional Headers |
2695 |
|
2696 |
Many MAIL headers, and many of those specified in present |
2697 |
and future MAIL extensions, are potentially applicable to |
2698 |
news. Headers specific to MAIL's point-to-point transmis- |
2699 |
sion paradigm, e.g. To and Cc, SHOULD not appear in news |
2700 |
articles. (Gateways wishing to preserve such information |
2701 |
|
2702 |
|
2703 |
|
2704 |
2 June 1994 - 41 - expires 15 July 1994 |
2705 |
|
2706 |
|
2707 |
|
2708 |
|
2709 |
|
2710 |
INTERNET DRAFT to be NEWS sec. 6 |
2711 |
|
2712 |
|
2713 |
for debugging probably SHOULD hide it under different names; |
2714 |
prefixing "X-" to the original headers, resulting in e.g. |
2715 |
"X-To", is suggested.) |
2716 |
|
2717 |
The following optional headers are either specific to news |
2718 |
or of particular note in news articles; an article MAY con- |
2719 |
tain some or all of them. (Note that there are some circum- |
2720 |
stances in which some of them are mandatory; these are |
2721 |
explained under the individual headers.) An article MUST |
2722 |
not contain two or more headers with any one of these header |
2723 |
names. |
2724 |
|
2725 |
NOTE: The ban on duplicate header names does not |
2726 |
apply to headers not specified in this Draft at |
2727 |
all, such as "X-" headers. Software should not |
2728 |
assume that all header names in a given article |
2729 |
are unique. |
2730 |
|
2731 |
|
2732 |
6.1. Followup-To |
2733 |
|
2734 |
The Followup-To header contents specify which newsgroup(s) |
2735 |
followups should be posted to: |
2736 |
|
2737 |
Followup-To-content = Newsgroups-content / "poster" |
2738 |
|
2739 |
The syntax is the same as that of the Newsgroups content, |
2740 |
with the exception that the magic word "poster" means that |
2741 |
followups should be mailed to the article's reply address |
2742 |
rather than posted. In the absence of Followup-To, the |
2743 |
default newsgroup(s) for a followup are those in the News- |
2744 |
groups header. |
2745 |
|
2746 |
NOTE: The way to request that followups be mailed |
2747 |
to a specific address other than that in the From |
2748 |
line is to supply "Followup-To: poster" and a |
2749 |
Reply-To header. Putting a mailing address in the |
2750 |
Followup-To line is incorrect; posting agents |
2751 |
should reject or rewrite such headers. |
2752 |
|
2753 |
NOTE: There is no syntax for "no followups |
2754 |
allowed" because "Followup-To: poster" accom- |
2755 |
plishes this effect without extra machinery. |
2756 |
|
2757 |
Although it is generally desirable to limit followups to the |
2758 |
smallest reasonable set of newsgroups, especially when the |
2759 |
precursor was cross-posted widely, posting agents SHOULD not |
2760 |
supply a Followup-To header except at the poster's explicit |
2761 |
request. |
2762 |
|
2763 |
NOTE: In particular, it is incorrect for the post- |
2764 |
ing agent to assume that followups to a cross- |
2765 |
posted article should be directed to the first |
2766 |
newsgroup only. Trimming the list of newsgroups |
2767 |
|
2768 |
|
2769 |
|
2770 |
2 June 1994 - 42 - expires 15 July 1994 |
2771 |
|
2772 |
|
2773 |
|
2774 |
|
2775 |
|
2776 |
INTERNET DRAFT to be NEWS sec. 6.1 |
2777 |
|
2778 |
|
2779 |
should be the poster's decision, not the posting |
2780 |
agent's. However, when an article is to be cross- |
2781 |
posted to a considerable number of newsgroups, a |
2782 |
posting agent might wish to SUGGEST to the poster |
2783 |
that followups go to a shorter list. |
2784 |
|
2785 |
|
2786 |
6.2. Expires |
2787 |
|
2788 |
The Expires header content specifies a date and time when |
2789 |
the article is deemed to be no longer useful and should be |
2790 |
removed ("expired"): |
2791 |
|
2792 |
Expires-content = Date-content |
2793 |
|
2794 |
The content syntax is the same as that of the Date content. |
2795 |
In the absence of Expires, the default is decided by the |
2796 |
administrators of each host the article reaches, who MAY |
2797 |
also restrict the extent to which the Expires header is hon- |
2798 |
ored. |
2799 |
|
2800 |
The Expires header has two main applications: removing arti- |
2801 |
cles whose utility ends on a specific date (e.g., event |
2802 |
announcements which can be removed once the day of the event |
2803 |
is past) and preserving articles expected to be of prolonged |
2804 |
usefulness (e.g., information aimed at new readers of a |
2805 |
newsgroup). The latter application is sometimes abused. |
2806 |
Since individual hosts have local policies for expiration of |
2807 |
news (depending on available disk space, for instance), |
2808 |
posters SHOULD not provide Expires headers for articles |
2809 |
unless there is a natural expiration date associated with |
2810 |
the topic. Posting agents MUST not provide a default |
2811 |
Expires header. Leave it out and allow local policies to be |
2812 |
used unless there is a good reason not to. Expiry dates are |
2813 |
properly the decision of individual host administrators; |
2814 |
posters and moderators SHOULD set only expiry dates that |
2815 |
most administrators would agree with. |
2816 |
|
2817 |
NOTE: A poster preparing an Expires header for an |
2818 |
article whose utility ends on a specific day |
2819 |
should typically specify the NEXT day as the |
2820 |
expiry date. A meeting on July 7th remains of |
2821 |
interest on the 7th. |
2822 |
|
2823 |
|
2824 |
6.3. Reply-To |
2825 |
|
2826 |
The Reply-To header content specifies a reply address dif- |
2827 |
ferent from the author's address given in the From header: |
2828 |
|
2829 |
Reply-To-content = From-content |
2830 |
|
2831 |
In the absence of Reply-To, the reply address is the address |
2832 |
in the From header. |
2833 |
|
2834 |
|
2835 |
|
2836 |
2 June 1994 - 43 - expires 15 July 1994 |
2837 |
|
2838 |
|
2839 |
|
2840 |
|
2841 |
|
2842 |
INTERNET DRAFT to be NEWS sec. 6.3 |
2843 |
|
2844 |
|
2845 |
Use of a Reply-To header is preferable to including a simi- |
2846 |
lar request in the article body, because reply-preparation |
2847 |
software can take account of Reply-To automatically. |
2848 |
|
2849 |
|
2850 |
6.4. Sender |
2851 |
|
2852 |
The Sender header identifies the poster, in the event that |
2853 |
this differs from the author identified in the From header: |
2854 |
|
2855 |
Sender-content = From-content |
2856 |
|
2857 |
In the absence of Sender, the default poster is the author |
2858 |
(named in the From header). |
2859 |
|
2860 |
NOTE: The intent is that the Sender header have a |
2861 |
fairly high probability of identifying the person |
2862 |
who really posted the article. The ability to |
2863 |
specify a From header naming someone other than |
2864 |
the poster is useful but can be abused. |
2865 |
|
2866 |
If the poster supplies a From header, the posting agent MUST |
2867 |
ensure that a Sender header is present, unless it can verify |
2868 |
that the mailing address in the From header is a valid mail- |
2869 |
ing address for the poster. A poster-supplied Sender header |
2870 |
MAY be used, if its mailing address is verifiably a valid |
2871 |
mailing address for the poster; otherwise the posting agent |
2872 |
MUST supply a Sender header and delete (or rename, e.g. to |
2873 |
X-Unverifiable-Sender) any poster-supplied Sender header. |
2874 |
|
2875 |
NOTE: It might be useful to preserve a poster- |
2876 |
supplied Sender header so that the poster can sup- |
2877 |
ply the full-name part of the content. The mail- |
2878 |
ing address, however, must be right. Hence, the |
2879 |
posting agent must generate the Sender header if |
2880 |
it is unable to verify the mailing address of a |
2881 |
poster-supplied one. |
2882 |
|
2883 |
NOTE: NNTP implementors, in particular, are urged |
2884 |
to note this requirement (which would eliminate |
2885 |
the need for ad hoc headers like NNTP-Posting- |
2886 |
Host), although there are admittedly some imple- |
2887 |
mentation difficulties. A user name from an RFC |
2888 |
1413 server and a host name from an inverse map- |
2889 |
ping of the address, perhaps with a "full name" |
2890 |
comment noting the origin of the information, |
2891 |
would be at least a first approximation: |
2892 |
|
2893 |
Sender: fred@zoo.toronto.edu (RFC-1413@reverse-lookup; not verified) |
2894 |
|
2895 |
While this does not completely meet the specs, it |
2896 |
comes a lot closer than not having a Sender header |
2897 |
at all. Even just supplying a placeholder for the |
2898 |
user name: |
2899 |
|
2900 |
|
2901 |
|
2902 |
2 June 1994 - 44 - expires 15 July 1994 |
2903 |
|
2904 |
|
2905 |
|
2906 |
|
2907 |
|
2908 |
INTERNET DRAFT to be NEWS sec. 6.4 |
2909 |
|
2910 |
|
2911 |
Sender: somebody@zoo.toronto.edu (user name unknown) |
2912 |
|
2913 |
would be better than nothing. |
2914 |
|
2915 |
|
2916 |
6.5. References |
2917 |
|
2918 |
The References header content lists message IDs of precur- |
2919 |
sors: |
2920 |
|
2921 |
References-content = message-id *( space message-id ) |
2922 |
|
2923 |
A followup MUST have a References header, and an article |
2924 |
which is not a followup MUST not have a References header. |
2925 |
In a followup, if the precursor had a References header, the |
2926 |
message ID of the precursor is appended to the end of the |
2927 |
precursor's References-content to form the followup's Refer- |
2928 |
ences-content. a References header containing the precur- |
2929 |
sor's message ID. A followup to an article which had a Ref- |
2930 |
erences header MUST have a References header containing the |
2931 |
precursor's References content, plus the precursor's message |
2932 |
ID appended to the end of the list. |
2933 |
|
2934 |
NOTE: Use the See-Also header (section 6.16) for |
2935 |
interconnection of articles which are not in a |
2936 |
followup relationship to each other. |
2937 |
|
2938 |
NOTE: In retrospect, RFCs 850 and 1036, and the |
2939 |
implementations whose practice they represented, |
2940 |
erred here. The proper MAIL header to use for |
2941 |
references to precursors is In-Reply-To, and the |
2942 |
References header is meant to be used for the pur- |
2943 |
poses here ascribed to See-Also. This incompati- |
2944 |
bility is far too solidly established to be fixed, |
2945 |
unfortunately. The best that can be done is to |
2946 |
provide a clear mapping between the two, and urge |
2947 |
gateways to do the transformation. The news usage |
2948 |
is (now) a deliberate violation of the MAIL speci- |
2949 |
fications; articles containing news References |
2950 |
headers are technically not valid MAIL messages, |
2951 |
although it is unlikely that much MAIL software |
2952 |
will notice because the incompatibility is at a |
2953 |
subtle semantic level that does not affect the |
2954 |
syntax. |
2955 |
|
2956 |
UNRESOLVED ISSUE: Would it be better to just give |
2957 |
up and admit that news uses References for both |
2958 |
purposes? |
2959 |
|
2960 |
UNRESOLVED ISSUE: Should the syntax be generalized |
2961 |
to include URLs as alternatives to message IDs? |
2962 |
Perhaps not; too many things know about References |
2963 |
already. And non-articles can't be precursors of |
2964 |
articles, not really. |
2965 |
|
2966 |
|
2967 |
|
2968 |
2 June 1994 - 45 - expires 15 July 1994 |
2969 |
|
2970 |
|
2971 |
|
2972 |
|
2973 |
|
2974 |
INTERNET DRAFT to be NEWS sec. 6.5 |
2975 |
|
2976 |
|
2977 |
Followup agents SHOULD not shorten References headers. If |
2978 |
it is absolutely necessary to shorten the header, as a des- |
2979 |
perate last resort, a followup agent MAY do this by deleting |
2980 |
some of the message IDs. However, it MUST not delete the |
2981 |
first message ID, the last three message IDs (including that |
2982 |
of the immediate precursor), or any message ID mentioned in |
2983 |
the body of the followup. If it is possible for the fol- |
2984 |
lowup agent to determine the Subject content of the articles |
2985 |
identified in the References header, it MUST not delete the |
2986 |
message ID of any article where the Subject content changed |
2987 |
(other than by prepending of a back reference). The fol- |
2988 |
lowup agent MUST not delete any message ID whose local part |
2989 |
ends with "_-_" (underscore (ASCII 95), hyphen (ASCII 45), |
2990 |
underscore); followup agents are urged to use this form to |
2991 |
mark subject changes, and to avoid using it otherwise. |
2992 |
|
2993 |
NOTE: As software capable of exploiting References |
2994 |
chains has grown more common, the random shorten- |
2995 |
ing permitted by RFC 1036 has become increasingly |
2996 |
troublesome. ANY shortening is undesirable, and |
2997 |
software should do it only in cases of dire neces- |
2998 |
sity. In such cases, these rules attempt to limit |
2999 |
the damage. |
3000 |
|
3001 |
NOTE: The first message ID is very important as |
3002 |
the starting point of the "thread" of discussion, |
3003 |
and absolutely should not be deleted. Keeping the |
3004 |
last three message IDs gives thread-following |
3005 |
software a fighting chance to reconstruct a full |
3006 |
thread even if an article or two is missing. |
3007 |
Keeping message IDs mentioned in the body is obvi- |
3008 |
ously desirable. |
3009 |
|
3010 |
NOTE: Subject changes are difficult to determine, |
3011 |
but they are significant as possible beginnings of |
3012 |
new threads. The "_-_" convention is provided so |
3013 |
that posting agents (which have more information |
3014 |
about subjects) can flag articles containing a |
3015 |
subject change in a way that followup agents can |
3016 |
detect without access to the articles themselves. |
3017 |
The sequence is chosen as one that is fairly |
3018 |
unlikely to occur by accident. |
3019 |
|
3020 |
NOTE: Is "_-_" really worth having? |
3021 |
|
3022 |
When a References header is shortened, at least three blanks |
3023 |
SHOULD be left between adjacent message IDs at each point |
3024 |
where deletions were made. Software preparing new Refer- |
3025 |
ences headers SHOULD preserve multiple blanks in older Ref- |
3026 |
erences content. |
3027 |
|
3028 |
NOTE: It's desirable to have some marker of where |
3029 |
deletions occurred, but the restricted syntax of |
3030 |
the header makes this difficult. Extra white |
3031 |
|
3032 |
|
3033 |
|
3034 |
2 June 1994 - 46 - expires 15 July 1994 |
3035 |
|
3036 |
|
3037 |
|
3038 |
|
3039 |
|
3040 |
INTERNET DRAFT to be NEWS sec. 6.5 |
3041 |
|
3042 |
|
3043 |
space is not a very good marker, since it may be |
3044 |
deleted by software that ill-advisedly rewrites |
3045 |
headers, but at least it doesn't break existing |
3046 |
software. |
3047 |
|
3048 |
To repeat: followup agents SHOULD not shorten References |
3049 |
headers. |
3050 |
|
3051 |
NOTE: Unfortunately, reading agents and other |
3052 |
software analyzing References patterns have to be |
3053 |
prepared for the worst anyway. The worst includes |
3054 |
random deletions and the possibility of circular |
3055 |
References chains (when References is misused in |
3056 |
place of See-Also, section 6.16). |
3057 |
|
3058 |
|
3059 |
6.6. Control |
3060 |
|
3061 |
The Control header content marks the article as a control |
3062 |
message, and specifies the desired actions (other than the |
3063 |
usual ones of filing and passing on the article): |
3064 |
|
3065 |
Control-content = verb *( space argument ) |
3066 |
verb = 1*( letter / digit ) |
3067 |
argument = 1*<ASCII printable character> |
3068 |
|
3069 |
The verb indicates what action should be taken, and the |
3070 |
argument(s) (if any) supply details. In some cases, the |
3071 |
body of the article may also contain details. Section 7 |
3072 |
describes the standard verbs. See also the Also-Control |
3073 |
header (section 6.15). |
3074 |
|
3075 |
NOTE: Control messages are often processed and |
3076 |
filed rather differently than normal articles. |
3077 |
|
3078 |
NOTE: The restriction of verbs to letters and dig- |
3079 |
its is new, but is consistent with existing prac- |
3080 |
tice and potentially simplifies implementation by |
3081 |
avoiding characters significant to command inter- |
3082 |
preters. Beware that the arguments are under no |
3083 |
such restriction in general. |
3084 |
|
3085 |
NOTE: Two other conventions for distinguishing |
3086 |
control messages from normal articles were for- |
3087 |
merly in use: a three-component newsgroup name |
3088 |
ending in ".ctl" or a subject beginning with |
3089 |
"cmsg " was considered to imply that the article |
3090 |
was a control message. These conventions are |
3091 |
obsolete. Do not use them. |
3092 |
|
3093 |
An article with a Control header MUST not have an Also- |
3094 |
Control or Supersedes header. |
3095 |
|
3096 |
|
3097 |
|
3098 |
|
3099 |
|
3100 |
2 June 1994 - 47 - expires 15 July 1994 |
3101 |
|
3102 |
|
3103 |
|
3104 |
|
3105 |
|
3106 |
INTERNET DRAFT to be NEWS sec. 6.7 |
3107 |
|
3108 |
|
3109 |
6.7. Distribution |
3110 |
|
3111 |
The Distribution header content specifies geographic or |
3112 |
organizational limits on an article's propagation: |
3113 |
|
3114 |
Distribution-content = distribution *( dist-delim distribution ) |
3115 |
dist-delim = "," |
3116 |
distribution = plain-component |
3117 |
|
3118 |
A distribution is syntactically identical to a one-component |
3119 |
newsgroup name, and must satisfy the same rules and restric- |
3120 |
tions. In the absence of Distribution, the default distri- |
3121 |
bution is "world". |
3122 |
|
3123 |
NOTE: This syntax has the disadvantage of contain- |
3124 |
ing no white space, making it impossible to con- |
3125 |
tinue a Distribution header across several lines. |
3126 |
Implementors of relayers and reading agents are |
3127 |
warned that it is intended that the successor to |
3128 |
this Draft will change the definition of dist |
3129 |
delimiter to: |
3130 |
|
3131 |
dist-delim = "," [ space ] |
3132 |
|
3133 |
and are urged to fix their software to handle |
3134 |
(i.e., ignore) white space following the commas. |
3135 |
|
3136 |
A relayer MUST not pass an article to another relayer unless |
3137 |
configuration information specifies transmission to that |
3138 |
other relayer of BOTH (a) at least one of the article's |
3139 |
newsgroup(s), and (b) at least one of the article's distri- |
3140 |
bution(s). In effect, the only role of distributions is to |
3141 |
limit propagation, by preventing transmission of articles |
3142 |
that would have been transmitted had the decision been based |
3143 |
solely on newsgroups. |
3144 |
|
3145 |
A posting agent might wish to present a menu of possible |
3146 |
distributions, or suggest a default, but normally SHOULD not |
3147 |
supply a default without giving the poster a chance to over- |
3148 |
ride it. A followup agent SHOULD initially supply the same |
3149 |
Distribution header as found in the precursor, although the |
3150 |
poster MAY alter this if appropriate. |
3151 |
|
3152 |
Despite the syntactic similarity and some historical confu- |
3153 |
sion, distributions are NOT newsgroup names. The whole |
3154 |
point of putting a distribution on an article is that it is |
3155 |
DIFFERENT from the newsgroup(s). In general, a meaningful |
3156 |
distribution corresponds to some sort of region of propaga- |
3157 |
tion: a geographical area, an organization, or a cooperating |
3158 |
subnet. |
3159 |
|
3160 |
NOTE: Distributions have historically suffered |
3161 |
from the completely uncontrolled nature of their |
3162 |
name space, the lack of feedback to posters on |
3163 |
|
3164 |
|
3165 |
|
3166 |
2 June 1994 - 48 - expires 15 July 1994 |
3167 |
|
3168 |
|
3169 |
|
3170 |
|
3171 |
|
3172 |
INTERNET DRAFT to be NEWS sec. 6.7 |
3173 |
|
3174 |
|
3175 |
incomplete propagation resulting from use of ran- |
3176 |
dom trash in Distribution headers, and confusion |
3177 |
with newsgroups (arising partly because many |
3178 |
regions and organizations DO have internal news- |
3179 |
groups with names resembling their internal dis- |
3180 |
tributions). This has resulted in much garbage in |
3181 |
Distribution headers, notably the pointless prac- |
3182 |
tice of automatically supplying the first compo- |
3183 |
nent of the newsgroup name as a distribution |
3184 |
(which is MOST unlikely to restrict propagation!). |
3185 |
Many sites have opted to maximize propagation of |
3186 |
such ill-formed articles by essentially ignoring |
3187 |
distributions. This unfortunately interferes with |
3188 |
legitimate uses. The situation is bad enough that |
3189 |
distributions must be considered largely useless |
3190 |
except within cooperating subnets that make an |
3191 |
organized effort to restrain propagation of their |
3192 |
internal distributions. |
3193 |
|
3194 |
NOTE: The distributions "world" and "local" have |
3195 |
no standard magic meaning (except that the former |
3196 |
is the default distribution if none is given). |
3197 |
Some pieces of software do assign such meanings to |
3198 |
them. |
3199 |
|
3200 |
|
3201 |
6.8. Keywords |
3202 |
|
3203 |
The Keywords header content is one or more phrases intended |
3204 |
to describe some aspect of the content of the article: |
3205 |
|
3206 |
Keywords-content = plain-phrase *( "," [ space ] plain-phrase ) |
3207 |
|
3208 |
Keywords, separated by commas, each follow the <plain- |
3209 |
phrase> syntax defined in section 5.2. Encoded words in |
3210 |
keywords MUST not contain characters other than letters (of |
3211 |
either case), digits, and the characters "!", "*", "+", "-", |
3212 |
"/", "=", and "_". |
3213 |
|
3214 |
NOTE: Posters and posting agents are asked to take |
3215 |
note that keywords are separated by commas, not by |
3216 |
white space. The following Keywords header con- |
3217 |
tains only one keyword (a rather unlikely and |
3218 |
improbable one): |
3219 |
|
3220 |
Keywords: Thompson Ritchie Multics Linux |
3221 |
|
3222 |
and should probably have been written: |
3223 |
|
3224 |
Keywords: Thompson, Ritchie, Multics, Linux |
3225 |
|
3226 |
This particular error is unfortunately rather |
3227 |
widespread. |
3228 |
|
3229 |
|
3230 |
|
3231 |
|
3232 |
2 June 1994 - 49 - expires 15 July 1994 |
3233 |
|
3234 |
|
3235 |
|
3236 |
|
3237 |
|
3238 |
INTERNET DRAFT to be NEWS sec. 6.8 |
3239 |
|
3240 |
|
3241 |
NOTE: Reading agents and archivers preparing |
3242 |
indexes of articles should bear in mind that user- |
3243 |
chosen keywords are notoriously poor for indexing |
3244 |
purposes unless the keywords are picked from a |
3245 |
predefined set (which they are not in this case). |
3246 |
Also, some followup agents unwisely propagate the |
3247 |
Keywords header from the precursor into the fol- |
3248 |
lowup by default. At least one news-based experi- |
3249 |
ment has found the contents of Keywords headers to |
3250 |
be completely valueless for indexing. |
3251 |
|
3252 |
|
3253 |
6.9. Summary |
3254 |
|
3255 |
The Summary header content is a short phrase summarizing the |
3256 |
article's content: |
3257 |
|
3258 |
Summary-content = nonblank-text |
3259 |
|
3260 |
As with the subject, no restriction is placed on the content |
3261 |
since it is intended solely for display to humans. |
3262 |
|
3263 |
NOTE: Reading agents should be aware that the Sum- |
3264 |
mary header is often used as a sort of secondary |
3265 |
Subject header, and (if present) its contents |
3266 |
should perhaps be displayed when the subject is |
3267 |
displayed. |
3268 |
|
3269 |
The summary SHOULD be terse. Posters SHOULD avoid trying to |
3270 |
cram their entire article into the headers; even the sim- |
3271 |
plest query usually benefits from a sentence or two of elab- |
3272 |
oration and context, and not all reading agents display all |
3273 |
headers. |
3274 |
|
3275 |
|
3276 |
6.10. Approved |
3277 |
|
3278 |
The Approved header content indicates the mailing addresses |
3279 |
(and possibly the full names) of the persons or entities |
3280 |
approving the article for posting: |
3281 |
|
3282 |
Approved-content = From-content *( "," [ space ] From-content ) |
3283 |
|
3284 |
An Approved header is required in all postings to moderated |
3285 |
newsgroups; the presence or absence of this header allows a |
3286 |
posting agent to distinguish between articles posted by the |
3287 |
moderator (which are normal articles to be posted normally) |
3288 |
and attempted contributions by others (which should be |
3289 |
mailed to the moderator for approval). An Approved header |
3290 |
is also required in certain control messages, to reduce the |
3291 |
probability of accidental posting of same; see the relevant |
3292 |
parts of section 7. |
3293 |
|
3294 |
|
3295 |
|
3296 |
|
3297 |
|
3298 |
2 June 1994 - 50 - expires 15 July 1994 |
3299 |
|
3300 |
|
3301 |
|
3302 |
|
3303 |
|
3304 |
INTERNET DRAFT to be NEWS sec. 6.10 |
3305 |
|
3306 |
|
3307 |
NOTE: There is, at present, no way to authenticate |
3308 |
Approved headers to ensure that the claimed |
3309 |
approval really was bestowed. Nor is there an |
3310 |
established mechanism for even maintaining a list |
3311 |
of legitimate approvers (such a list would quickly |
3312 |
become out of date if it had to be maintained by |
3313 |
hand). Such mechanisms, presumably relying on |
3314 |
cryptographic authentication, would be a worth- |
3315 |
while extension to this Draft, and experimental |
3316 |
work in this area is encouraged. (The problem is |
3317 |
harder than it sounds because news is used on many |
3318 |
systems which do not have real-time access to key |
3319 |
servers.) |
3320 |
|
3321 |
NOTE: Relayer implementors, please note well: it |
3322 |
is the POSTING AGENT that is authorized to distin- |
3323 |
guish between moderator postings and attempted |
3324 |
contributions, and to mail the latter to the mod- |
3325 |
erator. As discussed in section 9.1, relayers |
3326 |
MUST not, repeat MUST not, send such mail; on |
3327 |
receipt of an unApproved article in a moderated |
3328 |
newsgroup, they should discard the article, NOT |
3329 |
transform it into a mail message (except perhaps |
3330 |
to a local administrator). |
3331 |
|
3332 |
NOTE: RFC 1036 restricted Approved to a single |
3333 |
From-content. However, multiple moderation is no |
3334 |
longer rare, and multi-moderator Approved headers |
3335 |
are already in use. |
3336 |
|
3337 |
|
3338 |
6.11. Lines |
3339 |
|
3340 |
The Lines header content indicates the number of lines in |
3341 |
the body of the article: |
3342 |
|
3343 |
Lines-content = 1*digit |
3344 |
|
3345 |
The line count includes all body lines, including the signa- |
3346 |
ture if any, including empty lines (if any) at beginning or |
3347 |
end of the body. (The single empty separator line between |
3348 |
the headers and the body is not part of the body.) The |
3349 |
"body" here is the body as found in the posted article, |
3350 |
AFTER all transformations such as MIME encodings. |
3351 |
|
3352 |
Reading agents SHOULD not rely on the presence of this |
3353 |
header, since it is optional (and some posting agents do not |
3354 |
supply it). They MUST not rely on it being precise, since |
3355 |
it frequently is not. |
3356 |
|
3357 |
NOTE: The average line length in article bodies is |
3358 |
surprisingly consistent at about 40 characters, |
3359 |
and since the line count typically is used only |
3360 |
for approximate judgements ("is this too long to |
3361 |
|
3362 |
|
3363 |
|
3364 |
2 June 1994 - 51 - expires 15 July 1994 |
3365 |
|
3366 |
|
3367 |
|
3368 |
|
3369 |
|
3370 |
INTERNET DRAFT to be NEWS sec. 6.11 |
3371 |
|
3372 |
|
3373 |
read quickly?"), dividing the byte count of the |
3374 |
body by 40 gives an estimate of the body line |
3375 |
count that is adequate for normal use. This esti- |
3376 |
mate is NOT adequate if the body has been MIME |
3377 |
encoded... but neither is the Lines header, since |
3378 |
at least one major relayer will supply a Lines |
3379 |
header for an article that lacks one, and will not |
3380 |
consider the possibility of MIME encodings when |
3381 |
computing the line count. |
3382 |
|
3383 |
NOTE: It would be better to have a Content-Size |
3384 |
header as part of MIME, so that body parts could |
3385 |
have their own sizes, and so that the units used |
3386 |
could be appropriate to the data type (line count |
3387 |
is not a useful measure of the size of an encoded |
3388 |
image, for example). Doing this is preferable to |
3389 |
trying to fix Lines. |
3390 |
|
3391 |
UNRESOLVED ISSUE: Update on Content-Size? |
3392 |
|
3393 |
Relayers SHOULD discard this header if they find it neces- |
3394 |
sary to re-encode the article in such a way that the origi- |
3395 |
nal Lines header would be rendered incorrect. |
3396 |
|
3397 |
|
3398 |
6.12. Xref |
3399 |
|
3400 |
The Xref header content indicates where an article was filed |
3401 |
by the last relayer to process it: |
3402 |
|
3403 |
Xref-content = relayer 1*( space location ) |
3404 |
relayer = relayer-name |
3405 |
location = newsgroup-name ":" article-locator |
3406 |
article-locator = 1*<ASCII printable character> |
3407 |
|
3408 |
The relayer's name is included so that software can deter- |
3409 |
mine which relayer generated the header (and specifically, |
3410 |
whether it really was the one that filed the copy being |
3411 |
examined). The locations specify what newsgroups the arti- |
3412 |
cle was filed under (which may differ from those in the |
3413 |
Newsgroups header) and where it was filed under them. The |
3414 |
exact form of an article locator is implementation-specific. |
3415 |
|
3416 |
NOTE: Reading agents can exploit this information |
3417 |
to avoid presenting the same article to a reader |
3418 |
several times. The information is sometimes |
3419 |
available in system databases, but having it in |
3420 |
the article is convenient. Relayers traditionally |
3421 |
generate an Xref header only if the article is |
3422 |
cross-posted, but this is not mandatory, and there |
3423 |
is at least one new application ("mirroring": |
3424 |
keeping news databases on two hosts identical) |
3425 |
where the header is useful in all articles. |
3426 |
|
3427 |
|
3428 |
|
3429 |
|
3430 |
2 June 1994 - 52 - expires 15 July 1994 |
3431 |
|
3432 |
|
3433 |
|
3434 |
|
3435 |
|
3436 |
INTERNET DRAFT to be NEWS sec. 6.12 |
3437 |
|
3438 |
|
3439 |
NOTE: The traditional form of an article locator |
3440 |
is a decimal number, with articles in each news- |
3441 |
group numbered consecutively starting from 1. |
3442 |
NNTP [rrr] demands that such a model be provided, |
3443 |
and there may be other software which expects it, |
3444 |
but it seems desirable to permit flexibility for |
3445 |
unorthodox implementations. |
3446 |
|
3447 |
A relayer inserting an Xref header into an article MUST |
3448 |
delete any previous Xref header. A relayer which is not |
3449 |
inserting its own Xref header SHOULD delete any previous |
3450 |
Xref header. A relayer MAY delete the Xref header when |
3451 |
passing an article on to another relayer. |
3452 |
|
3453 |
NOTE: RFC 1036 specified that the Xref header was |
3454 |
not transmitted when an article was passed to |
3455 |
another relayer, but the major news implementa- |
3456 |
tions have never obeyed this rule, and applica- |
3457 |
tions like mirroring depend on this disobedience. |
3458 |
|
3459 |
A relayer MUST use the same name in Xref headers as it uses |
3460 |
in Path headers. Reading agents MUST ignore an Xref header |
3461 |
containing a relayer name that differs from the one that |
3462 |
begins the path list. |
3463 |
|
3464 |
|
3465 |
6.13. Organization |
3466 |
|
3467 |
The Organization header content is a short phrase identify- |
3468 |
ing the poster's organization: |
3469 |
|
3470 |
Organization-content = nonblank-text |
3471 |
|
3472 |
This header is typically supplied by the posting agent. The |
3473 |
Organization content SHOULD mention geographical location |
3474 |
(e.g. city and country) when it is not obvious from the |
3475 |
organization's name. |
3476 |
|
3477 |
NOTE: The motive here is that the organization is |
3478 |
often difficult to guess from the mailing address, |
3479 |
is not always supplied in a signature, and can |
3480 |
help identify the poster to the reader. |
3481 |
|
3482 |
NOTE: There is no "s" in "Organization". |
3483 |
|
3484 |
The Organization content is provided for identification |
3485 |
only, and does not imply that the poster speaks for the |
3486 |
organization or that the article represents organization |
3487 |
policy. Posting agents SHOULD permit the poster to override |
3488 |
a local default Organization header. |
3489 |
|
3490 |
|
3491 |
|
3492 |
|
3493 |
|
3494 |
|
3495 |
|
3496 |
2 June 1994 - 53 - expires 15 July 1994 |
3497 |
|
3498 |
|
3499 |
|
3500 |
|
3501 |
|
3502 |
INTERNET DRAFT to be NEWS sec. 6.14 |
3503 |
|
3504 |
|
3505 |
6.14. Supersedes |
3506 |
|
3507 |
The Supersedes header content specifies articles to be can- |
3508 |
celled on arrival of this one: |
3509 |
|
3510 |
Supersedes-content = message-id *( space message-id ) |
3511 |
|
3512 |
Supersedes is equivalent to Also-Control (section 6.15) with |
3513 |
an implicit verb of "cancel" (section 7.1). |
3514 |
|
3515 |
NOTE: Supersedes is normally used where the arti- |
3516 |
cle is an updated version of the one(s) being can- |
3517 |
celled. |
3518 |
|
3519 |
NOTE: Although the ability to use multiple message |
3520 |
IDs in Supersedes is highly desirable (see section |
3521 |
7.1), posters are warned that existing implementa- |
3522 |
tions often do not correctly handle more than one. |
3523 |
|
3524 |
NOTE: There is no "c" in "Supersedes". |
3525 |
|
3526 |
An article with a Supersedes header MUST not have an Also- |
3527 |
Control or Control header. |
3528 |
|
3529 |
|
3530 |
6.15. Also-Control |
3531 |
|
3532 |
The Also-Control header content marks the article as being a |
3533 |
control message IN ADDITION to being a normal news article, |
3534 |
and specifies the desired actions: |
3535 |
|
3536 |
Also-Control-content = Control-content |
3537 |
|
3538 |
An article with an Also-Control header is filed and passed |
3539 |
on normally, but the content of the Also-Control header is |
3540 |
processed as if it were found in a Control header. |
3541 |
|
3542 |
NOTE: It is sometimes desirable to piggyback con- |
3543 |
trol actions on a normal article, so that the |
3544 |
article will be filed normally but will also be |
3545 |
acted on as a control message. This header is |
3546 |
essentially a generalization of Supersedes. |
3547 |
|
3548 |
NOTE: Be warned that some old relayers do not |
3549 |
implement Also-Control. |
3550 |
|
3551 |
An article with an Also-Control header MUST not have a Con- |
3552 |
trol or Supersedes header. |
3553 |
|
3554 |
|
3555 |
6.16. See-Also |
3556 |
|
3557 |
The See-Also header content lists message IDs of articles |
3558 |
that are related to this one but are not its precursors: |
3559 |
|
3560 |
|
3561 |
|
3562 |
2 June 1994 - 54 - expires 15 July 1994 |
3563 |
|
3564 |
|
3565 |
|
3566 |
|
3567 |
|
3568 |
INTERNET DRAFT to be NEWS sec. 6.16 |
3569 |
|
3570 |
|
3571 |
See-Also-content = message-id *( space message-id ) |
3572 |
|
3573 |
See-Also resembles References, but without the restrictions |
3574 |
imposed on References by the followup rules. |
3575 |
|
3576 |
NOTE: See-Also provides a way to group related |
3577 |
articles, such as the parts of a single document |
3578 |
that had to be split across multiple articles due |
3579 |
to its size, or to cross-reference between paral- |
3580 |
lel threads. |
3581 |
|
3582 |
NOTE: See the discussion (in section 6.5) on MAIL |
3583 |
compatibility issues of References and See-Also. |
3584 |
|
3585 |
NOTE: In the specific case where it is desired to |
3586 |
essentially make another article PART of the cur- |
3587 |
rent one, e.g. for annotation of the other arti- |
3588 |
cle, MIME's "message/external-body" convention can |
3589 |
be used to do so without actual inclusion. "news- |
3590 |
message-ID" was registered as a standard external- |
3591 |
body access method, with a mandatory NAME parame- |
3592 |
ter giving the message ID and an optional SITE |
3593 |
parameter suggesting an NNTP site that might have |
3594 |
the article available (if it is not available |
3595 |
locally), by IANA 22 June 1993. |
3596 |
|
3597 |
UNRESOLVED ISSUE: Could the syntax be generalized |
3598 |
to include URLs as alternatives to message IDs? |
3599 |
Here it makes much more sense than in References. |
3600 |
|
3601 |
|
3602 |
6.17. Article-Names |
3603 |
|
3604 |
The Article-Names header content indicates any special sig- |
3605 |
nificance the article may have in particular newsgroups: |
3606 |
|
3607 |
Article-Names-content = 1*( name-clause space ) |
3608 |
name-clause = newsgroup-name ":" article-name |
3609 |
article-name = letter 1*( letter / digit / "-" ) |
3610 |
|
3611 |
Each name clause specifies a newsgroup (which SHOULD be |
3612 |
among those in the Newsgroups header) and an article name |
3613 |
local to that newsgroup. Article names MAY be used by |
3614 |
relayers to file the article in special ways, or they MAY |
3615 |
just be noted for possible special attention by reading |
3616 |
agents. Article names are case-sensitive. |
3617 |
|
3618 |
NOTE: This header provides a way to mark special |
3619 |
postings, such as introductions, frequently-asked- |
3620 |
question lists, etc., so that reading agents have |
3621 |
a way of finding them automatically. The news- |
3622 |
group name is specified for each article name |
3623 |
because the names may be newsgroup-specific; for |
3624 |
example, many frequently-asked-question lists are |
3625 |
|
3626 |
|
3627 |
|
3628 |
2 June 1994 - 55 - expires 15 July 1994 |
3629 |
|
3630 |
|
3631 |
|
3632 |
|
3633 |
|
3634 |
INTERNET DRAFT to be NEWS sec. 6.17 |
3635 |
|
3636 |
|
3637 |
posted to "news.answers" in addition to their |
3638 |
"home" newsgroup, and they would not be known by |
3639 |
the same name(s) in both newsgroups. |
3640 |
|
3641 |
The Article-Names header SHOULD be ignored unless the arti- |
3642 |
cle also contains an Approved header. |
3643 |
|
3644 |
NOTE: This stipulation is made in anticipation of |
3645 |
the possibility that Approved headers will be |
3646 |
involved in cryptographic authentication. |
3647 |
|
3648 |
The presence of an Article-Names header does not necessarily |
3649 |
imply that the article will be retained unusually long |
3650 |
before expiration, or that previous article(s) with similar |
3651 |
Article-Names headers will be cancelled by its arrival. |
3652 |
Posters preparing special postings SHOULD include appropri- |
3653 |
ate other headers, such as Expires and Supersedes, to |
3654 |
request such actions. |
3655 |
|
3656 |
Different networks MAY establish different sets of article |
3657 |
names for the special postings they deem significant; it is |
3658 |
preferable for usage to be standardized within networks, |
3659 |
although it might be desirable for individual newsgroups to |
3660 |
have different naming conventions in some situations. Arti- |
3661 |
cle names MUST be 14 characters or less. The following |
3662 |
names are suggested but are not mandatory: |
3663 |
|
3664 |
intro Introduction to the newsgroup for newcomers. |
3665 |
|
3666 |
charter Charter, rules, organization, moderation poli- |
3667 |
cies, etc. |
3668 |
|
3669 |
background Biographies of special participants, history of |
3670 |
the newsgroup, notes on related newsgroups, etc. |
3671 |
|
3672 |
subgroups Descriptions of sub-newsgroups under this news- |
3673 |
group, e.g. "sci.space.news" under "sci.space". |
3674 |
|
3675 |
facts Information relating to the purpose of the news- |
3676 |
group, e.g. an acronym glossary in "sci.space". |
3677 |
|
3678 |
references Where to get more information: books, journals, |
3679 |
FTP repositories, etc. |
3680 |
|
3681 |
faq Answers to frequently-asked questions. |
3682 |
|
3683 |
menu If present, a list of all the other article |
3684 |
names local to this newsgroup, with brief |
3685 |
descriptions of their contents. |
3686 |
|
3687 |
Such articles may be divided into subsections using the MIME |
3688 |
"multipart/mixed" conventions. If size considerations make |
3689 |
it necessary to split such articles, names ending in a |
3690 |
hyphen and a part number are suggested; for example, a |
3691 |
|
3692 |
|
3693 |
|
3694 |
2 June 1994 - 56 - expires 15 July 1994 |
3695 |
|
3696 |
|
3697 |
|
3698 |
|
3699 |
|
3700 |
INTERNET DRAFT to be NEWS sec. 6.17 |
3701 |
|
3702 |
|
3703 |
three-part frequently-asked-questions list could have arti- |
3704 |
cle names "faq-1", "faq-2", and "faq-3". |
3705 |
|
3706 |
NOTE: It is somewhat premature to attempt to stan- |
3707 |
dardize article names, since this is essentially a |
3708 |
new feature with no experience behind it. How- |
3709 |
ever, if reading agents are to attach special sig- |
3710 |
nificance to these names, some attempt at standard |
3711 |
conventions is imperative. This is a first |
3712 |
attempt at providing some. |
3713 |
|
3714 |
|
3715 |
6.18. Article-Updates |
3716 |
|
3717 |
The Article-Updates header content indicates what previous |
3718 |
articles this one is deemed (by the poster) to update (i.e., |
3719 |
replace): |
3720 |
|
3721 |
Article-Updates-content = message-id *( space message-id ) |
3722 |
|
3723 |
Each message ID identifies a previous article that this one |
3724 |
is deemed to update. This MUST not cause the previous arti- |
3725 |
cle(s) to be cancelled or otherwise altered, unless this is |
3726 |
implied by other headers (e.g. Supersedes); Article-Updates |
3727 |
is merely an advisory which MAY be noted for special atten- |
3728 |
tion by reading agents. |
3729 |
|
3730 |
NOTE: This header provides a way to mark articles |
3731 |
which are only minor updates of previous ones, |
3732 |
containing no significant new information and not |
3733 |
worth reading if the previous ones have been read. |
3734 |
|
3735 |
NOTE: If suitable conventions using MIME multipart |
3736 |
bodies and the "message/external-body" body-part |
3737 |
type can be developed, a replacing article might |
3738 |
contain only differences between the old text and |
3739 |
the new text, rather than a complete new copy. |
3740 |
This is the motivation for not making Article- |
3741 |
Updates also function as Supersedes does: the |
3742 |
replacing article might depend on the continued |
3743 |
presence of the replaced article. |
3744 |
|
3745 |
|
3746 |
7. Control Messages |
3747 |
|
3748 |
The following sections document the currently-defined con- |
3749 |
trol messages. "Message" is used herein as a synonym for |
3750 |
"article" unless context indicates otherwise. |
3751 |
|
3752 |
Posting agents are warned that since certain control mes- |
3753 |
sages require article bodies in quite specific formats, sig- |
3754 |
natures SHOULD not be appended to such articles, and it may |
3755 |
be wise to take greater care than usual to avoid unintended |
3756 |
(although perhaps well-meaning) alterations to text supplied |
3757 |
|
3758 |
|
3759 |
|
3760 |
2 June 1994 - 57 - expires 15 July 1994 |
3761 |
|
3762 |
|
3763 |
|
3764 |
|
3765 |
|
3766 |
INTERNET DRAFT to be NEWS sec. 7 |
3767 |
|
3768 |
|
3769 |
by the poster. Relayers MUST assume that control messages |
3770 |
mean what they say; they MAY be obeyed as is or rejected, |
3771 |
but MUST not be reinterpreted. |
3772 |
|
3773 |
The execution of the actions requested by control messages |
3774 |
is subject to local administrative restrictions, which MAY |
3775 |
deny requests or refer them to an administrator for |
3776 |
approval. The descriptions below are generally phrased in |
3777 |
terms suggesting mandatory actions, but any or all of these |
3778 |
MAY be subject to local administrative approval (either as a |
3779 |
class or case-by-case). Analogously, where the description |
3780 |
below specifies that a message or portion thereof is to be |
3781 |
ignored, this action MAY include reporting it to an adminis- |
3782 |
trator. |
3783 |
|
3784 |
NOTE: The exact choice of local action might |
3785 |
depend on what action the control message |
3786 |
requests, who it claims to come from, etc. |
3787 |
|
3788 |
Relayers MUST propagate even control messages they do not |
3789 |
understand. |
3790 |
|
3791 |
In the following sections, each type of control message is |
3792 |
defined syntactically by defining its arguments and its |
3793 |
body. For example, "cancel" is defined by defining cancel- |
3794 |
arguments and cancel-body. |
3795 |
|
3796 |
|
3797 |
7.1. cancel |
3798 |
|
3799 |
The cancel message requests that one or more previous arti- |
3800 |
cles be "cancelled": |
3801 |
|
3802 |
cancel-arguments = message-id *( space message-id ) |
3803 |
cancel-body = body |
3804 |
|
3805 |
The argument(s) identify the articles to be cancelled, by |
3806 |
message ID. The body is a comment, which software MUST |
3807 |
ignore, and SHOULD contain an indication of why the cancel- |
3808 |
lation was requested. The cancel message SHOULD be posted |
3809 |
to the same newsgroup(s), with the same distribution(s), as |
3810 |
the article(s) it is attempting to cancel. |
3811 |
|
3812 |
NOTE: Using the same newsgroups and distributions |
3813 |
maximizes the chances of the cancel message propa- |
3814 |
gating everywhere the target articles went. |
3815 |
|
3816 |
NOTE: RFC 1036 permitted only a single message-id |
3817 |
in a cancel message. Support for cancelling mul- |
3818 |
tiple articles is highly desirable, especially for |
3819 |
use with Supersedes (see section 6.14). If sev- |
3820 |
eral revisions of an article appear in fast suc- |
3821 |
cession, each using Supersedes to cancel the pre- |
3822 |
vious one, it is possible for a middle revision to |
3823 |
|
3824 |
|
3825 |
|
3826 |
2 June 1994 - 58 - expires 15 July 1994 |
3827 |
|
3828 |
|
3829 |
|
3830 |
|
3831 |
|
3832 |
INTERNET DRAFT to be NEWS sec. 7.1 |
3833 |
|
3834 |
|
3835 |
be destroyed by cancellation before it is propa- |
3836 |
gated onward to cancel its predecessor. Allowing |
3837 |
each article to cancel several predecessors |
3838 |
greatly alleviates this problem. (Posting agents |
3839 |
preparing a cancel of an article which itself can- |
3840 |
cels other articles might wish to add those arti- |
3841 |
cles to the cancel-arguments.) However, posters |
3842 |
should be aware that much old software does not |
3843 |
implement multiple cancellation properly, and |
3844 |
should avoid using it when reliable cancellation |
3845 |
is vitally important. |
3846 |
|
3847 |
When an article (the "target article") is to be cancelled, |
3848 |
there are four cases of interest: the article hasn't arrived |
3849 |
yet, it has arrived and been filed and is available for |
3850 |
reading, it has expired and been archived on some less- |
3851 |
accessible storage medium, or it has expired and been |
3852 |
deleted. The next few paragraphs discuss each case in turn |
3853 |
(in reverse order, which is convenient for the explanation). |
3854 |
|
3855 |
EXPIRED AND DELETED. Take no action. |
3856 |
|
3857 |
EXPIRED AND ARCHIVED. If the article is readily accessible |
3858 |
and can be deleted or made unreadable easily, treat as under |
3859 |
AVAILABLE below. Otherwise treat as under EXPIRED AND |
3860 |
DELETED. |
3861 |
|
3862 |
NOTE: While it is desirable for archived articles |
3863 |
to be cancellable, this can easily involve rewrit- |
3864 |
ing an entire archive volume just to get rid of |
3865 |
one article, perhaps with manual actions required |
3866 |
to arrange it. It is difficult to envision a sit- |
3867 |
uation so dire as to require such measures from |
3868 |
hundreds or thousands of administrators, or for |
3869 |
that matter one in which widespread compliance |
3870 |
with such a request is likely. |
3871 |
|
3872 |
AVAILABLE. Compare the mailing addresses from the From |
3873 |
lines of the cancel message and the target article, bearing |
3874 |
in mind that local parts (except for "postmaster") are case- |
3875 |
sensitive and domains are case-insensitive. If they do not |
3876 |
match, either refer the issue to an administrator for a |
3877 |
case-by-case decision, or treat as if they matched. |
3878 |
|
3879 |
NOTE: It is generally trivial to forge articles, |
3880 |
so nothing short of cryptographic authentication |
3881 |
is really adequate to ensure that a cancel came |
3882 |
from the original article's author. Moreover, it |
3883 |
is highly desirable to permit authorities other |
3884 |
than the author to cancel articles, to allow for |
3885 |
cases in which the author is unavailable, uncoop- |
3886 |
erative, or malicious, and in which damage and/or |
3887 |
legal problems may be minimized by prompt cancel- |
3888 |
lation. Reliable authentication that would permit |
3889 |
|
3890 |
|
3891 |
|
3892 |
2 June 1994 - 59 - expires 15 July 1994 |
3893 |
|
3894 |
|
3895 |
|
3896 |
|
3897 |
|
3898 |
INTERNET DRAFT to be NEWS sec. 7.1 |
3899 |
|
3900 |
|
3901 |
such administrative cancels would be a worthwhile |
3902 |
extension to this Draft, and experimental work in |
3903 |
this area is encouraged. |
3904 |
|
3905 |
NOTE: Meanwhile, a simple check of addresses is |
3906 |
useful accident prevention and catches at least |
3907 |
the most simple-minded forgers. Since the intent |
3908 |
is accident prevention rather than ironclad secu- |
3909 |
rity, use of the From address is appropriate, all |
3910 |
the more so because in the presence of gateways |
3911 |
(especially redundant multiple gateways), the |
3912 |
author may not have full control over Sender head- |
3913 |
ers. |
3914 |
|
3915 |
NOTE: The "refer... or treat as if they matched" |
3916 |
rule is intended to specifically forbid quietly |
3917 |
ignoring cancels with mismatched addresses. |
3918 |
|
3919 |
If the addresses match, then if technically possible, the |
3920 |
relayer MUST delete the target article completely and imme- |
3921 |
diately. Failing that, it MUST make the target article |
3922 |
unreadable (preferably to everyone, minimally to everyone |
3923 |
but the administrator) and either arrange for it to be |
3924 |
deleted as soon as possible or notify an administrator at |
3925 |
once. |
3926 |
|
3927 |
NOTE: To allow for events such as criminal |
3928 |
actions, malicious forgeries, and copyright |
3929 |
infringements, where damage and/or legal problems |
3930 |
may be minimized by prompt cancellation, complete |
3931 |
removal is strongly preferred over merely making |
3932 |
the target article unreadable. The potential for |
3933 |
malice is outweighed by the importance of really |
3934 |
getting rid of the target article in some legiti- |
3935 |
mate cases. (In cases of inadvertent copyright |
3936 |
violation in particular, the ability to quickly |
3937 |
remedy the violation is of considerable legal |
3938 |
importance.) Failing that, making it unreadable |
3939 |
is better than nothing. |
3940 |
|
3941 |
NOTE: Merely annotating the article so that read- |
3942 |
ers see an indication that the author wanted it |
3943 |
cancelled is not acceptable. Making the article |
3944 |
unreadable is the minimum action. |
3945 |
|
3946 |
NOTE: There have been experiments with making can- |
3947 |
celled articles unreadable, so that local news |
3948 |
administrators could reverse cancellations. In |
3949 |
practice, administrators almost never find cause |
3950 |
to do so. Removal appears to be clearly prefer- |
3951 |
able where technically feasible. |
3952 |
|
3953 |
NOT ARRIVED YET. If practical, retain the cancel message |
3954 |
until the target article does arrive, or until there is no |
3955 |
|
3956 |
|
3957 |
|
3958 |
2 June 1994 - 60 - expires 15 July 1994 |
3959 |
|
3960 |
|
3961 |
|
3962 |
|
3963 |
|
3964 |
INTERNET DRAFT to be NEWS sec. 7.1 |
3965 |
|
3966 |
|
3967 |
further possibility of it arriving and being accepted (see |
3968 |
section 9.2), and then treat as under AVAILABLE. Failing |
3969 |
that, arrange for the target article to be rejected and dis- |
3970 |
carded if it does arrive. |
3971 |
|
3972 |
NOTE: It may well be impractical to retain the |
3973 |
control message, given uncertainty about whether |
3974 |
the target article will ever arrive. Existing |
3975 |
practice in such cases is to assume that addresses |
3976 |
would match and arrange the equivalent of dele- |
3977 |
tion. This is often done by making a spurious |
3978 |
entry in a database of already-seen message IDs |
3979 |
(see section 9.3), so that if the article does |
3980 |
arrive, it will be rejected as a duplicate. |
3981 |
|
3982 |
The cancel message MUST be propagated onward in the usual |
3983 |
fashion, regardless of which of the four cases applied, so |
3984 |
that the target article will be cancelled everywhere even if |
3985 |
cancellation and target article follow different routes. |
3986 |
|
3987 |
NOTE: RFC 1036 appeared to require stopping cancel |
3988 |
propagation in the NOT ARRIVED YET case, although |
3989 |
the wording was somewhat unclear. This appears to |
3990 |
have been an unwise decision; there are known |
3991 |
cases of important cancellations (in situations |
3992 |
of, e.g., inadvertent copyright violation) achiev- |
3993 |
ing rather poorer propagation than the target |
3994 |
article. News propagation is often a much less |
3995 |
orderly process than the authors of RFC 1036 |
3996 |
apparently envisioned. Modern implementations |
3997 |
generally propagate the cancellation regardless. |
3998 |
|
3999 |
Posting agents meant for use by ordinary posters SHOULD |
4000 |
reject an attempt to post a cancel message if the target |
4001 |
article is available and the mailing address in its From |
4002 |
header does not match the one in the cancel message's From |
4003 |
header. |
4004 |
|
4005 |
NOTE: This, again, is primarily accident preven- |
4006 |
tion. |
4007 |
|
4008 |
|
4009 |
7.2. ihave, sendme |
4010 |
|
4011 |
The ihave and sendme control messages implement a crude |
4012 |
batched predecessor of the NNTP [rrr] protocol. They are |
4013 |
largely obsolete in the Internet, but still see use in the |
4014 |
UUCP environment, especially for backup feeds that normally |
4015 |
are active only when a primary feed path has failed. |
4016 |
|
4017 |
NOTE: The ihave and sendme messages defined here |
4018 |
have ABSOLUTELY NOTHING TO DO WITH NNTP, despite |
4019 |
similarities of terminology. |
4020 |
|
4021 |
|
4022 |
|
4023 |
|
4024 |
2 June 1994 - 61 - expires 15 July 1994 |
4025 |
|
4026 |
|
4027 |
|
4028 |
|
4029 |
|
4030 |
INTERNET DRAFT to be NEWS sec. 7.2 |
4031 |
|
4032 |
|
4033 |
The two messages share the same syntax: |
4034 |
|
4035 |
ihave-arguments = *( message-id space ) relayer-name |
4036 |
sendme-arguments = ihave-arguments |
4037 |
ihave-body = *( message-id eol ) |
4038 |
sendme-body = ihave-body |
4039 |
|
4040 |
Message IDs MUST appear in either the arguments or the body, |
4041 |
but not both. Relayers SHOULD generate the form putting |
4042 |
message IDs in the body, but the other form MUST be sup- |
4043 |
ported for backward compatibility. |
4044 |
|
4045 |
NOTE: RFC 1036 made the relayer name optional, but |
4046 |
difficulties could easily ensue in determining the |
4047 |
origin of the message, and this option is believed |
4048 |
to be unused nowadays. Putting the message IDs in |
4049 |
the body is strongly preferred over putting them |
4050 |
in the arguments because it lends itself much bet- |
4051 |
ter to large numbers of message IDs and avoids the |
4052 |
empty-body problem mentioned in section 4.3.1. |
4053 |
|
4054 |
The ihave message states that the named relayer has filed |
4055 |
articles with the specified message IDs, which may be of |
4056 |
interest to the relayer(s) receiving the ihave message. The |
4057 |
sendme message requests that the relayer receiving it send |
4058 |
the articles having the specified message IDs to the named |
4059 |
relayer. |
4060 |
|
4061 |
These control messages are normally sent essentially as |
4062 |
point-to-point messages, by using "to." newsgroups (see sec- |
4063 |
tion 5.5) that are sent only to the relayer the messages are |
4064 |
intended for. The two relayers MUST be neighbors, exchang- |
4065 |
ing news directly with each other. Each relayer advertises |
4066 |
its new arrivals to the other using ihave messages, and each |
4067 |
uses sendme messages to request the articles it lacks. |
4068 |
|
4069 |
NOTE: Arguably these point-to-point control mes- |
4070 |
sages should flow by some other protocol, e.g. |
4071 |
mail, but administrative and interfacing issues |
4072 |
are simplified if the news system doesn't need to |
4073 |
talk to the mail system. |
4074 |
|
4075 |
To reduce overhead, ihave and sendme messages SHOULD be sent |
4076 |
relatively infrequently and SHOULD contain substantial num- |
4077 |
bers of message IDs. If ihave and sendme are being used to |
4078 |
implement a backup feed, it may be desirable to insert a |
4079 |
delay between reception of an ihave and generation of a |
4080 |
sendme, so that a slightly slow primary feed will not cause |
4081 |
large numbers of articles to be requested unnecessarily via |
4082 |
sendme. |
4083 |
|
4084 |
|
4085 |
|
4086 |
|
4087 |
|
4088 |
|
4089 |
|
4090 |
2 June 1994 - 62 - expires 15 July 1994 |
4091 |
|
4092 |
|
4093 |
|
4094 |
|
4095 |
|
4096 |
INTERNET DRAFT to be NEWS sec. 7.3 |
4097 |
|
4098 |
|
4099 |
7.3. newgroup |
4100 |
|
4101 |
The newgroup control message requests that a new newsgroup |
4102 |
be created: |
4103 |
|
4104 |
newgroup-arguments = newsgroup-name [ space moderation ] |
4105 |
moderation = "moderated" / "unmoderated" |
4106 |
newgroup-body = body |
4107 |
/ [ body ] descriptor [ body ] |
4108 |
descriptor = descriptor-tag eol description-line eol |
4109 |
descriptor-tag = "For your newsgroups file:" |
4110 |
description-line = newsgroup-name space description |
4111 |
description = nonblank-text [ " (Moderated)" ] |
4112 |
|
4113 |
The first argument names the newsgroup to be created, and |
4114 |
the second one (if present) indicates whether it is moder- |
4115 |
ated. If there is no second argument, the default is |
4116 |
"unmoderated". |
4117 |
|
4118 |
NOTE: Implementors are warned that there is occa- |
4119 |
sional use of other forms in the second argument. |
4120 |
It is suggested that such violations of this |
4121 |
Draft, which are also violations of RFC 1036, |
4122 |
cause the newgroup message to be ignored. RFC |
4123 |
1036 was slightly vague about how second arguments |
4124 |
other than "moderated" were to be treated (specif- |
4125 |
ically, whether they were illegal or just |
4126 |
ignored), but it is thought that all existing |
4127 |
major implementations will handle "unmoderated" |
4128 |
correctly, and it appears desirable to tighten up |
4129 |
the specs to make it possible for other forms to |
4130 |
be used in future. |
4131 |
|
4132 |
The body is a comment, which software MUST ignore, except |
4133 |
that if it contains a descriptor, the description line is |
4134 |
intended to be suitable for addition to a list of newsgroup |
4135 |
descriptions. The description cannot be continued onto |
4136 |
later lines, but is not constrained to any particular |
4137 |
length. Moderated newsgroups have descriptions that end |
4138 |
with the string " (Moderated)" (note that this string begins |
4139 |
with a blank). |
4140 |
|
4141 |
NOTE: It is unfortunate that the description line |
4142 |
is part of the body, rather than being supplied in |
4143 |
a header, but this is established practice. News- |
4144 |
group creators are cautioned that the descriptor |
4145 |
tag must be reproduced exactly as given above, |
4146 |
alone on a line, and is case-sensitive. (To |
4147 |
reduce errors in this regard, posting agents might |
4148 |
wish to question or reject newgroup messages which |
4149 |
do not contain a descriptor.) Given the desire |
4150 |
for short lines, description writers should avoid |
4151 |
content-free phrases like "discussion of" and |
4152 |
"news about", and stick to defining what the |
4153 |
|
4154 |
|
4155 |
|
4156 |
2 June 1994 - 63 - expires 15 July 1994 |
4157 |
|
4158 |
|
4159 |
|
4160 |
|
4161 |
|
4162 |
INTERNET DRAFT to be NEWS sec. 7.3 |
4163 |
|
4164 |
|
4165 |
newsgroup is about. |
4166 |
|
4167 |
The remainder of the body SHOULD contain an explanation of |
4168 |
the purpose of the newsgroup and the decision to create it. |
4169 |
|
4170 |
NOTE: Criteria for newsgroup creation vary widely |
4171 |
and are outside the scope of this Draft, but if |
4172 |
formal procedures of one kind or another were fol- |
4173 |
lowed in the decision, the body should mention |
4174 |
this. Administrators often look for such informa- |
4175 |
tion when deciding whether to comply with cre- |
4176 |
ation/deletion requests. |
4177 |
|
4178 |
A newgroup message which lacks an Approved header MUST be |
4179 |
ignored. |
4180 |
|
4181 |
NOTE: It would also be desirable to ignore a new- |
4182 |
group message unless its Approved header names a |
4183 |
person who is authorized (in some sense) to create |
4184 |
such a newsgroup. A cooperating subnet with suf- |
4185 |
ficiently strong coordination to maintain a cor- |
4186 |
rect and current list of authorized creators might |
4187 |
wish to do so for its internal newsgroups. It |
4188 |
also (or alternatively) might wish to ignore a |
4189 |
newgroup message for an internal newsgroup that |
4190 |
was posted (or cross-posted) to a non-internal |
4191 |
newsgroup. |
4192 |
|
4193 |
NOTE: As mentioned in section 6.10, some form of |
4194 |
(cryptographic?) authentication of Approved head- |
4195 |
ers would be highly desirable, especially for con- |
4196 |
trol messages. |
4197 |
|
4198 |
It would be desirable to provide some way of supplying a |
4199 |
moderator's address in a newgroup message for a moderated |
4200 |
newsgroup, but this will cause problems unless effective |
4201 |
authentication is available, so it is left for future work. |
4202 |
|
4203 |
NOTE: This leaves news administrators stuck with |
4204 |
the annoying chore of arranging proper mailing of |
4205 |
moderated-newsgroup submissions. On Usenet, this |
4206 |
can be simplified by exploiting a forwarding |
4207 |
facility that some major sites provide: they main- |
4208 |
tain forwarding addresses, each the name of a mod- |
4209 |
erated newsgroup with all periods (".", ASCII 46) |
4210 |
replaced by hyphens ("-", ASCII 45), which forward |
4211 |
mail to the current newsgroup moderators. More |
4212 |
advice on the subject of forwarding to moderators |
4213 |
can be found in the document titled "How to Con- |
4214 |
struct the Mailpaths File", posted regularly to |
4215 |
the Usenet newsgroups news.lists, news.admin.misc, |
4216 |
and news.answers. |
4217 |
|
4218 |
|
4219 |
|
4220 |
|
4221 |
|
4222 |
2 June 1994 - 64 - expires 15 July 1994 |
4223 |
|
4224 |
|
4225 |
|
4226 |
|
4227 |
|
4228 |
INTERNET DRAFT to be NEWS sec. 7.3 |
4229 |
|
4230 |
|
4231 |
A newgroup message naming a newsgroup that already exists is |
4232 |
requesting a change in the moderation status or description |
4233 |
of the newsgroup. The same rules apply. |
4234 |
|
4235 |
|
4236 |
7.4. rmgroup |
4237 |
|
4238 |
The rmgroup message requests that a newsgroup be deleted: |
4239 |
|
4240 |
rmgroup-arguments = newsgroup-name |
4241 |
rmgroup-body = body |
4242 |
|
4243 |
The sole argument is the newsgroup name. The body is a com- |
4244 |
ment, which software MUST ignore; it SHOULD contain an |
4245 |
explanation of the decision to delete the newsgroup. |
4246 |
|
4247 |
NOTE: Criteria for newsgroup deletion vary widely |
4248 |
and are outside the scope of this Draft, but if |
4249 |
formal procedures of one kind or another were fol- |
4250 |
lowed in the decision, the body should mention |
4251 |
this. Administrators often look for such informa- |
4252 |
tion when deciding whether to comply with cre- |
4253 |
ation/deletion requests. |
4254 |
|
4255 |
A rmgroup message which lacks an Approved header MUST be |
4256 |
ignored. |
4257 |
|
4258 |
NOTE: It would also be desirable to ignore a |
4259 |
rmgroup message unless its Approved header names a |
4260 |
person who is authorized (in some sense) to delete |
4261 |
such a newsgroup. A cooperating subnet with suf- |
4262 |
ficiently strong coordination to maintain a cor- |
4263 |
rect and current list of authorized deleters might |
4264 |
wish to do so for its internal newsgroups. It |
4265 |
also (or alternatively) might wish to ignore a |
4266 |
rmgroup message for an internal newsgroup that was |
4267 |
posted (or cross-posted) to a non-internal news- |
4268 |
group. |
4269 |
|
4270 |
Unexpected deletion of a newsgroup being a disruptive |
4271 |
action, implementations are strongly advised to refer |
4272 |
rmgroup messages to an administrator by default, unless per- |
4273 |
haps the message can be determined to have originated within |
4274 |
a cooperating subnet whose members are considered trustwor- |
4275 |
thy. Abuses have occurred. |
4276 |
|
4277 |
|
4278 |
7.5. sendsys, version, whogets |
4279 |
|
4280 |
The sendsys message requests that a description of the |
4281 |
relayer's news feeds to other relayers be mailed to the |
4282 |
article's reply address: |
4283 |
|
4284 |
|
4285 |
|
4286 |
|
4287 |
|
4288 |
2 June 1994 - 65 - expires 15 July 1994 |
4289 |
|
4290 |
|
4291 |
|
4292 |
|
4293 |
|
4294 |
INTERNET DRAFT to be NEWS sec. 7.5 |
4295 |
|
4296 |
|
4297 |
sendsys-arguments = [ relayer-name ] |
4298 |
sendsys-body = body |
4299 |
|
4300 |
If there is an argument, relayers other than the one named |
4301 |
by the argument MUST not respond. The body is a comment, |
4302 |
which software MUST ignore; it SHOULD contain an explanation |
4303 |
of the reason for the request. |
4304 |
|
4305 |
The version message requests that the name and version of |
4306 |
the relayer software be mailed to the reply address: |
4307 |
|
4308 |
version-arguments = |
4309 |
version-body = body |
4310 |
|
4311 |
There are no arguments. The body is a comment, which soft- |
4312 |
ware MUST ignore; it SHOULD contain an explanation of the |
4313 |
reason for the request. |
4314 |
|
4315 |
The whogets message requests that a description of the |
4316 |
relayer and its news feeds to other relayers be mailed to |
4317 |
the article's reply address: |
4318 |
|
4319 |
whogets-arguments = newsgroup-name [ space relayer-name ] |
4320 |
whogets-body = body |
4321 |
|
4322 |
The first argument is the name of the "target newsgroup", |
4323 |
specifying the newsgroup for which propagation information |
4324 |
is desired. This MUST be a complete newsgroup name, not the |
4325 |
name of a hierarchy or a portion of a newsgroup name that is |
4326 |
not itself the name of a newsgroup. If there is a second |
4327 |
argument, only the relayer named by that argument should |
4328 |
respond. The body is a comment, which software MUST ignore; |
4329 |
it SHOULD contain an explanation of the reason for the |
4330 |
request. |
4331 |
|
4332 |
NOTE: Whogets is intended as a replacement for |
4333 |
sendsys (and version) with a precisely-specified |
4334 |
reply format. Since the syntax for specifying |
4335 |
what newsgroups get sent to what other relayers |
4336 |
varies widely between different forms of relayer |
4337 |
software, the only practical way to standardize |
4338 |
the reply format is to indicate a specific news- |
4339 |
group and ask where THAT newsgroup propagates. |
4340 |
The requirement that it be a complete newsgroup |
4341 |
name is intended to (largely) avoid the problem of |
4342 |
having to answer "yes and no" in cases where not |
4343 |
all newsgroups in a hierarchy are sent. |
4344 |
|
4345 |
Any of these messages lacking an Approved header MUST be |
4346 |
ignored. Response to any of these messages SHOULD be |
4347 |
delayed for at least 24 hours, and no response should be |
4348 |
attempted if the message has been cancelled in that time. |
4349 |
Also, no response SHOULD be attempted unless the local part |
4350 |
of the destination address is "newsmap". News |
4351 |
|
4352 |
|
4353 |
|
4354 |
2 June 1994 - 66 - expires 15 July 1994 |
4355 |
|
4356 |
|
4357 |
|
4358 |
|
4359 |
|
4360 |
INTERNET DRAFT to be NEWS sec. 7.5 |
4361 |
|
4362 |
|
4363 |
administrators SHOULD arrange for mail to "newsmap" on their |
4364 |
systems to be discarded (without reply) unless legitimate |
4365 |
use is in progress. |
4366 |
|
4367 |
NOTE: Because these messages can cause many, many |
4368 |
relayers to send mail to one person, such mes- |
4369 |
sages, specifying mailing to an innocent person's |
4370 |
mailbox, have been forged as a half-witted practi- |
4371 |
cal joke. A delay gives administrators time to |
4372 |
notice a fraudulent message and act (by cancelling |
4373 |
the message, preparing to divert the flood of mail |
4374 |
into the bit bucket, or both). Restriction of the |
4375 |
destination address to "newsmap" reduces the |
4376 |
appeal of fraud by making it impossible to use it |
4377 |
to harass a normal user. (A site which does NOT |
4378 |
discard mail to "newsmap", but rather bounces it |
4379 |
back, may incur higher communications costs than |
4380 |
if the mail had been accepted into a user's mail- |
4381 |
box... but a malicious forger could accomplish |
4382 |
this anyway, by using an address whose local part |
4383 |
is very unlikely to be a legitimate mailbox name.) |
4384 |
|
4385 |
NOTE: RFC 1036 did not require the Approved header |
4386 |
for these control messages. This has been added |
4387 |
because of the possibility that cryptographic |
4388 |
authentication of Approved headers will become |
4389 |
available. |
4390 |
|
4391 |
The body of the reply to a sendsys message SHOULD be of the |
4392 |
form: |
4393 |
|
4394 |
sendsys-reply = responder 1*sys-line |
4395 |
responder = "Responding-System:" space domain eol |
4396 |
sys-line = relayer-name ":" newsgroup-patterns [ ":" text ] eol |
4397 |
newsgroup-patterns = newsgroup-name *( "," newsgroup-name ) |
4398 |
|
4399 |
The first line identifies the responding system, using a |
4400 |
syntax resembling a header (but note that it is part of the |
4401 |
BODY). Remaining lines indicate what newsgroups are sent to |
4402 |
what other systems. The syntax of newsgroup patterns is not |
4403 |
well standardized; the form described is common (often with |
4404 |
newsgroup names only partially given, denoting all names |
4405 |
starting with a particular set of components) but not uni- |
4406 |
versal. The whogets message provides a better-defined |
4407 |
alternative. |
4408 |
|
4409 |
The reply to a version message is of somewhat ill-defined |
4410 |
form, with a body normally consisting of a single line of |
4411 |
text that somehow describes the version of the relayer soft- |
4412 |
ware. The whogets message provides a better-defined alter- |
4413 |
native. |
4414 |
|
4415 |
The body of the reply to a whogets message MUST be of the |
4416 |
form: |
4417 |
|
4418 |
|
4419 |
|
4420 |
2 June 1994 - 67 - expires 15 July 1994 |
4421 |
|
4422 |
|
4423 |
|
4424 |
|
4425 |
|
4426 |
INTERNET DRAFT to be NEWS sec. 7.5 |
4427 |
|
4428 |
|
4429 |
whogets-reply = responder-domain responder-relayer response-date |
4430 |
responding-to arrived-via responder-version |
4431 |
whogets-delimiter *pass-line |
4432 |
responder-domain = "Responding-System:" space domain eol |
4433 |
responder-relayer = "Responding-Relayer:" space relayer-name eol |
4434 |
response-date = "Response-Date:" space date eol |
4435 |
responding-to = "Responding-To:" space message-id eol |
4436 |
arrived-via = "Arrived-Via:" path-list eol |
4437 |
responder-version = "Responding-Version:" space nonblank-text eol |
4438 |
whogets-delimiter = eol |
4439 |
pass-line = relayer-name [ space domain ] eol |
4440 |
|
4441 |
The first six lines identify the responding relayer by its |
4442 |
Internet domain name (use of the ".uucp" and ".bitnet" |
4443 |
pseudo-domains is permissible, for registered hosts in them, |
4444 |
but discouraged) and its relayer name, specify the date when |
4445 |
the reply was generated and the message ID of the whogets |
4446 |
message being replied to, give the path list (from the Path |
4447 |
header) of the whogets message (which MAY, if absolutely |
4448 |
necessary, be truncated to a convenient length, but MUST |
4449 |
contain at least the leading three relayer names), and indi- |
4450 |
cate the version of relayer software responding. Note that |
4451 |
these lines are part of the BODY even though their format |
4452 |
resembles that of headers. Despite the apparently-fixed |
4453 |
order specified by the syntax above, they can appear in any |
4454 |
order, but there must be exactly one of each. |
4455 |
|
4456 |
After those preliminaries, and an empty line to unambigu- |
4457 |
ously define their end, the remaining lines are the relayer |
4458 |
names (which MAY be accompanied by the corresponding domain |
4459 |
names, if known) of systems which the responding system |
4460 |
passes the target newsgroup to. Only the names of news |
4461 |
relayers are to be included. |
4462 |
|
4463 |
NOTE: It is desirable for a reply to identify its |
4464 |
source by both domain name and relayer name |
4465 |
because news propagation is governed by the latter |
4466 |
but location in a broader context is best deter- |
4467 |
mined by the former. The date and whogets message |
4468 |
ID should, in principle, be present in the MAIL |
4469 |
headers, but are included in the body for robust- |
4470 |
ness in the presence of uncooperative mail sys- |
4471 |
tems. The reason for the path list is discussed |
4472 |
below. Adding version information eliminates the |
4473 |
need for a separate message to gather it. |
4474 |
|
4475 |
NOTE: The limitation of pass lines to contain only |
4476 |
names of news relayers is meant to exclude names |
4477 |
used within a single host (as identifiers for mail |
4478 |
gateways, portions of ihave/sendme implementa- |
4479 |
tions, etc.), which do not actually refer to other |
4480 |
hosts. |
4481 |
|
4482 |
|
4483 |
|
4484 |
|
4485 |
|
4486 |
2 June 1994 - 68 - expires 15 July 1994 |
4487 |
|
4488 |
|
4489 |
|
4490 |
|
4491 |
|
4492 |
INTERNET DRAFT to be NEWS sec. 7.5 |
4493 |
|
4494 |
|
4495 |
A relayer which is unaware of the existence of the target |
4496 |
newsgroup MUST not reply to a whogets message at all, |
4497 |
although this MUST not influence decisions on whether to |
4498 |
pass the article on to other relayers. |
4499 |
|
4500 |
NOTE: While this may result in discontinuous maps |
4501 |
in cases where some hosts have not honored |
4502 |
requests for creation of a newsgroup, it will also |
4503 |
prevent a flood of useless responses in the event |
4504 |
that a whogets message intended to map a small |
4505 |
region "leaks" out to a larger one. The possibil- |
4506 |
ity of discontinuous recognition of a newsgroup |
4507 |
does make it important that the whogets message |
4508 |
itself continue to propagate (if other criteria |
4509 |
permit). This is also the reason for the inclu- |
4510 |
sion of the whogets message's path list, or at |
4511 |
least the leading portion of it, in the reply: to |
4512 |
permit reconstruction of at least small gaps in |
4513 |
maps. |
4514 |
|
4515 |
Different networks set different rules for the legitimacy of |
4516 |
these messages, given that they may reveal details of orga- |
4517 |
nization-internal topology that are sometimes considered |
4518 |
proprietary. |
4519 |
|
4520 |
NOTE: On Usenet, in particular, willingness to |
4521 |
respond to these messages is held to be a condi- |
4522 |
tion of network membership: the topology of Usenet |
4523 |
is public information. Organizations wishing to |
4524 |
belong to such networks while keeping their inter- |
4525 |
nal topology confidential might wish to organize |
4526 |
their internal news software so that all articles |
4527 |
reaching outsiders appear to be from a single |
4528 |
"gatekeeper" system, with the details of internal |
4529 |
topology hidden behind that system. |
4530 |
|
4531 |
UNRESOLVED ISSUE: It might be useful to have a way |
4532 |
to set some sort of hop limit for these. |
4533 |
|
4534 |
|
4535 |
7.6. checkgroups |
4536 |
|
4537 |
The checkgroups control message contains a supposedly |
4538 |
authoritative list of the valid newsgroups within some sub- |
4539 |
set of the newsgroup name space: |
4540 |
|
4541 |
checkgroups-arguments = |
4542 |
checkgroups-body = [ invalidation ] valid-groups |
4543 |
/ invalidation |
4544 |
invalidation = "!" plain-component *( "," plain-component ) eol |
4545 |
valid-groups = 1*( description-line eol ) |
4546 |
|
4547 |
There are no arguments. The body lines (except possibly for |
4548 |
an initial invalidation) each contain a description line for |
4549 |
|
4550 |
|
4551 |
|
4552 |
2 June 1994 - 69 - expires 15 July 1994 |
4553 |
|
4554 |
|
4555 |
|
4556 |
|
4557 |
|
4558 |
INTERNET DRAFT to be NEWS sec. 7.6 |
4559 |
|
4560 |
|
4561 |
a newsgroup, as defined under the newgroup message (section |
4562 |
7.3). |
4563 |
|
4564 |
NOTE: Some other, ill-defined, forms of the check- |
4565 |
groups body were formerly used. See appendix A. |
4566 |
|
4567 |
The checkgroups message applies to all hierarchies contain- |
4568 |
ing any of the newsgroups listed in the body. The check- |
4569 |
groups message asserts that the newsgroups it lists are the |
4570 |
only newsgroups in those hierarchies. If there is an inval- |
4571 |
idation, it asserts that the hierarchies it names no longer |
4572 |
contain any newsgroups. |
4573 |
|
4574 |
Processing a checkgroups message MAY cause a local list of |
4575 |
newsgroup descriptions to be updated. It SHOULD also cause |
4576 |
the local lists of newsgroups (and their moderation sta- |
4577 |
tuses) in the mentioned hierarchies to be checked against |
4578 |
the message. The results of the check MAY be used for auto- |
4579 |
matic corrective action, or MAY be reported to the news |
4580 |
administrator in some way. |
4581 |
|
4582 |
NOTE: Automatically updating descriptions of |
4583 |
existing newsgroups is relatively safe. In the |
4584 |
case of newsgroup additions or deletions, simply |
4585 |
notifying the administrator is generally the wis- |
4586 |
est action, unless perhaps the message can be |
4587 |
determined to have originated within a cooperating |
4588 |
subnet whose members are considered trustworthy. |
4589 |
|
4590 |
NOTE: There is a problem with the checkgroups con- |
4591 |
cept: not all newsgroups in a hierarchy necessar- |
4592 |
ily propagate to the same set of machines. |
4593 |
(Notably, there is a set of newsgroups known as |
4594 |
the "inet" newsgroups, which have relatively lim- |
4595 |
ited distribution but coexist in several hierar- |
4596 |
chies with more widely-distributed newsgroups.) |
4597 |
The advice of checkgroups should always be taken |
4598 |
with a grain of salt, and should never be followed |
4599 |
blindly. |
4600 |
|
4601 |
|
4602 |
8. Transmission Formats |
4603 |
|
4604 |
While this Draft does not specify transmission methods |
4605 |
except to place a few constraints on them, there are some |
4606 |
data formats used only for transmission that are unique to |
4607 |
news. |
4608 |
|
4609 |
|
4610 |
8.1. Batches |
4611 |
|
4612 |
For efficient bulk transmission and processing of news arti- |
4613 |
cles, it is often desirable to transmit a number of them as |
4614 |
a single block of data, a "batch". The format of a batch |
4615 |
|
4616 |
|
4617 |
|
4618 |
2 June 1994 - 70 - expires 15 July 1994 |
4619 |
|
4620 |
|
4621 |
|
4622 |
|
4623 |
|
4624 |
INTERNET DRAFT to be NEWS sec. 8.1 |
4625 |
|
4626 |
|
4627 |
is: |
4628 |
|
4629 |
batch = 1*( batch-header article ) |
4630 |
batch-header = "#! rnews " article-size eol |
4631 |
article-size = 1*digit |
4632 |
|
4633 |
A batch is a sequence of articles, each prefixed by a header |
4634 |
line that includes its size. The article size is a decimal |
4635 |
count of the octets in the article, counting each EOL as one |
4636 |
octet regardless of how it is actually represented. |
4637 |
|
4638 |
NOTE: A relayer might wish to accept either a sin- |
4639 |
gle article or a batch as input. Since "#" cannot |
4640 |
appear in a header name, examination of the first |
4641 |
octet of the input will reveal its nature. |
4642 |
|
4643 |
NOTE: In the header line, there is exactly one |
4644 |
blank before "rnews", there is exactly one blank |
4645 |
after "rnews", and the EOL immediately follows the |
4646 |
article size. Beware that some software inserts |
4647 |
non-standard trash after the size. |
4648 |
|
4649 |
NOTE: Despite the similarity of this format to the |
4650 |
executable-script format used by some operating |
4651 |
systems, it is EXTREMELY unwise to just feed |
4652 |
incoming batches to a command interpreter in the |
4653 |
anticipation that it will run a command named |
4654 |
"rnews" to process the batch. Unless arrangements |
4655 |
are made to very tightly restrict the range of |
4656 |
commands that can be executed by this means, the |
4657 |
security implications are disastrous. |
4658 |
|
4659 |
|
4660 |
8.2. Encoded Batches |
4661 |
|
4662 |
When transmitting news, especially over communications links |
4663 |
that are slow or are billed by the bit, it is often desir- |
4664 |
able to batch news and apply data compression to the |
4665 |
batches. Transmission links sending compressed batches |
4666 |
SHOULD use out-of-band means of communication to specify the |
4667 |
compression algorithm being used. If there is no way to |
4668 |
send out-of-band information along with a batch, the follow- |
4669 |
ing encapsulation for a compressed batch MAY be used: |
4670 |
|
4671 |
ec-batch = "#! " compression-keyword eol compressed-batch |
4672 |
compression-keyword = "cunbatch" |
4673 |
|
4674 |
A line containing a keyword indicating the type of compres- |
4675 |
sion is followed by the compressed batch. The only truly |
4676 |
widespread compression keyword at present is "cunbatch", |
4677 |
indicating compression using the widely-distributed "com- |
4678 |
press" program. Other compression keywords MAY be used by |
4679 |
mutual agreement between the hosts involved. |
4680 |
|
4681 |
|
4682 |
|
4683 |
|
4684 |
2 June 1994 - 71 - expires 15 July 1994 |
4685 |
|
4686 |
|
4687 |
|
4688 |
|
4689 |
|
4690 |
INTERNET DRAFT to be NEWS sec. 8.2 |
4691 |
|
4692 |
|
4693 |
NOTE: An encapsulated compressed batch is NOT, in |
4694 |
general, a text file, despite having an initial |
4695 |
text line. This combination of text and non-text |
4696 |
data is often awkward to handle; for example, |
4697 |
standard decompression programs cannot be used |
4698 |
without first stripping off the initial line, and |
4699 |
that in turn is painful to do because many text- |
4700 |
handling tools that are superficially suited to |
4701 |
the job do not cope well with non-text data. |
4702 |
Hence the recommendation that out-of-band communi- |
4703 |
cation be used instead when possible. |
4704 |
|
4705 |
NOTE: For UUCP transmission, where a batch is typ- |
4706 |
ically transmitted by invoking the remote command |
4707 |
"rnews" with the batch as its input stream, a |
4708 |
plausible out-of-band method for indicating a com- |
4709 |
pression type would be to give a compression key- |
4710 |
word in an option to "rnews", perhaps in the form: |
4711 |
|
4712 |
rnews -d decompressor |
4713 |
|
4714 |
where "decompressor" is the name of a decompres- |
4715 |
sion program (e.g. "uncompress" for a batch com- |
4716 |
pressed with "compress" or "gunzip" for a batch |
4717 |
compressed with "gzip"). How this decompression |
4718 |
program is located and invoked by the receiving |
4719 |
relayer is implementation-specific. |
4720 |
|
4721 |
NOTE: See the notes in section 8.1 on the inadvis- |
4722 |
ability of feeding batches directly to command |
4723 |
interpreters. |
4724 |
|
4725 |
NOTE: There is exactly one blank between "#!" and |
4726 |
the compression keyword, and the EOL immediately |
4727 |
follows the keyword. |
4728 |
|
4729 |
|
4730 |
8.3. News Within Mail |
4731 |
|
4732 |
It is often desirable to transmit news as mail, either for |
4733 |
the convenience of a human recipient or because that is the |
4734 |
only type of transmission available on a restrictive commu- |
4735 |
nication path. |
4736 |
|
4737 |
Given the similarity between the news format and the MAIL |
4738 |
format, it is superficially attractive to just send the news |
4739 |
article as a mail message. This is typically a mistake: |
4740 |
mail-handling software often feels free to manipulate vari- |
4741 |
ous headers in undesirable ways (in some cases, such as |
4742 |
Sender, such manipulation is actually mandatory), and mail |
4743 |
transmission problems etc. MUST be reported to the adminis- |
4744 |
trators responsible for the mail transmission rather than to |
4745 |
the article's author. In general, news sent as mail should |
4746 |
be encapsulated to separate the mail headers and the news |
4747 |
|
4748 |
|
4749 |
|
4750 |
2 June 1994 - 72 - expires 15 July 1994 |
4751 |
|
4752 |
|
4753 |
|
4754 |
|
4755 |
|
4756 |
INTERNET DRAFT to be NEWS sec. 8.3 |
4757 |
|
4758 |
|
4759 |
headers. |
4760 |
|
4761 |
When the intended recipient is a human, any convenient form |
4762 |
of encapsulation may be used. Recommended practice is to |
4763 |
use MIME encapsulation with a content type of "mes- |
4764 |
sage/news", given that news articles have additional seman- |
4765 |
tics beyond what "message/rfc822" implies. |
4766 |
|
4767 |
NOTE: "message/news" was registered as a standard |
4768 |
subtype by IANA 22 June 1993. |
4769 |
|
4770 |
When mail is being used as a transmission path between two |
4771 |
relayers, however, a standard method is desirable. Cur- |
4772 |
rently the standard method is to send the mail to an address |
4773 |
whose local part is "rnews", with whatever mail headers are |
4774 |
necessary for successful transmission. The news article |
4775 |
(including its headers) is sent as the body of the mail mes- |
4776 |
sage, with an "N" prepended to each line. |
4777 |
|
4778 |
NOTE: The "N" reduces the probability of an inno- |
4779 |
cent line in a news article being taken as a magic |
4780 |
command to mail software, and makes it easy for |
4781 |
receiving software to strip off any lines added by |
4782 |
mail software (e.g. the trailing empty line added |
4783 |
by some UUCP mail software). |
4784 |
|
4785 |
This method has its weaknesses. In particular, it assumes |
4786 |
that the mail transmission channel can transmit nearly- |
4787 |
arbitrary body text undamaged. When mail is being used as a |
4788 |
transmission path of last resort, however, the mail system |
4789 |
often has inconvenient preconceived notions about the format |
4790 |
of message bodies. Various ad-hoc encoding schemes have |
4791 |
been used to avoid such problems. The recommended method is |
4792 |
to send a news article or batch as the body of a MIME mail |
4793 |
message, using content type "application/news-transmission" |
4794 |
and MIME's "base64" encoding (which is specifically designed |
4795 |
to survive all known major mail systems). |
4796 |
|
4797 |
NOTE: In the process, MIME conventions could be |
4798 |
used to fragment and reassemble an article which |
4799 |
is too large to be sent as a single mail message |
4800 |
over a transmission path that restricts message |
4801 |
length. In addition, the "conversions" parameter |
4802 |
to the content type could be used to indicate what |
4803 |
(if any) compression method has been used. And |
4804 |
the Content-MD5 header [rrr 1544] can be used as a |
4805 |
"checksum" to provide high confidence of detecting |
4806 |
accidental damage to the contents. |
4807 |
|
4808 |
UNRESOLVED ISSUE: The "conversions" parameter no |
4809 |
longer exists. What should be done about this, if |
4810 |
anything? |
4811 |
|
4812 |
|
4813 |
|
4814 |
|
4815 |
|
4816 |
2 June 1994 - 73 - expires 15 July 1994 |
4817 |
|
4818 |
|
4819 |
|
4820 |
|
4821 |
|
4822 |
INTERNET DRAFT to be NEWS sec. 8.3 |
4823 |
|
4824 |
|
4825 |
NOTE: It might look tempting to use a content type |
4826 |
such as "message/X-netnews", but MIME bans non- |
4827 |
trivial encodings of the entire body of messages |
4828 |
with content type "message". The intent is to |
4829 |
avoid obscuring nested structure underneath encod- |
4830 |
ings. For inter-relayer news transmission, there |
4831 |
is no nested structure of interest, and it is |
4832 |
important that the entire article (including its |
4833 |
headers, not just its body) be protected against |
4834 |
the vagaries of intervening mail software. This |
4835 |
situation appears to fit the MIME description of |
4836 |
circumstances in which "application" is the proper |
4837 |
content type. |
4838 |
|
4839 |
NOTE: "application/news-transmission", with a |
4840 |
"conversions" parameter, was registered as a stan- |
4841 |
dard subtype by IANA 22 June 1993. |
4842 |
|
4843 |
UNRESOLVED ISSUE: The "conversions" parameter no |
4844 |
longer exists in MIME. What should we do about |
4845 |
this? |
4846 |
|
4847 |
|
4848 |
8.4. Partial Batches |
4849 |
|
4850 |
UNRESOLVED ISSUE: The existing batch conventions |
4851 |
assemble (potentially) many articles into one |
4852 |
batch. Handling very large articles would be sub- |
4853 |
stantially less troublesome if there was also a |
4854 |
fragmentation convention for splitting a large |
4855 |
article into several batches. Is this worth |
4856 |
defining at this time? |
4857 |
|
4858 |
|
4859 |
9. Propagation and Processing |
4860 |
|
4861 |
Most aspects of news propagation and processing are imple- |
4862 |
mentation-specific. The basic propagation algorithms, and |
4863 |
certain details of how they are implemented, nevertheless |
4864 |
need to be standard. |
4865 |
|
4866 |
There are two important principles that news implementors |
4867 |
(and administrators) need to keep in mind. The first is the |
4868 |
well-known Internet Robustness Principle: |
4869 |
|
4870 |
Be liberal in what you accept, and conservative in what you send. |
4871 |
|
4872 |
However, in the case of news there is an even more important |
4873 |
principle, derived from a much older code of practice, the |
4874 |
Hippocratic Oath (we will thus call this the Hippocratic |
4875 |
Principle): |
4876 |
|
4877 |
First, do no harm. |
4878 |
|
4879 |
|
4880 |
|
4881 |
|
4882 |
2 June 1994 - 74 - expires 15 July 1994 |
4883 |
|
4884 |
|
4885 |
|
4886 |
|
4887 |
|
4888 |
INTERNET DRAFT to be NEWS sec. 9 |
4889 |
|
4890 |
|
4891 |
It is VITAL to realize that decisions which might be merely |
4892 |
suboptimal in a smaller context can become devastating mis- |
4893 |
takes when amplified by the actions of thousands of hosts |
4894 |
within a few hours. |
4895 |
|
4896 |
|
4897 |
9.1. Relayer General Issues |
4898 |
|
4899 |
Relayers MUST not alter the content of articles unnecessar- |
4900 |
ily. Well-intentioned attempts to "improve" headers, in |
4901 |
particular, typically do more harm than good. It is neces- |
4902 |
sary for a relayer to prepend its own name to the Path con- |
4903 |
tent (see section 5.6) and permissible for it to rewrite or |
4904 |
delete the Xref header (see section 6.12). Relayers MAY |
4905 |
delete the thoroughly-obsolete headers described in appendix |
4906 |
A.3, although this behavior no longer seems useful enough to |
4907 |
encourage. Other alterations SHOULD be avoided at all |
4908 |
costs, as per the Hippocratic Principle. |
4909 |
|
4910 |
NOTE: As discussed in section 2.3, tidying up the |
4911 |
headers of a user-prepared article is the job of |
4912 |
the posting agent, not the relayer. The relayer's |
4913 |
purpose is to move already-compliant articles |
4914 |
around efficiently without damaging them. Note |
4915 |
that in existing implementations, specific pro- |
4916 |
grams may contain both posting-agent functions and |
4917 |
relayer functions. The distinction is that post- |
4918 |
ing-agent functions are invoked only on articles |
4919 |
posted by local posters, never on articles |
4920 |
received from other relayers. |
4921 |
|
4922 |
NOTE: A particular corollary of this rule is that |
4923 |
relayers should not add headers unless truly nec- |
4924 |
essary. In particular, this is not SMTP; do not |
4925 |
add Received headers. |
4926 |
|
4927 |
Relayers MUST not pass non-conforming articles on to other |
4928 |
relayers, except perhaps in a cooperating subnet that has |
4929 |
agreed to permit certain kinds of non-conforming behavior. |
4930 |
This is a direct consequence of the Internet Robustness |
4931 |
Principle. |
4932 |
|
4933 |
The two preceding paragraphs may appear to be in conflict. |
4934 |
What is to be done when a non-conforming article is |
4935 |
received? The Robustness Principle argues that it should be |
4936 |
accepted but must not be passed on to other relayers while |
4937 |
still non-conforming, and the Hippocratic Principle strongly |
4938 |
discourages attempts at repair. The conclusion that this |
4939 |
appears to lead to is correct: a non-conforming article MAY |
4940 |
be accepted for local filing and processing, or it MAY be |
4941 |
discarded entirely, but it MUST not be passed on to other |
4942 |
relayers. |
4943 |
|
4944 |
|
4945 |
|
4946 |
|
4947 |
|
4948 |
2 June 1994 - 75 - expires 15 July 1994 |
4949 |
|
4950 |
|
4951 |
|
4952 |
|
4953 |
|
4954 |
INTERNET DRAFT to be NEWS sec. 9.1 |
4955 |
|
4956 |
|
4957 |
A relayer MUST not respond to the arrival of an article by |
4958 |
sending mail to any destination, other than a local adminis- |
4959 |
trator, except by explicit prearrangement with the recipi- |
4960 |
ent. Neither posting an article (other than certain types |
4961 |
of control message, see section 7.5) nor being the moderator |
4962 |
of a moderated newsgroup constitutes such prearrangement. |
4963 |
UNDER NO CIRCUMSTANCES WHATSOEVER may a relayer attempt to |
4964 |
send mail to either an article's originator or a moderator. |
4965 |
|
4966 |
NOTE: Reporting apparent errors in message compo- |
4967 |
sition is the job of a posting agent, not a |
4968 |
relayer. The same is true of mailing moderated- |
4969 |
newsgroup postings to moderators. In networks of |
4970 |
thousands of cooperating relayers, it is simply |
4971 |
unacceptable for there to be any circumstance |
4972 |
whatsoever that causes any significant fraction of |
4973 |
them to simultaneously send mail to the same des- |
4974 |
tination. (Some control messages are exceptions, |
4975 |
although perhaps ill-advised ones.) What might, |
4976 |
in a smaller network, be a useful notification or |
4977 |
forwarding becomes a deluge of near-identical mes- |
4978 |
sages that can bring mail software to its knees |
4979 |
and severely inconvenience recipients. Modera- |
4980 |
tors, in particular, historically have suffered |
4981 |
grievously from this. |
4982 |
|
4983 |
Notification of problems in incoming articles MAY go to |
4984 |
local administrators, or at most (by prearrangement!) to |
4985 |
the administrators of the neighboring relayer(s) that passed |
4986 |
on the problematic articles. |
4987 |
|
4988 |
NOTE: It would be desirable to notify the author |
4989 |
that his posting is not propagating as he expects. |
4990 |
However, there is no known method for doing this |
4991 |
that will scale up gracefully. (In particular, |
4992 |
"notify only if within N relayers of the origina- |
4993 |
tor" falls down in the presence of commercial news |
4994 |
services like UUNET: there may be hundreds or |
4995 |
thousands of relayers within a couple of hops of |
4996 |
the originator.) The best that can be done right |
4997 |
now is to notify neighbors, in hopes that the word |
4998 |
will eventually propagate up the line, or organize |
4999 |
regional monitoring at major hubs. |
5000 |
|
5001 |
If it is necessary to alter an article, e.g. translate it to |
5002 |
another character set or alter its EOL representation, |
5003 |
strenuous efforts should be made to ensure that such trans- |
5004 |
formations are reversible, and that relayers or other soft- |
5005 |
ware that might wish to reverse them know exactly how to do |
5006 |
so. |
5007 |
|
5008 |
NOTE: For example, a cooperating subnet that |
5009 |
exchanges articles using a non-ASCII character set |
5010 |
like EBCDIC should define a standard, reversible |
5011 |
|
5012 |
|
5013 |
|
5014 |
2 June 1994 - 76 - expires 15 July 1994 |
5015 |
|
5016 |
|
5017 |
|
5018 |
|
5019 |
|
5020 |
INTERNET DRAFT to be NEWS sec. 9.1 |
5021 |
|
5022 |
|
5023 |
ASCII-EBCDIC mapping and take pains to see that it |
5024 |
is used at all points where the subnet meets the |
5025 |
outside. If the only reason for using EBCDIC is |
5026 |
that the readers typically employ EBCDIC devices, |
5027 |
it would be more robust to employ ASCII as the |
5028 |
interchange format and do the transformation in |
5029 |
the reading and posting agents. |
5030 |
|
5031 |
|
5032 |
9.2. Article Acceptance And Propagation |
5033 |
|
5034 |
When a relayer first receives an article, it must decide |
5035 |
whether to accept it. (This applies regardless of whether |
5036 |
the article arrived by itself or as part of a batch, and in |
5037 |
principle regardless of whether it originated as a local |
5038 |
posting or as traffic from another relayer.) In a cooperat- |
5039 |
ing subnet with well-controlled propagation paths, some of |
5040 |
the tests specified here MAY be delegated to centrally- |
5041 |
located relayers; that is, relayers that can receive news |
5042 |
ONLY via one of the central relayers might simplify accep- |
5043 |
tance testing based on the assumption that incoming traffic |
5044 |
has already passed the full set of tests at a central |
5045 |
relayer. |
5046 |
|
5047 |
The wording that follows is based on a model in which arti- |
5048 |
cles arrive on a relayer's host before acceptance tests are |
5049 |
done. However, depending on the degree of integration of |
5050 |
the transport mechanisms and the relayer, some or all of |
5051 |
these tests MAY be done before the article is actually |
5052 |
transmitted, so that articles which definitely will not be |
5053 |
accepted need not be transmitted at all. |
5054 |
|
5055 |
The wording that follows also specifies a particular order |
5056 |
for the acceptance tests. While this order is the obvious |
5057 |
one, the tests MAY be done in any order. |
5058 |
|
5059 |
First, the relayer MUST verify that the article is a legal |
5060 |
news article, with all mandatory headers present with legal |
5061 |
contents. |
5062 |
|
5063 |
NOTE: This check in principle is done by the first |
5064 |
relayer to see an article, so an article received |
5065 |
from another relayer should always be legal, but |
5066 |
there is enough old software still operational |
5067 |
that this cannot be taken for granted; see the |
5068 |
discussion of the Internet Robustness Principle in |
5069 |
section 9.1. |
5070 |
|
5071 |
Second, the relayer MUST determine whether it has already |
5072 |
seen this article (identified by its message ID). This is |
5073 |
normally done by retaining a history of all article message |
5074 |
IDs seen in the last N days, where the value of N is decided |
5075 |
by the relayer's administrator but SHOULD be at least 7. |
5076 |
Since N cannot practically be infinite, articles whose Date |
5077 |
|
5078 |
|
5079 |
|
5080 |
2 June 1994 - 77 - expires 15 July 1994 |
5081 |
|
5082 |
|
5083 |
|
5084 |
|
5085 |
|
5086 |
INTERNET DRAFT to be NEWS sec. 9.2 |
5087 |
|
5088 |
|
5089 |
content indicates that they are older than N days are |
5090 |
declared "stale" and are deemed to have been seen already. |
5091 |
|
5092 |
NOTE: This check is important because news propa- |
5093 |
gation topology is typically redundant, often |
5094 |
highly so, and it is not at all uncommon for a |
5095 |
relayer to receive the same article from several |
5096 |
neighbors. The history of already-seen message |
5097 |
IDs can get quite large, hence the desire to limit |
5098 |
its length... but it is important that it be long |
5099 |
enough that slowly-propagating articles are not |
5100 |
classed as stale. News propagation within the |
5101 |
Internet is normally very rapid, but when UUCP |
5102 |
links are involved, end-to-end delays of several |
5103 |
days are not rare, so a week is not a particularly |
5104 |
generous minimum. |
5105 |
|
5106 |
NOTE: Despite generally more rapid propagation in |
5107 |
recent times, it is still not unheard-of for some |
5108 |
propagation paths to be very slow. This can |
5109 |
introduce the possibility of old articles arriving |
5110 |
again after they are gone from the history. Hence |
5111 |
the "stale" rule. |
5112 |
|
5113 |
Third, the relayer MUST determine whether any of the arti- |
5114 |
cle's newsgroups are "subscribed to" by the host, i.e. fit a |
5115 |
description of what hierarchies or newsgroups the site wants |
5116 |
to receive. |
5117 |
|
5118 |
NOTE: This check is significant because informa- |
5119 |
tion on what newsgroups a relayer wishes to |
5120 |
receive is often stored at its neighbors, who may |
5121 |
not have up-to-date information or may simplify |
5122 |
the rules for implementation reasons. As a hedge |
5123 |
against the possibility of missed or delayed new- |
5124 |
group control messages, relayers may wish to |
5125 |
observe a notion of a newsgroup subscription that |
5126 |
is independent of the list of newsgroups actually |
5127 |
known to the relayer. This would permit reception |
5128 |
and relaying of articles in newsgroups that the |
5129 |
relayer is not (yet) aware of, subject to more |
5130 |
general criteria indicating that they are likely |
5131 |
to be of interest. |
5132 |
|
5133 |
Once an article has been accepted, it may be passed on to |
5134 |
other relayers. The fundamental news propagation rule is a |
5135 |
flooding algorithm: on receiving and accepting an article, |
5136 |
send it to all neighboring relayers not already in its path |
5137 |
list that are sent its newsgroup(s) and distribution(s). |
5138 |
|
5139 |
NOTE: The path list's role in loop prevention may |
5140 |
appear relatively unimportant, given that looping |
5141 |
articles would typically be rejected as duplicates |
5142 |
anyway. However, the path list's role in |
5143 |
|
5144 |
|
5145 |
|
5146 |
2 June 1994 - 78 - expires 15 July 1994 |
5147 |
|
5148 |
|
5149 |
|
5150 |
|
5151 |
|
5152 |
INTERNET DRAFT to be NEWS sec. 9.2 |
5153 |
|
5154 |
|
5155 |
preventing superfluous transmissions is not triv- |
5156 |
ial. In particular, the path list is the only |
5157 |
thing that prevents relayer X, on receiving an |
5158 |
article from relayer Y, from sending it back to Y |
5159 |
again. (Indeed, the usual symptom of confusion |
5160 |
about relayer names is that incoming news loops |
5161 |
back in this manner.) The looping articles would |
5162 |
be rejected as duplicates, but doubling the commu- |
5163 |
nications load on every news transmission path is |
5164 |
not to be taken lightly! |
5165 |
|
5166 |
In general, relayers SHOULD not make propagation decisions |
5167 |
by "anticipation": relayer X, noting that the article's path |
5168 |
list already contains relayer Y, decides not to send it to |
5169 |
relayer Z because X anticipates that Z will get the article |
5170 |
by a better path. If that is generally true, then why is |
5171 |
there a news feed from X to Z at all? In fact, the "better |
5172 |
path" may be running slowly or may be down. News propaga- |
5173 |
tion is very robust precisely because some redundant trans- |
5174 |
mission is done "just in case". If it is imperative to |
5175 |
limit unnecessary traffic on a path, use of NNTP [rrr] or |
5176 |
ihave/sendme (see section 7.2) to pass articles only when |
5177 |
necessary is better than arbitrary decisions not to pass |
5178 |
articles at all. |
5179 |
|
5180 |
Anticipation is occasionally justified in special cases. |
5181 |
Such cases should involve both (1) a cooperating subnet |
5182 |
whose propagation paths are well-understood and well- |
5183 |
monitored, with failures and slowdowns noticed and dealt |
5184 |
with promptly, and (2) a persistent pattern of heavy unnec- |
5185 |
essary traffic on a path that is either slow or costly. In |
5186 |
addition, there should be some reason why neither NNTP nor |
5187 |
ihave/sendme is suitable as a solution to the problem. |
5188 |
|
5189 |
|
5190 |
9.3. Administrator Contact |
5191 |
|
5192 |
It is desirable to have a standardized contact address for a |
5193 |
relayer's administrators, in the spirit of the "postmaster" |
5194 |
address for mail administrators. Mail addressed to "news- |
5195 |
master" on a relayer's host MUST go to the administrator(s) |
5196 |
of that relayer. Mail addressed to "usenet" on the |
5197 |
relayer's host SHOULD be handled likewise. Mail addressed |
5198 |
to either address on other hosts using the same news |
5199 |
database SHOULD be handled likewise. |
5200 |
|
5201 |
NOTE: These addresses are case-sensitive, although |
5202 |
it would be desirable for sequences equivalent to |
5203 |
them using case-insensitive comparison to be han- |
5204 |
dled likewise. While "newsmaster" seems the pre- |
5205 |
ferred network-independent address, by analogy to |
5206 |
"postmaster", there is an existing practice of |
5207 |
using "usenet" for this purpose, and so "usenet" |
5208 |
should be supported if at all possible (especially |
5209 |
|
5210 |
|
5211 |
|
5212 |
2 June 1994 - 79 - expires 15 July 1994 |
5213 |
|
5214 |
|
5215 |
|
5216 |
|
5217 |
|
5218 |
INTERNET DRAFT to be NEWS sec. 9.3 |
5219 |
|
5220 |
|
5221 |
on hosts belonging to Usenet!). The address |
5222 |
`news" is also sometimes used for purposes like |
5223 |
this, but less consistently. |
5224 |
|
5225 |
|
5226 |
10. Gatewaying |
5227 |
|
5228 |
Gatewaying of traffic between news networks using this Draft |
5229 |
and those using other exchange mechanisms can be useful, but |
5230 |
must be done cautiously. Gateway administrators are taking |
5231 |
on significant responsibilities, and must recognize that the |
5232 |
consequences of error can be quite serious. |
5233 |
|
5234 |
|
5235 |
10.1. General Gatewaying Issues |
5236 |
|
5237 |
This section will primarily address the problems of gateway- |
5238 |
ing traffic INTO news networks. Little can be said about |
5239 |
the other direction without some specific knowledge of the |
5240 |
network(s) involved. However, the two issues are not |
5241 |
entirely independent: if a non-news network is gatewayed |
5242 |
into a news network at more than one point, traffic injected |
5243 |
into the non-news network by one gateway may appear at |
5244 |
another as a candidate for injection back into the news net- |
5245 |
work. |
5246 |
|
5247 |
This raises a more general principle, the single most impor- |
5248 |
tant issue for gatewaying: |
5249 |
|
5250 |
Above all, prevent loops. |
5251 |
|
5252 |
The normal loop prevention of news transmission is vitally |
5253 |
dependent on the Message-ID header. Any gateway which finds |
5254 |
it necessary to remove this header, alter it, or supersede |
5255 |
it (by moving it into the body), MUST take equally effective |
5256 |
precautions against looping. |
5257 |
|
5258 |
NOTE: There are few things more effective at turn- |
5259 |
ing news readers into a lynch mob than a malfunc- |
5260 |
tioning gateway, or pair of gateways, that takes |
5261 |
in news articles, mangles them just enough to pre- |
5262 |
vent news relayers from recognizing them as dupli- |
5263 |
cates, and regurgitates them back into the news |
5264 |
stream. This happens rather too often. |
5265 |
|
5266 |
Gateway implementors should realize that gateways have all |
5267 |
the responsibilities of relayers, plus the added complica- |
5268 |
tions introduced by transformations between different infor- |
5269 |
mation formats. Much of section 9's discussion of relayer |
5270 |
issues is relevant to gateways as well. In particular, |
5271 |
gateways SHOULD keep a history of recently-seen articles, as |
5272 |
described in section 9.2, and not assume that articles will |
5273 |
never reappear. This is particularly important for networks |
5274 |
that have their own concept analogous to message IDs: a |
5275 |
|
5276 |
|
5277 |
|
5278 |
2 June 1994 - 80 - expires 15 July 1994 |
5279 |
|
5280 |
|
5281 |
|
5282 |
|
5283 |
|
5284 |
INTERNET DRAFT to be NEWS sec. 10.1 |
5285 |
|
5286 |
|
5287 |
gateway should keep a history of traffic seen from BOTH |
5288 |
directions. |
5289 |
|
5290 |
If at all possible, articles entering the non-news network |
5291 |
SHOULD be marked in some way so that they will NOT be re- |
5292 |
gatewayed back into news. Multiple gateways obviously must |
5293 |
agree on the marking method used; if it is done by having |
5294 |
them know each others' names, name changes MUST be coordi- |
5295 |
nated with great care. If marking cannot be done, all |
5296 |
transformations MUST be reversible so that a re-gatewayed |
5297 |
article is identical to the original (except perhaps for a |
5298 |
longer Path header). |
5299 |
|
5300 |
Gateways MUST not pass control messages (articles containing |
5301 |
Control, Also-Control, or Supersedes headers) without remov- |
5302 |
ing the headers that make them control messages, unless |
5303 |
there are compelling reasons to believe that they are rele- |
5304 |
vant to both sides and that conventions are compatible. If |
5305 |
it is truly desirable to pass them unaltered, suitable pre- |
5306 |
cautions MUST be taken to ensure that there is NO POSSIBIL- |
5307 |
ITY of a looping control message. |
5308 |
|
5309 |
NOTE: The damage done by looping articles is mul- |
5310 |
tiplied a thousandfold if one of the affected |
5311 |
articles is something like a sendsys message (see |
5312 |
section 7.3) that requests multiple automatic |
5313 |
replies. Most gateways simply should not pass |
5314 |
control messages at all. If some unusual reason |
5315 |
dictates doing so, gateway implementors and admin- |
5316 |
istrators are urged to consider bulletproof rate- |
5317 |
limiting measures for the more destructive ones |
5318 |
like sendsys, e.g. passing only one per hour no |
5319 |
matter how many are offered. |
5320 |
|
5321 |
Gateways, like relayers, SHOULD make determined efforts to |
5322 |
avoid mangling articles unnecessarily. In the case of gate- |
5323 |
ways, some transformations may be inevitable, but keeping |
5324 |
them to a minimum and ensuring that they are reversible is |
5325 |
still highly desirable. |
5326 |
|
5327 |
Gateways MUST avoid destroying information. In particular, |
5328 |
the restrictions of section 4.2.2 are best taken with a |
5329 |
grain of salt in the context of gateways. Information that |
5330 |
does not translate directly into news headers SHOULD be |
5331 |
retained, perhaps in "X-" headers, both because it may be of |
5332 |
interest to sophisticated readers and because it may be cru- |
5333 |
cial to tracing propagation problems. |
5334 |
|
5335 |
Gateway implementors should take particular note of the dis- |
5336 |
cussion of mailed replies, or more precisely the ban on |
5337 |
same, in section 9.1. Gateway problems MUST be reported to |
5338 |
the local administration, not to the innocent originator of |
5339 |
traffic. "Gateway problems" here includes all forms of |
5340 |
propagation anomaly on the non-news side of the gateway, |
5341 |
|
5342 |
|
5343 |
|
5344 |
2 June 1994 - 81 - expires 15 July 1994 |
5345 |
|
5346 |
|
5347 |
|
5348 |
|
5349 |
|
5350 |
INTERNET DRAFT to be NEWS sec. 10.1 |
5351 |
|
5352 |
|
5353 |
e.g. unreachable addresses on a mailing list. Note that |
5354 |
this requires consideration of possible misbehavior of |
5355 |
"downstream" hosts, not just the gateway host. |
5356 |
|
5357 |
|
5358 |
10.2. Header Synthesis |
5359 |
|
5360 |
News articles prepared by gateways MUST be legal news arti- |
5361 |
cles. In particular, they MUST include all of the mandatory |
5362 |
headers (see section 5) and MUST fully conform to the |
5363 |
restrictions on said headers. This often requires that a |
5364 |
gateway function not only as a relayer, but also partly as a |
5365 |
posting agent, aiding in the synthesis of a conforming arti- |
5366 |
cle from non-conforming input. |
5367 |
|
5368 |
NOTE: The full-conformance requirement needs par- |
5369 |
ticularly careful attention when gatewaying mail- |
5370 |
ing lists to news, because a number of constructs |
5371 |
that are legal in MAIL headers are NOT permissible |
5372 |
in news headers. (Note also that not all mail |
5373 |
traffic fully conforms to even the MAIL specifica- |
5374 |
tion.) The rest of this section will be phrased |
5375 |
in terms of mail-to-news gatewaying, but most of |
5376 |
it is more generally applicable. |
5377 |
|
5378 |
The mandatory headers generally present few problems. |
5379 |
|
5380 |
If no date information is available, the gateway should sup- |
5381 |
ply a Date header with the gateway's current date. If only |
5382 |
partial information is available (e.g. date but not time), |
5383 |
this should be fleshed out to a full Date header by adding |
5384 |
default values, not by mixing in parts of the gateway's cur- |
5385 |
rent date. (Defaults should be chosen so that fleshed-out |
5386 |
dates will not be in the future!) It may be necessary to |
5387 |
map timezone information to the restricted forms permitted |
5388 |
in the news Date header. See section 5.1. |
5389 |
|
5390 |
NOTE: The prohibition of mixing dates is on the |
5391 |
theory that it is better to admit ignorance than |
5392 |
to lie. |
5393 |
|
5394 |
If the author's address as supplied in the original message |
5395 |
is not suitable for inclusion in a From header, the gateway |
5396 |
MUST transform it so it is, e.g. by use of the "% hack" and |
5397 |
the domain address of the gateway. The desire to preserve |
5398 |
information is NOT an excuse for violating the rules. If |
5399 |
the transformation is drastic enough that there is reason to |
5400 |
suspect loss of information, it may be desirable to include |
5401 |
the original form in an X- header, but the From header's |
5402 |
contents MUST be as specified in section 5.2. |
5403 |
|
5404 |
If the message contains a Message-ID header, the contents |
5405 |
should be dealt with as discussed in section 10.3. If there |
5406 |
is no message ID present, it will be necessary to synthesize |
5407 |
|
5408 |
|
5409 |
|
5410 |
2 June 1994 - 82 - expires 15 July 1994 |
5411 |
|
5412 |
|
5413 |
|
5414 |
|
5415 |
|
5416 |
INTERNET DRAFT to be NEWS sec. 10.2 |
5417 |
|
5418 |
|
5419 |
one, following the news rules (see section 5.3). |
5420 |
|
5421 |
Every effort should be made to produce a meaningful Subject |
5422 |
header; see section 5.4. Many news readers select articles |
5423 |
to read based on Subject headers, and inserting a place- |
5424 |
holder like "<no subject available>" is considered highly |
5425 |
objectionable. Even synthesizing a Subject header by pick- |
5426 |
ing out the first half-dozen nouns and adjectives in the |
5427 |
article body is better than using a placeholder, since it |
5428 |
offers SOME indication of what the article might contain. |
5429 |
|
5430 |
The contents of the Newsgroups header (section 5.5) are usu- |
5431 |
ally predetermined by gateway configuration, but a gateway |
5432 |
to a network that has its own concept of newsgroups or dis- |
5433 |
cussions might have to make transformations. Such transfor- |
5434 |
mations should be reversible; otherwise confusion is likely |
5435 |
on both sides. |
5436 |
|
5437 |
It will rarely be possible for gateways to provide a Path |
5438 |
header that is both an accurate history of the relayers the |
5439 |
article has passed through AS NEWS and a usable reply |
5440 |
address. The history function MUST be given priority; see |
5441 |
the discussion in section 5.6. It will usually be necessary |
5442 |
for a gateway to supply an empty path list, abandoning the |
5443 |
reply function. |
5444 |
|
5445 |
It is desirable for gatewayed articles to convey as much |
5446 |
useful information as possible, e.g. by use of optional news |
5447 |
headers (see section 6) when the relevant information is |
5448 |
available. Synthesis of optional headers can generally fol- |
5449 |
low similar rules. |
5450 |
|
5451 |
Software synthesizing References headers should note the |
5452 |
discussion in section 6.5 concerning the incompatibility |
5453 |
between MAIL and news. Also of interest is the possibility |
5454 |
of incorporating information from In-Reply-To headers and |
5455 |
from attribution lines in the body; an incomplete or some- |
5456 |
what conjectural References header is much better than none |
5457 |
at all, and reading agents already have to cope with incom- |
5458 |
plete or slightly erroneous References lists. |
5459 |
|
5460 |
|
5461 |
10.3. Message ID Mapping |
5462 |
|
5463 |
This section, like the previous one, is phrased in terms of |
5464 |
mail being gatewayed into news, but most of the discussion |
5465 |
should be more generally applicable. |
5466 |
|
5467 |
A particularly sticky problem of gatewaying mail into news |
5468 |
is supplying legal news message IDs. Note, in particular, |
5469 |
that not all MAIL message IDs are legal in news; the news |
5470 |
syntax (specified in section 5.3, with related material in |
5471 |
5.2) is more restrictive. Generating a fully-conforming |
5472 |
news article from a mail message may require transforming |
5473 |
|
5474 |
|
5475 |
|
5476 |
2 June 1994 - 83 - expires 15 July 1994 |
5477 |
|
5478 |
|
5479 |
|
5480 |
|
5481 |
|
5482 |
INTERNET DRAFT to be NEWS sec. 10.3 |
5483 |
|
5484 |
|
5485 |
the message ID somewhat. |
5486 |
|
5487 |
Generation and transformation of message IDs assumes partic- |
5488 |
ular importance if a given mailing list (or whatever) is |
5489 |
being handled by more than one gateway. It is highly desir- |
5490 |
able that the same article contents not appear twice in the |
5491 |
same newsgroup, which requires that they receive the same |
5492 |
message ID from all gateways. Gateways SHOULD use the fol- |
5493 |
lowing algorithm (possibly modified by the later discussion |
5494 |
of gatewaying into more than one newsgroup) unless local |
5495 |
considerations dictate another: |
5496 |
|
5497 |
1. Separate message ID from surroundings, if necessary. |
5498 |
A plausible method for this is to start at the first |
5499 |
"<", end at the next ">", and reject the message if |
5500 |
no ">" is found or a second "<" is seen before the |
5501 |
">". Also reject the message if the message ID con- |
5502 |
tains no "@" or more than one "@", or if it contains |
5503 |
no ".". Also reject the message if the message ID |
5504 |
contains non-ASCII characters, ASCII control charac- |
5505 |
ters, or white space. |
5506 |
|
5507 |
NOTE: Any legitimate domain will include at |
5508 |
least one ".". RFC 822 section 6.2.2 forbids |
5509 |
white space in this context when passing mail |
5510 |
on to non-MAIL software. |
5511 |
|
5512 |
2. Delete the leading "<" and trailing ">". Separate |
5513 |
message ID into local part and domain at the "@". |
5514 |
|
5515 |
3. In both components, transliterate leading dots |
5516 |
(".", ASCII 46), trailing dots, and dots after the |
5517 |
first in sequences of two or more consecutive |
5518 |
dots, into underscores (ASCII 95). |
5519 |
|
5520 |
4. In both components, transliterate disallowed char- |
5521 |
acters other than dots (see the definition of |
5522 |
<unquoted-char> in section 5.2) to underscores |
5523 |
(ASCII 95). |
5524 |
|
5525 |
5. Form the message ID as |
5526 |
|
5527 |
"<" local-part "@" domain ">" |
5528 |
|
5529 |
|
5530 |
NOTE: This algorithm is approximately that of Rich |
5531 |
Salz's successful gatewaying package. |
5532 |
|
5533 |
Despite the desire to keep message IDs consistent across |
5534 |
multiple gateways, there is also a more subtle issue that |
5535 |
can require a different approach. If the same articles are |
5536 |
being gatewayed into more than one newsgroup, and it is not |
5537 |
possible to arrange that all gateways gateway them to the |
5538 |
same cross-posted set of newsgroups, then the message IDs in |
5539 |
|
5540 |
|
5541 |
|
5542 |
2 June 1994 - 84 - expires 15 July 1994 |
5543 |
|
5544 |
|
5545 |
|
5546 |
|
5547 |
|
5548 |
INTERNET DRAFT to be NEWS sec. 10.3 |
5549 |
|
5550 |
|
5551 |
the different newsgroups MUST be DIFFERENT. |
5552 |
|
5553 |
NOTE: Otherwise, arrival of an article in one |
5554 |
newsgroup will prevent it from appearing in |
5555 |
another, and which newsgroup a particular article |
5556 |
appears in will be an accident of which direction |
5557 |
it arrives from first. It is very difficult to |
5558 |
maintain a coherent discussion when each partici- |
5559 |
pant sees a randomly-selected 50% of the traffic. |
5560 |
The fundamental problem here is that the basic |
5561 |
assumption behind message IDs is being violated: |
5562 |
the gateways are assigning the same message ID to |
5563 |
articles that differ in an important respect |
5564 |
(Newsgroups header). |
5565 |
|
5566 |
In such cases, it is suggested that the newsgroup name, or |
5567 |
an agreed-on abbreviation thereof, be prepended to the local |
5568 |
part of the message ID (with a separating ".") by the gate- |
5569 |
way. This will ensure that multiple gateways generate the |
5570 |
same message ID, while also ensuring that different news- |
5571 |
groups can be read independently. |
5572 |
|
5573 |
NOTE: It is preferable to have the gateway(s) |
5574 |
cross-post the article, avoiding the issue alto- |
5575 |
gether, but this may not be feasible, especially |
5576 |
if one newsgroup is widespread and the other is |
5577 |
purely local. |
5578 |
|
5579 |
|
5580 |
10.4. Mail to and from News |
5581 |
|
5582 |
Gatewaying mail to news, and vice-versa, is the most obvious |
5583 |
form of news gatewaying. It is common to set up gateways |
5584 |
between news and mail rather too casually. |
5585 |
|
5586 |
It is hard to go very wrong in gatewaying news into a mail- |
5587 |
ing list, except for the non-trivial matter of making sure |
5588 |
that error reports go to the local administration rather |
5589 |
than to the authors of news articles. (This requires atten- |
5590 |
tion to the "envelope address" as well as to the message |
5591 |
headers.) Doing the reverse connection correctly is much |
5592 |
harder than it looks. |
5593 |
|
5594 |
NOTE: In particular, just feeding the mail message |
5595 |
to "inews -h" or the equivalent is NOT, repeat |
5596 |
NOT, adequate to gateway mail to news. Signifi- |
5597 |
cant gatewaying software is necessary to do it |
5598 |
right. Not all headers of mail messages conform |
5599 |
to even the MAIL specifications, never mind the |
5600 |
stricter rules for news. |
5601 |
|
5602 |
It is useful to distinguish between two different forms of |
5603 |
mail-to-news gatewaying: gatewaying a mailing list into a |
5604 |
newsgroup, and operating a "post-by-mail" service in which |
5605 |
|
5606 |
|
5607 |
|
5608 |
2 June 1994 - 85 - expires 15 July 1994 |
5609 |
|
5610 |
|
5611 |
|
5612 |
|
5613 |
|
5614 |
INTERNET DRAFT to be NEWS sec. 10.4 |
5615 |
|
5616 |
|
5617 |
individual articles can be posted to a newsgroup by mailing |
5618 |
them to a specific address. In the first case, the message |
5619 |
is already being "broadcast", and the situation can be |
5620 |
viewed as gatewaying one form of news into another. The |
5621 |
second case is closer to that of a moderator posting submis- |
5622 |
sions to a moderated newsgroup. |
5623 |
|
5624 |
In either case, the discussions in the preceding two sec- |
5625 |
tions are relevant, as is the Hippocratic Principle of sec- |
5626 |
tion 9. However, some additional considerations are spe- |
5627 |
cific to mail-to-news gatewaying. |
5628 |
|
5629 |
As mentioned in section 6, point-to-point headers like To |
5630 |
and Cc SHOULD not appear as such in news, although it is |
5631 |
suggested that they be transformed to "X-" headers, e.g. X- |
5632 |
To and X-Cc, to preserve their information content for pos- |
5633 |
sible use by readers or troubleshooters. The Received |
5634 |
header is entirely specific to MAIL and SHOULD be deleted |
5635 |
completely during gatewaying, except perhaps for the |
5636 |
Received header supplied by the gateway host itself. |
5637 |
|
5638 |
The Sender header is a tricky case, one where mailing-list |
5639 |
and post-by-mail practice should differ. For gatewaying |
5640 |
mailing lists, the mailing-list host should be considered a |
5641 |
relayer, and the From and Sender headers supplied in its |
5642 |
transmissions left strictly untouched. For post-by-mail, as |
5643 |
for a moderator posting a mailed submission, the Sender |
5644 |
header should reflect the poster rather than the author. If |
5645 |
a post-by-mail gateway receives a message with its own |
5646 |
Sender header, it might wish to preserve the content in an |
5647 |
X-Sender header. |
5648 |
|
5649 |
It will generally be necessary to transform between mail's |
5650 |
In-Reply-To/References convention and news's References/See- |
5651 |
Also convention, to preserve correct semantics of cross ref- |
5652 |
erences. This also requires attention when going the other |
5653 |
way, from news to mail. See the discussion of the differ- |
5654 |
ence in section 6.5. |
5655 |
|
5656 |
|
5657 |
10.5. Gateway Administration |
5658 |
|
5659 |
Any news system will benefit from an attentive administra- |
5660 |
tor, preferably assisted by automated monitoring for anoma- |
5661 |
lies. This is particularly true of gateways. Gateway soft- |
5662 |
ware SHOULD be instrumented so that unusual occurrences, |
5663 |
such as sudden massive surges in traffic, are reported |
5664 |
promptly. It is desirable, in fact, to go further: gateway |
5665 |
software SHOULD endeavour to limit damage in the event that |
5666 |
the administrator does not respond promptly. |
5667 |
|
5668 |
NOTE: For example, software might limit the gate- |
5669 |
waying rate by queueing incoming traffic and emp- |
5670 |
tying the queue at a finite maximum rate (well |
5671 |
|
5672 |
|
5673 |
|
5674 |
2 June 1994 - 86 - expires 15 July 1994 |
5675 |
|
5676 |
|
5677 |
|
5678 |
|
5679 |
|
5680 |
INTERNET DRAFT to be NEWS sec. 10.5 |
5681 |
|
5682 |
|
5683 |
below the maximum that the host is capable of!) |
5684 |
which is set by the administrator and is not |
5685 |
raised automatically. |
5686 |
|
5687 |
Traffic gatewayed into a news network SHOULD include a suit- |
5688 |
able header, perhaps X-Gateway-Administrator, giving an |
5689 |
electronic address that can be used to report problems. |
5690 |
This SHOULD be an address that goes direct to a human, not |
5691 |
to a "routine administrative issues" mailbox that is exam- |
5692 |
ined only occasionally, since the point is to be able to |
5693 |
reach the administrator quickly in an emergency. Gateway |
5694 |
administrators SHOULD arrange substitutes to cover gateway |
5695 |
operation (with suitable redirection of mail) when they are |
5696 |
on vacation etc. |
5697 |
|
5698 |
|
5699 |
11. Security And Related Issues |
5700 |
|
5701 |
Although the interchange format itself raises no significant |
5702 |
security issues, the wider context does. |
5703 |
|
5704 |
|
5705 |
11.1. Leakage |
5706 |
|
5707 |
The most obvious form of security problem with news is |
5708 |
"leakage" of articles which are intended to have only |
5709 |
restricted circulation. The flooding algorithm is EXTREMELY |
5710 |
good at finding any path by which articles can leave a sub- |
5711 |
net with supposedly-restrictive boundaries. Substantial |
5712 |
administrative effort is required to ensure that local news- |
5713 |
groups remain local, unless connections to the outside world |
5714 |
are tightly restricted. |
5715 |
|
5716 |
A related problem is that the sendme control message can be |
5717 |
used to ask for any article by its message ID. The useful- |
5718 |
ness of this has declined as message-ID generation algo- |
5719 |
rithms have become less predictable, but it remains a poten- |
5720 |
tial problem for "secure" newsgroups. Hosts with such news- |
5721 |
groups may wish to disable the sendme control message |
5722 |
entirely. |
5723 |
|
5724 |
The sendsys, version, and whogets control messages also |
5725 |
allow "outsiders" to request information from "inside", |
5726 |
which may reveal details of internal topology (etc.) that |
5727 |
are considered confidential. (Note that at least limited |
5728 |
openness about such matters may be a condition of membership |
5729 |
in such networks, e.g. Usenet.) |
5730 |
|
5731 |
Organizations wishing to control these forms of leakage are |
5732 |
strongly advised to designate a small number of "official |
5733 |
gateway" hosts to handle all news exchange with the outside |
5734 |
world, so that a bounded amount of administrative effort is |
5735 |
needed to control propagation and eliminate problems. |
5736 |
Attempts to keep news out entirely, by refusing to support |
5737 |
|
5738 |
|
5739 |
|
5740 |
2 June 1994 - 87 - expires 15 July 1994 |
5741 |
|
5742 |
|
5743 |
|
5744 |
|
5745 |
|
5746 |
INTERNET DRAFT to be NEWS sec. 11.1 |
5747 |
|
5748 |
|
5749 |
an official gateway, typically result in large numbers of |
5750 |
unofficial partial gateways appearing over time. Such a |
5751 |
configuration is much more difficult to troubleshoot. |
5752 |
|
5753 |
A somewhat-related problem is the possibility of proprietary |
5754 |
material being disclosed unintentionally by a poster who |
5755 |
does not realize how far his words will propagate, either |
5756 |
from sheer misunderstanding or because of errors made (by |
5757 |
human or software) in followup preparation. There is little |
5758 |
that can be done about this except education. |
5759 |
|
5760 |
|
5761 |
11.2. Attacks |
5762 |
|
5763 |
Although the limitations of the medium restrict what can be |
5764 |
done to attack a host via news, some possibilities exist, |
5765 |
most of them problems news shares with mail. |
5766 |
|
5767 |
If reading agents are careless about transmitting non- |
5768 |
printable characters to output devices, malicious posters |
5769 |
may post articles containing control sequences ("letter- |
5770 |
bombs") meant to have various destructive effects on output |
5771 |
devices. Possible effects depend on the device, but they |
5772 |
can include hardware damage (e.g. by repeated writing of |
5773 |
values into configuration memories that can tolerate only a |
5774 |
limited number of write cycles) and security violation (e.g. |
5775 |
by reprogramming function keys potentially used by privi- |
5776 |
leged readers). |
5777 |
|
5778 |
A more sophisticated variation on the letterbomb is inclu- |
5779 |
sion of "Trojan horses" in programs. Obviously, readers |
5780 |
must be cautious about using software found in news, but |
5781 |
more subtly, reading agents must also exercise care. MIME |
5782 |
messages can include material that is executable in some |
5783 |
sense, such as PostScript documents (which are programs!), |
5784 |
and letterbombs may be introduced into such material. |
5785 |
|
5786 |
Given the presence of finite resources and other software |
5787 |
limitations, some degree of system disruption can be |
5788 |
achieved by posting otherwise-innocent material in great |
5789 |
volume, either in single huge articles (see section 4.6) or |
5790 |
in a stream of modest-sized articles. (Some would say that |
5791 |
the steady growth of Usenet volume constitutes a subtle and |
5792 |
unintentional attack of the latter type; certainly it can |
5793 |
have disruptive effects if administrators are inattentive.) |
5794 |
Systems need some ability to cope with surges, because sin- |
5795 |
gle huge articles occur occasionally as the result of soft- |
5796 |
ware error, innocent misunderstanding, or deliberate malice, |
5797 |
and downtime at upstream hosts can cause droughts, followed |
5798 |
by floods, of legitimate articles. (There is also a certain |
5799 |
amount of normal variation; for example, Usenet traffic is |
5800 |
noticeably lighter on weekends and during Christmas holi- |
5801 |
days, and rises noticeably at the start of the school term |
5802 |
of North American universities.) However, a site that |
5803 |
|
5804 |
|
5805 |
|
5806 |
2 June 1994 - 88 - expires 15 July 1994 |
5807 |
|
5808 |
|
5809 |
|
5810 |
|
5811 |
|
5812 |
INTERNET DRAFT to be NEWS sec. 11.2 |
5813 |
|
5814 |
|
5815 |
normally receives little traffic may be quite vulnerable to |
5816 |
"swamping" attack if its software is insufficiently careful. |
5817 |
|
5818 |
In general, careless implementation may open doors that are |
5819 |
not intrinsic to news. In particular, implementation of |
5820 |
control messages (see sections 6.6 and 7) and unbatchers |
5821 |
(see section 8.1 and 8.2) via a command interpreter requires |
5822 |
substantial precautions to ensure that only the intended |
5823 |
capabilities are available. Care must also be taken that |
5824 |
article-supplied text is not fed to programs that have |
5825 |
escapes to command interpreters. |
5826 |
|
5827 |
Finally, there is considerable potential for malice in the |
5828 |
sendsys, version, and whogets control messages. They are |
5829 |
not harmful to the hosts receiving them as news, but they |
5830 |
can be used to enlist those hosts (by the thousands) as |
5831 |
unwitting allies in a mail-swamping attack on a victim who |
5832 |
may not even receive news. The precautions discussed in |
5833 |
section 7.5 can reduce the potential for such attacks con- |
5834 |
siderably, but the hazard cannot be eliminated as long as |
5835 |
these control messages exist. |
5836 |
|
5837 |
|
5838 |
11.3. Anarchy |
5839 |
|
5840 |
The highly distributed nature of news propagation, and the |
5841 |
lack of adequate authentication protocols (especially for |
5842 |
use over the less-interactive transport mechanisms such as |
5843 |
UUCP), make article forgery relatively straightforward. It |
5844 |
may be possible to at least track a forgery to its source, |
5845 |
once it is recognized as such, but clever forgers can make |
5846 |
even that relatively difficult. The assumption that forg- |
5847 |
eries will be recognized as such is also not to be taken for |
5848 |
granted; readers are notoriously prone to blindly assuming |
5849 |
authenticity. If a forged article's initial path list |
5850 |
includes the relayer name of the supposed poster's host, the |
5851 |
article will never be sent to that host, and the alleged |
5852 |
author may learn about the forgery secondhand or not at all. |
5853 |
|
5854 |
A particularly noxious form of forgery is the forged "can- |
5855 |
cel" control message. Notably, it is relatively straight- |
5856 |
forward to write software that will automatically send out a |
5857 |
(forged) cancel message for any article meeting some crite- |
5858 |
rion, e.g. written by a specific author. The authentication |
5859 |
problems discussed in section 7.1 make it difficult to solve |
5860 |
this without crippling cancel's important functionality. |
5861 |
|
5862 |
A related problem is the possibility of disagreements over |
5863 |
newsgroup creation, on networks where such things are not |
5864 |
decided by central authorities. There have been cases of |
5865 |
"rmgroup wars", where one poster persistently sends out new- |
5866 |
group messages to create a newsgroup and another, equally |
5867 |
persistently, sends out rmgroup messages asking that it be |
5868 |
removed. This is not particularly damaging, if relayers are |
5869 |
|
5870 |
|
5871 |
|
5872 |
2 June 1994 - 89 - expires 15 July 1994 |
5873 |
|
5874 |
|
5875 |
|
5876 |
|
5877 |
|
5878 |
INTERNET DRAFT to be NEWS sec. 11.3 |
5879 |
|
5880 |
|
5881 |
configured to be cautious, but can cause serious confusion |
5882 |
among innocent third parties who just want to know whether |
5883 |
they can use the newsgroup for communication or not. |
5884 |
|
5885 |
|
5886 |
11.4. Liability |
5887 |
|
5888 |
News shares the legal uncertainty surrounding other forms of |
5889 |
electronic communication: what rules apply to this new |
5890 |
medium of information exchange? News is a particularly |
5891 |
problematic case because it is a broadcast medium rather |
5892 |
than a point-to-point one like mail, and analogies to older |
5893 |
forms of communication are particularly weak. |
5894 |
|
5895 |
Are news-carrying hosts common carriers, like the phone com- |
5896 |
panies, providing communications paths without having either |
5897 |
authority over or responsibility for content? Or are they |
5898 |
publishers, responsible for the content regardless of |
5899 |
whether they are aware of it or not? Or something in |
5900 |
between? Such questions are particularly significant when |
5901 |
the content is technically criminal, e.g. some types of sex- |
5902 |
ually-oriented material in some jurisdictions, in which case |
5903 |
ignorance of its presence may not be an adequate defence. |
5904 |
|
5905 |
Even in milder situations such as libel or copyright viola- |
5906 |
tion, the responsibilities of the poster, his host, and |
5907 |
other hosts carrying the traffic are unclear. Note, in par- |
5908 |
ticular, the problems arising when the article is a forgery, |
5909 |
or when the alleged author claims it is a forgery but cannot |
5910 |
prove this. |
5911 |
|
5912 |
|
5913 |
A. Archeological Notes |
5914 |
|
5915 |
|
5916 |
A.1. A-News Article Format |
5917 |
|
5918 |
The obsolete "A News" article format consisted of exactly |
5919 |
five lines of header information, followed by the body. For |
5920 |
example: |
5921 |
|
5922 |
Aeagle.642 |
5923 |
news.misc |
5924 |
cbosgd!mhuxj!mhuxt!eagle!jerry |
5925 |
Fri Nov 19 16:14:55 1982 |
5926 |
Usenet Etiquette - Please Read |
5927 |
body |
5928 |
body |
5929 |
body |
5930 |
|
5931 |
The first line consisted of an "A" followed by an article ID |
5932 |
(analogous to a message ID and used for similar purposes). |
5933 |
The second line was the list of newsgroups. The third line |
5934 |
was the path. The fourth was the date, in the format above |
5935 |
|
5936 |
|
5937 |
|
5938 |
2 June 1994 - 90 - expires 15 July 1994 |
5939 |
|
5940 |
|
5941 |
|
5942 |
|
5943 |
|
5944 |
INTERNET DRAFT to be NEWS sec. A.1 |
5945 |
|
5946 |
|
5947 |
(all fields fixed width), resembling an Internet date but |
5948 |
not quite the same. The fifth was the subject. |
5949 |
|
5950 |
This format is documented for archeological purposes only. |
5951 |
Do not generate articles in this format. |
5952 |
|
5953 |
|
5954 |
A.2. Early B-News Article Format |
5955 |
|
5956 |
The obsolete pseudo-Internet article format, used briefly |
5957 |
during the transition between the A News format and the mod- |
5958 |
ern format, followed the general outline of a MAIL message |
5959 |
but with some non-standard headers. For example: |
5960 |
|
5961 |
From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz) |
5962 |
Newsgroups: news.misc |
5963 |
Title: Usenet Etiquette -- Please Read |
5964 |
Article-I.D.: eagle.642 |
5965 |
Posted: Fri Nov 19 16:14:55 1982 |
5966 |
Received: Fri Nov 19 16:59:30 1982 |
5967 |
Expires: Mon Jan 1 00:00:00 1990 |
5968 |
|
5969 |
body |
5970 |
body |
5971 |
body |
5972 |
|
5973 |
The From header contained the information now found in the |
5974 |
Path header, plus possibly the full name now typically found |
5975 |
in the From header. The Title header contained what is now |
5976 |
the Subject content. The Posted header contained what is |
5977 |
now the Date content. The Article-I.D. header contained an |
5978 |
article ID, analogous to a message ID and used for similar |
5979 |
purposes. The Newsgroups and Expires headers were approxi- |
5980 |
mately as now. The Received header contained the date when |
5981 |
the latest relayer to process the article first saw it. All |
5982 |
dates were in the above format, with all fields fixed width, |
5983 |
resembling an Internet date but not quite the same. |
5984 |
|
5985 |
This format is documented for archeological purposes only. |
5986 |
Do not generate articles in this format. |
5987 |
|
5988 |
|
5989 |
A.3. Obsolete Headers |
5990 |
|
5991 |
Early versions of news software following the modern format |
5992 |
sometimes generated headers like the following: |
5993 |
|
5994 |
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP |
5995 |
Posting-Version: version B 2.10 2/13/83; site eagle.UUCP |
5996 |
Date-Received: Friday, 19-Nov-82 16:59:30 EST |
5997 |
|
5998 |
Relay-Version contained version information about the |
5999 |
relayer that last processed the article. Posting-Version |
6000 |
contained version information about the posting agent that |
6001 |
|
6002 |
|
6003 |
|
6004 |
2 June 1994 - 91 - expires 15 July 1994 |
6005 |
|
6006 |
|
6007 |
|
6008 |
|
6009 |
|
6010 |
INTERNET DRAFT to be NEWS sec. A.3 |
6011 |
|
6012 |
|
6013 |
posted the article. Date-Received contained the date when |
6014 |
the last relayer to process the article first saw it (in a |
6015 |
slightly nonstandard format). |
6016 |
|
6017 |
These headers are documented for archeological purposes |
6018 |
only. Do not generate articles using them. |
6019 |
|
6020 |
|
6021 |
A.4. Obsolete Control Messages |
6022 |
|
6023 |
There once was a senduuname control message, resembling |
6024 |
sendsys but requesting transmission of the list of hosts |
6025 |
that the receiving host had UUCP connections to. This |
6026 |
rapidly ceased to be of much use, and many organizations |
6027 |
consider information about their internal connectivity to be |
6028 |
confidential. |
6029 |
|
6030 |
Historically, a checkgroups body consisting of one or two |
6031 |
lines, the first of the form "-n newsgroup", caused check- |
6032 |
groups to apply to only that single newsgroup. This form is |
6033 |
documented for archeological purposes only; do not use it. |
6034 |
|
6035 |
Historically, an article posted to a newsgroup whose name |
6036 |
had exactly three components of which the third was "ctl" |
6037 |
signified that article was to be taken as a control message. |
6038 |
The Subject header specified the actions, in the same way |
6039 |
the Control header does now. This form is documented for |
6040 |
archeological purposes only; do not use it; do not implement |
6041 |
it. |
6042 |
|
6043 |
|
6044 |
B. A Quick Tour Of MIME |
6045 |
|
6046 |
(The editor wishes to thank Luc Rooijakkers; most of this |
6047 |
appendix is a lightly-edited version of a summary he kindly |
6048 |
supplied.) |
6049 |
|
6050 |
MIME (Multipurpose Internet Mail Extensions) is an upward- |
6051 |
compatible set of extensions to RFC 822, currently docu- |
6052 |
mented in RFCs 1341 and 1342. This appendix summarizes |
6053 |
these documents. See the MIME RFCs for more information; |
6054 |
they are very readable. |
6055 |
|
6056 |
UNRESOLVED ISSUE: These RFC numbers (here and |
6057 |
elsewhere in this Draft) need updating when the |
6058 |
new MIME RFCs come out. |
6059 |
|
6060 |
MIME defines the following new headers: |
6061 |
|
6062 |
|
6063 |
|
6064 |
|
6065 |
|
6066 |
|
6067 |
|
6068 |
|
6069 |
|
6070 |
2 June 1994 - 92 - expires 15 July 1994 |
6071 |
|
6072 |
|
6073 |
|
6074 |
|
6075 |
|
6076 |
INTERNET DRAFT to be NEWS sec. B |
6077 |
|
6078 |
|
6079 |
MIME-Version |
6080 |
Content-Type |
6081 |
Content-Transfer-Encoding |
6082 |
Content-ID |
6083 |
Content-Description |
6084 |
|
6085 |
|
6086 |
The MIME-Version header is mandatory for all messages con- |
6087 |
forming to the MIME specification and carries the version |
6088 |
number of the MIME specification. Example: |
6089 |
|
6090 |
MIME-Version: 1.0 |
6091 |
|
6092 |
|
6093 |
The Content-Type header indicates the content type of the |
6094 |
message. Content types are split into a top-level type and |
6095 |
a subtype, separated by a slash. Auxiliary information can |
6096 |
also be supplied, using an attribute-value notation. Exam- |
6097 |
ple: |
6098 |
|
6099 |
Content-Type: text/plain; charset=us-ascii |
6100 |
|
6101 |
(In the absence of a Content-Type header this is in fact the |
6102 |
default content type.) |
6103 |
|
6104 |
Important type/subtype combinations are |
6105 |
|
6106 |
text/plain Plain text, possibly in a non- |
6107 |
ASCII character set. |
6108 |
|
6109 |
text/enriched A very simple wordprocessor-like |
6110 |
language supporting character |
6111 |
attributes (e.g., underlining), |
6112 |
justification control, and multi- |
6113 |
ple character sets. (This pro- |
6114 |
posal has gone through several |
6115 |
iterations and has recently split |
6116 |
off from the main MIME RFCs into a |
6117 |
separate document.) |
6118 |
|
6119 |
message/rfc822 A mail message conforming to a |
6120 |
slightly-relaxed version of RFC |
6121 |
822. |
6122 |
|
6123 |
message/partial Part of a message (supporting the |
6124 |
transparent splitting and joining |
6125 |
of messages when they are too |
6126 |
large to be handled by some trans- |
6127 |
port agent). |
6128 |
|
6129 |
message/external-body A message whose body is external. |
6130 |
Possible access methods include |
6131 |
via mail, FTP, local file, etc. |
6132 |
|
6133 |
|
6134 |
|
6135 |
|
6136 |
2 June 1994 - 93 - expires 15 July 1994 |
6137 |
|
6138 |
|
6139 |
|
6140 |
|
6141 |
|
6142 |
INTERNET DRAFT to be NEWS sec. B |
6143 |
|
6144 |
|
6145 |
multipart/mixed A message whose body consists of |
6146 |
multiple parts, possibly of dif- |
6147 |
ferent types, intended to be |
6148 |
viewed in serial order. Each part |
6149 |
looks like an RFC 822 message, |
6150 |
consisting of headers and a body. |
6151 |
Most of the RFC 822 headers have |
6152 |
no defined semantics for body |
6153 |
parts. |
6154 |
|
6155 |
multipart/parallel Likewise, except that the parts |
6156 |
are intended to be viewed in par- |
6157 |
allel (on user agents that support |
6158 |
it). |
6159 |
|
6160 |
multipart/alternative Likewise, except that the parts |
6161 |
are intended to be semantically |
6162 |
equivalent such that the part that |
6163 |
best matches the capabilities of |
6164 |
the environment should be dis- |
6165 |
played. For example, a message |
6166 |
may include plain-text, enriched- |
6167 |
text, and postscript versions of |
6168 |
some document. |
6169 |
|
6170 |
multipart/digest A variant of multipart/mixed espe- |
6171 |
cially intended for message |
6172 |
digests (the default type of the |
6173 |
parts is message/rfc822 instead of |
6174 |
text/plain, saving on the number |
6175 |
of headers for the parts). |
6176 |
|
6177 |
application/postscript A PostScript document. |
6178 |
(PostScript is a trademark of |
6179 |
Adobe.) |
6180 |
|
6181 |
Other top-level types exist for still images, audio, and |
6182 |
video samples. |
6183 |
|
6184 |
Some of the above types require the ability to transport |
6185 |
binary data. Since the existing message systems usually do |
6186 |
not support this, MIME provides a Content-Transfer-Encoding |
6187 |
header to indicate the kind of encoding used. The possible |
6188 |
encodings are: |
6189 |
|
6190 |
7bit No encoding; the data consists of short |
6191 |
(less than 1000 characters) lines of |
6192 |
7-bit ASCII data, delimited by EOL |
6193 |
sequences. This is the default encod- |
6194 |
ing. |
6195 |
|
6196 |
8bit Like 7bit, except that bytes with the |
6197 |
high-order bit set may be present. |
6198 |
Many transmission paths are incapable |
6199 |
|
6200 |
|
6201 |
|
6202 |
2 June 1994 - 94 - expires 15 July 1994 |
6203 |
|
6204 |
|
6205 |
|
6206 |
|
6207 |
|
6208 |
INTERNET DRAFT to be NEWS sec. B |
6209 |
|
6210 |
|
6211 |
of carrying messages which use this |
6212 |
encoding. |
6213 |
|
6214 |
binary No encoding; any sequence of bytes may |
6215 |
be present. Many transmission paths |
6216 |
are incapable of carrying messages |
6217 |
which use this encoding. |
6218 |
|
6219 |
base64 The data is encoded by representing |
6220 |
every group of 3 bytes as 4 characters |
6221 |
from the alphabet "A-Za-z0-9+/", which |
6222 |
was chosen for its high robustness |
6223 |
through mail gateways (the alphabet |
6224 |
used by uuencode does not survive |
6225 |
ASCII-EBCDIC-ASCII translations). In |
6226 |
the final group of 4 characters, "=" is |
6227 |
used for those characters not repre- |
6228 |
senting data bytes. Line length is |
6229 |
limited and EOLs in the encoded form |
6230 |
are ignored. |
6231 |
|
6232 |
quoted-printable Any byte can be represented by a three |
6233 |
character "=XX" sequence where the X's |
6234 |
are upper case hexadecimal digits. |
6235 |
Bytes representing printable 7-bit US- |
6236 |
ASCII characters except "=" may be rep- |
6237 |
resented literally. Tabs and blanks |
6238 |
may be represented literally if not at |
6239 |
the end of a line. Line length is lim- |
6240 |
ited, and an EOL preceded by "=" was |
6241 |
inserted for this purpose and is not |
6242 |
present in the original. |
6243 |
|
6244 |
The base64 and quoted-printable encodings are applied to |
6245 |
data in Internet canonical form, which means that any EOL |
6246 |
encoded as anything but EOL must be an Internet canonical |
6247 |
EOL: CR followed by LF. |
6248 |
|
6249 |
The Content-Description header allows further description of |
6250 |
a body part, analogous to the use of Subject for messages. |
6251 |
|
6252 |
Finally, the Content-ID header can be used to assign an |
6253 |
identification to body parts, analogous to the assignment of |
6254 |
identifications to messages by Message-ID. |
6255 |
|
6256 |
Note that most of these headers are structured header |
6257 |
fields, as defined in RFC 822. Consequently, comments are |
6258 |
allowed in their values. The following is a legal MIME |
6259 |
header: |
6260 |
|
6261 |
Content-Type: (a comment) text (yeah) / |
6262 |
plain (and now some params:) ; charset= (guess what) |
6263 |
iso-8859-1 (we don't have iso-10646 yet, pity) |
6264 |
|
6265 |
|
6266 |
|
6267 |
|
6268 |
2 June 1994 - 95 - expires 15 July 1994 |
6269 |
|
6270 |
|
6271 |
|
6272 |
|
6273 |
|
6274 |
INTERNET DRAFT to be NEWS sec. B |
6275 |
|
6276 |
|
6277 |
NOTE: Although the MIME specification was devel- |
6278 |
oped for mail, there is nothing precluding its use |
6279 |
for news as well. While it might simplify imple- |
6280 |
mentation to restrict the MIME headers somewhat, |
6281 |
in the same way that other news headers (e.g. |
6282 |
From) are restricted subsets of the RFC-822 origi- |
6283 |
nals, this would add yet another divergence |
6284 |
between two formats that ought to be as compatible |
6285 |
as possible. In the case of the MIME headers, |
6286 |
there is no body of existing code posing compati- |
6287 |
bility concerns. A full-featured MIME reading |
6288 |
agent needs a full RFC-822 parser anyway, to prop- |
6289 |
erly handle body parts of types like mes- |
6290 |
sage/rfc822, so there is little gain from |
6291 |
restricting MIME headers. Adopting the MIME spec- |
6292 |
ification unchanged seems best. However, article- |
6293 |
level MIME headers must still comply with the |
6294 |
overall news header syntax given in section 4, so |
6295 |
that news software which is NOT interested in MIME |
6296 |
need not contain a full RFC-822 parser. |
6297 |
|
6298 |
The second part of MIME, RFC 1342 (Representation of Non- |
6299 |
ASCII Text in Internet Message Headers), addresses the prob- |
6300 |
lem of non-ASCII characters in headers. An example of a |
6301 |
header using the RFC 1342 mechanism is |
6302 |
|
6303 |
From: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be> |
6304 |
|
6305 |
Such encodings are allowed in selected headers, subject to |
6306 |
the restrictions listed in RFC 1342. |
6307 |
|
6308 |
The MIME effort has also produced an RFC defining a Content- |
6309 |
MD5 header [rrr 1544], containing an MD5-based "checksum" of |
6310 |
the contents of an article or body part, giving high confi- |
6311 |
dence of detecting accidental modifications to the contents. |
6312 |
|
6313 |
The "metamail" software package [rrr] helps provide MIME |
6314 |
support with minimal changes to mailers, and may also be |
6315 |
relevant to news reading agents. |
6316 |
|
6317 |
The PEM (Privacy Enhanced Mail) effort is pursuing analogous |
6318 |
facilities to offer stronger guarantees against malicious |
6319 |
modifications, unauthorized eavesdropping, and forgery. |
6320 |
This work too may be applicable to news, once it is recon- |
6321 |
ciled with MIME (by efforts now underway). |
6322 |
|
6323 |
|
6324 |
C. Summary of Changes Since RFC 1036 |
6325 |
|
6326 |
This Draft is much longer than RFC 1036, so there is obvi- |
6327 |
ously much change in content. Much of this is just |
6328 |
increased precision and rigor. Noteworthy changes and addi- |
6329 |
tions include: |
6330 |
|
6331 |
|
6332 |
|
6333 |
|
6334 |
2 June 1994 - 96 - expires 15 July 1994 |
6335 |
|
6336 |
|
6337 |
|
6338 |
|
6339 |
|
6340 |
INTERNET DRAFT to be NEWS sec. C |
6341 |
|
6342 |
|
6343 |
+ section 4.3's restrictions on article bodies |
6344 |
|
6345 |
+ all references to MIME facilities |
6346 |
|
6347 |
+ size limits on articles |
6348 |
|
6349 |
+ precise specification of Date-content syntax |
6350 |
|
6351 |
+ message IDs must never be re-used, ever |
6352 |
|
6353 |
+ "!" is the only Path delimiter |
6354 |
|
6355 |
+ multiple moderators in the Approved header |
6356 |
|
6357 |
+ rules on References trimming, and the _-_ mechanism |
6358 |
|
6359 |
+ generalization of the Xref rules |
6360 |
|
6361 |
+ multiple message IDs in Cancel and Supersedes |
6362 |
|
6363 |
+ Also-Control |
6364 |
|
6365 |
+ See-Also |
6366 |
|
6367 |
+ Article-Names |
6368 |
|
6369 |
+ Article-Replacing |
6370 |
|
6371 |
+ more precise rules for cancellation |
6372 |
|
6373 |
+ cancellation authorization based on From, not Sender |
6374 |
|
6375 |
+ "unmoderated" and descriptors in newgroup messages |
6376 |
|
6377 |
+ restrictive rules on handling of sendsys and version |
6378 |
messages |
6379 |
|
6380 |
+ the whogets control message |
6381 |
|
6382 |
+ precise specification of checkgroups messages |
6383 |
|
6384 |
+ compression type preferably specified out-of-band |
6385 |
|
6386 |
+ rules for encapsulating news in MIME mail |
6387 |
|
6388 |
+ tighter specification of relayer functioning (section |
6389 |
9.1) |
6390 |
|
6391 |
+ the "newsmaster" contact address |
6392 |
|
6393 |
+ rules for gatewaying (section 10) |
6394 |
|
6395 |
+ discussion of security issues (section 11) |
6396 |
|
6397 |
|
6398 |
|
6399 |
|
6400 |
2 June 1994 - 97 - expires 15 July 1994 |
6401 |
|
6402 |
|
6403 |
|
6404 |
|
6405 |
|
6406 |
INTERNET DRAFT to be NEWS sec. C |
6407 |
|
6408 |
|
6409 |
D. Summary of Completely New Features |
6410 |
|
6411 |
Most of this Draft merely documents existing practice, but |
6412 |
there are a few attempts to extend it. These are: |
6413 |
|
6414 |
TBW |
6415 |
|
6416 |
|
6417 |
E. Summary of Differences From RFC 822+1123 |
6418 |
|
6419 |
The following are noteworthy differences between this |
6420 |
Draft's articles and MAIL messages: |
6421 |
|
6422 |
+ generally less-permissive header syntax |
6423 |
|
6424 |
+ notably, limited From syntax |
6425 |
|
6426 |
+ MAIL header comments allowed in only a few contexts |
6427 |
|
6428 |
+ slightly more restricted message-ID syntax |
6429 |
|
6430 |
+ several more mandatory headers |
6431 |
|
6432 |
+ duplicate headers forbidden |
6433 |
|
6434 |
+ References/See-Also versus In-Reply-To/References |
6435 |
(section 6.5) |
6436 |
|
6437 |
+ case sensitivity in some contexts |
6438 |
|
6439 |
+ point-to-point headers, e.g. To and Cc, forbidden |
6440 |
(section 6) |
6441 |
|
6442 |
+ several new headers |
6443 |
|
6444 |
|
6445 |
References |
6446 |
|
6447 |
[Sanderson] "Smileys", David Sanderson, O'Reilly & Associ- |
6448 |
ates Ltd., 1993. |
6449 |
|
6450 |
TBW |
6451 |
|
6452 |
|
6453 |
Security Considerations |
6454 |
|
6455 |
Section 11 discusses security considerations in detail. |
6456 |
|
6457 |
|
6458 |
Author's Address |
6459 |
|
6460 |
|
6461 |
|
6462 |
|
6463 |
|
6464 |
|
6465 |
|
6466 |
2 June 1994 - 98 - expires 15 July 1994 |
6467 |
|
6468 |
|
6469 |
|
6470 |
|
6471 |
|
6472 |
INTERNET DRAFT to be NEWS sec. - |
6473 |
|
6474 |
|
6475 |
Henry Spencer |
6476 |
henry@zoo.toronto.edu |
6477 |
|
6478 |
SP Systems |
6479 |
Box 280 Stn. A |
6480 |
Toronto, Ont. M5W1B2 Canada |
6481 |
|
6482 |
|
6483 |
|
6484 |
|
6485 |
|
6486 |
|
6487 |
|
6488 |
|
6489 |
|
6490 |
|
6491 |
|
6492 |
|
6493 |
|
6494 |
|
6495 |
|
6496 |
|
6497 |
|
6498 |
|
6499 |
|
6500 |
|
6501 |
|
6502 |
|
6503 |
|
6504 |
|
6505 |
|
6506 |
|
6507 |
|
6508 |
|
6509 |
|
6510 |
|
6511 |
|
6512 |
|
6513 |
|
6514 |
|
6515 |
|
6516 |
|
6517 |
|
6518 |
|
6519 |
|
6520 |
|
6521 |
|
6522 |
|
6523 |
|
6524 |
|
6525 |
|
6526 |
|
6527 |
|
6528 |
|
6529 |
|
6530 |
|
6531 |
|
6532 |
2 June 1994 - 99 - expires 15 July 1994 |
6533 |
|
6534 |
|