1 |
wakaba |
1.1 |
|
2 |
|
|
|
3 |
|
|
|
4 |
|
|
INTERNET DRAFT to be NEWS sec. - |
5 |
|
|
|
6 |
|
|
|
7 |
|
|
|
8 |
|
|
|
9 |
|
|
|
10 |
|
|
News Article Format and Transmission |
11 |
|
|
|
12 |
|
|
Henry Spencer |
13 |
|
|
|
14 |
|
|
|
15 |
|
|
|
16 |
|
|
Status of this Memo |
17 |
|
|
|
18 |
|
|
This document is intended to become an Internet Draft. |
19 |
|
|
Internet Drafts are working documents of the Internet Engi- |
20 |
|
|
neering Task Force (IETF), its Areas, and its Working |
21 |
|
|
Groups. Note that other groups may also distribute working |
22 |
|
|
documents as Internet Drafts. |
23 |
|
|
|
24 |
|
|
Internet Drafts are draft documents valid for a maximum of |
25 |
|
|
six months. Internet Drafts may be updated, replaced, or |
26 |
|
|
obsoleted by other documents at any time. It is not appro- |
27 |
|
|
priate to use Internet Drafts as reference material or to |
28 |
|
|
cite them other than as a "working draft" or "work in |
29 |
|
|
progress". |
30 |
|
|
|
31 |
|
|
Please check the I-D abstract listing contained in each |
32 |
|
|
Internet Draft directory to learn the current status of this |
33 |
|
|
or any other Internet Draft. (Actually, this draft is at |
34 |
|
|
too early a stage to even be listed there yet.) |
35 |
|
|
|
36 |
|
|
It is hoped that a later version of this Draft will obsolete |
37 |
|
|
RFC 1036 and will become an Internet standard. |
38 |
|
|
|
39 |
|
|
References to the "successor to this Draft" refer not to |
40 |
|
|
later versions of this draft, but to a hypothetical future |
41 |
|
|
rewrite of this Draft (in the same way that this Draft is a |
42 |
|
|
rewrite of RFC 1036). |
43 |
|
|
|
44 |
|
|
Distribution of this memo is unlimited. |
45 |
|
|
|
46 |
|
|
|
47 |
|
|
Abstract |
48 |
|
|
|
49 |
|
|
This Draft defines the format and procedures for interchange |
50 |
|
|
of network news articles. It is hoped that a later version |
51 |
|
|
of this Draft will obsolete RFC 1036, reflecting more recent |
52 |
|
|
experience and accommodating future directions. |
53 |
|
|
|
54 |
|
|
Network news articles resemble mail messages but are broad- |
55 |
|
|
cast to potentially-large audiences, using a flooding algo- |
56 |
|
|
rithm that propagates one copy to each interested host (or |
57 |
|
|
group thereof), typically stores only one copy per host, and |
58 |
|
|
does not require any central administration or systematic |
59 |
|
|
registration of interested users. Network news originated |
60 |
|
|
as the medium of communication for Usenet, circa 1980. |
61 |
|
|
|
62 |
|
|
|
63 |
|
|
|
64 |
|
|
2 June 1994 - 1 - expires 15 July 1994 |
65 |
|
|
|
66 |
|
|
|
67 |
|
|
|
68 |
|
|
|
69 |
|
|
|
70 |
|
|
INTERNET DRAFT to be NEWS sec. - |
71 |
|
|
|
72 |
|
|
|
73 |
|
|
Since then Usenet has grown explosively, and many Internet |
74 |
|
|
sites participate in it. In addition, the news technology |
75 |
|
|
is now in widespread use for other purposes, on the Internet |
76 |
|
|
and elsewhere. |
77 |
|
|
|
78 |
|
|
This Draft primarily codifies and organizes existing prac- |
79 |
|
|
tice. A few small extensions have been added in an attempt |
80 |
|
|
to solve problems that are considered serious. Major exten- |
81 |
|
|
sions (e.g. cryptographic authentication) that need signifi- |
82 |
|
|
cant development effort are left to be undertaken as inde- |
83 |
|
|
pendent efforts. |
84 |
|
|
|
85 |
|
|
|
86 |
|
|
Table of Contents |
87 |
|
|
|
88 |
|
|
TBW |
89 |
|
|
|
90 |
|
|
|
91 |
|
|
1. Introduction |
92 |
|
|
|
93 |
|
|
Network news articles resemble mail messages but are broad- |
94 |
|
|
cast to potentially-large audiences, using a flooding algo- |
95 |
|
|
rithm that propagates one copy to each interested host (or |
96 |
|
|
groups thereof), typically stores only one copy per host, |
97 |
|
|
and does not require any central administration or system- |
98 |
|
|
atic registration of interested users. Network news origi- |
99 |
|
|
nated as the medium of communication for Usenet, circa 1980. |
100 |
|
|
Since then Usenet has grown explosively, and many Internet |
101 |
|
|
sites participate in it. In addition, the news technology |
102 |
|
|
is now in widespread use for other purposes, on the Internet |
103 |
|
|
and elsewhere. |
104 |
|
|
|
105 |
|
|
The earliest news interchange used the so-called "A News" |
106 |
|
|
article format. Shortly thereafter, an article format |
107 |
|
|
vaguely resembling Internet mail was devised and used |
108 |
|
|
briefly. Both of those formats are completely obsolete; |
109 |
|
|
they are documented in appendix A for historical reasons |
110 |
|
|
only. With publication of RFC 850 [rrr] in 1983, news arti- |
111 |
|
|
cles came to closely resemble Internet mail messages, with |
112 |
|
|
some restrictions and some additional headers. RFC 1036 |
113 |
|
|
[rrr] in 1987 updated RFC 850 without making major changes. |
114 |
|
|
|
115 |
|
|
In the intervening five years, the RFC 1036 article format |
116 |
|
|
has proven quite satisfactory, although minor extensions |
117 |
|
|
appear desirable to match recent developments in areas such |
118 |
|
|
as multi-media mail. RFC 1036 itself has not proven quite |
119 |
|
|
so satisfactory. It is often rather vague and does not |
120 |
|
|
address some issues at all; this has caused significant |
121 |
|
|
interoperability problems at times, and implementations have |
122 |
|
|
diverged somewhat. Worse, although it was intended primar- |
123 |
|
|
ily to document existing practice, it did not precisely |
124 |
|
|
match existing practice even at the time it was published, |
125 |
|
|
and the deviations have grown since. |
126 |
|
|
|
127 |
|
|
|
128 |
|
|
|
129 |
|
|
|
130 |
|
|
2 June 1994 - 2 - expires 15 July 1994 |
131 |
|
|
|
132 |
|
|
|
133 |
|
|
|
134 |
|
|
|
135 |
|
|
|
136 |
|
|
INTERNET DRAFT to be NEWS sec. 1 |
137 |
|
|
|
138 |
|
|
|
139 |
|
|
This Draft attempts to specify the format of articles, and |
140 |
|
|
the procedures used to exchange them and process them, in |
141 |
|
|
sufficient detail to allow full interoperability. In addi- |
142 |
|
|
tion, some tentative suggestions are made about directions |
143 |
|
|
for future development, in an attempt to avert unnecessary |
144 |
|
|
divergence and consequent loss of interoperability. Major |
145 |
|
|
extensions (e.g. cryptographic authentication) that need |
146 |
|
|
significant development effort are left to be undertaken as |
147 |
|
|
independent efforts. |
148 |
|
|
|
149 |
|
|
NOTE: One question this all may raise is: why is |
150 |
|
|
there no News-Version header, analogous to MIME- |
151 |
|
|
Version, specifying a version number corresponding |
152 |
|
|
to this specification? The answer is: it doesn't |
153 |
|
|
appear to be useful, given news's backward- |
154 |
|
|
compatibility constraints. The major use of a |
155 |
|
|
version number is indicating which of several |
156 |
|
|
INCOMPATIBLE interpretations is relevant. The |
157 |
|
|
impossibility of orchestrating any sort of simul- |
158 |
|
|
taneous change over news's installed base makes it |
159 |
|
|
necessary to avoid such incompatible changes (as |
160 |
|
|
opposed to extensions) entirely. MIME has a ver- |
161 |
|
|
sion number mostly because it introduced incompat- |
162 |
|
|
ible changes to the interpretation of several |
163 |
|
|
"Content-" headers. This Draft attempts no |
164 |
|
|
changes in interpretation and it appears doubtful |
165 |
|
|
that future Drafts will find it feasible to intro- |
166 |
|
|
duce any. |
167 |
|
|
|
168 |
|
|
UNRESOLVED ISSUE: Should this be reconsidered? |
169 |
|
|
Only if the header has SPECIFIC IDENTIFIABLE uses |
170 |
|
|
today. Otherwise it's just useless added bulk. |
171 |
|
|
|
172 |
|
|
As in this Draft's predecessors, the exact means used to |
173 |
|
|
transmit articles from one host to another is not specified. |
174 |
|
|
NNTP [rrr] is probably the most common transmission method |
175 |
|
|
on the Internet, but a number of others are known to be in |
176 |
|
|
use, including the UUCP protocol [rrr] extensively used in |
177 |
|
|
the early days of Usenet and still much used on its fringes |
178 |
|
|
today. |
179 |
|
|
|
180 |
|
|
Several of the mechanisms described in this Draft may seem |
181 |
|
|
somewhat strange or even bizarre at first reading. As with |
182 |
|
|
Internet mail, there is no reasonable possibility of updat- |
183 |
|
|
ing the entire installed base of news software promptly, so |
184 |
|
|
interoperability with old software is crucial and will |
185 |
|
|
remain so. Compatibility with existing practice and robust- |
186 |
|
|
ness in an imperfect world necessarily take priority over |
187 |
|
|
|
188 |
|
|
|
189 |
|
|
|
190 |
|
|
|
191 |
|
|
|
192 |
|
|
|
193 |
|
|
|
194 |
|
|
|
195 |
|
|
|
196 |
|
|
2 June 1994 - 3 - expires 15 July 1994 |
197 |
|
|
|
198 |
|
|
|
199 |
|
|
|
200 |
|
|
|
201 |
|
|
|
202 |
|
|
INTERNET DRAFT to be NEWS sec. 1 |
203 |
|
|
|
204 |
|
|
|
205 |
|
|
elegance. |
206 |
|
|
|
207 |
|
|
|
208 |
|
|
2. Definitions, Notations, and Conventions |
209 |
|
|
|
210 |
|
|
|
211 |
|
|
2.1. Textual Notations |
212 |
|
|
|
213 |
|
|
Throughout this Draft, "MAIL" is short for "RFC 822 [rrr] as |
214 |
|
|
amended by RFC 1123 [rrr]". (RFC 1123's amendments are |
215 |
|
|
mostly relatively small, but they are not insignificant.) |
216 |
|
|
See also the discussion in section 3 about this Draft's |
217 |
|
|
relationship to MAIL. "MIME" is short for "RFCs 1341 and |
218 |
|
|
1342" (or their updated replacements). |
219 |
|
|
|
220 |
|
|
UNRESOLVED ISSUE: Update these numbers. |
221 |
|
|
|
222 |
|
|
"ASCII" is short for "the ANSI X3.4 character set" [rrr]. |
223 |
|
|
While "ASCII" is often misused to refer to various character |
224 |
|
|
sets somewhat similar to X3.4, in this Draft, "ASCII" means |
225 |
|
|
X3.4 and only X3.4. |
226 |
|
|
|
227 |
|
|
NOTE: The name is traditional (to the point where |
228 |
|
|
the ANSI standard sanctions it) even though it is |
229 |
|
|
no longer an acronym for the name of the standard. |
230 |
|
|
|
231 |
|
|
NOTE: ASCII, X3.4, contains 128 characters, not |
232 |
|
|
all of them printable. Character sets with more |
233 |
|
|
characters are not ASCII, although they may |
234 |
|
|
include it as a subset. |
235 |
|
|
|
236 |
|
|
Certain words used to define the significance of individual |
237 |
|
|
requirements are capitalized. "MUST" means that the item is |
238 |
|
|
an absolute requirement of the specification. "SHOULD" |
239 |
|
|
means that the item is a strong recommendation: there may be |
240 |
|
|
valid reasons to ignore it in unusual circumstances, but |
241 |
|
|
this should be done only after careful study of the full |
242 |
|
|
implications and a firm conclusion that it is necessary, |
243 |
|
|
because there are serious disadvantages to doing so. "MAY" |
244 |
|
|
means that the item is truly optional, and implementors and |
245 |
|
|
users are warned that conformance is possible but not to be |
246 |
|
|
relied on. |
247 |
|
|
|
248 |
|
|
The term "compliant", applied to implementations etc., indi- |
249 |
|
|
cates satisfaction of all relevant "MUST" and "SHOULD" |
250 |
|
|
requirements. The term "conditionally compliant" indicates |
251 |
|
|
satisfaction of all relevant "MUST" requirements but viola- |
252 |
|
|
tion of at least one relevant "SHOULD" requirement. |
253 |
|
|
|
254 |
|
|
This Draft contains explanatory notes using the following |
255 |
|
|
format. These may be skipped by persons interested solely |
256 |
|
|
in the content of the specification. The purpose of the |
257 |
|
|
notes is to explain why choices were made, to place them in |
258 |
|
|
context, or to suggest possible implementation techniques. |
259 |
|
|
|
260 |
|
|
|
261 |
|
|
|
262 |
|
|
2 June 1994 - 4 - expires 15 July 1994 |
263 |
|
|
|
264 |
|
|
|
265 |
|
|
|
266 |
|
|
|
267 |
|
|
|
268 |
|
|
INTERNET DRAFT to be NEWS sec. 2.1 |
269 |
|
|
|
270 |
|
|
|
271 |
|
|
NOTE: While such explanatory notes may seem super- |
272 |
|
|
fluous in principle, they often help the less- |
273 |
|
|
than-omniscient reader grasp the purpose of the |
274 |
|
|
specification and the constraints involved. Given |
275 |
|
|
the limitations of natural language for descrip- |
276 |
|
|
tive purposes, this improves the probability that |
277 |
|
|
implementors and users will understand the true |
278 |
|
|
intent of the specification in cases where the |
279 |
|
|
wording is not entirely clear. |
280 |
|
|
|
281 |
|
|
All numeric values are given in decimal unless otherwise |
282 |
|
|
indicated. Octets are assumed to be unsigned values for |
283 |
|
|
this purpose. Large numbers are written using the North |
284 |
|
|
American convention, in which "," separates groups of three |
285 |
|
|
digits but otherwise has no significance. |
286 |
|
|
|
287 |
|
|
|
288 |
|
|
2.2. Syntax Notation |
289 |
|
|
|
290 |
|
|
Although the mechanisms specified in this Draft are all |
291 |
|
|
described in prose, most are also described formally in the |
292 |
|
|
modified BNF notation of RFC 822. Implementors will need to |
293 |
|
|
be familiar with this notation to fully understand this |
294 |
|
|
specification, and are referred to RFC 822 for a complete |
295 |
|
|
explanation of the modified BNF notation. Here is a brief |
296 |
|
|
illustrative example: |
297 |
|
|
|
298 |
|
|
sentence = clause *( punct clause ) "." |
299 |
|
|
punct = ":" / ";" |
300 |
|
|
clause = 1*word [ "(" clause ")" / "," 1*word ] |
301 |
|
|
word = <any English word> |
302 |
|
|
|
303 |
|
|
This defines a sentence as some clauses separated by puncts |
304 |
|
|
and ended by a period, a punct as a colon or semicolon, a |
305 |
|
|
clause as at least one <word> optionally followed by either |
306 |
|
|
a parenthesized clause or a comma and at least one more |
307 |
|
|
<word>, and a <word> as (informally) any English word. <> |
308 |
|
|
are used to enclose names when (and only when) distinguish- |
309 |
|
|
ing them from surrounding text is useful. The full form of |
310 |
|
|
the repetition notation is <m>"*"<n><thing>, denoting <m> |
311 |
|
|
through <n> repetitions of <thing>; <m> defaults to zero, |
312 |
|
|
<n> to infinity, and the "*" and <n> can be omitted if <m> |
313 |
|
|
and <n> are equal, so 1*word is one or more words, 1*5word |
314 |
|
|
is one through five words, and 2word is exactly two words. |
315 |
|
|
|
316 |
|
|
The character "\" is not special in any way in this nota- |
317 |
|
|
tion. |
318 |
|
|
|
319 |
|
|
This Draft is intended to be self-contained; all syntax |
320 |
|
|
rules used in it are defined within it, and a rule with the |
321 |
|
|
same name as one found in MAIL does not necessarily have the |
322 |
|
|
same definition. The lexical layer of MAIL is NOT, repeat |
323 |
|
|
NOT, used in this Draft, and its presence must not be |
324 |
|
|
assumed; notably, this Draft spells out all places where |
325 |
|
|
|
326 |
|
|
|
327 |
|
|
|
328 |
|
|
2 June 1994 - 5 - expires 15 July 1994 |
329 |
|
|
|
330 |
|
|
|
331 |
|
|
|
332 |
|
|
|
333 |
|
|
|
334 |
|
|
INTERNET DRAFT to be NEWS sec. 2.2 |
335 |
|
|
|
336 |
|
|
|
337 |
|
|
white space is permitted/required and all places where con- |
338 |
|
|
structs resembling MAIL comments can occur. |
339 |
|
|
|
340 |
|
|
NOTE: News parsers historically have been much |
341 |
|
|
less permissive than MAIL parsers. |
342 |
|
|
|
343 |
|
|
|
344 |
|
|
2.3. Definitions |
345 |
|
|
|
346 |
|
|
The term "character set", wherever it is used in this Draft, |
347 |
|
|
refers to a coded character set, in the sense of ISO charac- |
348 |
|
|
ter set standardization work, and must not be misinterpreted |
349 |
|
|
as meaning merely "a set of characters". |
350 |
|
|
|
351 |
|
|
In this Draft, ASCII character 32 is referred to as "blank"; |
352 |
|
|
the word "space" has a more generic meaning. |
353 |
|
|
|
354 |
|
|
An "article" is the unit of news, analogous to a MAIL "mes- |
355 |
|
|
sage". |
356 |
|
|
|
357 |
|
|
A "poster" is a human being (or software equivalent) submit- |
358 |
|
|
ting a possibly-compliant article to be "posted": made |
359 |
|
|
available for reading on all relevant hosts. A "posting |
360 |
|
|
agent" is software that assists posters to prepare articles, |
361 |
|
|
including determining whether the final article is compli- |
362 |
|
|
ant, passing it on to a relayer for posting if so, and |
363 |
|
|
returning it to the poster with an explanation if not. A |
364 |
|
|
"relayer" is software which receives allegedly-compliant |
365 |
|
|
articles from posting agents and/or other relayers, files |
366 |
|
|
copies in a "news database", and possibly passes copies on |
367 |
|
|
to other relayers. |
368 |
|
|
|
369 |
|
|
NOTE: While the same software may well function |
370 |
|
|
both as a relayer and as part of a posting agent, |
371 |
|
|
the two functions are distinct and should not be |
372 |
|
|
confused. The posting agent's purpose is (in |
373 |
|
|
part) to validate an article, supply header infor- |
374 |
|
|
mation that can or should be supplied automati- |
375 |
|
|
cally, and generally take reasonable actions in an |
376 |
|
|
attempt to transform the poster's submission into |
377 |
|
|
a compliant article. The relayer's purpose is to |
378 |
|
|
move already-compliant articles around efficiently |
379 |
|
|
without damaging them. |
380 |
|
|
|
381 |
|
|
A "reader" is a human being reading news articles. A "read- |
382 |
|
|
ing agent" is software which presents articles to a reader. |
383 |
|
|
|
384 |
|
|
NOTE: Informal usage often uses "reader" for both |
385 |
|
|
these meanings, but this introduces considerable |
386 |
|
|
potential for confusion and misunderstanding, so |
387 |
|
|
this Draft takes care to make the distinction. |
388 |
|
|
|
389 |
|
|
A "newsgroup" is a single news forum, a logical bulletin |
390 |
|
|
board, having a name and nominally intended for articles on |
391 |
|
|
|
392 |
|
|
|
393 |
|
|
|
394 |
|
|
2 June 1994 - 6 - expires 15 July 1994 |
395 |
|
|
|
396 |
|
|
|
397 |
|
|
|
398 |
|
|
|
399 |
|
|
|
400 |
|
|
INTERNET DRAFT to be NEWS sec. 2.3 |
401 |
|
|
|
402 |
|
|
|
403 |
|
|
a specific topic. An article is "posted to" a single news- |
404 |
|
|
group or several newsgroups. When an article is posted to |
405 |
|
|
more than one newsgroup, it is said to be "cross-posted"; |
406 |
|
|
note that this differs from posting the same text as part of |
407 |
|
|
each of several articles, one per newsgroup. A "hierarchy" |
408 |
|
|
is the set of all newsgroups whose names share a first com- |
409 |
|
|
ponent (see the name syntax in section 5.5). |
410 |
|
|
|
411 |
|
|
A newsgroup may be "moderated", in which case submissions |
412 |
|
|
are not posted directly, but mailed to a "moderator" for |
413 |
|
|
consideration and possible posting. Moderators are typi- |
414 |
|
|
cally human but may be implemented partially or entirely in |
415 |
|
|
software. |
416 |
|
|
|
417 |
|
|
A "followup" is an article containing a response to the con- |
418 |
|
|
tents of an earlier article (the followup's "precursor"). A |
419 |
|
|
"followup agent" is a combination of reading agent and post- |
420 |
|
|
ing agent that aids in the preparation and posting of a fol- |
421 |
|
|
lowup. |
422 |
|
|
|
423 |
|
|
Text comparisons are "case-sensitive" if they consider |
424 |
|
|
uppercase letters (e.g. "A") different from lowercase let- |
425 |
|
|
ters (e.g. "a"), and "case-insensitive" if letters differing |
426 |
|
|
only in case (e.g. "A" and "a") are considered identical. |
427 |
|
|
Categories of text are said to be case-(in)sensitive if com- |
428 |
|
|
parisons of such texts to others are case-(in)sensitive. |
429 |
|
|
|
430 |
|
|
A "cooperating subnet" is a set of news-exchanging hosts |
431 |
|
|
which is sufficiently well-coordinated (typically via a cen- |
432 |
|
|
tral administration of some sort) that stronger assumptions |
433 |
|
|
can be made about hosts in the set than about news hosts in |
434 |
|
|
general. This is typically used to relax restrictions which |
435 |
|
|
are otherwise required for worst-case interoperability; mem- |
436 |
|
|
bers of a cooperating subnet MAY interchange articles that |
437 |
|
|
do not conform to this Draft's specifications, provided all |
438 |
|
|
members have agreed to this and provided the articles are |
439 |
|
|
not permitted to leak out of the subnet. The word "subnet" |
440 |
|
|
is used to emphasize that a cooperating subnet is typically |
441 |
|
|
not an isolated universe; care must be taken that traffic |
442 |
|
|
leaving the subnet complies with the restrictions of the |
443 |
|
|
larger net, not just those of the cooperating subnet. |
444 |
|
|
|
445 |
|
|
A "message ID" is a unique identifier for an article, usu- |
446 |
|
|
ally supplied by the posting agent which posted it. It dis- |
447 |
|
|
tinguishes the article from every other article ever posted |
448 |
|
|
anywhere (in theory). Articles with the same message ID are |
449 |
|
|
treated as identical copies of the same article even if they |
450 |
|
|
are not in fact identical. |
451 |
|
|
|
452 |
|
|
A "gateway" is software which receives news articles and |
453 |
|
|
converts them to messages of some other kind (e.g. mail to a |
454 |
|
|
mailing list), or vice-versa; in essence it is a translating |
455 |
|
|
relayer that straddles boundaries between different methods |
456 |
|
|
of message exchange. The most common type of gateway |
457 |
|
|
|
458 |
|
|
|
459 |
|
|
|
460 |
|
|
2 June 1994 - 7 - expires 15 July 1994 |
461 |
|
|
|
462 |
|
|
|
463 |
|
|
|
464 |
|
|
|
465 |
|
|
|
466 |
|
|
INTERNET DRAFT to be NEWS sec. 2.3 |
467 |
|
|
|
468 |
|
|
|
469 |
|
|
connects newsgroup(s) to mailing list(s), either unidirec- |
470 |
|
|
tionally or bidirectionally, but there are also gateways |
471 |
|
|
between news networks using this Draft's news format and |
472 |
|
|
those using other formats. |
473 |
|
|
|
474 |
|
|
A "control message" is an article which is marked as con- |
475 |
|
|
taining control information; a relayer receiving such an |
476 |
|
|
article will (subject to permissions etc.) take actions |
477 |
|
|
beyond just filing and passing on the article. |
478 |
|
|
|
479 |
|
|
NOTE: "Control article" would be more consistent |
480 |
|
|
terminology, but "control message" is already well |
481 |
|
|
established. |
482 |
|
|
|
483 |
|
|
An article's "reply address" is the address to which mailed |
484 |
|
|
replies should be sent. This is the address specified in |
485 |
|
|
the article's From header (see section 5.2), unless it also |
486 |
|
|
has a Reply-To header (see section 6.3). |
487 |
|
|
|
488 |
|
|
The notation (e.g.) "(ASCII 17)" following a name means |
489 |
|
|
"this name refers to the ASCII character having value 17". |
490 |
|
|
An "ASCII printable character" is an ASCII character in the |
491 |
|
|
range 33-126. An "ASCII control character" is an ASCII |
492 |
|
|
character in the range 0-31, or the character DEL (ASCII |
493 |
|
|
127). A "non-ASCII character" is a character having a value |
494 |
|
|
exceeding 127. |
495 |
|
|
|
496 |
|
|
NOTE: Blank is neither an "ASCII printable charac- |
497 |
|
|
ter" nor an "ASCII control character". |
498 |
|
|
|
499 |
|
|
|
500 |
|
|
2.4. End Of Line |
501 |
|
|
|
502 |
|
|
How the end of a text line is represented depends on the |
503 |
|
|
context and the implementation. For Internet transmission |
504 |
|
|
via protocols such as SMTP [rrr], an end-of-line is a CR |
505 |
|
|
(ASCII 13) followed by an LF (ASCII 10). ISO C [rrr] and |
506 |
|
|
many modern operating systems indicate end-of-line with a |
507 |
|
|
single character, typically ASCII LF (aka "newline"), and |
508 |
|
|
this is the normal convention when news is transmitted via |
509 |
|
|
UUCP. A variety of other methods are in use, including out- |
510 |
|
|
of-band methods in which there is no specific character that |
511 |
|
|
means end-of-line. |
512 |
|
|
|
513 |
|
|
This Draft does not constrain how end-of-line is represented |
514 |
|
|
in news, except that characters other than CR and LF MUST |
515 |
|
|
not be usurped for use in end-of-line representations. |
516 |
|
|
Also, obviously, all software dealing with a particular copy |
517 |
|
|
of an article must agree on the convention to be used. |
518 |
|
|
"EOL" is used to mean "whatever end-of-line representation |
519 |
|
|
is appropriate"; it is not necessarily a character or |
520 |
|
|
sequence of characters. |
521 |
|
|
|
522 |
|
|
|
523 |
|
|
|
524 |
|
|
|
525 |
|
|
|
526 |
|
|
2 June 1994 - 8 - expires 15 July 1994 |
527 |
|
|
|
528 |
|
|
|
529 |
|
|
|
530 |
|
|
|
531 |
|
|
|
532 |
|
|
INTERNET DRAFT to be NEWS sec. 2.4 |
533 |
|
|
|
534 |
|
|
|
535 |
|
|
NOTE: If faced with picking an EOL representation |
536 |
|
|
in the absence of other constraints, use of a sin- |
537 |
|
|
gle character simplifies processing, and the ASCII |
538 |
|
|
standard [rrr] specifies that if one character is |
539 |
|
|
to be used for this purpose, it should be LF |
540 |
|
|
(ASCII 10). |
541 |
|
|
|
542 |
|
|
NOTE: Inside MIME encodings, use of the Internet |
543 |
|
|
canonical EOL representation (CR followed by LF) |
544 |
|
|
is mandatory. See [rrr]. |
545 |
|
|
|
546 |
|
|
|
547 |
|
|
2.5. Case-Sensitivity |
548 |
|
|
|
549 |
|
|
Text in newsgroup names, header parameters, etc. is case- |
550 |
|
|
sensitive unless stated otherwise. |
551 |
|
|
|
552 |
|
|
NOTE: This is at variance with MAIL, which is |
553 |
|
|
case-insensitive unless stated otherwise, but is |
554 |
|
|
consistent with news historical practice and |
555 |
|
|
existing news software. See the comments on back- |
556 |
|
|
ward compatibility in section 1. |
557 |
|
|
|
558 |
|
|
|
559 |
|
|
2.6. Language |
560 |
|
|
|
561 |
|
|
Various constant strings in this Draft, such as header names |
562 |
|
|
and month names, are derived from English words. Despite |
563 |
|
|
their derivation, these words do NOT change when the poster |
564 |
|
|
or reader employing them is interacting in a language other |
565 |
|
|
than English. Posting and reading agents SHOULD translate |
566 |
|
|
as appropriate in their interaction with the poster or |
567 |
|
|
reader, but the forms that actually appear in articles are |
568 |
|
|
always the English-derived ones defined in this Draft. |
569 |
|
|
|
570 |
|
|
|
571 |
|
|
3. Relation To MAIL (RFC 822 etc.) |
572 |
|
|
|
573 |
|
|
The primary intent of this Draft is to completely describe |
574 |
|
|
the news article format as a subset of MAIL's message format |
575 |
|
|
augmented by some new headers. Unless explicitly noted oth- |
576 |
|
|
erwise, the intent throughout is that an article MUST also |
577 |
|
|
be a valid MAIL message. |
578 |
|
|
|
579 |
|
|
NOTE: Despite obvious similarities between news |
580 |
|
|
and mail, opinions vary on whether it is possible |
581 |
|
|
or desirable to unify them into a single service. |
582 |
|
|
However, it is unquestionably both possible and |
583 |
|
|
useful to employ some of the same tools for manip- |
584 |
|
|
ulating both mail messages and news articles, so |
585 |
|
|
there is specific advantage to be had in defining |
586 |
|
|
them compatibly. Furthermore, there is no appar- |
587 |
|
|
ent need to re-invent the wheel when slight exten- |
588 |
|
|
sions to an existing definition will suffice. |
589 |
|
|
|
590 |
|
|
|
591 |
|
|
|
592 |
|
|
2 June 1994 - 9 - expires 15 July 1994 |
593 |
|
|
|
594 |
|
|
|
595 |
|
|
|
596 |
|
|
|
597 |
|
|
|
598 |
|
|
INTERNET DRAFT to be NEWS sec. 3 |
599 |
|
|
|
600 |
|
|
|
601 |
|
|
Given that this Draft attempts to be self-contained, it |
602 |
|
|
inevitably contains considerable repetition of information |
603 |
|
|
found in MAIL. This raises the possibility of unintentional |
604 |
|
|
conflicts. Unless specifically noted otherwise, any wording |
605 |
|
|
in this Draft which permits behavior that is not MAIL- |
606 |
|
|
compliant is erroneous and should be followed only to the |
607 |
|
|
extent that the result remains compliant with MAIL. |
608 |
|
|
|
609 |
|
|
NOTE: RFC 1036 said "where this standard conflicts |
610 |
|
|
with [RFC 822], RFC-822 should be considered cor- |
611 |
|
|
rect and this standard in error". Taken liter- |
612 |
|
|
ally, this was obviously incorrect, since RFC 1036 |
613 |
|
|
imposed a number of restrictions not found in RFC |
614 |
|
|
822. The intent, however, was reasonable: to |
615 |
|
|
indicate that UNINTENTIONAL differences were |
616 |
|
|
errors in RFC 1036. |
617 |
|
|
|
618 |
|
|
Implementors and users should note that MAIL is deliberately |
619 |
|
|
an extensible standard, and most extensions devised for mail |
620 |
|
|
are also relevant to (and compatible with) news. Note par- |
621 |
|
|
ticularly MIME [rrr], summarized briefly in appendix B, |
622 |
|
|
which extends MAIL in a number of useful ways that are defi- |
623 |
|
|
nitely relevant to news. Also of note is the work in |
624 |
|
|
progress on reconciling PEM (Privacy Enhanced Mail, which |
625 |
|
|
defines extensions for authentication and security) with |
626 |
|
|
MIME, after which this may also be relevant to news. |
627 |
|
|
|
628 |
|
|
UNRESOLVED ISSUE: Update the MIME/PEM information. |
629 |
|
|
|
630 |
|
|
Similarly, descriptions here of MIME facilities should be |
631 |
|
|
considered correct only to the extent that they do not |
632 |
|
|
require or legitimize practices that would violate those |
633 |
|
|
RFCs. (Note that this Draft does extend the application of |
634 |
|
|
some MIME facilities, but this is an extension rather than |
635 |
|
|
an alteration.) |
636 |
|
|
|
637 |
|
|
|
638 |
|
|
4. Basic Format |
639 |
|
|
|
640 |
|
|
|
641 |
|
|
4.1. Overall Syntax |
642 |
|
|
|
643 |
|
|
The overall syntax of a news article is: |
644 |
|
|
|
645 |
|
|
|
646 |
|
|
|
647 |
|
|
|
648 |
|
|
|
649 |
|
|
|
650 |
|
|
|
651 |
|
|
|
652 |
|
|
|
653 |
|
|
|
654 |
|
|
|
655 |
|
|
|
656 |
|
|
|
657 |
|
|
|
658 |
|
|
2 June 1994 - 10 - expires 15 July 1994 |
659 |
|
|
|
660 |
|
|
|
661 |
|
|
|
662 |
|
|
|
663 |
|
|
|
664 |
|
|
INTERNET DRAFT to be NEWS sec. 4.1 |
665 |
|
|
|
666 |
|
|
|
667 |
|
|
article = 1*header separator body |
668 |
|
|
header = start-line *continuation |
669 |
|
|
start-line = header-name ":" space [ nonblank-text ] eol |
670 |
|
|
continuation = space nonblank-text eol |
671 |
|
|
header-name = 1*name-character *( "-" 1*name-character ) |
672 |
|
|
name-character = letter / digit |
673 |
|
|
letter = <ASCII letter A-Z or a-z> |
674 |
|
|
digit = <ASCII digit 0-9> |
675 |
|
|
separator = eol |
676 |
|
|
body = *( [ nonblank-text / space ] eol ) |
677 |
|
|
eol = <EOL> |
678 |
|
|
nonblank-text = [ space ] text-character *( space-or-text ) |
679 |
|
|
text-character = <any ASCII character except NUL (ASCII 0), |
680 |
|
|
HT (ASCII 9), LF (ASCII 10), CR (ASCII 13), |
681 |
|
|
or blank (ASCII 32)> |
682 |
|
|
space = 1*( <HT (ASCII 9)> / <blank (ASCII 32)> ) |
683 |
|
|
space-or-text = space / text-character |
684 |
|
|
|
685 |
|
|
An article consists of some headers followed by a body. An |
686 |
|
|
empty line separates the two. The headers contain struc- |
687 |
|
|
tured information about the article and its transmission. A |
688 |
|
|
header begins with a header name identifying it, and can be |
689 |
|
|
continued onto subsequent lines by beginning the continua- |
690 |
|
|
tion line(s) with white space. (Note that section 4.2.3 |
691 |
|
|
adds some restrictions to the header syntax indicated here.) |
692 |
|
|
The body is largely-unstructured text significant only to |
693 |
|
|
the poster and the readers. |
694 |
|
|
|
695 |
|
|
NOTE: Terminology here follows the current custom |
696 |
|
|
in the news community, rather than the MAIL con- |
697 |
|
|
vention of (sometimes) referring to what is here |
698 |
|
|
called a "header" as a "header field" or "field". |
699 |
|
|
|
700 |
|
|
Note that the separator line must be truly empty, not just a |
701 |
|
|
line containing white space. Further empty lines following |
702 |
|
|
it are part of the body, as are empty lines at the end of |
703 |
|
|
the article. |
704 |
|
|
|
705 |
|
|
NOTE: Some systems make no distinction between |
706 |
|
|
empty lines and lines consisting entirely of white |
707 |
|
|
space; indeed, some systems cannot represent |
708 |
|
|
entirely empty lines. The grammar's requirement |
709 |
|
|
that header continuation lines contain some print- |
710 |
|
|
able text is meant to ensure that the empty/space |
711 |
|
|
distinction cannot confuse identification of the |
712 |
|
|
separator line. |
713 |
|
|
|
714 |
|
|
NOTE: It is tempting to authorize posting agents |
715 |
|
|
to strip empty lines at the beginning and end of |
716 |
|
|
the body, but such empty lines could possibly be |
717 |
|
|
part of a preformatted document. |
718 |
|
|
|
719 |
|
|
Implementors are warned that trailing white space, whether |
720 |
|
|
alone on the line or not, MAY be significant in the body, |
721 |
|
|
|
722 |
|
|
|
723 |
|
|
|
724 |
|
|
2 June 1994 - 11 - expires 15 July 1994 |
725 |
|
|
|
726 |
|
|
|
727 |
|
|
|
728 |
|
|
|
729 |
|
|
|
730 |
|
|
INTERNET DRAFT to be NEWS sec. 4.1 |
731 |
|
|
|
732 |
|
|
|
733 |
|
|
notably in early versions of the "uuencode" encoding for |
734 |
|
|
binary data. Trailing white space MUST be preserved unless |
735 |
|
|
the article is known to have originated within a cooperating |
736 |
|
|
subnet that avoids using significant trailing white space, |
737 |
|
|
and SHOULD be preserved regardless. Posters SHOULD avoid |
738 |
|
|
using conventions or encodings which make trailing white |
739 |
|
|
space significant; for encoding of binary data, MIME's |
740 |
|
|
"base64" encoding is recommended. Implementors are warned |
741 |
|
|
that ISO C implementations are not required to preserve |
742 |
|
|
trailing white space, and special precautions may be neces- |
743 |
|
|
sary in implementations which do not. |
744 |
|
|
|
745 |
|
|
NOTE: Unfortunately, the signature-delimiter con- |
746 |
|
|
vention (described in section 4.3.2) does use sig- |
747 |
|
|
nificant trailing white space. It's too late to |
748 |
|
|
fix this; there is work underway on defining an |
749 |
|
|
organized signature convention as part of MIME, |
750 |
|
|
which is a preferable solution in the long run. |
751 |
|
|
|
752 |
|
|
Posters are warned that some very old relayer software mis- |
753 |
|
|
behaves when the first non-empty line of an article body |
754 |
|
|
begins with white space. |
755 |
|
|
|
756 |
|
|
|
757 |
|
|
4.2. Headers |
758 |
|
|
|
759 |
|
|
|
760 |
|
|
4.2.1. Names and Contents |
761 |
|
|
|
762 |
|
|
Despite the restrictions on header-name syntax imposed by |
763 |
|
|
the grammar, relayers and reading agents SHOULD tolerate |
764 |
|
|
header names containing any ASCII printable character other |
765 |
|
|
than colon (":", ASCII 58). |
766 |
|
|
|
767 |
|
|
NOTE: MAIL header names can contain any ASCII |
768 |
|
|
printable character (other than colon) in theory, |
769 |
|
|
but in practice, arbitrary header names are known |
770 |
|
|
to cause trouble for some news software. Section |
771 |
|
|
4.1's restriction to alphanumeric sequences sepa- |
772 |
|
|
rated by hyphens is believed to permit all widely- |
773 |
|
|
used header names without causing problems for any |
774 |
|
|
widely-used software. Software is nevertheless |
775 |
|
|
encouraged to cope correctly with the full range |
776 |
|
|
of possibilities, since aberrations are known to |
777 |
|
|
occur. |
778 |
|
|
|
779 |
|
|
Relayers MUST disregard headers not described in this Draft |
780 |
|
|
(that is, with header names not mentioned in this Draft), |
781 |
|
|
and pass them on unaltered. |
782 |
|
|
|
783 |
|
|
Posters wishing to convey non-standard information in head- |
784 |
|
|
ers SHOULD use header names beginning with "X-". No stan- |
785 |
|
|
dard header name will ever be of this form. Reading agents |
786 |
|
|
SHOULD ignore "X-" headers, or at least treat them with |
787 |
|
|
|
788 |
|
|
|
789 |
|
|
|
790 |
|
|
2 June 1994 - 12 - expires 15 July 1994 |
791 |
|
|
|
792 |
|
|
|
793 |
|
|
|
794 |
|
|
|
795 |
|
|
|
796 |
|
|
INTERNET DRAFT to be NEWS sec. 4.2.1 |
797 |
|
|
|
798 |
|
|
|
799 |
|
|
great care. |
800 |
|
|
|
801 |
|
|
The order of headers in an article is not significant. How- |
802 |
|
|
ever, posting agents are encouraged to put mandatory headers |
803 |
|
|
(see section 5) first, followed by optional headers (see |
804 |
|
|
section 6), followed by headers not defined in this Draft. |
805 |
|
|
|
806 |
|
|
NOTE: While relayers and reading agents must be |
807 |
|
|
prepared to handle any order, having the signifi- |
808 |
|
|
cant headers (the precise definition of "signifi- |
809 |
|
|
cant" depends on context) first can noticeably |
810 |
|
|
improve efficiency, especially in memory-limited |
811 |
|
|
environments where it is difficult to buffer up an |
812 |
|
|
arbitrary quantity of headers while searching for |
813 |
|
|
the few that matter. |
814 |
|
|
|
815 |
|
|
Header names are case-insensitive. There is a preferred |
816 |
|
|
case convention, which posters and posting agents SHOULD |
817 |
|
|
use: each hyphen-separated "word" has its initial letter (if |
818 |
|
|
any) in uppercase and the rest in lowercase, except that |
819 |
|
|
some abbreviations have all letters uppercase (e.g. "Mes- |
820 |
|
|
sage-ID" and "MIME-Version"). The forms used in this Draft |
821 |
|
|
are the preferred forms for the headers described herein. |
822 |
|
|
Relayers and reading agents are warned that articles might |
823 |
|
|
not obey this convention. |
824 |
|
|
|
825 |
|
|
NOTE: Although software must be prepared for the |
826 |
|
|
possibility of random use of case in header names |
827 |
|
|
(and other case-independent text), establishing a |
828 |
|
|
preferred convention reduces pointless diversity, |
829 |
|
|
and may permit optimized software that looks for |
830 |
|
|
the preferred forms before resorting to less- |
831 |
|
|
efficient case-insensitive searches. |
832 |
|
|
|
833 |
|
|
In general, a header can consist of several lines, with each |
834 |
|
|
continuation line beginning with white space. The EOLs pre- |
835 |
|
|
ceding continuation lines are ignored when processing such a |
836 |
|
|
header, effectively combining the start-line and the contin- |
837 |
|
|
uations into a single logical line. The logical line, less |
838 |
|
|
the header name, colon, and any white space following the |
839 |
|
|
colon, is the "header content". |
840 |
|
|
|
841 |
|
|
|
842 |
|
|
4.2.2. Undesirable Headers |
843 |
|
|
|
844 |
|
|
A header whose content is empty is said to be an empty |
845 |
|
|
header. Relayers and reading agents SHOULD not consider |
846 |
|
|
presence or absence of an empty header to alter the seman- |
847 |
|
|
tics of an article (although syntactic rules, such as |
848 |
|
|
requirements that certain header names appear at most once |
849 |
|
|
in an article, MUST still be satisfied). Posting agents |
850 |
|
|
SHOULD delete empty headers from articles before posting |
851 |
|
|
them. |
852 |
|
|
|
853 |
|
|
|
854 |
|
|
|
855 |
|
|
|
856 |
|
|
2 June 1994 - 13 - expires 15 July 1994 |
857 |
|
|
|
858 |
|
|
|
859 |
|
|
|
860 |
|
|
|
861 |
|
|
|
862 |
|
|
INTERNET DRAFT to be NEWS sec. 4.2.2 |
863 |
|
|
|
864 |
|
|
|
865 |
|
|
Headers that merely state defaults explicitly (e.g., a Fol- |
866 |
|
|
lowup-To header with the same content as the Newsgroups |
867 |
|
|
header, or a MIME Content-Type header with contents |
868 |
|
|
"text/plain; charset=us-ascii") or state information that |
869 |
|
|
reading agents can typically determine easily themselves |
870 |
|
|
(e.g. the length of the body in octets) are redundant, con- |
871 |
|
|
veying no information whatsoever. Headers that state infor- |
872 |
|
|
mation which cannot possibly be of use to a significant num- |
873 |
|
|
ber of relayers, reading agents, or readers (e.g., the name |
874 |
|
|
of the software package used as the posting agent) are use- |
875 |
|
|
less and pointless. Posters and posting agents SHOULD avoid |
876 |
|
|
including redundant or useless headers in articles. |
877 |
|
|
|
878 |
|
|
NOTE: Information that someone, somewhere, might |
879 |
|
|
someday find useful is best omitted from headers. |
880 |
|
|
(There's quite enough of it in article bodies.) |
881 |
|
|
Headers should contain information of known util- |
882 |
|
|
ity only. This is not meant to preclude inclusion |
883 |
|
|
of information primarily meant for news-software |
884 |
|
|
debugging, but such information should be included |
885 |
|
|
only if there is real reason, preferably based on |
886 |
|
|
experience, to suspect that it may be genuinely |
887 |
|
|
useful. Articles passing through gateways are the |
888 |
|
|
only obvious case where inclusion of debugging |
889 |
|
|
information appears clearly legitimate. (See sec- |
890 |
|
|
tion 10.1.) |
891 |
|
|
|
892 |
|
|
NOTE: A useful rule of thumb for software imple- |
893 |
|
|
mentors is: "if I had to pay a dollar a day for |
894 |
|
|
the transmission of this header, would I still |
895 |
|
|
think it worthwhile?". |
896 |
|
|
|
897 |
|
|
|
898 |
|
|
4.2.3. White Space and Continuations |
899 |
|
|
|
900 |
|
|
The colon following the header name on the start-line MUST |
901 |
|
|
be followed by white space, even if the header is empty. If |
902 |
|
|
the header is not empty, at least some of the content MUST |
903 |
|
|
appear on the start-line. Posting agents MUST enforce these |
904 |
|
|
restrictions, but relayers (etc.) SHOULD accept even arti- |
905 |
|
|
cles that violate them. |
906 |
|
|
|
907 |
|
|
NOTE: MAIL does not require white space after the |
908 |
|
|
colon, but it is usual. RFC 1036 required the |
909 |
|
|
white space, even in empty headers, and some |
910 |
|
|
existing software demands it. In MAIL, and |
911 |
|
|
arguably in RFC 1036 (although the wording is |
912 |
|
|
vague), it is technically legitimate for the white |
913 |
|
|
space to be part of a continuation line rather |
914 |
|
|
than the start-line, but not all existing software |
915 |
|
|
will accept this. Deleting empty headers and |
916 |
|
|
placing some content on the start-line avoids this |
917 |
|
|
issue... which is desirable because trailing |
918 |
|
|
blanks, easily deleted by accident, are best not |
919 |
|
|
|
920 |
|
|
|
921 |
|
|
|
922 |
|
|
2 June 1994 - 14 - expires 15 July 1994 |
923 |
|
|
|
924 |
|
|
|
925 |
|
|
|
926 |
|
|
|
927 |
|
|
|
928 |
|
|
INTERNET DRAFT to be NEWS sec. 4.2.3 |
929 |
|
|
|
930 |
|
|
|
931 |
|
|
made significant in headers. |
932 |
|
|
|
933 |
|
|
In general, posters and posting agents SHOULD use blank |
934 |
|
|
(ASCII 32), not tab (ASCII 9), where white space is desired |
935 |
|
|
in headers. Existing software does not consistently accept |
936 |
|
|
tab as synonymous with blank in all contexts. In particu- |
937 |
|
|
lar, RFC 1036 appeared to specify that the character immedi- |
938 |
|
|
ately following the colon after a header name was required |
939 |
|
|
to be a blank, and some news software insists on that, so |
940 |
|
|
this character MUST be a blank. Again, posting agents MUST |
941 |
|
|
enforce these restrictions but relayers SHOULD be more tol- |
942 |
|
|
erant. |
943 |
|
|
|
944 |
|
|
Since the white space beginning a continuation line remains |
945 |
|
|
a part of the logical line, headers can be "broken" into |
946 |
|
|
multiple lines only at white space. Posting agents SHOULD |
947 |
|
|
not break headers unnecessarily. Relayers SHOULD preserve |
948 |
|
|
existing header breaks, and SHOULD not introduce new breaks. |
949 |
|
|
Breaking headers SHOULD be a last resort; relayers and read- |
950 |
|
|
ing agents SHOULD handle long header lines gracefully. (See |
951 |
|
|
the discussion of size limits in section 4.6.) |
952 |
|
|
|
953 |
|
|
|
954 |
|
|
4.3. Body |
955 |
|
|
|
956 |
|
|
Although the article body is unstructured for most of the |
957 |
|
|
purposes of this Draft, structure MAY be imposed on it by |
958 |
|
|
other means, notably MIME headers (see appendix B). |
959 |
|
|
|
960 |
|
|
|
961 |
|
|
4.3.1. Body Format Issues |
962 |
|
|
|
963 |
|
|
The body of an article MAY be empty, although posting agents |
964 |
|
|
SHOULD consider this an error condition (meriting returning |
965 |
|
|
the article to the poster for revision). A posting agent |
966 |
|
|
which does not reject such an article SHOULD issue a warning |
967 |
|
|
message to the poster and supply a non-empty body. Note |
968 |
|
|
that the separator line MUST be present even if the body is |
969 |
|
|
empty. |
970 |
|
|
|
971 |
|
|
NOTE: An empty body is probably a poster error |
972 |
|
|
except, arguably, for some control messages... and |
973 |
|
|
even they really ought to have a body explaining |
974 |
|
|
the reason for the control message. Some old |
975 |
|
|
reading agents are known to generate empty bodies |
976 |
|
|
for "cancel" control messages, so posting agents |
977 |
|
|
might opt not to reject body-less articles in such |
978 |
|
|
cases (although it would be better to fix the |
979 |
|
|
reading agents to request a body). However, some |
980 |
|
|
existing news software is known to react badly to |
981 |
|
|
body-less articles, hence the request for posting |
982 |
|
|
agents to insert a body in such cases. |
983 |
|
|
|
984 |
|
|
|
985 |
|
|
|
986 |
|
|
|
987 |
|
|
|
988 |
|
|
2 June 1994 - 15 - expires 15 July 1994 |
989 |
|
|
|
990 |
|
|
|
991 |
|
|
|
992 |
|
|
|
993 |
|
|
|
994 |
|
|
INTERNET DRAFT to be NEWS sec. 4.3.1 |
995 |
|
|
|
996 |
|
|
|
997 |
|
|
NOTE: A possible posting-agent-supplied body text |
998 |
|
|
(already used by one widespread posting agent) is |
999 |
|
|
"This article was probably generated by a buggy |
1000 |
|
|
news reader.". (The use of "reader" to refer to |
1001 |
|
|
the reading agent is traditional, although this |
1002 |
|
|
Draft uses more precise terminology.) |
1003 |
|
|
|
1004 |
|
|
NOTE: The requirement for the separator line even |
1005 |
|
|
in a bodyless article is inherited from MAIL, and |
1006 |
|
|
also distinguishes legitimately-bodyless articles |
1007 |
|
|
from articles accidentally truncated in the middle |
1008 |
|
|
of the headers. |
1009 |
|
|
|
1010 |
|
|
Note that an article body is a sequence of lines terminated |
1011 |
|
|
by EOLs, not arbitrary binary data, and in particular it |
1012 |
|
|
MUST end with an EOL. However, relayers SHOULD treat the |
1013 |
|
|
body of an article as an uninterpreted sequence of octets |
1014 |
|
|
(except as mandated by changes of EOL representation and by |
1015 |
|
|
control-message processing) and SHOULD avoid imposing con- |
1016 |
|
|
straints on it. See also section 4.6. |
1017 |
|
|
|
1018 |
|
|
|
1019 |
|
|
4.3.2. Body Conventions |
1020 |
|
|
|
1021 |
|
|
Although body lines can in principle be very long (see sec- |
1022 |
|
|
tion 4.6 for some discussion of length limits), posters |
1023 |
|
|
SHOULD restrict body line lengths to circa 70-75 characters. |
1024 |
|
|
On systems where text is conventionally stored with EOLs |
1025 |
|
|
only at paragraph breaks and other "hard return" points, |
1026 |
|
|
with software breaking lines as appropriate for display or |
1027 |
|
|
manipulation, posting agents SHOULD insert EOLs as necessary |
1028 |
|
|
so that posted articles comply with this restriction. |
1029 |
|
|
|
1030 |
|
|
NOTE: News originated in environments where line |
1031 |
|
|
breaks in plain text files were supplied by the |
1032 |
|
|
user, not the software. Be this good or bad, much |
1033 |
|
|
reading-agent and posting-agent software assumes |
1034 |
|
|
that news articles follow this convention, so it |
1035 |
|
|
is often inconvenient to read or respond to arti- |
1036 |
|
|
cles which violate it. The "70-75" number comes |
1037 |
|
|
from the widespread use of display devices which |
1038 |
|
|
are 80 columns wide, and the desire to leave a bit |
1039 |
|
|
of margin for quoting etc. (see below). |
1040 |
|
|
|
1041 |
|
|
Reading agents confronted with body lines much longer than |
1042 |
|
|
the available output-device width SHOULD break lines as |
1043 |
|
|
appropriate. Posters are warned that such breaks may not |
1044 |
|
|
occur exactly where the poster intends. |
1045 |
|
|
|
1046 |
|
|
NOTE: "As appropriate" would typically include |
1047 |
|
|
breaking lines when supplying the text of an arti- |
1048 |
|
|
cle to be quoted in a reply or followup, something |
1049 |
|
|
that line-breaking reading agents often neglect to |
1050 |
|
|
do now. |
1051 |
|
|
|
1052 |
|
|
|
1053 |
|
|
|
1054 |
|
|
2 June 1994 - 16 - expires 15 July 1994 |
1055 |
|
|
|
1056 |
|
|
|
1057 |
|
|
|
1058 |
|
|
|
1059 |
|
|
|
1060 |
|
|
INTERNET DRAFT to be NEWS sec. 4.3.2 |
1061 |
|
|
|
1062 |
|
|
|
1063 |
|
|
Although styles vary widely, for plain text it is usual to |
1064 |
|
|
use no left margin, leave the right edge ragged, use a sin- |
1065 |
|
|
gle empty line to separate paragraphs, and employ normal |
1066 |
|
|
natural-language usage on matters such as upper/lowercase. |
1067 |
|
|
(In particular, articles SHOULD not be written entirely in |
1068 |
|
|
uppercase. In environments where posters have access only |
1069 |
|
|
to uppercase, posting agents SHOULD translate it to lower- |
1070 |
|
|
case.) |
1071 |
|
|
|
1072 |
|
|
NOTE: Most people find substantial bodies of text |
1073 |
|
|
entirely in uppercase relatively hard to read, |
1074 |
|
|
while all-lowercase text merely looks slightly |
1075 |
|
|
odd. The common association of uppercase with |
1076 |
|
|
strong emphasis adds to this. |
1077 |
|
|
|
1078 |
|
|
Tone of voice does not carry well in written text, and mis- |
1079 |
|
|
understandings are common when sarcasm, parody, or exaggera- |
1080 |
|
|
tion for humorous effect is attempted without explicit warn- |
1081 |
|
|
ing. It has become conventional to use the sequence ":-)", |
1082 |
|
|
which (on most output devices) resembles a rotated "smiley |
1083 |
|
|
face" symbol, as a marker for text not meant to be taken |
1084 |
|
|
literally, especially when humor is intended. This practice |
1085 |
|
|
aids communication and averts unintended ill-will; posters |
1086 |
|
|
are urged to use it. A variety of analogous sequences are |
1087 |
|
|
used with less-standardized meanings [Sanderson]. |
1088 |
|
|
|
1089 |
|
|
The order of arrival of news articles at a particular host |
1090 |
|
|
depends somewhat on transmission paths, and occasionally |
1091 |
|
|
articles are lost for various reasons. When responding to a |
1092 |
|
|
previous article, posters SHOULD not assume that all readers |
1093 |
|
|
understand the exact context. It is common to quote some of |
1094 |
|
|
the previous article to establish context. This SHOULD be |
1095 |
|
|
done by prefacing each quoted line (even if it is empty) |
1096 |
|
|
with the character ">". This will result in multiple levels |
1097 |
|
|
of ">" when quoted context itself contains quoted context. |
1098 |
|
|
|
1099 |
|
|
NOTE: It may seem superfluous to put a prefix on |
1100 |
|
|
empty lines, but it simplifies implementation of |
1101 |
|
|
functions such as "skip all quoted text" in read- |
1102 |
|
|
ing agents. |
1103 |
|
|
|
1104 |
|
|
Readability is enhanced if quoted text and new text are sep- |
1105 |
|
|
arated by an empty line. |
1106 |
|
|
|
1107 |
|
|
Posters SHOULD edit quoted context to trim it down to the |
1108 |
|
|
minimum necessary. However, posting agents SHOULD not |
1109 |
|
|
attempt to enforce this by imposing overly-simplistic rules |
1110 |
|
|
like "no more than 50% of the lines should be quotes". |
1111 |
|
|
|
1112 |
|
|
NOTE: While encouraging trimming is desirable, the |
1113 |
|
|
50% rule imposed by some old posting agents is |
1114 |
|
|
both inadequate and counterproductive. Posters do |
1115 |
|
|
not respond to it by being more selective about |
1116 |
|
|
quoting; they respond by padding short responses, |
1117 |
|
|
|
1118 |
|
|
|
1119 |
|
|
|
1120 |
|
|
2 June 1994 - 17 - expires 15 July 1994 |
1121 |
|
|
|
1122 |
|
|
|
1123 |
|
|
|
1124 |
|
|
|
1125 |
|
|
|
1126 |
|
|
INTERNET DRAFT to be NEWS sec. 4.3.2 |
1127 |
|
|
|
1128 |
|
|
|
1129 |
|
|
or by using different quoting styles to defeat |
1130 |
|
|
automatic analysis. The former adds unnecessary |
1131 |
|
|
noise and volume, while the latter also defeats |
1132 |
|
|
more useful forms of automatic analysis that read- |
1133 |
|
|
ing agents might wish to do. |
1134 |
|
|
|
1135 |
|
|
NOTE: At the very least, if a minimum-unquoted |
1136 |
|
|
quota is being set, article bodies shorter than |
1137 |
|
|
(say) 20 lines, or perhaps articles which exceed |
1138 |
|
|
the quota by only a few lines, should be exempt. |
1139 |
|
|
This avoids the ridiculous situation of complain- |
1140 |
|
|
ing about a 5-line response to a 6-line quote. |
1141 |
|
|
|
1142 |
|
|
NOTE: A more subtle posting-agent rule, suggested |
1143 |
|
|
for experimental use, is to reject articles that |
1144 |
|
|
appear to contain quoted signatures (see below). |
1145 |
|
|
This is almost certainly the result of a careless |
1146 |
|
|
poster not bothering to trim down quoted context. |
1147 |
|
|
Also, if a posting agent or followup agent pre- |
1148 |
|
|
sents an article template to the poster for edit- |
1149 |
|
|
ing, it really should take note of whether the |
1150 |
|
|
poster actually made any changes, and refrain from |
1151 |
|
|
posting an unmodified template. |
1152 |
|
|
|
1153 |
|
|
Some followup agents supply "attribution" lines for quoted |
1154 |
|
|
context, indicating where it first appeared and under whose |
1155 |
|
|
name. When multiple levels of quoting are present and |
1156 |
|
|
quoted context is edited for brevity, "inner" attribution |
1157 |
|
|
lines are not always retained. The editing process is also |
1158 |
|
|
somewhat error-prone. Reading agents (and readers) are |
1159 |
|
|
warned not to assume that attributions are accurate. |
1160 |
|
|
|
1161 |
|
|
UNRESOLVED ISSUE: Should a standard format for |
1162 |
|
|
attribution lines be defined? There is already |
1163 |
|
|
considerable diversity... but automatic news anal- |
1164 |
|
|
ysis would be substantially aided by a standard |
1165 |
|
|
convention. |
1166 |
|
|
|
1167 |
|
|
Early difficulties in inferring return addresses from arti- |
1168 |
|
|
cle headers led to "signatures": short closing texts, auto- |
1169 |
|
|
matically added to the end of articles by posting agents, |
1170 |
|
|
identifying the poster and giving his network addresses etc. |
1171 |
|
|
If a poster or posting agent does append a signature to an |
1172 |
|
|
article, the signature SHOULD be preceded with a delimiter |
1173 |
|
|
line containing (only) two hyphens (ASCII 45) followed by |
1174 |
|
|
one blank (ASCII 32). Posting agents SHOULD limit the |
1175 |
|
|
length of signatures, since verbose excess bordering on |
1176 |
|
|
abuse is common if no restraint is imposed; 4 lines is a |
1177 |
|
|
common limit. |
1178 |
|
|
|
1179 |
|
|
NOTE: While signatures are arguably a blemish, |
1180 |
|
|
they are a well-understood convention, and convey- |
1181 |
|
|
ing the same information in headers exposes it to |
1182 |
|
|
mangling and makes it rather less conspicuous. A |
1183 |
|
|
|
1184 |
|
|
|
1185 |
|
|
|
1186 |
|
|
2 June 1994 - 18 - expires 15 July 1994 |
1187 |
|
|
|
1188 |
|
|
|
1189 |
|
|
|
1190 |
|
|
|
1191 |
|
|
|
1192 |
|
|
INTERNET DRAFT to be NEWS sec. 4.3.2 |
1193 |
|
|
|
1194 |
|
|
|
1195 |
|
|
standard delimiter line makes it possible for |
1196 |
|
|
reading agents to handle signatures specially if |
1197 |
|
|
desired. (This is unfortunately hampered by |
1198 |
|
|
extensive misunderstanding of, and misuse of, the |
1199 |
|
|
delimiter.) |
1200 |
|
|
|
1201 |
|
|
NOTE: The choice of delimiter is somewhat unfortu- |
1202 |
|
|
nate, since it relies on preservation of trailing |
1203 |
|
|
white space, but it is too well-established to |
1204 |
|
|
change. There is work underway to define a more |
1205 |
|
|
sophisticated signature scheme as part of MIME, |
1206 |
|
|
and this will presumably supersede the current |
1207 |
|
|
convention in due time. |
1208 |
|
|
|
1209 |
|
|
NOTE: Four 75-column lines of signature text is |
1210 |
|
|
300 characters, which is ample to convey name and |
1211 |
|
|
mail-address information in all but the most |
1212 |
|
|
bizarre situations. |
1213 |
|
|
|
1214 |
|
|
|
1215 |
|
|
4.4. Characters And Character Sets |
1216 |
|
|
|
1217 |
|
|
Header and body lines MAY contain any ASCII characters other |
1218 |
|
|
than CR (ASCII 13), LF (ASCII 10), and NUL (ASCII 0). |
1219 |
|
|
|
1220 |
|
|
NOTE: CR and LF are excluded because they clash |
1221 |
|
|
with common EOL conventions. NUL is excluded |
1222 |
|
|
because it clashes with the C end-of-string con- |
1223 |
|
|
vention, which is significant to most existing |
1224 |
|
|
news software. These three characters are |
1225 |
|
|
unlikely to be transmitted successfully. |
1226 |
|
|
|
1227 |
|
|
However, posters SHOULD avoid using ASCII control characters |
1228 |
|
|
except for tab (ASCII 9), formfeed (ASCII 12), and backspace |
1229 |
|
|
(ASCII 8). Tab signifies sufficient horizontal white space |
1230 |
|
|
to reach the next of a set of fixed positions; posters are |
1231 |
|
|
warned that there is no standard set of positions, so tabs |
1232 |
|
|
should be avoided if precise spacing is essential. Formfeed |
1233 |
|
|
signifies a point at which a reading agent SHOULD pause and |
1234 |
|
|
await reader interaction before displaying further text. |
1235 |
|
|
Backspace SHOULD be used only for underlining, done by a |
1236 |
|
|
sequence of underscores (ASCII 95) followed by an equal num- |
1237 |
|
|
ber of backspaces, signifying that the same number of text |
1238 |
|
|
characters following are to be underlined. Posters are |
1239 |
|
|
warned that underlining is not available on all output |
1240 |
|
|
devices and is best not relied on for essential meaning. |
1241 |
|
|
Reading agents SHOULD recognize underlining and translate it |
1242 |
|
|
to the appropriate commands for devices that support it. |
1243 |
|
|
|
1244 |
|
|
NOTE: Interpretation of almost all control charac- |
1245 |
|
|
ters is device-specific to some degree, and |
1246 |
|
|
devices differ. Tabs and underlining are sup- |
1247 |
|
|
ported, to some extent, by most modern devices and |
1248 |
|
|
reading agents, hence the cautious exemptions for |
1249 |
|
|
|
1250 |
|
|
|
1251 |
|
|
|
1252 |
|
|
2 June 1994 - 19 - expires 15 July 1994 |
1253 |
|
|
|
1254 |
|
|
|
1255 |
|
|
|
1256 |
|
|
|
1257 |
|
|
|
1258 |
|
|
INTERNET DRAFT to be NEWS sec. 4.4 |
1259 |
|
|
|
1260 |
|
|
|
1261 |
|
|
them. The underlining method is specified because |
1262 |
|
|
the inverse method, text and then underscores, is |
1263 |
|
|
tempting to the naive... but if sent unaltered to |
1264 |
|
|
a device that shows only the most recent of sev- |
1265 |
|
|
eral overstruck characters rather than a compos- |
1266 |
|
|
ite, the result can be utterly unreadable. |
1267 |
|
|
|
1268 |
|
|
NOTE: A common interpretation of tab is that it is |
1269 |
|
|
a request to space forward to the next position |
1270 |
|
|
whose number is one more than a multiple of 8, |
1271 |
|
|
with positions numbered sequentially starting at |
1272 |
|
|
1. (So tab positions are 9, 17, 25, ...) Reading |
1273 |
|
|
agents not constrained by existing system conven- |
1274 |
|
|
tions might wish to use this interpretation. |
1275 |
|
|
|
1276 |
|
|
NOTE: It will typically be necessary for a reading |
1277 |
|
|
agent to catch and interpret formfeed, not just |
1278 |
|
|
send it to the output device. The actions per- |
1279 |
|
|
formed by typical output devices on receiving a |
1280 |
|
|
formfeed are neither adequate for nor appropriate |
1281 |
|
|
to the pause-for-interaction meaning. |
1282 |
|
|
|
1283 |
|
|
Cooperating subnets which wish to employ non-ASCII character |
1284 |
|
|
sets by using escape sequences (employing, e.g., ESC (ASCII |
1285 |
|
|
27), SO (ASCII 14), and SI (ASCII 15)) to alter the meaning |
1286 |
|
|
of superficially-ASCII characters MAY do so, but MUST use |
1287 |
|
|
MIME headers to alert reading agents to the particular char- |
1288 |
|
|
acter set(s) and escape sequences in use. A reading agent |
1289 |
|
|
SHOULD not pass such an escape sequence through, unaltered, |
1290 |
|
|
to the output device unless the agent confirms that the |
1291 |
|
|
sequence is one used to affect character sets and has reason |
1292 |
|
|
to believe that the device is capable of interpreting that |
1293 |
|
|
particular sequence properly. |
1294 |
|
|
|
1295 |
|
|
NOTE: Cooperating-subnet organizers are warned |
1296 |
|
|
that some very old relayers strip certain control |
1297 |
|
|
characters out of articles they pass along. ESC |
1298 |
|
|
is known to be among the affected characters. |
1299 |
|
|
|
1300 |
|
|
NOTE: There are now standard Internet encodings |
1301 |
|
|
for Japanese [rrr] and Vietnamese [rrr] in partic- |
1302 |
|
|
ular. |
1303 |
|
|
|
1304 |
|
|
Articles MUST not contain any octet with value exceeding |
1305 |
|
|
127, i.e. any octet that is not an ASCII character. |
1306 |
|
|
|
1307 |
|
|
NOTE: This rule, like others, may be relaxed by |
1308 |
|
|
unanimous consent of the members of a cooperating |
1309 |
|
|
subnet, provided suitable precautions are taken to |
1310 |
|
|
ensure that rule-violating articles do not leak |
1311 |
|
|
out of the subnet. (This has already been done in |
1312 |
|
|
many areas where ASCII is not adequate for the |
1313 |
|
|
local language(s).) Beware that articles contain- |
1314 |
|
|
ing non-ASCII octets in headers are a violation of |
1315 |
|
|
|
1316 |
|
|
|
1317 |
|
|
|
1318 |
|
|
2 June 1994 - 20 - expires 15 July 1994 |
1319 |
|
|
|
1320 |
|
|
|
1321 |
|
|
|
1322 |
|
|
|
1323 |
|
|
|
1324 |
|
|
INTERNET DRAFT to be NEWS sec. 4.4 |
1325 |
|
|
|
1326 |
|
|
|
1327 |
|
|
the MAIL specifications and are not valid MAIL |
1328 |
|
|
messages. MIME offers a way to encode non-ASCII |
1329 |
|
|
characters in ASCII for use in headers; see sec- |
1330 |
|
|
tion 4.5. |
1331 |
|
|
|
1332 |
|
|
NOTE: While there is great interest in using 8-bit |
1333 |
|
|
character sets, not all software can yet handle |
1334 |
|
|
them correctly. Hence the restriction to cooper- |
1335 |
|
|
ating subnets. MIME encodings can be used to |
1336 |
|
|
transmit such characters while remaining within |
1337 |
|
|
the octet restriction. |
1338 |
|
|
|
1339 |
|
|
In anticipation of the day when it is possible to use non- |
1340 |
|
|
ASCII characters safely anywhere, and to provide for the |
1341 |
|
|
(substantial) cooperating subnets that are already using |
1342 |
|
|
them, transmission paths SHOULD treat news articles as unin- |
1343 |
|
|
terpreted sequences of octets (except perhaps for transfor- |
1344 |
|
|
mations between EOL representations) and relayers SHOULD |
1345 |
|
|
treat non-ASCII characters in articles as ordinary charac- |
1346 |
|
|
ters. |
1347 |
|
|
|
1348 |
|
|
NOTE: 8-bit enthusiasts are warned that not all |
1349 |
|
|
software conforms to these recommendations yet. |
1350 |
|
|
In particular, standard NNTP [rrr] is a 7-bit pro- |
1351 |
|
|
tocol, and there may be implementations which |
1352 |
|
|
enforce this rule. Be warned, also, that it will |
1353 |
|
|
never be safe to send raw binary data in the body |
1354 |
|
|
of news articles, because changes of EOL represen- |
1355 |
|
|
tation may (will!) corrupt it. |
1356 |
|
|
|
1357 |
|
|
Except where cooperating subnets permit more direct |
1358 |
|
|
approaches, MIME [rrr] headers and encodings SHOULD be used |
1359 |
|
|
to transmit non-ASCII content using ASCII characters; see |
1360 |
|
|
section 4.5, appendix B, and the MIME RFCs for details. If |
1361 |
|
|
article content can be expressed in ASCII, it SHOULD be. |
1362 |
|
|
Failing that, the order of preference for character sets is |
1363 |
|
|
that described in MIME [rrr]. |
1364 |
|
|
|
1365 |
|
|
NOTE: Using the MIME facilities, it is possible to |
1366 |
|
|
transmit ANY character set, and ANY form of binary |
1367 |
|
|
data, using only ASCII characters. Equally impor- |
1368 |
|
|
tant, such articles are self-describing and the |
1369 |
|
|
reading agent can tell which octet-to-symbol map- |
1370 |
|
|
ping is intended! Designation of some preferred |
1371 |
|
|
character sets is intended to minimize the number |
1372 |
|
|
of character sets that a reading agent must under- |
1373 |
|
|
stand in order to display most articles properly. |
1374 |
|
|
|
1375 |
|
|
Articles containing non-ASCII characters, articles using |
1376 |
|
|
ASCII characters (values 0 through 127) to refer to non- |
1377 |
|
|
ASCII symbols, and articles using escape sequences to shift |
1378 |
|
|
character sets SHOULD include MIME headers indicating which |
1379 |
|
|
character set(s) and conventions are being used, and MUST do |
1380 |
|
|
so unless such articles are strictly confined to a |
1381 |
|
|
|
1382 |
|
|
|
1383 |
|
|
|
1384 |
|
|
2 June 1994 - 21 - expires 15 July 1994 |
1385 |
|
|
|
1386 |
|
|
|
1387 |
|
|
|
1388 |
|
|
|
1389 |
|
|
|
1390 |
|
|
INTERNET DRAFT to be NEWS sec. 4.4 |
1391 |
|
|
|
1392 |
|
|
|
1393 |
|
|
cooperating subnet which has its own pre-agreed conventions. |
1394 |
|
|
MIME encodings are preferred over all these techniques. If |
1395 |
|
|
it comes to a relayer's attention that it is being asked to |
1396 |
|
|
pass an article using such techniques outward across what it |
1397 |
|
|
knows to be the boundary of such a cooperating subnet, it |
1398 |
|
|
MUST report this error to its administrator, and MAY refuse |
1399 |
|
|
to pass the article beyond the subnet boundary. If it does |
1400 |
|
|
pass the article, it MUST re-encode it with MIME encodings |
1401 |
|
|
to make it conform to this Draft. |
1402 |
|
|
|
1403 |
|
|
NOTE: Such re-encoding is a non-trivial task, due |
1404 |
|
|
to MIME rules such as the prohibition of nested |
1405 |
|
|
encodings. It's not just a matter of pouring the |
1406 |
|
|
body through a simple filter. |
1407 |
|
|
|
1408 |
|
|
Reading agents SHOULD note MIME headers and attempt to show |
1409 |
|
|
the reader the closest possible approximation to the |
1410 |
|
|
intended content. They SHOULD not just send the octets of |
1411 |
|
|
the article to the output device unaltered, unless there is |
1412 |
|
|
reason to believe that the output device will indeed inter- |
1413 |
|
|
pret them correctly. Reading agents MUST not pass ASCII |
1414 |
|
|
control characters or escape sequences, other than as dis- |
1415 |
|
|
cussed above, unaltered to the output device; only by chance |
1416 |
|
|
would the result be the desired one, and there is serious |
1417 |
|
|
potential for harmful side effects, either accidental or |
1418 |
|
|
malicious. |
1419 |
|
|
|
1420 |
|
|
NOTE: Exactly what to do with unwanted control |
1421 |
|
|
characters/sequences depends on the philosophy of |
1422 |
|
|
the reading agent, but passing them straight to |
1423 |
|
|
the output device is almost always wrong. If the |
1424 |
|
|
reading agent wants to mark the presence of such a |
1425 |
|
|
character/sequence in circumstances where only |
1426 |
|
|
ASCII printable characters are available, trans- |
1427 |
|
|
lating it to "#" might be a suitable method; "#" |
1428 |
|
|
is a conspicuous character seldom used in normal |
1429 |
|
|
text. |
1430 |
|
|
|
1431 |
|
|
NOTE: Reading agents should be aware that many old |
1432 |
|
|
output devices (or the transmission paths to them) |
1433 |
|
|
zero out the top bit of octets sent to them. This |
1434 |
|
|
can transform non-ASCII characters into ASCII con- |
1435 |
|
|
trol characters. |
1436 |
|
|
|
1437 |
|
|
Followup agents MUST be careful to apply appropriate trans- |
1438 |
|
|
formations of representation to the outbound followup as |
1439 |
|
|
well as the inbound precursor. A followup to an article |
1440 |
|
|
containing non-ASCII material is very likely to contain non- |
1441 |
|
|
ASCII material itself. |
1442 |
|
|
|
1443 |
|
|
|
1444 |
|
|
|
1445 |
|
|
|
1446 |
|
|
|
1447 |
|
|
|
1448 |
|
|
|
1449 |
|
|
|
1450 |
|
|
2 June 1994 - 22 - expires 15 July 1994 |
1451 |
|
|
|
1452 |
|
|
|
1453 |
|
|
|
1454 |
|
|
|
1455 |
|
|
|
1456 |
|
|
INTERNET DRAFT to be NEWS sec. 4.5 |
1457 |
|
|
|
1458 |
|
|
|
1459 |
|
|
4.5. Non-ASCII Characters In Headers |
1460 |
|
|
|
1461 |
|
|
All octets found in headers MUST be ASCII characters. How- |
1462 |
|
|
ever, it is desirable to have a way of encoding non-ASCII |
1463 |
|
|
characters, especially in "human-readable" headers such as |
1464 |
|
|
Subject. MIME [rrr] provides a way to do this. Full |
1465 |
|
|
details may be found in the MIME specifications; herewith a |
1466 |
|
|
quick summary to alert software authors to the issues... |
1467 |
|
|
|
1468 |
|
|
encoded-word = "=?" charset "?" encoding "?" codes "?=" |
1469 |
|
|
charset = 1*tag-char |
1470 |
|
|
encoding = 1*tag-char |
1471 |
|
|
tag-char = <ASCII printable character except !()<>@,;:\"[]/?=> |
1472 |
|
|
codes = 1*code-char |
1473 |
|
|
code-char = <ASCII printable character except ?> |
1474 |
|
|
|
1475 |
|
|
An encoded word is a sequence of ASCII printable characters |
1476 |
|
|
that specifies the character set, encoding method, and bits |
1477 |
|
|
of (potentially) non-ASCII characters. Encoded words are |
1478 |
|
|
allowed only in certain positions in certain headers. Spe- |
1479 |
|
|
cific headers impose restrictions on the content of encoded |
1480 |
|
|
words beyond that specified in this section. Posting agents |
1481 |
|
|
MUST ensure that any material resembling an encoded word |
1482 |
|
|
(complete with all delimiters), in a context where encoded |
1483 |
|
|
words may appear, really is an encoded word. |
1484 |
|
|
|
1485 |
|
|
NOTE: The syntax is a bit ugly, but it was |
1486 |
|
|
designed to minimize chances of confusion with |
1487 |
|
|
legitimate header contents, and to satisfy diffi- |
1488 |
|
|
cult constraints on use within existing headers. |
1489 |
|
|
|
1490 |
|
|
An encoded word MUST not be more than 75 octets long. Each |
1491 |
|
|
line of a header containing encoded word(s) MUST be at most |
1492 |
|
|
76 octets long, not counting the EOL. |
1493 |
|
|
|
1494 |
|
|
NOTE: These limits are meant to bound the looka- |
1495 |
|
|
head needed to determine whether text that begins |
1496 |
|
|
"=?" is really an encoded word. |
1497 |
|
|
|
1498 |
|
|
The details of charsets and encodings are defined by MIME |
1499 |
|
|
[rrr]; the sequence of preferred character sets is the same |
1500 |
|
|
as MIME's. Encoded words SHOULD not be used for content |
1501 |
|
|
expressible in ASCII. |
1502 |
|
|
|
1503 |
|
|
When an encoded word is used, other than in a newsgroup name |
1504 |
|
|
(see section 5.5), it MUST be separated from any adjacent |
1505 |
|
|
non-space characters (including other encoded words) by |
1506 |
|
|
white space. Reading agents displaying the contents of |
1507 |
|
|
encoded words (as opposed to their encoded form) should |
1508 |
|
|
ignore white space adjacent to encoded words. |
1509 |
|
|
|
1510 |
|
|
UNRESOLVED ISSUE: Should this section be deleted |
1511 |
|
|
entirely, or made much more terse? The material |
1512 |
|
|
is relevant, but too complex to discuss fully. |
1513 |
|
|
|
1514 |
|
|
|
1515 |
|
|
|
1516 |
|
|
2 June 1994 - 23 - expires 15 July 1994 |
1517 |
|
|
|
1518 |
|
|
|
1519 |
|
|
|
1520 |
|
|
|
1521 |
|
|
|
1522 |
|
|
INTERNET DRAFT to be NEWS sec. 4.5 |
1523 |
|
|
|
1524 |
|
|
|
1525 |
|
|
NOTE: The deletion of intervening white space per- |
1526 |
|
|
mits using multiple encoded words, implicitly con- |
1527 |
|
|
catenated by the deletion, to encode text that |
1528 |
|
|
will not fit within a single 75-character encoded |
1529 |
|
|
word. |
1530 |
|
|
|
1531 |
|
|
Reading-agent implementors are warned that although this |
1532 |
|
|
Draft completely specifies where encoded words may appear in |
1533 |
|
|
the headers it defines, there are other headers (e.g. the |
1534 |
|
|
MIME Content-Description header) that MAY contain them. |
1535 |
|
|
|
1536 |
|
|
|
1537 |
|
|
4.6. Size Limits |
1538 |
|
|
|
1539 |
|
|
Implementations SHOULD avoid fixed constraints on the sizes |
1540 |
|
|
of lines within an article and on the size of the entire |
1541 |
|
|
article. |
1542 |
|
|
|
1543 |
|
|
Relayers SHOULD treat the body of an article as an uninter- |
1544 |
|
|
preted sequence of octets (except as mandated by changes of |
1545 |
|
|
EOL representation and processing of control messages), not |
1546 |
|
|
to be altered or constrained in any way. |
1547 |
|
|
|
1548 |
|
|
If it is absolutely necessary for an implementation to |
1549 |
|
|
impose a limit on the length of header lines, body lines, or |
1550 |
|
|
header logical lines, that limit shall be at least 1000 |
1551 |
|
|
octets, including EOL representations. Relayers and trans- |
1552 |
|
|
mission paths confronted with lines beyond their internal |
1553 |
|
|
limits (if any) MUST not simply inject EOLs at random |
1554 |
|
|
places; they MAY break headers (as described in 4.2.3) as a |
1555 |
|
|
last resort, and otherwise they MUST either pass the long |
1556 |
|
|
lines through unaltered, or refuse to pass the article at |
1557 |
|
|
all (see section 9.1 for further discussion). |
1558 |
|
|
|
1559 |
|
|
NOTE: The limit here is essentially the same mini- |
1560 |
|
|
mum as that specified for SMTP mail in RFC 821 |
1561 |
|
|
[rrr]. Implementors are warned that Path (see |
1562 |
|
|
section 5.6) and References (see section 6.5) |
1563 |
|
|
headers, in particular, often become several hun- |
1564 |
|
|
dred characters long, so 1000 is not an overly |
1565 |
|
|
generous limit. |
1566 |
|
|
|
1567 |
|
|
All implementations MUST be able to handle an article |
1568 |
|
|
totalling at least 65,000 octets, including headers and EOL |
1569 |
|
|
representations, gracefully and efficiently. All implemen- |
1570 |
|
|
tations SHOULD be able to handle an article totalling at |
1571 |
|
|
least 1,000,000 (one million) octets, including headers and |
1572 |
|
|
EOL representations, gracefully and efficiently. "Grace- |
1573 |
|
|
fully and efficiently" is intended to preclude not only |
1574 |
|
|
failures, but also major loss of performance, serious prob- |
1575 |
|
|
lems in error recovery, or resource consumption beyond what |
1576 |
|
|
is reasonably necessary. |
1577 |
|
|
|
1578 |
|
|
|
1579 |
|
|
|
1580 |
|
|
|
1581 |
|
|
|
1582 |
|
|
2 June 1994 - 24 - expires 15 July 1994 |
1583 |
|
|
|
1584 |
|
|
|
1585 |
|
|
|
1586 |
|
|
|
1587 |
|
|
|
1588 |
|
|
INTERNET DRAFT to be NEWS sec. 4.6 |
1589 |
|
|
|
1590 |
|
|
|
1591 |
|
|
NOTE: The intent here is to prohibit lowering the |
1592 |
|
|
existing de-facto limit any further, while |
1593 |
|
|
strongly encouraging movement towards a higher |
1594 |
|
|
one. Actually, although improvements are desir- |
1595 |
|
|
able in some cases, much news software copes rea- |
1596 |
|
|
sonably well with very large articles. The same |
1597 |
|
|
cannot be said of the communications software and |
1598 |
|
|
protocols used to transmit news from one host to |
1599 |
|
|
another, especially when slow communications links |
1600 |
|
|
are involved. Occasional huge articles that |
1601 |
|
|
appear now (by accident or through ignorance) typ- |
1602 |
|
|
ically leave trails of failing software, system |
1603 |
|
|
problems, and irate administrators in their wake. |
1604 |
|
|
|
1605 |
|
|
NOTE: It is intended that the successor to this |
1606 |
|
|
Draft will raise the "MUST" limit to 1,000,000 and |
1607 |
|
|
the "SHOULD" limit still further. |
1608 |
|
|
|
1609 |
|
|
Posters SHOULD limit posted articles to at most 60,000 |
1610 |
|
|
octets, including headers and EOL representations, unless |
1611 |
|
|
the articles are being posted only within a cooperating sub- |
1612 |
|
|
net which is known to be capable of handling larger articles |
1613 |
|
|
gracefully. Posting agents presented with a large article |
1614 |
|
|
SHOULD warn the poster and request confirmation. |
1615 |
|
|
|
1616 |
|
|
NOTE: The difference between this and the earlier |
1617 |
|
|
"MUST" limit is margin for header growth, differ- |
1618 |
|
|
ing EOL representations, and transmission over- |
1619 |
|
|
heads. |
1620 |
|
|
|
1621 |
|
|
NOTE: Disagreeable though these limits are, it is |
1622 |
|
|
a fact that in current networks, an article larger |
1623 |
|
|
than 64K (after header growth etc.) simply is not |
1624 |
|
|
transmitted reliably. Note also the comments |
1625 |
|
|
above on the trauma caused by single extremely- |
1626 |
|
|
large articles now; the problems are real and cur- |
1627 |
|
|
rent. These problems arguably should be fixed, |
1628 |
|
|
but this will not happen network-wide in the imme- |
1629 |
|
|
diate future. Hence the restriction of larger |
1630 |
|
|
articles to cooperating subnets, for now. |
1631 |
|
|
|
1632 |
|
|
Posters using non-ASCII characters in their text MUST take |
1633 |
|
|
into account the overhead involved in MIME encoding, unless |
1634 |
|
|
the article's propagation will be entirely limited to a |
1635 |
|
|
cooperating subnet which does not use MIME encodings for |
1636 |
|
|
non-ASCII characters. For example, MIME base64 encoding |
1637 |
|
|
involves growth by a factor of approximately 4/3, so an |
1638 |
|
|
article which would likely have to use this encoding should |
1639 |
|
|
be at most about 45,000 octets before encoding. |
1640 |
|
|
|
1641 |
|
|
Posters SHOULD use MIME "message/partial" conventions to |
1642 |
|
|
facilitate automatic reassembly of a large document split |
1643 |
|
|
into smaller pieces for posting. It is recommended that the |
1644 |
|
|
content identifier used should be a message ID, generated by |
1645 |
|
|
|
1646 |
|
|
|
1647 |
|
|
|
1648 |
|
|
2 June 1994 - 25 - expires 15 July 1994 |
1649 |
|
|
|
1650 |
|
|
|
1651 |
|
|
|
1652 |
|
|
|
1653 |
|
|
|
1654 |
|
|
INTERNET DRAFT to be NEWS sec. 4.6 |
1655 |
|
|
|
1656 |
|
|
|
1657 |
|
|
the same means as article message IDs (see section 5.3), and |
1658 |
|
|
that all parts should have a See-Also header (see section |
1659 |
|
|
6.16) giving the message IDs of at least the previous parts |
1660 |
|
|
and preferably all the parts. |
1661 |
|
|
|
1662 |
|
|
NOTE: See-Also is more correct for this purpose |
1663 |
|
|
than References, although References is in common |
1664 |
|
|
use today (with less-formal reassembly arrange- |
1665 |
|
|
ments). MIME reassemblers should probably examine |
1666 |
|
|
articles suggested by References headers if See- |
1667 |
|
|
Also headers are not present to indicate the |
1668 |
|
|
whereabouts of the other parts of "mes- |
1669 |
|
|
sage/partial" articles. |
1670 |
|
|
|
1671 |
|
|
To repeat: implementations SHOULD avoid fixed constraints on |
1672 |
|
|
the sizes of lines within an article and on the size of the |
1673 |
|
|
entire article. |
1674 |
|
|
|
1675 |
|
|
|
1676 |
|
|
4.7. Example |
1677 |
|
|
|
1678 |
|
|
Here is a sample article: |
1679 |
|
|
|
1680 |
|
|
From: jerry@eagle.ATT.COM (Jerry Schwarz) |
1681 |
|
|
Path: cbosgd!mhuxj!mhuxt!eagle!jerry |
1682 |
|
|
Newsgroups: news.announce |
1683 |
|
|
Subject: Usenet Etiquette -- Please Read |
1684 |
|
|
Message-ID: <642@eagle.ATT.COM> |
1685 |
|
|
Date: Mon, 17 Jan 1994 11:14:55 -0500 (EST) |
1686 |
|
|
Followup-To: news.misc |
1687 |
|
|
Expires: Wed, 19 Jan 1994 00:00:00 -0500 |
1688 |
|
|
Organization: AT&T Bell Laboratories, Murray Hill |
1689 |
|
|
|
1690 |
|
|
body |
1691 |
|
|
body |
1692 |
|
|
body |
1693 |
|
|
|
1694 |
|
|
|
1695 |
|
|
|
1696 |
|
|
5. Mandatory Headers |
1697 |
|
|
|
1698 |
|
|
An article MUST have one, and only one, of each of the fol- |
1699 |
|
|
lowing headers: Date, From, Message-ID, Subject, Newsgroups, |
1700 |
|
|
Path. |
1701 |
|
|
|
1702 |
|
|
NOTE: MAIL specifies (if read most carefully) that |
1703 |
|
|
there must be exactly one Date header and exactly |
1704 |
|
|
one From header, but otherwise does not restrict |
1705 |
|
|
multiple appearances of headers. (Notably, it |
1706 |
|
|
permits multiple Message-ID headers!) This |
1707 |
|
|
appears singularly useless, or even harmful, in |
1708 |
|
|
the context of news, and much current news soft- |
1709 |
|
|
ware will not tolerate multiple appearances of |
1710 |
|
|
mandatory headers. |
1711 |
|
|
|
1712 |
|
|
|
1713 |
|
|
|
1714 |
|
|
2 June 1994 - 26 - expires 15 July 1994 |
1715 |
|
|
|
1716 |
|
|
|
1717 |
|
|
|
1718 |
|
|
|
1719 |
|
|
|
1720 |
|
|
INTERNET DRAFT to be NEWS sec. 5 |
1721 |
|
|
|
1722 |
|
|
|
1723 |
|
|
Note also that there are situations, discussed in the rele- |
1724 |
|
|
vant parts of section 6, where References, Sender, or |
1725 |
|
|
Approved headers are mandatory. |
1726 |
|
|
|
1727 |
|
|
In the discussions of the individual headers, the content of |
1728 |
|
|
each is specified using the syntax notation. The convention |
1729 |
|
|
used is that the content of, for example, the Subject header |
1730 |
|
|
is defined as <Subject-content>. |
1731 |
|
|
|
1732 |
|
|
|
1733 |
|
|
5.1. Date |
1734 |
|
|
|
1735 |
|
|
The Date header contains the date and time when the article |
1736 |
|
|
was submitted for transmission: |
1737 |
|
|
|
1738 |
|
|
Date-content = [ weekday "," space ] date space time |
1739 |
|
|
weekday = "Mon" / "Tue" / "Wed" / "Thu" |
1740 |
|
|
/ "Fri" / "Sat" / "Sun" |
1741 |
|
|
date = day space month space year |
1742 |
|
|
day = 1*2digit |
1743 |
|
|
month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" |
1744 |
|
|
/ "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" |
1745 |
|
|
year = 4digit / 2digit |
1746 |
|
|
time = hh ":" mm [ ":" ss ] space timezone |
1747 |
|
|
timezone = "UT" / "GMT" |
1748 |
|
|
/ ( "+" / "-" ) hh mm [ space "(" zone-name ")" ] |
1749 |
|
|
hh = 2digit |
1750 |
|
|
mm = 2digit |
1751 |
|
|
ss = 2digit |
1752 |
|
|
zone-name = 1*( <ASCII printable character except ()\> / space ) |
1753 |
|
|
|
1754 |
|
|
This is a restricted subset of the MAIL date format. |
1755 |
|
|
|
1756 |
|
|
If a weekday is given, it MUST be consistent with the date. |
1757 |
|
|
The modern Gregorian calendar is used, and dates MUST be |
1758 |
|
|
consistent with its usual conventions; for example, if the |
1759 |
|
|
month is May, the day must be between 1 and 31 inclusive. |
1760 |
|
|
The year SHOULD be given as four digits, and posting agents |
1761 |
|
|
SHOULD enforce this; however, relayers MUST accept the two- |
1762 |
|
|
digit form, and MUST interpret it as having the implicit |
1763 |
|
|
prefix "19". |
1764 |
|
|
|
1765 |
|
|
NOTE: Two-digit year numbers can, should, and must |
1766 |
|
|
be phased out by 1999. |
1767 |
|
|
|
1768 |
|
|
The time is given on the 24-hour clock, e.g. two hours |
1769 |
|
|
before midnight is "22:00" or "22:00:00". The hh must be |
1770 |
|
|
between 00 and 23 inclusive, the mm between 0 and 59 inclu- |
1771 |
|
|
sive, and the ss between 0 and 61 inclusive. |
1772 |
|
|
|
1773 |
|
|
NOTE: Leap seconds very occasionally result in |
1774 |
|
|
minutes that are 61 or 62 seconds long. |
1775 |
|
|
|
1776 |
|
|
|
1777 |
|
|
|
1778 |
|
|
|
1779 |
|
|
|
1780 |
|
|
2 June 1994 - 27 - expires 15 July 1994 |
1781 |
|
|
|
1782 |
|
|
|
1783 |
|
|
|
1784 |
|
|
|
1785 |
|
|
|
1786 |
|
|
INTERNET DRAFT to be NEWS sec. 5.1 |
1787 |
|
|
|
1788 |
|
|
|
1789 |
|
|
The date and time SHOULD be given in the poster's local |
1790 |
|
|
timezone, including a specification of that timezone as a |
1791 |
|
|
numeric offset (which SHOULD include the timezone name, e.g. |
1792 |
|
|
"EST", supplied in parentheses like a MAIL comment). If |
1793 |
|
|
not, they MUST be given in Universal Time (abbreviated "UT"; |
1794 |
|
|
"GMT" is a historical synonym for "UT"). The timezone name |
1795 |
|
|
in parentheses, if present, is a comment; software MUST |
1796 |
|
|
ignore it, except that reading agents might wish to display |
1797 |
|
|
it to the reader. Timezone names other than "UT" and "GMT" |
1798 |
|
|
MUST appear only in the comment. |
1799 |
|
|
|
1800 |
|
|
NOTE: Attempts to deal with a full set of timezone |
1801 |
|
|
names have all foundered on the vast number of |
1802 |
|
|
such names in use and the duplications (for exam- |
1803 |
|
|
ple, there are at least FIVE different timezones |
1804 |
|
|
called "EST" by somebody). Even the limited set |
1805 |
|
|
of North American zone names authorized by MAIL is |
1806 |
|
|
subject to confusion and misinterpretation. Hence |
1807 |
|
|
the flat ban on non-UT timezone names except as |
1808 |
|
|
comments. |
1809 |
|
|
|
1810 |
|
|
NOTE: RFC 1036 specified that use of GMT (aka UT, |
1811 |
|
|
UTC) was preferred. However, the local time (in |
1812 |
|
|
the poster's timezone) is arguably information of |
1813 |
|
|
possible interest to the reader, and this requires |
1814 |
|
|
some indication of the poster's timezone. Numeric |
1815 |
|
|
offsets are an unambiguous way of doing this, and |
1816 |
|
|
their use was indeed sanctioned by RFC 1036 (that |
1817 |
|
|
is, this is a change of preference only). |
1818 |
|
|
|
1819 |
|
|
NOTE: There is frequent confusion, including |
1820 |
|
|
errors in some news software, regarding the sign |
1821 |
|
|
of numeric timezones. Zones west of Greenwich |
1822 |
|
|
have negative offsets. For example, North Ameri- |
1823 |
|
|
can Eastern Standard Time is zone -0500 and North |
1824 |
|
|
American Eastern Daylight Time is zone -0400. |
1825 |
|
|
|
1826 |
|
|
NOTE: Implementors are warned that the hh in a |
1827 |
|
|
timezone can go up to about 14; it is not limited |
1828 |
|
|
to 12. This is because the International Date |
1829 |
|
|
Line does not run exactly along the boundary |
1830 |
|
|
between zone -1200 and zone +1200. |
1831 |
|
|
|
1832 |
|
|
NOTE: The comments in section 2.6 regarding trans- |
1833 |
|
|
lation to other languages are relevant here. The |
1834 |
|
|
Date-content format, and the spellings of its com- |
1835 |
|
|
ponents, as found in articles themselves, are |
1836 |
|
|
always as defined in this Draft, regardless of the |
1837 |
|
|
language used to interact with readers and |
1838 |
|
|
posters. Reading and posting agents should trans- |
1839 |
|
|
late as appropriate. Actually, even English- |
1840 |
|
|
language reading and posting agents will probably |
1841 |
|
|
want to do some degree of translation on dates, if |
1842 |
|
|
only to abbreviate the lengthy format and |
1843 |
|
|
|
1844 |
|
|
|
1845 |
|
|
|
1846 |
|
|
2 June 1994 - 28 - expires 15 July 1994 |
1847 |
|
|
|
1848 |
|
|
|
1849 |
|
|
|
1850 |
|
|
|
1851 |
|
|
|
1852 |
|
|
INTERNET DRAFT to be NEWS sec. 5.1 |
1853 |
|
|
|
1854 |
|
|
|
1855 |
|
|
(perhaps) translate to and from the reader's time- |
1856 |
|
|
zone. |
1857 |
|
|
|
1858 |
|
|
|
1859 |
|
|
5.2. From |
1860 |
|
|
|
1861 |
|
|
The From header contains the electronic address, and possi- |
1862 |
|
|
bly the full name, of the article's author: |
1863 |
|
|
|
1864 |
|
|
From-content = address [ space "(" paren-phrase ")" ] |
1865 |
|
|
/ [ plain-phrase space ] "<" address ">" |
1866 |
|
|
paren-phrase = 1*( paren-char / space / encoded-word ) |
1867 |
|
|
paren-char = <ASCII printable character except ()<>\> |
1868 |
|
|
plain-phrase = plain-word *( space plain-word ) |
1869 |
|
|
plain-word = unquoted-word / quoted-word / encoded-word |
1870 |
|
|
unquoted-word = 1*unquoted-char |
1871 |
|
|
unquoted-char = <ASCII printable character except !()<>@,;:\".[]> |
1872 |
|
|
quoted-word = quote 1*( quoted-char / space ) quote |
1873 |
|
|
quote = <" (ASCII 34)> |
1874 |
|
|
quoted-char = <ASCII printable character except "()<>\> |
1875 |
|
|
address = local-part "@" domain |
1876 |
|
|
local-part = unquoted-word *( "." unquoted-word ) |
1877 |
|
|
domain = unquoted-word *( "." unquoted-word ) |
1878 |
|
|
|
1879 |
|
|
(Encoded words are described in section 4.5.) The full name |
1880 |
|
|
is distinguished from the electronic address either by |
1881 |
|
|
enclosing the former in parentheses (making it resemble a |
1882 |
|
|
MAIL comment, after the address) or by enclosing the latter |
1883 |
|
|
in angle brackets. The second form is preferred. In the |
1884 |
|
|
first form, encoded words inside the full name MUST be com- |
1885 |
|
|
posed entirely of <paren-char>s. In the second form, |
1886 |
|
|
encoded words inside the full name may not contain charac- |
1887 |
|
|
ters other than letters (of either case), digits, and the |
1888 |
|
|
characters "!", "*", "+", "-", "/", "=", and "_". The local |
1889 |
|
|
part is case-sensitive (except that all case counterparts of |
1890 |
|
|
"postmaster" are deemed equivalent), the domain is case- |
1891 |
|
|
insensitive, and all other parts of the From content are |
1892 |
|
|
comments which MUST be ignored by news software (except |
1893 |
|
|
insofar as reading agents may wish to display them to the |
1894 |
|
|
reader). Posters and posting agents MUST restrict them- |
1895 |
|
|
selves to this subset of the MAIL From syntax; relayers MAY |
1896 |
|
|
accept a broader subset, but see the discussion in section |
1897 |
|
|
9.1. |
1898 |
|
|
|
1899 |
|
|
NOTE: The syntax here is a restricted subset of |
1900 |
|
|
the MAIL From syntax, with quoting particularly |
1901 |
|
|
restricted, for simple parsing. In particular, |
1902 |
|
|
the presence of "<" in the From content indicates |
1903 |
|
|
that the second form is being used, otherwise the |
1904 |
|
|
first form is being used. The major restrictions |
1905 |
|
|
here are those already de-facto imposed by exist- |
1906 |
|
|
ing software. |
1907 |
|
|
|
1908 |
|
|
|
1909 |
|
|
|
1910 |
|
|
|
1911 |
|
|
|
1912 |
|
|
2 June 1994 - 29 - expires 15 July 1994 |
1913 |
|
|
|
1914 |
|
|
|
1915 |
|
|
|
1916 |
|
|
|
1917 |
|
|
|
1918 |
|
|
INTERNET DRAFT to be NEWS sec. 5.2 |
1919 |
|
|
|
1920 |
|
|
|
1921 |
|
|
NOTE: Overly-lenient posting agents sometimes per- |
1922 |
|
|
mit the second form with a full name containing |
1923 |
|
|
"(" or ")", but it is extremely rare for a full |
1924 |
|
|
name to contain "<" or ">" even in mail. Accord- |
1925 |
|
|
ingly, reading agents wishing to robustly deter- |
1926 |
|
|
mine which form is in use in a particular article |
1927 |
|
|
should key on the presence or absence of "<", not |
1928 |
|
|
the presence or absence of "(". |
1929 |
|
|
|
1930 |
|
|
The address SHOULD be a valid and complete Internet domain |
1931 |
|
|
address, capable of being successfully mailed to by an |
1932 |
|
|
Internet host (possibly via an MX record and a forwarder). |
1933 |
|
|
The pseudo-domain ".uucp" MAY be used for hosts registered |
1934 |
|
|
in the UUCP maps (e.g. name "xyz.uucp" for registered site |
1935 |
|
|
"xyz"), but such hosts SHOULD discontinue this usage (either |
1936 |
|
|
by arranging a proper Internet address and forwarder, or by |
1937 |
|
|
using the "% hack" (see below)), as soon as possible. Bit- |
1938 |
|
|
net hosts SHOULD use Internet addresses, avoiding the obso- |
1939 |
|
|
lescent ".bitnet" pseudo-domain. Other forms of address |
1940 |
|
|
MUST not be used. |
1941 |
|
|
|
1942 |
|
|
NOTE: "Other forms" specifically include UK-style |
1943 |
|
|
"backward" domains ("uk.oxbridge.cs" is in the |
1944 |
|
|
Czech Republic, not the UK), pure-UUCP addressing |
1945 |
|
|
("knee!shin!foot" instead of |
1946 |
|
|
"foot%shin@knee.uucp"), and abbreviated domains |
1947 |
|
|
("zebra.zoo" instead of "zebra.zoo.toronto.edu"). |
1948 |
|
|
|
1949 |
|
|
If it is necessary to use the local part to specify a rout- |
1950 |
|
|
ing relative to the nearest Internet host, this MUST be done |
1951 |
|
|
using the "% hack", using "%" as a secondary "@". For exam- |
1952 |
|
|
ple, to specify that mail to the address should go to Inter- |
1953 |
|
|
net host "foo.bar.edu", then to non-Internet host "ein", |
1954 |
|
|
then to non-Internet host "deux", for delivery there to |
1955 |
|
|
mailbox "fred", a suitable address would be: |
1956 |
|
|
|
1957 |
|
|
fred%deux%ein@foo.bar.edu |
1958 |
|
|
|
1959 |
|
|
Analogous forms using "!" in the local part MUST not be |
1960 |
|
|
used, as they are ambiguous; they should be expressed in the |
1961 |
|
|
"%" form. |
1962 |
|
|
|
1963 |
|
|
NOTE: "a!b@c" can be interpreted as either "b%c@a" |
1964 |
|
|
or "b%a@c", and there is no consistency in which |
1965 |
|
|
choice is made. Such addresses consequently are |
1966 |
|
|
unreliable. The "%" form does not suffer from |
1967 |
|
|
this problem, and although its use is officially |
1968 |
|
|
discouraged, it is a de-facto standard, to the |
1969 |
|
|
point that MAIL recognizes it. |
1970 |
|
|
|
1971 |
|
|
Relayers MUST not, repeat MUST not, repeat MUST not, rewrite |
1972 |
|
|
From lines, in any way, however minor or innocent-seeming. |
1973 |
|
|
Trying to "fix" a non-conforming address has a very high |
1974 |
|
|
probability of making things worse. Either pass it along |
1975 |
|
|
|
1976 |
|
|
|
1977 |
|
|
|
1978 |
|
|
2 June 1994 - 30 - expires 15 July 1994 |
1979 |
|
|
|
1980 |
|
|
|
1981 |
|
|
|
1982 |
|
|
|
1983 |
|
|
|
1984 |
|
|
INTERNET DRAFT to be NEWS sec. 5.2 |
1985 |
|
|
|
1986 |
|
|
|
1987 |
|
|
unchanged, or reject the article. |
1988 |
|
|
|
1989 |
|
|
NOTE: An additional reason for banning the use of |
1990 |
|
|
"!" addressing is that it has a much higher proba- |
1991 |
|
|
bility of being rewritten into mangled unrecogniz- |
1992 |
|
|
ability by old relayers. |
1993 |
|
|
|
1994 |
|
|
Posters and posting agents SHOULD avoid use of the charac- |
1995 |
|
|
ters "!" and "@" in full names, as they may trigger unwanted |
1996 |
|
|
header rewriting by old, simple-minded news software. |
1997 |
|
|
|
1998 |
|
|
NOTE: Also, the characters "." and ",", not infre- |
1999 |
|
|
quently found in names (e.g., "John W. Campbell, |
2000 |
|
|
Jr."), are NOT, repeat NOT, allowed in an unquoted |
2001 |
|
|
word. A From header like the following MUST not |
2002 |
|
|
be written without the quotation marks: |
2003 |
|
|
|
2004 |
|
|
From: "John W. Campbell, Jr." <editor@analog.com> |
2005 |
|
|
|
2006 |
|
|
|
2007 |
|
|
|
2008 |
|
|
5.3. Message-ID |
2009 |
|
|
|
2010 |
|
|
The Message-ID header contains the article's message ID, a |
2011 |
|
|
unique identifier distinguishing the article from every |
2012 |
|
|
other article: |
2013 |
|
|
|
2014 |
|
|
Message-ID-content = message-id |
2015 |
|
|
message-id = "<" local-part "@" domain ">" |
2016 |
|
|
|
2017 |
|
|
As with From addresses, a message ID's local part is case- |
2018 |
|
|
sensitive and its domain is case-insensitive. The "<" and |
2019 |
|
|
">" are parts of the message ID, not peculiarities of the |
2020 |
|
|
Message-ID header. |
2021 |
|
|
|
2022 |
|
|
NOTE: News message IDs are a restricted subset of |
2023 |
|
|
MAIL message IDs. In particular, no existing news |
2024 |
|
|
software copes properly with MAIL quoting conven- |
2025 |
|
|
tions within the local part, so they are forbid- |
2026 |
|
|
den. This is unfortunate, particularly for X.400 |
2027 |
|
|
gateways that often wish to include characters |
2028 |
|
|
which are not legal in unquoted message IDs, but |
2029 |
|
|
it is impossible to fix net-wide. See the notes |
2030 |
|
|
on gatewaying in section 10. |
2031 |
|
|
|
2032 |
|
|
The domain in the message ID SHOULD be the full Internet |
2033 |
|
|
domain name of the posting agent's host. Use of the ".uucp" |
2034 |
|
|
pseudo-domain (for hosts registered in the UUCP maps) or the |
2035 |
|
|
".bitnet" pseudo-domain (for Bitnet hosts) is permissible, |
2036 |
|
|
but SHOULD be avoided. |
2037 |
|
|
|
2038 |
|
|
Posters and posting agents MUST generate the local part of a |
2039 |
|
|
message ID using an algorithm which obeys the specified syn- |
2040 |
|
|
tax (words separated by ".", with certain characters not |
2041 |
|
|
|
2042 |
|
|
|
2043 |
|
|
|
2044 |
|
|
2 June 1994 - 31 - expires 15 July 1994 |
2045 |
|
|
|
2046 |
|
|
|
2047 |
|
|
|
2048 |
|
|
|
2049 |
|
|
|
2050 |
|
|
INTERNET DRAFT to be NEWS sec. 5.3 |
2051 |
|
|
|
2052 |
|
|
|
2053 |
|
|
permitted) (see section 5.2 for details), and will not |
2054 |
|
|
repeat itself (ever). The algorithm SHOULD not generate |
2055 |
|
|
message IDs which differ only in case of letters. Note the |
2056 |
|
|
specification in section 6.5 of a recommended convention for |
2057 |
|
|
indicating subject changes. Otherwise the algorithm is up |
2058 |
|
|
to the implementor. |
2059 |
|
|
|
2060 |
|
|
NOTE: The crucial use of message IDs is to distin- |
2061 |
|
|
guish circulating articles from each other and |
2062 |
|
|
from articles circulated recently. They are also |
2063 |
|
|
potentially useful as permanent indexing keys, |
2064 |
|
|
hence the requirement for permanent uniqueness... |
2065 |
|
|
but indexers cannot absolutely rely on this |
2066 |
|
|
because the earlier RFCs urged it but did not |
2067 |
|
|
demand it. All major implementations have always |
2068 |
|
|
generated permanently-unique message IDs by |
2069 |
|
|
design, but in some cases this is sensitive to |
2070 |
|
|
proper administration, and duplicates may have |
2071 |
|
|
occurred by accident. |
2072 |
|
|
|
2073 |
|
|
NOTE: The most popular method of generating local |
2074 |
|
|
parts is to use the date and time, plus some way |
2075 |
|
|
of distinguishing between simultaneous postings on |
2076 |
|
|
the same host (e.g. a process number), and encode |
2077 |
|
|
them in a suitably-restricted alphabet. An older |
2078 |
|
|
but now less-popular alternative is to use a |
2079 |
|
|
sequence number, incremented each time the host |
2080 |
|
|
generates a new message ID; this is workable, but |
2081 |
|
|
requires careful design to cope properly with |
2082 |
|
|
simultaneous posting attempts, and is not as |
2083 |
|
|
robust in the presence of crashes and other mal- |
2084 |
|
|
functions. |
2085 |
|
|
|
2086 |
|
|
NOTE: Some buggy news software considers message |
2087 |
|
|
IDs completely case-insensitive, hence the advice |
2088 |
|
|
to avoid relying on case distinctions. The |
2089 |
|
|
restrictions placed on the "alphabet" of local |
2090 |
|
|
parts and domains in section 5.2 have the useful |
2091 |
|
|
side effect of making it unnecessary to parse mes- |
2092 |
|
|
sage IDs in complex ways to break them into case- |
2093 |
|
|
sensitive and case-insensitive portions. |
2094 |
|
|
|
2095 |
|
|
The local part of a message ID MUST not be "postmaster" or |
2096 |
|
|
any other string that would compare equal to "postmaster" in |
2097 |
|
|
a case-insensitive comparison. Message IDs MUST be no |
2098 |
|
|
longer than 250 octets, including the "<" and ">". |
2099 |
|
|
|
2100 |
|
|
NOTE: "Postmaster" is an irksome exception to |
2101 |
|
|
case-sensitivity in local parts, inherited from |
2102 |
|
|
MAIL, and simply avoiding it is the best way to |
2103 |
|
|
deal with it (not that it's likely, but the issue |
2104 |
|
|
needs to be dealt with). The length limit is |
2105 |
|
|
undesirable, but is present in widely-used exist- |
2106 |
|
|
ing software. The limit is actually 255, but a |
2107 |
|
|
|
2108 |
|
|
|
2109 |
|
|
|
2110 |
|
|
2 June 1994 - 32 - expires 15 July 1994 |
2111 |
|
|
|
2112 |
|
|
|
2113 |
|
|
|
2114 |
|
|
|
2115 |
|
|
|
2116 |
|
|
INTERNET DRAFT to be NEWS sec. 5.3 |
2117 |
|
|
|
2118 |
|
|
|
2119 |
|
|
small safety margin is wise. |
2120 |
|
|
|
2121 |
|
|
|
2122 |
|
|
5.4. Subject |
2123 |
|
|
|
2124 |
|
|
The Subject header's content (the "subject" of the article) |
2125 |
|
|
is a short phrase describing the topic of the article: |
2126 |
|
|
|
2127 |
|
|
Subject-content = [ "Re: " ] nonblank-text |
2128 |
|
|
|
2129 |
|
|
Encoded words MAY appear in this header. |
2130 |
|
|
|
2131 |
|
|
If the article is a followup, the subject SHOULD begin with |
2132 |
|
|
"Re: " (a "back reference"). If the article is not a fol- |
2133 |
|
|
lowup, the subject MUST not begin with a back reference. |
2134 |
|
|
Back references are case-insensitive, although "Re: " is the |
2135 |
|
|
preferred form. A followup agent assisting a poster in |
2136 |
|
|
preparing a followup SHOULD prepend a back reference, UNLESS |
2137 |
|
|
the subject already begins with one. If the poster deter- |
2138 |
|
|
mines that the topic of the followup differs significantly |
2139 |
|
|
from what is described in the subject, a new, more descrip- |
2140 |
|
|
tive, subject SHOULD be substituted (with no back refer- |
2141 |
|
|
ence). An article whose subject begins with a back refer- |
2142 |
|
|
ence MUST have a References header referencing the precur- |
2143 |
|
|
sor. |
2144 |
|
|
|
2145 |
|
|
NOTE: A back reference is FOUR characters, the |
2146 |
|
|
fourth being a blank. RFC 1036 was confused about |
2147 |
|
|
this. Observe also that only ONE back reference |
2148 |
|
|
should be present. |
2149 |
|
|
|
2150 |
|
|
NOTE: There is a semi-standard convention, often |
2151 |
|
|
used, in which a subject change is flagged by mak- |
2152 |
|
|
ing the new Subject-content of the form: |
2153 |
|
|
|
2154 |
|
|
new topic (was: old topic) |
2155 |
|
|
|
2156 |
|
|
possibly with "old topic" somewhat truncated. |
2157 |
|
|
Posters wishing to do something like this are |
2158 |
|
|
urged to use this exact form, to simplify auto- |
2159 |
|
|
mated analysis. |
2160 |
|
|
|
2161 |
|
|
For historical reasons, the subject MUST not begin with |
2162 |
|
|
"cmsg " (note that this sequence ends with a blank). |
2163 |
|
|
|
2164 |
|
|
NOTE: Some old news software takes a subject |
2165 |
|
|
beginning with "cmsg " as an indication that the |
2166 |
|
|
article is a control message (see sections 6.6 and |
2167 |
|
|
7). This mechanism is obsolete and undesirable, |
2168 |
|
|
but accidental triggering of it is still possible. |
2169 |
|
|
|
2170 |
|
|
The subject SHOULD be terse. Posters SHOULD avoid trying to |
2171 |
|
|
cram their entire article into the headers; even the sim- |
2172 |
|
|
plest query usually benefits from a sentence or two of |
2173 |
|
|
|
2174 |
|
|
|
2175 |
|
|
|
2176 |
|
|
2 June 1994 - 33 - expires 15 July 1994 |
2177 |
|
|
|
2178 |
|
|
|
2179 |
|
|
|
2180 |
|
|
|
2181 |
|
|
|
2182 |
|
|
INTERNET DRAFT to be NEWS sec. 5.4 |
2183 |
|
|
|
2184 |
|
|
|
2185 |
|
|
elaboration and context, and the details of header display |
2186 |
|
|
vary widely among reading agents. |
2187 |
|
|
|
2188 |
|
|
NOTE: All-in-the-subject articles are sometimes |
2189 |
|
|
the result of misunderstandings over the interac- |
2190 |
|
|
tion protocol of a posting agent. Posting agents |
2191 |
|
|
might wish to give special attention to the possi- |
2192 |
|
|
bility that a poster specifying a very long sub- |
2193 |
|
|
ject might have thought he was typing the body of |
2194 |
|
|
the article. |
2195 |
|
|
|
2196 |
|
|
|
2197 |
|
|
5.5. Newsgroups |
2198 |
|
|
|
2199 |
|
|
The Newsgroups header's content specifies which newsgroup(s) |
2200 |
|
|
the article is posted to: |
2201 |
|
|
|
2202 |
|
|
Newsgroups-content = newsgroup-name *( ng-delim newsgroup-name ) |
2203 |
|
|
newsgroup-name = plain-component *( "." component ) |
2204 |
|
|
component = plain-component / encoded-word |
2205 |
|
|
plain-component = component-start *13component-rest |
2206 |
|
|
component-start = lowercase / digit |
2207 |
|
|
lowercase = <letter a-z> |
2208 |
|
|
component-rest = component-start / "+" / "-" / "_" |
2209 |
|
|
ng-delim = "," |
2210 |
|
|
|
2211 |
|
|
Encoded words used in newsgroup names MUST not contain char- |
2212 |
|
|
acters other than letters, digits, "+", "-", "/", "_", "=", |
2213 |
|
|
and "?" (although they may encode them). |
2214 |
|
|
|
2215 |
|
|
A newsgroup name consists of one or more components, which |
2216 |
|
|
may be plain components or (except for the first) encoded |
2217 |
|
|
words. A plain component MUST contain at least one letter, |
2218 |
|
|
MUST begin with a letter or digit, and MUST not be longer |
2219 |
|
|
than 14 characters. The first component MUST begin with a |
2220 |
|
|
letter; subsequent components SHOULD begin with a letter. |
2221 |
|
|
Newsgroup names MUST not contain uppercase letters, except |
2222 |
|
|
where required by encodings in encoded words. The sequences |
2223 |
|
|
"all" and "ctl" MUST not be used as components. |
2224 |
|
|
|
2225 |
|
|
NOTE: The alphabet and syntax specified encom- |
2226 |
|
|
passes all existing names of widespread news- |
2227 |
|
|
groups, while avoiding various forms that are |
2228 |
|
|
known to cause problems. Important existing soft- |
2229 |
|
|
ware uses various non-alphanumeric characters as |
2230 |
|
|
punctuation adjacent to newsgroup names. (It |
2231 |
|
|
would, in fact, be preferable to ban "+" from |
2232 |
|
|
newsgroup names, were it not that several |
2233 |
|
|
widespread newsgroups related to the C++ program- |
2234 |
|
|
ming language already use it.) |
2235 |
|
|
|
2236 |
|
|
NOTE: Much existing software converts the news- |
2237 |
|
|
group name into a directory path and stores the |
2238 |
|
|
articles themselves using numeric filenames, so |
2239 |
|
|
|
2240 |
|
|
|
2241 |
|
|
|
2242 |
|
|
2 June 1994 - 34 - expires 15 July 1994 |
2243 |
|
|
|
2244 |
|
|
|
2245 |
|
|
|
2246 |
|
|
|
2247 |
|
|
|
2248 |
|
|
INTERNET DRAFT to be NEWS sec. 5.5 |
2249 |
|
|
|
2250 |
|
|
|
2251 |
|
|
all-digit name components can be troublesome; the |
2252 |
|
|
"Great Renaming" early in the history of Usenet |
2253 |
|
|
included revisions of several newsgroup names to |
2254 |
|
|
eliminate such components. |
2255 |
|
|
|
2256 |
|
|
NOTE: The same storage technique is the reason for |
2257 |
|
|
the 14-character limit. The limit is now largely |
2258 |
|
|
historical, since most modern systems have much |
2259 |
|
|
larger limits on the length of a directory entry's |
2260 |
|
|
name, but many old systems are still in use. Sys- |
2261 |
|
|
tems with shorter limits also exist, but news |
2262 |
|
|
software on such systems has had to deal with the |
2263 |
|
|
problem already, since there are several |
2264 |
|
|
widespread newsgroups with 14-character components |
2265 |
|
|
in their names. Implementors are warned that it |
2266 |
|
|
is intended that the successor to this Draft will |
2267 |
|
|
increase the 14-character limit, and are urged to |
2268 |
|
|
fix their software to handle longer names grace- |
2269 |
|
|
fully (if such fixes are necessary, given the |
2270 |
|
|
intended domain of application of the particular |
2271 |
|
|
software). |
2272 |
|
|
|
2273 |
|
|
NOTE: The requirement that the first character of |
2274 |
|
|
a name be a letter accommodates existing software |
2275 |
|
|
which assumes it can tell the difference between a |
2276 |
|
|
newsgroup name and other possible syntactic enti- |
2277 |
|
|
ties by inspecting the first character. Similar |
2278 |
|
|
considerations motivate excluding "+", "-", and |
2279 |
|
|
"_" from coming first in a component, and the |
2280 |
|
|
preference for components that do not begin with |
2281 |
|
|
digits. The "all" sequence is used as a wildcard |
2282 |
|
|
symbol in much existing software, and the "ctl" |
2283 |
|
|
sequence was involved in an obsolete historical |
2284 |
|
|
mechanism for marking control messages, so they |
2285 |
|
|
are best avoided. |
2286 |
|
|
|
2287 |
|
|
NOTE: Possibly newsgroup names should have been |
2288 |
|
|
case-insensitive, but all existing software treats |
2289 |
|
|
them as case-sensitive. (RFC 977 [rrr] claims |
2290 |
|
|
that they are case-insensitive in NNTP, but exist- |
2291 |
|
|
ing implementations are believed to ignore this.) |
2292 |
|
|
The simplest solution is just to ban use of upper- |
2293 |
|
|
case letters, since no widespread newsgroup name |
2294 |
|
|
uses them anyway; this avoids any possibility of |
2295 |
|
|
confusion. |
2296 |
|
|
|
2297 |
|
|
NOTE: The syntax has the disadvantage of contain- |
2298 |
|
|
ing no white space, making it impossible to con- |
2299 |
|
|
tinue a Newsgroups header across several lines. |
2300 |
|
|
Implementors of relayers and reading agents are |
2301 |
|
|
warned that it is intended that the successor to |
2302 |
|
|
this Draft will change the definition of ng-delim |
2303 |
|
|
to: |
2304 |
|
|
|
2305 |
|
|
|
2306 |
|
|
|
2307 |
|
|
|
2308 |
|
|
2 June 1994 - 35 - expires 15 July 1994 |
2309 |
|
|
|
2310 |
|
|
|
2311 |
|
|
|
2312 |
|
|
|
2313 |
|
|
|
2314 |
|
|
INTERNET DRAFT to be NEWS sec. 5.5 |
2315 |
|
|
|
2316 |
|
|
|
2317 |
|
|
ng-delim = "," [ space ] |
2318 |
|
|
|
2319 |
|
|
and are urged to fix their software to handle |
2320 |
|
|
(i.e., ignore) white space following the commas. |
2321 |
|
|
Meanwhile, posters must avoid inserting such space |
2322 |
|
|
(despite the natural-language convention which |
2323 |
|
|
permits it) and posting agents should strip it |
2324 |
|
|
out. |
2325 |
|
|
|
2326 |
|
|
NOTE: Encoded words as components are somewhat |
2327 |
|
|
problematic, but are clearly desirable for use in |
2328 |
|
|
non-English-speaking nations. They are not sub- |
2329 |
|
|
ject to the 14-character limit, and this (plus the |
2330 |
|
|
possibility of "/" within them) may require spe- |
2331 |
|
|
cial handling in news software. |
2332 |
|
|
|
2333 |
|
|
Encoded words are allowed in newsgroup names ONLY where non- |
2334 |
|
|
ASCII characters are necessary to the name, and must use the |
2335 |
|
|
"b" encoding [rrr] and the first suitable character set in |
2336 |
|
|
the MIME order of preferred character sets [rrr]. |
2337 |
|
|
|
2338 |
|
|
NOTE: Since the newsgroup name is the encoded |
2339 |
|
|
form, NOT the underlying non-ASCII form, there is |
2340 |
|
|
room for terrible confusion here if the choice of |
2341 |
|
|
encoding for a particular name is not fully stan- |
2342 |
|
|
dardized. |
2343 |
|
|
|
2344 |
|
|
Posters SHOULD use only the names of existing newsgroups in |
2345 |
|
|
the Newsgroups header, because newsgroups are NOT created |
2346 |
|
|
simply by being posted to. However, it is legitimate to |
2347 |
|
|
cross-post to newsgroup(s) which do not exist on the posting |
2348 |
|
|
agent's host, provided that at least one of the newsgroups |
2349 |
|
|
DOES exist there, and followup agents MUST accept this |
2350 |
|
|
(posting agents MAY accept it, but SHOULD at least alert the |
2351 |
|
|
poster to the situation and request confirmation). Relayers |
2352 |
|
|
MUST not rewrite Newsgroups headers in any way, even if some |
2353 |
|
|
or all of the newsgroups do not exist on the relayer's host. |
2354 |
|
|
|
2355 |
|
|
NOTE: Early experience with news software that |
2356 |
|
|
created newsgroups when they were mentioned in a |
2357 |
|
|
Newsgroups header was thoroughly negative: posters |
2358 |
|
|
frequently mistype newsgroup names. |
2359 |
|
|
|
2360 |
|
|
NOTE: While it is legitimate for some of an arti- |
2361 |
|
|
cle's newsgroups not to exist on the host where it |
2362 |
|
|
is posted, this IS a rather unusual situation |
2363 |
|
|
except in followups (which should go to all news- |
2364 |
|
|
groups the precursor was posted to, even if not |
2365 |
|
|
all of them reach the site where the followup is |
2366 |
|
|
being posted). |
2367 |
|
|
|
2368 |
|
|
NOTE: Rewriting Newsgroups headers to strip |
2369 |
|
|
locally-unknown newsgroups is superficially |
2370 |
|
|
attractive. However, early experience with |
2371 |
|
|
|
2372 |
|
|
|
2373 |
|
|
|
2374 |
|
|
2 June 1994 - 36 - expires 15 July 1994 |
2375 |
|
|
|
2376 |
|
|
|
2377 |
|
|
|
2378 |
|
|
|
2379 |
|
|
|
2380 |
|
|
INTERNET DRAFT to be NEWS sec. 5.5 |
2381 |
|
|
|
2382 |
|
|
|
2383 |
|
|
exactly that policy was thoroughly negative: news |
2384 |
|
|
propagation is more redundant and much less |
2385 |
|
|
orderly than many people imagine, and in particu- |
2386 |
|
|
lar it is not unheard-of for the (sometimes) |
2387 |
|
|
fastest path between two (say) U of Toronto sites |
2388 |
|
|
to pass outside U of Toronto... in which case |
2389 |
|
|
newsgroup stripping can cause incomplete propaga- |
2390 |
|
|
tion. Having an article's set of newsgroups |
2391 |
|
|
change as it propagates can also result in fol- |
2392 |
|
|
lowups not achieving the same propagation as the |
2393 |
|
|
original. It's been tried; it's more trouble than |
2394 |
|
|
it's worth; don't do it. |
2395 |
|
|
|
2396 |
|
|
NOTE: In particular, newsgroup stripping superfi- |
2397 |
|
|
cially looks like a solution to the problem of |
2398 |
|
|
duplicate regional newsgroup names. For example, |
2399 |
|
|
both University of Toronto and University of Texas |
2400 |
|
|
have "ut.general" newsgroups, and material cross- |
2401 |
|
|
posted to that name and a global newsgroup appears |
2402 |
|
|
in both universities' local newsgroups. However, |
2403 |
|
|
the side effects of stripping are sufficiently |
2404 |
|
|
unacceptable to disqualify it for this purpose. |
2405 |
|
|
Don't do it. |
2406 |
|
|
|
2407 |
|
|
Cross-posting an article to several relevant newsgroups is |
2408 |
|
|
far superior to posting separate articles with duplicated |
2409 |
|
|
content to each newsgroup, because reading agents can detect |
2410 |
|
|
the situation and show the article to a reader only once. |
2411 |
|
|
Posters SHOULD cross-post rather than duplicate-post. |
2412 |
|
|
|
2413 |
|
|
NOTE: On the other hand, cross-posting to a large |
2414 |
|
|
number of newsgroups usually indicates that the |
2415 |
|
|
poster has not thought about his audience; arti- |
2416 |
|
|
cles are rarely pertinent to more than (say) half |
2417 |
|
|
a dozen newsgroups. Posting agents might wish to |
2418 |
|
|
request confirmation when the number of newsgroups |
2419 |
|
|
exceeds (say) five in the presence of a Followup- |
2420 |
|
|
To header, or (say) two in the absence of such a |
2421 |
|
|
header. |
2422 |
|
|
|
2423 |
|
|
NOTE: One problem with cross-postings is what to |
2424 |
|
|
do with an article cross-posted to a set of news- |
2425 |
|
|
groups including both moderated and unmoderated |
2426 |
|
|
ones. Posters tend to expect such an article to |
2427 |
|
|
show up immediately in the unmoderated newsgroups, |
2428 |
|
|
especially if they do not realize that one or more |
2429 |
|
|
of the newsgroups is moderated. However, since it |
2430 |
|
|
is not possible for a moderator to retroactively |
2431 |
|
|
add an already-posted article to a moderated news- |
2432 |
|
|
group, the only correct action is to mail such an |
2433 |
|
|
article to one (and only one) of the moderators |
2434 |
|
|
for action. It is probably best for the posting |
2435 |
|
|
agent to detect this situation and ask the poster |
2436 |
|
|
what action is preferred. The acceptable choices |
2437 |
|
|
|
2438 |
|
|
|
2439 |
|
|
|
2440 |
|
|
2 June 1994 - 37 - expires 15 July 1994 |
2441 |
|
|
|
2442 |
|
|
|
2443 |
|
|
|
2444 |
|
|
|
2445 |
|
|
|
2446 |
|
|
INTERNET DRAFT to be NEWS sec. 5.5 |
2447 |
|
|
|
2448 |
|
|
|
2449 |
|
|
are to alter the newsgroup list or to mail to a |
2450 |
|
|
moderator of the poster's choice; the posting |
2451 |
|
|
agent should NOT offer duplicate-posting as an |
2452 |
|
|
easy-to-request option (if only because many mod- |
2453 |
|
|
erators will reject a submission that has already |
2454 |
|
|
been posted to unmoderated newsgroups). |
2455 |
|
|
|
2456 |
|
|
NOTE: An article cross-posted to multiple moder- |
2457 |
|
|
ated newsgroups really should have approval from |
2458 |
|
|
all the moderators involved. In practice, the |
2459 |
|
|
only straightforward way to do this is to send the |
2460 |
|
|
article to one of them and have him consult the |
2461 |
|
|
others. |
2462 |
|
|
|
2463 |
|
|
A newsgroup SHOULD not appear more than once in the News- |
2464 |
|
|
groups header. |
2465 |
|
|
|
2466 |
|
|
Newsgroup names having only one component are reserved for |
2467 |
|
|
newsgroups whose propagation is restricted to a single host |
2468 |
|
|
(or the administrative equivalent). It is inadvisable to |
2469 |
|
|
name a newsgroup "poster" because that word has special |
2470 |
|
|
meaning in the Followup-To header (see section 6.1). The |
2471 |
|
|
names "control" and "junk" are frequently used for pseudo- |
2472 |
|
|
newsgroups internal to relayer implementations, and hence |
2473 |
|
|
are also best avoided. |
2474 |
|
|
|
2475 |
|
|
NOTE: Beware of the duplicate-regional-newsgroup- |
2476 |
|
|
names problem mentioned above. In particular, |
2477 |
|
|
there are many, many hosts with a newsgroup named |
2478 |
|
|
"general", and some surprising things show up in |
2479 |
|
|
such newsgroups when people cross-post. It is |
2480 |
|
|
probably better to use multi-component names, |
2481 |
|
|
which are less likely to be duplicated. Fred's |
2482 |
|
|
Widget House should use "fwh.general" rather than |
2483 |
|
|
just "general" as its in-house general-topics |
2484 |
|
|
newsgroup. |
2485 |
|
|
|
2486 |
|
|
It is conventional to reserve newsgroup names beginning with |
2487 |
|
|
"to." for test messages sent on an essentially point-to- |
2488 |
|
|
point basis (see also the ihave/sendme protocol described in |
2489 |
|
|
section 7.2); newsgroup names beginning with "to." SHOULD |
2490 |
|
|
not be used for any other purpose. The second (and possibly |
2491 |
|
|
later) components of such a name should, together, comprise |
2492 |
|
|
the relayer name (see section 5.6) of a relayer. The news- |
2493 |
|
|
group exists only at the named relayer and its neighbors. |
2494 |
|
|
The neighbors all pass that newsgroup to the named relayer, |
2495 |
|
|
while the named relayer does not pass it to anyone. |
2496 |
|
|
|
2497 |
|
|
The order of newsgroup names in the Newsgroups header is not |
2498 |
|
|
significant. |
2499 |
|
|
|
2500 |
|
|
|
2501 |
|
|
|
2502 |
|
|
|
2503 |
|
|
|
2504 |
|
|
|
2505 |
|
|
|
2506 |
|
|
2 June 1994 - 38 - expires 15 July 1994 |
2507 |
|
|
|
2508 |
|
|
|
2509 |
|
|
|
2510 |
|
|
|
2511 |
|
|
|
2512 |
|
|
INTERNET DRAFT to be NEWS sec. 5.6 |
2513 |
|
|
|
2514 |
|
|
|
2515 |
|
|
5.6. Path |
2516 |
|
|
|
2517 |
|
|
The Path header's content indicates which relayers the arti- |
2518 |
|
|
cle has already visited, so that unnecessary redundant |
2519 |
|
|
transmission can be avoided: |
2520 |
|
|
|
2521 |
|
|
Path-content = [ path-list path-delimiter ] local-part |
2522 |
|
|
path-list = relayer-name *( path-delimiter relayer-name ) |
2523 |
|
|
relayer-name = 1*rn-char |
2524 |
|
|
rn-char = letter / digit / "." / "-" / "_" |
2525 |
|
|
path-delimiter = "!" |
2526 |
|
|
|
2527 |
|
|
The Path content is a list of relayer names, separated by |
2528 |
|
|
path delimiters, followed (after a final delimiter) by the |
2529 |
|
|
local part of a mailing address. Each relayer MUST prepend |
2530 |
|
|
its name, and a delimiter, to the Path content in all arti- |
2531 |
|
|
cles it processes. A relayer MUST not pass an article to a |
2532 |
|
|
neighboring relayer whose name is already mentioned in an |
2533 |
|
|
article's path list, unless this is explicitly requested by |
2534 |
|
|
the neighbor in some way. The Path content is case- |
2535 |
|
|
sensitive. |
2536 |
|
|
|
2537 |
|
|
NOTE: The Path header supplied by a posting agent |
2538 |
|
|
should normally contain only the local part. The |
2539 |
|
|
relayer that the posting agent passes the article |
2540 |
|
|
to for posting will prepend its relayer name to |
2541 |
|
|
get the path list started. |
2542 |
|
|
|
2543 |
|
|
NOTE: Observe that the trailing local part is NOT |
2544 |
|
|
part of the path list. This Path header: |
2545 |
|
|
|
2546 |
|
|
Path: fee!fie!foe!fum |
2547 |
|
|
|
2548 |
|
|
contains three relayer names: "fee", "fie", and |
2549 |
|
|
"foe". A relayer named "fum" is still eligible to |
2550 |
|
|
be sent this article. |
2551 |
|
|
|
2552 |
|
|
NOTE: This syntax has the disadvantage of contain- |
2553 |
|
|
ing no white space, making it impossible to con- |
2554 |
|
|
tinue a Path header across several lines. Imple- |
2555 |
|
|
mentors of relayers and reading agents are warned |
2556 |
|
|
that it is intended that the successor to this |
2557 |
|
|
Draft will change the definition of path delimiter |
2558 |
|
|
to: |
2559 |
|
|
|
2560 |
|
|
path-delimiter = "!" [ space ] |
2561 |
|
|
|
2562 |
|
|
and are urged to fix their software to handle |
2563 |
|
|
(i.e., ignore) white space following the exclama- |
2564 |
|
|
tion points. They are urged to hurry; some ill- |
2565 |
|
|
behaved systems reportedly already feel free to |
2566 |
|
|
add such white space. |
2567 |
|
|
|
2568 |
|
|
|
2569 |
|
|
|
2570 |
|
|
|
2571 |
|
|
|
2572 |
|
|
2 June 1994 - 39 - expires 15 July 1994 |
2573 |
|
|
|
2574 |
|
|
|
2575 |
|
|
|
2576 |
|
|
|
2577 |
|
|
|
2578 |
|
|
INTERNET DRAFT to be NEWS sec. 5.6 |
2579 |
|
|
|
2580 |
|
|
|
2581 |
|
|
NOTE: RFC 1036 allows considerably more flexibil- |
2582 |
|
|
ity in choice of delimiter, in theory, but this |
2583 |
|
|
flexibility has never been used and most news |
2584 |
|
|
software does not implement it properly. The |
2585 |
|
|
grammar reflects the current reality. Note, in |
2586 |
|
|
particular, that RFC 1036 treats "_" as a delim- |
2587 |
|
|
iter, but in fact it is known to appear in relayer |
2588 |
|
|
names occasionally. |
2589 |
|
|
|
2590 |
|
|
Because an article will not propagate to a relayer already |
2591 |
|
|
mentioned in its path list, the path list MUST not contain |
2592 |
|
|
any names other than those of relayers the article has |
2593 |
|
|
passed through AS NEWS. This is trivially obvious for nor- |
2594 |
|
|
mal news articles, but requires attention from the modera- |
2595 |
|
|
tors of moderated newsgroups and the implementors and main- |
2596 |
|
|
tainers of gateways. |
2597 |
|
|
|
2598 |
|
|
NOTE: For the same reason, a relayer and its |
2599 |
|
|
neighbors need to agree on the choice of relayer |
2600 |
|
|
name, and names should not be changed without |
2601 |
|
|
notifying neighbors. |
2602 |
|
|
|
2603 |
|
|
Relayer names need to be unique among all relayers which |
2604 |
|
|
will ever see the articles using them. A relayer name is |
2605 |
|
|
normally either an "official" name for the host the relayer |
2606 |
|
|
runs on, or some other "official" name controlled by the |
2607 |
|
|
same organization. Except in cooperating subnets that agree |
2608 |
|
|
to some other convention, and don't let articles using it |
2609 |
|
|
escape beyond the subnet, a relayer name MUST be either a |
2610 |
|
|
UUCP name registered in the UUCP maps (without any domain |
2611 |
|
|
suffix such as ".UUCP"), or a complete Internet domain name. |
2612 |
|
|
Use of a (registered) UUCP name is recommended, where prac- |
2613 |
|
|
tical, to keep the length of the path list down. |
2614 |
|
|
|
2615 |
|
|
The use of Internet domain names in the path list presents |
2616 |
|
|
one problem: domain names are case-insensitive, but the path |
2617 |
|
|
list is case-sensitive. Relayers using domain names as |
2618 |
|
|
their relayer names MUST pick a standard form for the name, |
2619 |
|
|
and use that form consistently to the exclusion of all oth- |
2620 |
|
|
ers. The preferred form for this purpose, which relayers |
2621 |
|
|
SHOULD use, is the all-lowercase form. |
2622 |
|
|
|
2623 |
|
|
NOTE: It is arguably unfortunate that the path |
2624 |
|
|
list is case-sensitive, but it is much too late to |
2625 |
|
|
change this. Most Internet sites do, in any |
2626 |
|
|
event, use one standardized form of their name |
2627 |
|
|
almost everywhere. |
2628 |
|
|
|
2629 |
|
|
In the ordinary case, where the poster is the author of the |
2630 |
|
|
article, the local part following the path list SHOULD be |
2631 |
|
|
the local part of the poster's full Internet domain mailing |
2632 |
|
|
address. |
2633 |
|
|
|
2634 |
|
|
|
2635 |
|
|
|
2636 |
|
|
|
2637 |
|
|
|
2638 |
|
|
2 June 1994 - 40 - expires 15 July 1994 |
2639 |
|
|
|
2640 |
|
|
|
2641 |
|
|
|
2642 |
|
|
|
2643 |
|
|
|
2644 |
|
|
INTERNET DRAFT to be NEWS sec. 5.6 |
2645 |
|
|
|
2646 |
|
|
|
2647 |
|
|
NOTE: It should be just the local part, not the |
2648 |
|
|
full address. The character "@" does not appear |
2649 |
|
|
in a Path header. |
2650 |
|
|
|
2651 |
|
|
The Path content somewhat resembles a mailing address, par- |
2652 |
|
|
ticularly in the UUCP world with its manual routing and "!" |
2653 |
|
|
address syntax. Historically, this resemblance was impor- |
2654 |
|
|
tant, and the Path content was often used as a reply |
2655 |
|
|
address. This practice has always been somewhat unreliable, |
2656 |
|
|
since news paths are not always mail paths and news relayer |
2657 |
|
|
names are not always recognized by mail handlers, and its |
2658 |
|
|
reliability has generally worsened in recent times. The |
2659 |
|
|
widespread use of and recognition of Internet domain |
2660 |
|
|
addresses, even outside the actual Internet, has largely |
2661 |
|
|
eliminated the problem. Readers SHOULD not use the Path |
2662 |
|
|
content as a reply address. On the other hand, relayer |
2663 |
|
|
administrators are urged not to break this usage without |
2664 |
|
|
good reason; where practical, paths followed by news SHOULD |
2665 |
|
|
be traversable by mail, and mail handlers SHOULD recognize |
2666 |
|
|
relayer names as host names. |
2667 |
|
|
|
2668 |
|
|
It will typically be difficult or impractical for gateways |
2669 |
|
|
and moderators to supply a Path content that is useful as a |
2670 |
|
|
reply address for the author, bearing in mind that the path |
2671 |
|
|
list they supply will normally be empty. (To reiterate: the |
2672 |
|
|
path list MUST not contain any names other than those of |
2673 |
|
|
relayers the article has passed through AS NEWS.) They |
2674 |
|
|
SHOULD supply a local part that will result in replies to a |
2675 |
|
|
Path-derived address being returned to the sender with a |
2676 |
|
|
brief explanation. Software permitting, the local part |
2677 |
|
|
"not-for-mail" is recommended. |
2678 |
|
|
|
2679 |
|
|
NOTE: A moderator or gateway administrator who |
2680 |
|
|
supplies a local part that delivers such mail to |
2681 |
|
|
an administrative mailbox will quickly discover |
2682 |
|
|
why it should be bounced automatically! It is |
2683 |
|
|
best, however, for the returned message to include |
2684 |
|
|
an explanation of what has probably happened, |
2685 |
|
|
rather than just a mysterious "undeliverable mail" |
2686 |
|
|
complaint, since the sender may not be aware that |
2687 |
|
|
his/her software is unwisely using the Path con- |
2688 |
|
|
tent as a reply address. Reply software might |
2689 |
|
|
wish to question attempts to reply to a Path- |
2690 |
|
|
derived address ending in "not-for-mail" (which is |
2691 |
|
|
why a specific name is being recommended here). |
2692 |
|
|
|
2693 |
|
|
|
2694 |
|
|
6. Optional Headers |
2695 |
|
|
|
2696 |
|
|
Many MAIL headers, and many of those specified in present |
2697 |
|
|
and future MAIL extensions, are potentially applicable to |
2698 |
|
|
news. Headers specific to MAIL's point-to-point transmis- |
2699 |
|
|
sion paradigm, e.g. To and Cc, SHOULD not appear in news |
2700 |
|
|
articles. (Gateways wishing to preserve such information |
2701 |
|
|
|
2702 |
|
|
|
2703 |
|
|
|
2704 |
|
|
2 June 1994 - 41 - expires 15 July 1994 |
2705 |
|
|
|
2706 |
|
|
|
2707 |
|
|
|
2708 |
|
|
|
2709 |
|
|
|
2710 |
|
|
INTERNET DRAFT to be NEWS sec. 6 |
2711 |
|
|
|
2712 |
|
|
|
2713 |
|
|
for debugging probably SHOULD hide it under different names; |
2714 |
|
|
prefixing "X-" to the original headers, resulting in e.g. |
2715 |
|
|
"X-To", is suggested.) |
2716 |
|
|
|
2717 |
|
|
The following optional headers are either specific to news |
2718 |
|
|
or of particular note in news articles; an article MAY con- |
2719 |
|
|
tain some or all of them. (Note that there are some circum- |
2720 |
|
|
stances in which some of them are mandatory; these are |
2721 |
|
|
explained under the individual headers.) An article MUST |
2722 |
|
|
not contain two or more headers with any one of these header |
2723 |
|
|
names. |
2724 |
|
|
|
2725 |
|
|
NOTE: The ban on duplicate header names does not |
2726 |
|
|
apply to headers not specified in this Draft at |
2727 |
|
|
all, such as "X-" headers. Software should not |
2728 |
|
|
assume that all header names in a given article |
2729 |
|
|
are unique. |
2730 |
|
|
|
2731 |
|
|
|
2732 |
|
|
6.1. Followup-To |
2733 |
|
|
|
2734 |
|
|
The Followup-To header contents specify which newsgroup(s) |
2735 |
|
|
followups should be posted to: |
2736 |
|
|
|
2737 |
|
|
Followup-To-content = Newsgroups-content / "poster" |
2738 |
|
|
|
2739 |
|
|
The syntax is the same as that of the Newsgroups content, |
2740 |
|
|
with the exception that the magic word "poster" means that |
2741 |
|
|
followups should be mailed to the article's reply address |
2742 |
|
|
rather than posted. In the absence of Followup-To, the |
2743 |
|
|
default newsgroup(s) for a followup are those in the News- |
2744 |
|
|
groups header. |
2745 |
|
|
|
2746 |
|
|
NOTE: The way to request that followups be mailed |
2747 |
|
|
to a specific address other than that in the From |
2748 |
|
|
line is to supply "Followup-To: poster" and a |
2749 |
|
|
Reply-To header. Putting a mailing address in the |
2750 |
|
|
Followup-To line is incorrect; posting agents |
2751 |
|
|
should reject or rewrite such headers. |
2752 |
|
|
|
2753 |
|
|
NOTE: There is no syntax for "no followups |
2754 |
|
|
allowed" because "Followup-To: poster" accom- |
2755 |
|
|
plishes this effect without extra machinery. |
2756 |
|
|
|
2757 |
|
|
Although it is generally desirable to limit followups to the |
2758 |
|
|
smallest reasonable set of newsgroups, especially when the |
2759 |
|
|
precursor was cross-posted widely, posting agents SHOULD not |
2760 |
|
|
supply a Followup-To header except at the poster's explicit |
2761 |
|
|
request. |
2762 |
|
|
|
2763 |
|
|
NOTE: In particular, it is incorrect for the post- |
2764 |
|
|
ing agent to assume that followups to a cross- |
2765 |
|
|
posted article should be directed to the first |
2766 |
|
|
newsgroup only. Trimming the list of newsgroups |
2767 |
|
|
|
2768 |
|
|
|
2769 |
|
|
|
2770 |
|
|
2 June 1994 - 42 - expires 15 July 1994 |
2771 |
|
|
|
2772 |
|
|
|
2773 |
|
|
|
2774 |
|
|
|
2775 |
|
|
|
2776 |
|
|
INTERNET DRAFT to be NEWS sec. 6.1 |
2777 |
|
|
|
2778 |
|
|
|
2779 |
|
|
should be the poster's decision, not the posting |
2780 |
|
|
agent's. However, when an article is to be cross- |
2781 |
|
|
posted to a considerable number of newsgroups, a |
2782 |
|
|
posting agent might wish to SUGGEST to the poster |
2783 |
|
|
that followups go to a shorter list. |
2784 |
|
|
|
2785 |
|
|
|
2786 |
|
|
6.2. Expires |
2787 |
|
|
|
2788 |
|
|
The Expires header content specifies a date and time when |
2789 |
|
|
the article is deemed to be no longer useful and should be |
2790 |
|
|
removed ("expired"): |
2791 |
|
|
|
2792 |
|
|
Expires-content = Date-content |
2793 |
|
|
|
2794 |
|
|
The content syntax is the same as that of the Date content. |
2795 |
|
|
In the absence of Expires, the default is decided by the |
2796 |
|
|
administrators of each host the article reaches, who MAY |
2797 |
|
|
also restrict the extent to which the Expires header is hon- |
2798 |
|
|
ored. |
2799 |
|
|
|
2800 |
|
|
The Expires header has two main applications: removing arti- |
2801 |
|
|
cles whose utility ends on a specific date (e.g., event |
2802 |
|
|
announcements which can be removed once the day of the event |
2803 |
|
|
is past) and preserving articles expected to be of prolonged |
2804 |
|
|
usefulness (e.g., information aimed at new readers of a |
2805 |
|
|
newsgroup). The latter application is sometimes abused. |
2806 |
|
|
Since individual hosts have local policies for expiration of |
2807 |
|
|
news (depending on available disk space, for instance), |
2808 |
|
|
posters SHOULD not provide Expires headers for articles |
2809 |
|
|
unless there is a natural expiration date associated with |
2810 |
|
|
the topic. Posting agents MUST not provide a default |
2811 |
|
|
Expires header. Leave it out and allow local policies to be |
2812 |
|
|
used unless there is a good reason not to. Expiry dates are |
2813 |
|
|
properly the decision of individual host administrators; |
2814 |
|
|
posters and moderators SHOULD set only expiry dates that |
2815 |
|
|
most administrators would agree with. |
2816 |
|
|
|
2817 |
|
|
NOTE: A poster preparing an Expires header for an |
2818 |
|
|
article whose utility ends on a specific day |
2819 |
|
|
should typically specify the NEXT day as the |
2820 |
|
|
expiry date. A meeting on July 7th remains of |
2821 |
|
|
interest on the 7th. |
2822 |
|
|
|
2823 |
|
|
|
2824 |
|
|
6.3. Reply-To |
2825 |
|
|
|
2826 |
|
|
The Reply-To header content specifies a reply address dif- |
2827 |
|
|
ferent from the author's address given in the From header: |
2828 |
|
|
|
2829 |
|
|
Reply-To-content = From-content |
2830 |
|
|
|
2831 |
|
|
In the absence of Reply-To, the reply address is the address |
2832 |
|
|
in the From header. |
2833 |
|
|
|
2834 |
|
|
|
2835 |
|
|
|
2836 |
|
|
2 June 1994 - 43 - expires 15 July 1994 |
2837 |
|
|
|
2838 |
|
|
|
2839 |
|
|
|
2840 |
|
|
|
2841 |
|
|
|
2842 |
|
|
INTERNET DRAFT to be NEWS sec. 6.3 |
2843 |
|
|
|
2844 |
|
|
|
2845 |
|
|
Use of a Reply-To header is preferable to including a simi- |
2846 |
|
|
lar request in the article body, because reply-preparation |
2847 |
|
|
software can take account of Reply-To automatically. |
2848 |
|
|
|
2849 |
|
|
|
2850 |
|
|
6.4. Sender |
2851 |
|
|
|
2852 |
|
|
The Sender header identifies the poster, in the event that |
2853 |
|
|
this differs from the author identified in the From header: |
2854 |
|
|
|
2855 |
|
|
Sender-content = From-content |
2856 |
|
|
|
2857 |
|
|
In the absence of Sender, the default poster is the author |
2858 |
|
|
(named in the From header). |
2859 |
|
|
|
2860 |
|
|
NOTE: The intent is that the Sender header have a |
2861 |
|
|
fairly high probability of identifying the person |
2862 |
|
|
who really posted the article. The ability to |
2863 |
|
|
specify a From header naming someone other than |
2864 |
|
|
the poster is useful but can be abused. |
2865 |
|
|
|
2866 |
|
|
If the poster supplies a From header, the posting agent MUST |
2867 |
|
|
ensure that a Sender header is present, unless it can verify |
2868 |
|
|
that the mailing address in the From header is a valid mail- |
2869 |
|
|
ing address for the poster. A poster-supplied Sender header |
2870 |
|
|
MAY be used, if its mailing address is verifiably a valid |
2871 |
|
|
mailing address for the poster; otherwise the posting agent |
2872 |
|
|
MUST supply a Sender header and delete (or rename, e.g. to |
2873 |
|
|
X-Unverifiable-Sender) any poster-supplied Sender header. |
2874 |
|
|
|
2875 |
|
|
NOTE: It might be useful to preserve a poster- |
2876 |
|
|
supplied Sender header so that the poster can sup- |
2877 |
|
|
ply the full-name part of the content. The mail- |
2878 |
|
|
ing address, however, must be right. Hence, the |
2879 |
|
|
posting agent must generate the Sender header if |
2880 |
|
|
it is unable to verify the mailing address of a |
2881 |
|
|
poster-supplied one. |
2882 |
|
|
|
2883 |
|
|
NOTE: NNTP implementors, in particular, are urged |
2884 |
|
|
to note this requirement (which would eliminate |
2885 |
|
|
the need for ad hoc headers like NNTP-Posting- |
2886 |
|
|
Host), although there are admittedly some imple- |
2887 |
|
|
mentation difficulties. A user name from an RFC |
2888 |
|
|
1413 server and a host name from an inverse map- |
2889 |
|
|
ping of the address, perhaps with a "full name" |
2890 |
|
|
comment noting the origin of the information, |
2891 |
|
|
would be at least a first approximation: |
2892 |
|
|
|
2893 |
|
|
Sender: fred@zoo.toronto.edu (RFC-1413@reverse-lookup; not verified) |
2894 |
|
|
|
2895 |
|
|
While this does not completely meet the specs, it |
2896 |
|
|
comes a lot closer than not having a Sender header |
2897 |
|
|
at all. Even just supplying a placeholder for the |
2898 |
|
|
user name: |
2899 |
|
|
|
2900 |
|
|
|
2901 |
|
|
|
2902 |
|
|
2 June 1994 - 44 - expires 15 July 1994 |
2903 |
|
|
|
2904 |
|
|
|
2905 |
|
|
|
2906 |
|
|
|
2907 |
|
|
|
2908 |
|
|
INTERNET DRAFT to be NEWS sec. 6.4 |
2909 |
|
|
|
2910 |
|
|
|
2911 |
|
|
Sender: somebody@zoo.toronto.edu (user name unknown) |
2912 |
|
|
|
2913 |
|
|
would be better than nothing. |
2914 |
|
|
|
2915 |
|
|
|
2916 |
|
|
6.5. References |
2917 |
|
|
|
2918 |
|
|
The References header content lists message IDs of precur- |
2919 |
|
|
sors: |
2920 |
|
|
|
2921 |
|
|
References-content = message-id *( space message-id ) |
2922 |
|
|
|
2923 |
|
|
A followup MUST have a References header, and an article |
2924 |
|
|
which is not a followup MUST not have a References header. |
2925 |
|
|
In a followup, if the precursor had a References header, the |
2926 |
|
|
message ID of the precursor is appended to the end of the |
2927 |
|
|
precursor's References-content to form the followup's Refer- |
2928 |
|
|
ences-content. a References header containing the precur- |
2929 |
|
|
sor's message ID. A followup to an article which had a Ref- |
2930 |
|
|
erences header MUST have a References header containing the |
2931 |
|
|
precursor's References content, plus the precursor's message |
2932 |
|
|
ID appended to the end of the list. |
2933 |
|
|
|
2934 |
|
|
NOTE: Use the See-Also header (section 6.16) for |
2935 |
|
|
interconnection of articles which are not in a |
2936 |
|
|
followup relationship to each other. |
2937 |
|
|
|
2938 |
|
|
NOTE: In retrospect, RFCs 850 and 1036, and the |
2939 |
|
|
implementations whose practice they represented, |
2940 |
|
|
erred here. The proper MAIL header to use for |
2941 |
|
|
references to precursors is In-Reply-To, and the |
2942 |
|
|
References header is meant to be used for the pur- |
2943 |
|
|
poses here ascribed to See-Also. This incompati- |
2944 |
|
|
bility is far too solidly established to be fixed, |
2945 |
|
|
unfortunately. The best that can be done is to |
2946 |
|
|
provide a clear mapping between the two, and urge |
2947 |
|
|
gateways to do the transformation. The news usage |
2948 |
|
|
is (now) a deliberate violation of the MAIL speci- |
2949 |
|
|
fications; articles containing news References |
2950 |
|
|
headers are technically not valid MAIL messages, |
2951 |
|
|
although it is unlikely that much MAIL software |
2952 |
|
|
will notice because the incompatibility is at a |
2953 |
|
|
subtle semantic level that does not affect the |
2954 |
|
|
syntax. |
2955 |
|
|
|
2956 |
|
|
UNRESOLVED ISSUE: Would it be better to just give |
2957 |
|
|
up and admit that news uses References for both |
2958 |
|
|
purposes? |
2959 |
|
|
|
2960 |
|
|
UNRESOLVED ISSUE: Should the syntax be generalized |
2961 |
|
|
to include URLs as alternatives to message IDs? |
2962 |
|
|
Perhaps not; too many things know about References |
2963 |
|
|
already. And non-articles can't be precursors of |
2964 |
|
|
articles, not really. |
2965 |
|
|
|
2966 |
|
|
|
2967 |
|
|
|
2968 |
|
|
2 June 1994 - 45 - expires 15 July 1994 |
2969 |
|
|
|
2970 |
|
|
|
2971 |
|
|
|
2972 |
|
|
|
2973 |
|
|
|
2974 |
|
|
INTERNET DRAFT to be NEWS sec. 6.5 |
2975 |
|
|
|
2976 |
|
|
|
2977 |
|
|
Followup agents SHOULD not shorten References headers. If |
2978 |
|
|
it is absolutely necessary to shorten the header, as a des- |
2979 |
|
|
perate last resort, a followup agent MAY do this by deleting |
2980 |
|
|
some of the message IDs. However, it MUST not delete the |
2981 |
|
|
first message ID, the last three message IDs (including that |
2982 |
|
|
of the immediate precursor), or any message ID mentioned in |
2983 |
|
|
the body of the followup. If it is possible for the fol- |
2984 |
|
|
lowup agent to determine the Subject content of the articles |
2985 |
|
|
identified in the References header, it MUST not delete the |
2986 |
|
|
message ID of any article where the Subject content changed |
2987 |
|
|
(other than by prepending of a back reference). The fol- |
2988 |
|
|
lowup agent MUST not delete any message ID whose local part |
2989 |
|
|
ends with "_-_" (underscore (ASCII 95), hyphen (ASCII 45), |
2990 |
|
|
underscore); followup agents are urged to use this form to |
2991 |
|
|
mark subject changes, and to avoid using it otherwise. |
2992 |
|
|
|
2993 |
|
|
NOTE: As software capable of exploiting References |
2994 |
|
|
chains has grown more common, the random shorten- |
2995 |
|
|
ing permitted by RFC 1036 has become increasingly |
2996 |
|
|
troublesome. ANY shortening is undesirable, and |
2997 |
|
|
software should do it only in cases of dire neces- |
2998 |
|
|
sity. In such cases, these rules attempt to limit |
2999 |
|
|
the damage. |
3000 |
|
|
|
3001 |
|
|
NOTE: The first message ID is very important as |
3002 |
|
|
the starting point of the "thread" of discussion, |
3003 |
|
|
and absolutely should not be deleted. Keeping the |
3004 |
|
|
last three message IDs gives thread-following |
3005 |
|
|
software a fighting chance to reconstruct a full |
3006 |
|
|
thread even if an article or two is missing. |
3007 |
|
|
Keeping message IDs mentioned in the body is obvi- |
3008 |
|
|
ously desirable. |
3009 |
|
|
|
3010 |
|
|
NOTE: Subject changes are difficult to determine, |
3011 |
|
|
but they are significant as possible beginnings of |
3012 |
|
|
new threads. The "_-_" convention is provided so |
3013 |
|
|
that posting agents (which have more information |
3014 |
|
|
about subjects) can flag articles containing a |
3015 |
|
|
subject change in a way that followup agents can |
3016 |
|
|
detect without access to the articles themselves. |
3017 |
|
|
The sequence is chosen as one that is fairly |
3018 |
|
|
unlikely to occur by accident. |
3019 |
|
|
|
3020 |
|
|
NOTE: Is "_-_" really worth having? |
3021 |
|
|
|
3022 |
|
|
When a References header is shortened, at least three blanks |
3023 |
|
|
SHOULD be left between adjacent message IDs at each point |
3024 |
|
|
where deletions were made. Software preparing new Refer- |
3025 |
|
|
ences headers SHOULD preserve multiple blanks in older Ref- |
3026 |
|
|
erences content. |
3027 |
|
|
|
3028 |
|
|
NOTE: It's desirable to have some marker of where |
3029 |
|
|
deletions occurred, but the restricted syntax of |
3030 |
|
|
the header makes this difficult. Extra white |
3031 |
|
|
|
3032 |
|
|
|
3033 |
|
|
|
3034 |
|
|
2 June 1994 - 46 - expires 15 July 1994 |
3035 |
|
|
|
3036 |
|
|
|
3037 |
|
|
|
3038 |
|
|
|
3039 |
|
|
|
3040 |
|
|
INTERNET DRAFT to be NEWS sec. 6.5 |
3041 |
|
|
|
3042 |
|
|
|
3043 |
|
|
space is not a very good marker, since it may be |
3044 |
|
|
deleted by software that ill-advisedly rewrites |
3045 |
|
|
headers, but at least it doesn't break existing |
3046 |
|
|
software. |
3047 |
|
|
|
3048 |
|
|
To repeat: followup agents SHOULD not shorten References |
3049 |
|
|
headers. |
3050 |
|
|
|
3051 |
|
|
NOTE: Unfortunately, reading agents and other |
3052 |
|
|
software analyzing References patterns have to be |
3053 |
|
|
prepared for the worst anyway. The worst includes |
3054 |
|
|
random deletions and the possibility of circular |
3055 |
|
|
References chains (when References is misused in |
3056 |
|
|
place of See-Also, section 6.16). |
3057 |
|
|
|
3058 |
|
|
|
3059 |
|
|
6.6. Control |
3060 |
|
|
|
3061 |
|
|
The Control header content marks the article as a control |
3062 |
|
|
message, and specifies the desired actions (other than the |
3063 |
|
|
usual ones of filing and passing on the article): |
3064 |
|
|
|
3065 |
|
|
Control-content = verb *( space argument ) |
3066 |
|
|
verb = 1*( letter / digit ) |
3067 |
|
|
argument = 1*<ASCII printable character> |
3068 |
|
|
|
3069 |
|
|
The verb indicates what action should be taken, and the |
3070 |
|
|
argument(s) (if any) supply details. In some cases, the |
3071 |
|
|
body of the article may also contain details. Section 7 |
3072 |
|
|
describes the standard verbs. See also the Also-Control |
3073 |
|
|
header (section 6.15). |
3074 |
|
|
|
3075 |
|
|
NOTE: Control messages are often processed and |
3076 |
|
|
filed rather differently than normal articles. |
3077 |
|
|
|
3078 |
|
|
NOTE: The restriction of verbs to letters and dig- |
3079 |
|
|
its is new, but is consistent with existing prac- |
3080 |
|
|
tice and potentially simplifies implementation by |
3081 |
|
|
avoiding characters significant to command inter- |
3082 |
|
|
preters. Beware that the arguments are under no |
3083 |
|
|
such restriction in general. |
3084 |
|
|
|
3085 |
|
|
NOTE: Two other conventions for distinguishing |
3086 |
|
|
control messages from normal articles were for- |
3087 |
|
|
merly in use: a three-component newsgroup name |
3088 |
|
|
ending in ".ctl" or a subject beginning with |
3089 |
|
|
"cmsg " was considered to imply that the article |
3090 |
|
|
was a control message. These conventions are |
3091 |
|
|
obsolete. Do not use them. |
3092 |
|
|
|
3093 |
|
|
An article with a Control header MUST not have an Also- |
3094 |
|
|
Control or Supersedes header. |
3095 |
|
|
|
3096 |
|
|
|
3097 |
|
|
|
3098 |
|
|
|
3099 |
|
|
|
3100 |
|
|
2 June 1994 - 47 - expires 15 July 1994 |
3101 |
|
|
|
3102 |
|
|
|
3103 |
|
|
|
3104 |
|
|
|
3105 |
|
|
|
3106 |
|
|
INTERNET DRAFT to be NEWS sec. 6.7 |
3107 |
|
|
|
3108 |
|
|
|
3109 |
|
|
6.7. Distribution |
3110 |
|
|
|
3111 |
|
|
The Distribution header content specifies geographic or |
3112 |
|
|
organizational limits on an article's propagation: |
3113 |
|
|
|
3114 |
|
|
Distribution-content = distribution *( dist-delim distribution ) |
3115 |
|
|
dist-delim = "," |
3116 |
|
|
distribution = plain-component |
3117 |
|
|
|
3118 |
|
|
A distribution is syntactically identical to a one-component |
3119 |
|
|
newsgroup name, and must satisfy the same rules and restric- |
3120 |
|
|
tions. In the absence of Distribution, the default distri- |
3121 |
|
|
bution is "world". |
3122 |
|
|
|
3123 |
|
|
NOTE: This syntax has the disadvantage of contain- |
3124 |
|
|
ing no white space, making it impossible to con- |
3125 |
|
|
tinue a Distribution header across several lines. |
3126 |
|
|
Implementors of relayers and reading agents are |
3127 |
|
|
warned that it is intended that the successor to |
3128 |
|
|
this Draft will change the definition of dist |
3129 |
|
|
delimiter to: |
3130 |
|
|
|
3131 |
|
|
dist-delim = "," [ space ] |
3132 |
|
|
|
3133 |
|
|
and are urged to fix their software to handle |
3134 |
|
|
(i.e., ignore) white space following the commas. |
3135 |
|
|
|
3136 |
|
|
A relayer MUST not pass an article to another relayer unless |
3137 |
|
|
configuration information specifies transmission to that |
3138 |
|
|
other relayer of BOTH (a) at least one of the article's |
3139 |
|
|
newsgroup(s), and (b) at least one of the article's distri- |
3140 |
|
|
bution(s). In effect, the only role of distributions is to |
3141 |
|
|
limit propagation, by preventing transmission of articles |
3142 |
|
|
that would have been transmitted had the decision been based |
3143 |
|
|
solely on newsgroups. |
3144 |
|
|
|
3145 |
|
|
A posting agent might wish to present a menu of possible |
3146 |
|
|
distributions, or suggest a default, but normally SHOULD not |
3147 |
|
|
supply a default without giving the poster a chance to over- |
3148 |
|
|
ride it. A followup agent SHOULD initially supply the same |
3149 |
|
|
Distribution header as found in the precursor, although the |
3150 |
|
|
poster MAY alter this if appropriate. |
3151 |
|
|
|
3152 |
|
|
Despite the syntactic similarity and some historical confu- |
3153 |
|
|
sion, distributions are NOT newsgroup names. The whole |
3154 |
|
|
point of putting a distribution on an article is that it is |
3155 |
|
|
DIFFERENT from the newsgroup(s). In general, a meaningful |
3156 |
|
|
distribution corresponds to some sort of region of propaga- |
3157 |
|
|
tion: a geographical area, an organization, or a cooperating |
3158 |
|
|
subnet. |
3159 |
|
|
|
3160 |
|
|
NOTE: Distributions have historically suffered |
3161 |
|
|
from the completely uncontrolled nature of their |
3162 |
|
|
name space, the lack of feedback to posters on |
3163 |
|
|
|
3164 |
|
|
|
3165 |
|
|
|
3166 |
|
|
2 June 1994 - 48 - expires 15 July 1994 |
3167 |
|
|
|
3168 |
|
|
|
3169 |
|
|
|
3170 |
|
|
|
3171 |
|
|
|
3172 |
|
|
INTERNET DRAFT to be NEWS sec. 6.7 |
3173 |
|
|
|
3174 |
|
|
|
3175 |
|
|
incomplete propagation resulting from use of ran- |
3176 |
|
|
dom trash in Distribution headers, and confusion |
3177 |
|
|
with newsgroups (arising partly because many |
3178 |
|
|
regions and organizations DO have internal news- |
3179 |
|
|
groups with names resembling their internal dis- |
3180 |
|
|
tributions). This has resulted in much garbage in |
3181 |
|
|
Distribution headers, notably the pointless prac- |
3182 |
|
|
tice of automatically supplying the first compo- |
3183 |
|
|
nent of the newsgroup name as a distribution |
3184 |
|
|
(which is MOST unlikely to restrict propagation!). |
3185 |
|
|
Many sites have opted to maximize propagation of |
3186 |
|
|
such ill-formed articles by essentially ignoring |
3187 |
|
|
distributions. This unfortunately interferes with |
3188 |
|
|
legitimate uses. The situation is bad enough that |
3189 |
|
|
distributions must be considered largely useless |
3190 |
|
|
except within cooperating subnets that make an |
3191 |
|
|
organized effort to restrain propagation of their |
3192 |
|
|
internal distributions. |
3193 |
|
|
|
3194 |
|
|
NOTE: The distributions "world" and "local" have |
3195 |
|
|
no standard magic meaning (except that the former |
3196 |
|
|
is the default distribution if none is given). |
3197 |
|
|
Some pieces of software do assign such meanings to |
3198 |
|
|
them. |
3199 |
|
|
|
3200 |
|
|
|
3201 |
|
|
6.8. Keywords |
3202 |
|
|
|
3203 |
|
|
The Keywords header content is one or more phrases intended |
3204 |
|
|
to describe some aspect of the content of the article: |
3205 |
|
|
|
3206 |
|
|
Keywords-content = plain-phrase *( "," [ space ] plain-phrase ) |
3207 |
|
|
|
3208 |
|
|
Keywords, separated by commas, each follow the <plain- |
3209 |
|
|
phrase> syntax defined in section 5.2. Encoded words in |
3210 |
|
|
keywords MUST not contain characters other than letters (of |
3211 |
|
|
either case), digits, and the characters "!", "*", "+", "-", |
3212 |
|
|
"/", "=", and "_". |
3213 |
|
|
|
3214 |
|
|
NOTE: Posters and posting agents are asked to take |
3215 |
|
|
note that keywords are separated by commas, not by |
3216 |
|
|
white space. The following Keywords header con- |
3217 |
|
|
tains only one keyword (a rather unlikely and |
3218 |
|
|
improbable one): |
3219 |
|
|
|
3220 |
|
|
Keywords: Thompson Ritchie Multics Linux |
3221 |
|
|
|
3222 |
|
|
and should probably have been written: |
3223 |
|
|
|
3224 |
|
|
Keywords: Thompson, Ritchie, Multics, Linux |
3225 |
|
|
|
3226 |
|
|
This particular error is unfortunately rather |
3227 |
|
|
widespread. |
3228 |
|
|
|
3229 |
|
|
|
3230 |
|
|
|
3231 |
|
|
|
3232 |
|
|
2 June 1994 - 49 - expires 15 July 1994 |
3233 |
|
|
|
3234 |
|
|
|
3235 |
|
|
|
3236 |
|
|
|
3237 |
|
|
|
3238 |
|
|
INTERNET DRAFT to be NEWS sec. 6.8 |
3239 |
|
|
|
3240 |
|
|
|
3241 |
|
|
NOTE: Reading agents and archivers preparing |
3242 |
|
|
indexes of articles should bear in mind that user- |
3243 |
|
|
chosen keywords are notoriously poor for indexing |
3244 |
|
|
purposes unless the keywords are picked from a |
3245 |
|
|
predefined set (which they are not in this case). |
3246 |
|
|
Also, some followup agents unwisely propagate the |
3247 |
|
|
Keywords header from the precursor into the fol- |
3248 |
|
|
lowup by default. At least one news-based experi- |
3249 |
|
|
ment has found the contents of Keywords headers to |
3250 |
|
|
be completely valueless for indexing. |
3251 |
|
|
|
3252 |
|
|
|
3253 |
|
|
6.9. Summary |
3254 |
|
|
|
3255 |
|
|
The Summary header content is a short phrase summarizing the |
3256 |
|
|
article's content: |
3257 |
|
|
|
3258 |
|
|
Summary-content = nonblank-text |
3259 |
|
|
|
3260 |
|
|
As with the subject, no restriction is placed on the content |
3261 |
|
|
since it is intended solely for display to humans. |
3262 |
|
|
|
3263 |
|
|
NOTE: Reading agents should be aware that the Sum- |
3264 |
|
|
mary header is often used as a sort of secondary |
3265 |
|
|
Subject header, and (if present) its contents |
3266 |
|
|
should perhaps be displayed when the subject is |
3267 |
|
|
displayed. |
3268 |
|
|
|
3269 |
|
|
The summary SHOULD be terse. Posters SHOULD avoid trying to |
3270 |
|
|
cram their entire article into the headers; even the sim- |
3271 |
|
|
plest query usually benefits from a sentence or two of elab- |
3272 |
|
|
oration and context, and not all reading agents display all |
3273 |
|
|
headers. |
3274 |
|
|
|
3275 |
|
|
|
3276 |
|
|
6.10. Approved |
3277 |
|
|
|
3278 |
|
|
The Approved header content indicates the mailing addresses |
3279 |
|
|
(and possibly the full names) of the persons or entities |
3280 |
|
|
approving the article for posting: |
3281 |
|
|
|
3282 |
|
|
Approved-content = From-content *( "," [ space ] From-content ) |
3283 |
|
|
|
3284 |
|
|
An Approved header is required in all postings to moderated |
3285 |
|
|
newsgroups; the presence or absence of this header allows a |
3286 |
|
|
posting agent to distinguish between articles posted by the |
3287 |
|
|
moderator (which are normal articles to be posted normally) |
3288 |
|
|
and attempted contributions by others (which should be |
3289 |
|
|
mailed to the moderator for approval). An Approved header |
3290 |
|
|
is also required in certain control messages, to reduce the |
3291 |
|
|
probability of accidental posting of same; see the relevant |
3292 |
|
|
parts of section 7. |
3293 |
|
|
|
3294 |
|
|
|
3295 |
|
|
|
3296 |
|
|
|
3297 |
|
|
|
3298 |
|
|
2 June 1994 - 50 - expires 15 July 1994 |
3299 |
|
|
|
3300 |
|
|
|
3301 |
|
|
|
3302 |
|
|
|
3303 |
|
|
|
3304 |
|
|
INTERNET DRAFT to be NEWS sec. 6.10 |
3305 |
|
|
|
3306 |
|
|
|
3307 |
|
|
NOTE: There is, at present, no way to authenticate |
3308 |
|
|
Approved headers to ensure that the claimed |
3309 |
|
|
approval really was bestowed. Nor is there an |
3310 |
|
|
established mechanism for even maintaining a list |
3311 |
|
|
of legitimate approvers (such a list would quickly |
3312 |
|
|
become out of date if it had to be maintained by |
3313 |
|
|
hand). Such mechanisms, presumably relying on |
3314 |
|
|
cryptographic authentication, would be a worth- |
3315 |
|
|
while extension to this Draft, and experimental |
3316 |
|
|
work in this area is encouraged. (The problem is |
3317 |
|
|
harder than it sounds because news is used on many |
3318 |
|
|
systems which do not have real-time access to key |
3319 |
|
|
servers.) |
3320 |
|
|
|
3321 |
|
|
NOTE: Relayer implementors, please note well: it |
3322 |
|
|
is the POSTING AGENT that is authorized to distin- |
3323 |
|
|
guish between moderator postings and attempted |
3324 |
|
|
contributions, and to mail the latter to the mod- |
3325 |
|
|
erator. As discussed in section 9.1, relayers |
3326 |
|
|
MUST not, repeat MUST not, send such mail; on |
3327 |
|
|
receipt of an unApproved article in a moderated |
3328 |
|
|
newsgroup, they should discard the article, NOT |
3329 |
|
|
transform it into a mail message (except perhaps |
3330 |
|
|
to a local administrator). |
3331 |
|
|
|
3332 |
|
|
NOTE: RFC 1036 restricted Approved to a single |
3333 |
|
|
From-content. However, multiple moderation is no |
3334 |
|
|
longer rare, and multi-moderator Approved headers |
3335 |
|
|
are already in use. |
3336 |
|
|
|
3337 |
|
|
|
3338 |
|
|
6.11. Lines |
3339 |
|
|
|
3340 |
|
|
The Lines header content indicates the number of lines in |
3341 |
|
|
the body of the article: |
3342 |
|
|
|
3343 |
|
|
Lines-content = 1*digit |
3344 |
|
|
|
3345 |
|
|
The line count includes all body lines, including the signa- |
3346 |
|
|
ture if any, including empty lines (if any) at beginning or |
3347 |
|
|
end of the body. (The single empty separator line between |
3348 |
|
|
the headers and the body is not part of the body.) The |
3349 |
|
|
"body" here is the body as found in the posted article, |
3350 |
|
|
AFTER all transformations such as MIME encodings. |
3351 |
|
|
|
3352 |
|
|
Reading agents SHOULD not rely on the presence of this |
3353 |
|
|
header, since it is optional (and some posting agents do not |
3354 |
|
|
supply it). They MUST not rely on it being precise, since |
3355 |
|
|
it frequently is not. |
3356 |
|
|
|
3357 |
|
|
NOTE: The average line length in article bodies is |
3358 |
|
|
surprisingly consistent at about 40 characters, |
3359 |
|
|
and since the line count typically is used only |
3360 |
|
|
for approximate judgements ("is this too long to |
3361 |
|
|
|
3362 |
|
|
|
3363 |
|
|
|
3364 |
|
|
2 June 1994 - 51 - expires 15 July 1994 |
3365 |
|
|
|
3366 |
|
|
|
3367 |
|
|
|
3368 |
|
|
|
3369 |
|
|
|
3370 |
|
|
INTERNET DRAFT to be NEWS sec. 6.11 |
3371 |
|
|
|
3372 |
|
|
|
3373 |
|
|
read quickly?"), dividing the byte count of the |
3374 |
|
|
body by 40 gives an estimate of the body line |
3375 |
|
|
count that is adequate for normal use. This esti- |
3376 |
|
|
mate is NOT adequate if the body has been MIME |
3377 |
|
|
encoded... but neither is the Lines header, since |
3378 |
|
|
at least one major relayer will supply a Lines |
3379 |
|
|
header for an article that lacks one, and will not |
3380 |
|
|
consider the possibility of MIME encodings when |
3381 |
|
|
computing the line count. |
3382 |
|
|
|
3383 |
|
|
NOTE: It would be better to have a Content-Size |
3384 |
|
|
header as part of MIME, so that body parts could |
3385 |
|
|
have their own sizes, and so that the units used |
3386 |
|
|
could be appropriate to the data type (line count |
3387 |
|
|
is not a useful measure of the size of an encoded |
3388 |
|
|
image, for example). Doing this is preferable to |
3389 |
|
|
trying to fix Lines. |
3390 |
|
|
|
3391 |
|
|
UNRESOLVED ISSUE: Update on Content-Size? |
3392 |
|
|
|
3393 |
|
|
Relayers SHOULD discard this header if they find it neces- |
3394 |
|
|
sary to re-encode the article in such a way that the origi- |
3395 |
|
|
nal Lines header would be rendered incorrect. |
3396 |
|
|
|
3397 |
|
|
|
3398 |
|
|
6.12. Xref |
3399 |
|
|
|
3400 |
|
|
The Xref header content indicates where an article was filed |
3401 |
|
|
by the last relayer to process it: |
3402 |
|
|
|
3403 |
|
|
Xref-content = relayer 1*( space location ) |
3404 |
|
|
relayer = relayer-name |
3405 |
|
|
location = newsgroup-name ":" article-locator |
3406 |
|
|
article-locator = 1*<ASCII printable character> |
3407 |
|
|
|
3408 |
|
|
The relayer's name is included so that software can deter- |
3409 |
|
|
mine which relayer generated the header (and specifically, |
3410 |
|
|
whether it really was the one that filed the copy being |
3411 |
|
|
examined). The locations specify what newsgroups the arti- |
3412 |
|
|
cle was filed under (which may differ from those in the |
3413 |
|
|
Newsgroups header) and where it was filed under them. The |
3414 |
|
|
exact form of an article locator is implementation-specific. |
3415 |
|
|
|
3416 |
|
|
NOTE: Reading agents can exploit this information |
3417 |
|
|
to avoid presenting the same article to a reader |
3418 |
|
|
several times. The information is sometimes |
3419 |
|
|
available in system databases, but having it in |
3420 |
|
|
the article is convenient. Relayers traditionally |
3421 |
|
|
generate an Xref header only if the article is |
3422 |
|
|
cross-posted, but this is not mandatory, and there |
3423 |
|
|
is at least one new application ("mirroring": |
3424 |
|
|
keeping news databases on two hosts identical) |
3425 |
|
|
where the header is useful in all articles. |
3426 |
|
|
|
3427 |
|
|
|
3428 |
|
|
|
3429 |
|
|
|
3430 |
|
|
2 June 1994 - 52 - expires 15 July 1994 |
3431 |
|
|
|
3432 |
|
|
|
3433 |
|
|
|
3434 |
|
|
|
3435 |
|
|
|
3436 |
|
|
INTERNET DRAFT to be NEWS sec. 6.12 |
3437 |
|
|
|
3438 |
|
|
|
3439 |
|
|
NOTE: The traditional form of an article locator |
3440 |
|
|
is a decimal number, with articles in each news- |
3441 |
|
|
group numbered consecutively starting from 1. |
3442 |
|
|
NNTP [rrr] demands that such a model be provided, |
3443 |
|
|
and there may be other software which expects it, |
3444 |
|
|
but it seems desirable to permit flexibility for |
3445 |
|
|
unorthodox implementations. |
3446 |
|
|
|
3447 |
|
|
A relayer inserting an Xref header into an article MUST |
3448 |
|
|
delete any previous Xref header. A relayer which is not |
3449 |
|
|
inserting its own Xref header SHOULD delete any previous |
3450 |
|
|
Xref header. A relayer MAY delete the Xref header when |
3451 |
|
|
passing an article on to another relayer. |
3452 |
|
|
|
3453 |
|
|
NOTE: RFC 1036 specified that the Xref header was |
3454 |
|
|
not transmitted when an article was passed to |
3455 |
|
|
another relayer, but the major news implementa- |
3456 |
|
|
tions have never obeyed this rule, and applica- |
3457 |
|
|
tions like mirroring depend on this disobedience. |
3458 |
|
|
|
3459 |
|
|
A relayer MUST use the same name in Xref headers as it uses |
3460 |
|
|
in Path headers. Reading agents MUST ignore an Xref header |
3461 |
|
|
containing a relayer name that differs from the one that |
3462 |
|
|
begins the path list. |
3463 |
|
|
|
3464 |
|
|
|
3465 |
|
|
6.13. Organization |
3466 |
|
|
|
3467 |
|
|
The Organization header content is a short phrase identify- |
3468 |
|
|
ing the poster's organization: |
3469 |
|
|
|
3470 |
|
|
Organization-content = nonblank-text |
3471 |
|
|
|
3472 |
|
|
This header is typically supplied by the posting agent. The |
3473 |
|
|
Organization content SHOULD mention geographical location |
3474 |
|
|
(e.g. city and country) when it is not obvious from the |
3475 |
|
|
organization's name. |
3476 |
|
|
|
3477 |
|
|
NOTE: The motive here is that the organization is |
3478 |
|
|
often difficult to guess from the mailing address, |
3479 |
|
|
is not always supplied in a signature, and can |
3480 |
|
|
help identify the poster to the reader. |
3481 |
|
|
|
3482 |
|
|
NOTE: There is no "s" in "Organization". |
3483 |
|
|
|
3484 |
|
|
The Organization content is provided for identification |
3485 |
|
|
only, and does not imply that the poster speaks for the |
3486 |
|
|
organization or that the article represents organization |
3487 |
|
|
policy. Posting agents SHOULD permit the poster to override |
3488 |
|
|
a local default Organization header. |
3489 |
|
|
|
3490 |
|
|
|
3491 |
|
|
|
3492 |
|
|
|
3493 |
|
|
|
3494 |
|
|
|
3495 |
|
|
|
3496 |
|
|
2 June 1994 - 53 - expires 15 July 1994 |
3497 |
|
|
|
3498 |
|
|
|
3499 |
|
|
|
3500 |
|
|
|
3501 |
|
|
|
3502 |
|
|
INTERNET DRAFT to be NEWS sec. 6.14 |
3503 |
|
|
|
3504 |
|
|
|
3505 |
|
|
6.14. Supersedes |
3506 |
|
|
|
3507 |
|
|
The Supersedes header content specifies articles to be can- |
3508 |
|
|
celled on arrival of this one: |
3509 |
|
|
|
3510 |
|
|
Supersedes-content = message-id *( space message-id ) |
3511 |
|
|
|
3512 |
|
|
Supersedes is equivalent to Also-Control (section 6.15) with |
3513 |
|
|
an implicit verb of "cancel" (section 7.1). |
3514 |
|
|
|
3515 |
|
|
NOTE: Supersedes is normally used where the arti- |
3516 |
|
|
cle is an updated version of the one(s) being can- |
3517 |
|
|
celled. |
3518 |
|
|
|
3519 |
|
|
NOTE: Although the ability to use multiple message |
3520 |
|
|
IDs in Supersedes is highly desirable (see section |
3521 |
|
|
7.1), posters are warned that existing implementa- |
3522 |
|
|
tions often do not correctly handle more than one. |
3523 |
|
|
|
3524 |
|
|
NOTE: There is no "c" in "Supersedes". |
3525 |
|
|
|
3526 |
|
|
An article with a Supersedes header MUST not have an Also- |
3527 |
|
|
Control or Control header. |
3528 |
|
|
|
3529 |
|
|
|
3530 |
|
|
6.15. Also-Control |
3531 |
|
|
|
3532 |
|
|
The Also-Control header content marks the article as being a |
3533 |
|
|
control message IN ADDITION to being a normal news article, |
3534 |
|
|
and specifies the desired actions: |
3535 |
|
|
|
3536 |
|
|
Also-Control-content = Control-content |
3537 |
|
|
|
3538 |
|
|
An article with an Also-Control header is filed and passed |
3539 |
|
|
on normally, but the content of the Also-Control header is |
3540 |
|
|
processed as if it were found in a Control header. |
3541 |
|
|
|
3542 |
|
|
NOTE: It is sometimes desirable to piggyback con- |
3543 |
|
|
trol actions on a normal article, so that the |
3544 |
|
|
article will be filed normally but will also be |
3545 |
|
|
acted on as a control message. This header is |
3546 |
|
|
essentially a generalization of Supersedes. |
3547 |
|
|
|
3548 |
|
|
NOTE: Be warned that some old relayers do not |
3549 |
|
|
implement Also-Control. |
3550 |
|
|
|
3551 |
|
|
An article with an Also-Control header MUST not have a Con- |
3552 |
|
|
trol or Supersedes header. |
3553 |
|
|
|
3554 |
|
|
|
3555 |
|
|
6.16. See-Also |
3556 |
|
|
|
3557 |
|
|
The See-Also header content lists message IDs of articles |
3558 |
|
|
that are related to this one but are not its precursors: |
3559 |
|
|
|
3560 |
|
|
|
3561 |
|
|
|
3562 |
|
|
2 June 1994 - 54 - expires 15 July 1994 |
3563 |
|
|
|
3564 |
|
|
|
3565 |
|
|
|
3566 |
|
|
|
3567 |
|
|
|
3568 |
|
|
INTERNET DRAFT to be NEWS sec. 6.16 |
3569 |
|
|
|
3570 |
|
|
|
3571 |
|
|
See-Also-content = message-id *( space message-id ) |
3572 |
|
|
|
3573 |
|
|
See-Also resembles References, but without the restrictions |
3574 |
|
|
imposed on References by the followup rules. |
3575 |
|
|
|
3576 |
|
|
NOTE: See-Also provides a way to group related |
3577 |
|
|
articles, such as the parts of a single document |
3578 |
|
|
that had to be split across multiple articles due |
3579 |
|
|
to its size, or to cross-reference between paral- |
3580 |
|
|
lel threads. |
3581 |
|
|
|
3582 |
|
|
NOTE: See the discussion (in section 6.5) on MAIL |
3583 |
|
|
compatibility issues of References and See-Also. |
3584 |
|
|
|
3585 |
|
|
NOTE: In the specific case where it is desired to |
3586 |
|
|
essentially make another article PART of the cur- |
3587 |
|
|
rent one, e.g. for annotation of the other arti- |
3588 |
|
|
cle, MIME's "message/external-body" convention can |
3589 |
|
|
be used to do so without actual inclusion. "news- |
3590 |
|
|
message-ID" was registered as a standard external- |
3591 |
|
|
body access method, with a mandatory NAME parame- |
3592 |
|
|
ter giving the message ID and an optional SITE |
3593 |
|
|
parameter suggesting an NNTP site that might have |
3594 |
|
|
the article available (if it is not available |
3595 |
|
|
locally), by IANA 22 June 1993. |
3596 |
|
|
|
3597 |
|
|
UNRESOLVED ISSUE: Could the syntax be generalized |
3598 |
|
|
to include URLs as alternatives to message IDs? |
3599 |
|
|
Here it makes much more sense than in References. |
3600 |
|
|
|
3601 |
|
|
|
3602 |
|
|
6.17. Article-Names |
3603 |
|
|
|
3604 |
|
|
The Article-Names header content indicates any special sig- |
3605 |
|
|
nificance the article may have in particular newsgroups: |
3606 |
|
|
|
3607 |
|
|
Article-Names-content = 1*( name-clause space ) |
3608 |
|
|
name-clause = newsgroup-name ":" article-name |
3609 |
|
|
article-name = letter 1*( letter / digit / "-" ) |
3610 |
|
|
|
3611 |
|
|
Each name clause specifies a newsgroup (which SHOULD be |
3612 |
|
|
among those in the Newsgroups header) and an article name |
3613 |
|
|
local to that newsgroup. Article names MAY be used by |
3614 |
|
|
relayers to file the article in special ways, or they MAY |
3615 |
|
|
just be noted for possible special attention by reading |
3616 |
|
|
agents. Article names are case-sensitive. |
3617 |
|
|
|
3618 |
|
|
NOTE: This header provides a way to mark special |
3619 |
|
|
postings, such as introductions, frequently-asked- |
3620 |
|
|
question lists, etc., so that reading agents have |
3621 |
|
|
a way of finding them automatically. The news- |
3622 |
|
|
group name is specified for each article name |
3623 |
|
|
because the names may be newsgroup-specific; for |
3624 |
|
|
example, many frequently-asked-question lists are |
3625 |
|
|
|
3626 |
|
|
|
3627 |
|
|
|
3628 |
|
|
2 June 1994 - 55 - expires 15 July 1994 |
3629 |
|
|
|
3630 |
|
|
|
3631 |
|
|
|
3632 |
|
|
|
3633 |
|
|
|
3634 |
|
|
INTERNET DRAFT to be NEWS sec. 6.17 |
3635 |
|
|
|
3636 |
|
|
|
3637 |
|
|
posted to "news.answers" in addition to their |
3638 |
|
|
"home" newsgroup, and they would not be known by |
3639 |
|
|
the same name(s) in both newsgroups. |
3640 |
|
|
|
3641 |
|
|
The Article-Names header SHOULD be ignored unless the arti- |
3642 |
|
|
cle also contains an Approved header. |
3643 |
|
|
|
3644 |
|
|
NOTE: This stipulation is made in anticipation of |
3645 |
|
|
the possibility that Approved headers will be |
3646 |
|
|
involved in cryptographic authentication. |
3647 |
|
|
|
3648 |
|
|
The presence of an Article-Names header does not necessarily |
3649 |
|
|
imply that the article will be retained unusually long |
3650 |
|
|
before expiration, or that previous article(s) with similar |
3651 |
|
|
Article-Names headers will be cancelled by its arrival. |
3652 |
|
|
Posters preparing special postings SHOULD include appropri- |
3653 |
|
|
ate other headers, such as Expires and Supersedes, to |
3654 |
|
|
request such actions. |
3655 |
|
|
|
3656 |
|
|
Different networks MAY establish different sets of article |
3657 |
|
|
names for the special postings they deem significant; it is |
3658 |
|
|
preferable for usage to be standardized within networks, |
3659 |
|
|
although it might be desirable for individual newsgroups to |
3660 |
|
|
have different naming conventions in some situations. Arti- |
3661 |
|
|
cle names MUST be 14 characters or less. The following |
3662 |
|
|
names are suggested but are not mandatory: |
3663 |
|
|
|
3664 |
|
|
intro Introduction to the newsgroup for newcomers. |
3665 |
|
|
|
3666 |
|
|
charter Charter, rules, organization, moderation poli- |
3667 |
|
|
cies, etc. |
3668 |
|
|
|
3669 |
|
|
background Biographies of special participants, history of |
3670 |
|
|
the newsgroup, notes on related newsgroups, etc. |
3671 |
|
|
|
3672 |
|
|
subgroups Descriptions of sub-newsgroups under this news- |
3673 |
|
|
group, e.g. "sci.space.news" under "sci.space". |
3674 |
|
|
|
3675 |
|
|
facts Information relating to the purpose of the news- |
3676 |
|
|
group, e.g. an acronym glossary in "sci.space". |
3677 |
|
|
|
3678 |
|
|
references Where to get more information: books, journals, |
3679 |
|
|
FTP repositories, etc. |
3680 |
|
|
|
3681 |
|
|
faq Answers to frequently-asked questions. |
3682 |
|
|
|
3683 |
|
|
menu If present, a list of all the other article |
3684 |
|
|
names local to this newsgroup, with brief |
3685 |
|
|
descriptions of their contents. |
3686 |
|
|
|
3687 |
|
|
Such articles may be divided into subsections using the MIME |
3688 |
|
|
"multipart/mixed" conventions. If size considerations make |
3689 |
|
|
it necessary to split such articles, names ending in a |
3690 |
|
|
hyphen and a part number are suggested; for example, a |
3691 |
|
|
|
3692 |
|
|
|
3693 |
|
|
|
3694 |
|
|
2 June 1994 - 56 - expires 15 July 1994 |
3695 |
|
|
|
3696 |
|
|
|
3697 |
|
|
|
3698 |
|
|
|
3699 |
|
|
|
3700 |
|
|
INTERNET DRAFT to be NEWS sec. 6.17 |
3701 |
|
|
|
3702 |
|
|
|
3703 |
|
|
three-part frequently-asked-questions list could have arti- |
3704 |
|
|
cle names "faq-1", "faq-2", and "faq-3". |
3705 |
|
|
|
3706 |
|
|
NOTE: It is somewhat premature to attempt to stan- |
3707 |
|
|
dardize article names, since this is essentially a |
3708 |
|
|
new feature with no experience behind it. How- |
3709 |
|
|
ever, if reading agents are to attach special sig- |
3710 |
|
|
nificance to these names, some attempt at standard |
3711 |
|
|
conventions is imperative. This is a first |
3712 |
|
|
attempt at providing some. |
3713 |
|
|
|
3714 |
|
|
|
3715 |
|
|
6.18. Article-Updates |
3716 |
|
|
|
3717 |
|
|
The Article-Updates header content indicates what previous |
3718 |
|
|
articles this one is deemed (by the poster) to update (i.e., |
3719 |
|
|
replace): |
3720 |
|
|
|
3721 |
|
|
Article-Updates-content = message-id *( space message-id ) |
3722 |
|
|
|
3723 |
|
|
Each message ID identifies a previous article that this one |
3724 |
|
|
is deemed to update. This MUST not cause the previous arti- |
3725 |
|
|
cle(s) to be cancelled or otherwise altered, unless this is |
3726 |
|
|
implied by other headers (e.g. Supersedes); Article-Updates |
3727 |
|
|
is merely an advisory which MAY be noted for special atten- |
3728 |
|
|
tion by reading agents. |
3729 |
|
|
|
3730 |
|
|
NOTE: This header provides a way to mark articles |
3731 |
|
|
which are only minor updates of previous ones, |
3732 |
|
|
containing no significant new information and not |
3733 |
|
|
worth reading if the previous ones have been read. |
3734 |
|
|
|
3735 |
|
|
NOTE: If suitable conventions using MIME multipart |
3736 |
|
|
bodies and the "message/external-body" body-part |
3737 |
|
|
type can be developed, a replacing article might |
3738 |
|
|
contain only differences between the old text and |
3739 |
|
|
the new text, rather than a complete new copy. |
3740 |
|
|
This is the motivation for not making Article- |
3741 |
|
|
Updates also function as Supersedes does: the |
3742 |
|
|
replacing article might depend on the continued |
3743 |
|
|
presence of the replaced article. |
3744 |
|
|
|
3745 |
|
|
|
3746 |
|
|
7. Control Messages |
3747 |
|
|
|
3748 |
|
|
The following sections document the currently-defined con- |
3749 |
|
|
trol messages. "Message" is used herein as a synonym for |
3750 |
|
|
"article" unless context indicates otherwise. |
3751 |
|
|
|
3752 |
|
|
Posting agents are warned that since certain control mes- |
3753 |
|
|
sages require article bodies in quite specific formats, sig- |
3754 |
|
|
natures SHOULD not be appended to such articles, and it may |
3755 |
|
|
be wise to take greater care than usual to avoid unintended |
3756 |
|
|
(although perhaps well-meaning) alterations to text supplied |
3757 |
|
|
|
3758 |
|
|
|
3759 |
|
|
|
3760 |
|
|
2 June 1994 - 57 - expires 15 July 1994 |
3761 |
|
|
|
3762 |
|
|
|
3763 |
|
|
|
3764 |
|
|
|
3765 |
|
|
|
3766 |
|
|
INTERNET DRAFT to be NEWS sec. 7 |
3767 |
|
|
|
3768 |
|
|
|
3769 |
|
|
by the poster. Relayers MUST assume that control messages |
3770 |
|
|
mean what they say; they MAY be obeyed as is or rejected, |
3771 |
|
|
but MUST not be reinterpreted. |
3772 |
|
|
|
3773 |
|
|
The execution of the actions requested by control messages |
3774 |
|
|
is subject to local administrative restrictions, which MAY |
3775 |
|
|
deny requests or refer them to an administrator for |
3776 |
|
|
approval. The descriptions below are generally phrased in |
3777 |
|
|
terms suggesting mandatory actions, but any or all of these |
3778 |
|
|
MAY be subject to local administrative approval (either as a |
3779 |
|
|
class or case-by-case). Analogously, where the description |
3780 |
|
|
below specifies that a message or portion thereof is to be |
3781 |
|
|
ignored, this action MAY include reporting it to an adminis- |
3782 |
|
|
trator. |
3783 |
|
|
|
3784 |
|
|
NOTE: The exact choice of local action might |
3785 |
|
|
depend on what action the control message |
3786 |
|
|
requests, who it claims to come from, etc. |
3787 |
|
|
|
3788 |
|
|
Relayers MUST propagate even control messages they do not |
3789 |
|
|
understand. |
3790 |
|
|
|
3791 |
|
|
In the following sections, each type of control message is |
3792 |
|
|
defined syntactically by defining its arguments and its |
3793 |
|
|
body. For example, "cancel" is defined by defining cancel- |
3794 |
|
|
arguments and cancel-body. |
3795 |
|
|
|
3796 |
|
|
|
3797 |
|
|
7.1. cancel |
3798 |
|
|
|
3799 |
|
|
The cancel message requests that one or more previous arti- |
3800 |
|
|
cles be "cancelled": |
3801 |
|
|
|
3802 |
|
|
cancel-arguments = message-id *( space message-id ) |
3803 |
|
|
cancel-body = body |
3804 |
|
|
|
3805 |
|
|
The argument(s) identify the articles to be cancelled, by |
3806 |
|
|
message ID. The body is a comment, which software MUST |
3807 |
|
|
ignore, and SHOULD contain an indication of why the cancel- |
3808 |
|
|
lation was requested. The cancel message SHOULD be posted |
3809 |
|
|
to the same newsgroup(s), with the same distribution(s), as |
3810 |
|
|
the article(s) it is attempting to cancel. |
3811 |
|
|
|
3812 |
|
|
NOTE: Using the same newsgroups and distributions |
3813 |
|
|
maximizes the chances of the cancel message propa- |
3814 |
|
|
gating everywhere the target articles went. |
3815 |
|
|
|
3816 |
|
|
NOTE: RFC 1036 permitted only a single message-id |
3817 |
|
|
in a cancel message. Support for cancelling mul- |
3818 |
|
|
tiple articles is highly desirable, especially for |
3819 |
|
|
use with Supersedes (see section 6.14). If sev- |
3820 |
|
|
eral revisions of an article appear in fast suc- |
3821 |
|
|
cession, each using Supersedes to cancel the pre- |
3822 |
|
|
vious one, it is possible for a middle revision to |
3823 |
|
|
|
3824 |
|
|
|
3825 |
|
|
|
3826 |
|
|
2 June 1994 - 58 - expires 15 July 1994 |
3827 |
|
|
|
3828 |
|
|
|
3829 |
|
|
|
3830 |
|
|
|
3831 |
|
|
|
3832 |
|
|
INTERNET DRAFT to be NEWS sec. 7.1 |
3833 |
|
|
|
3834 |
|
|
|
3835 |
|
|
be destroyed by cancellation before it is propa- |
3836 |
|
|
gated onward to cancel its predecessor. Allowing |
3837 |
|
|
each article to cancel several predecessors |
3838 |
|
|
greatly alleviates this problem. (Posting agents |
3839 |
|
|
preparing a cancel of an article which itself can- |
3840 |
|
|
cels other articles might wish to add those arti- |
3841 |
|
|
cles to the cancel-arguments.) However, posters |
3842 |
|
|
should be aware that much old software does not |
3843 |
|
|
implement multiple cancellation properly, and |
3844 |
|
|
should avoid using it when reliable cancellation |
3845 |
|
|
is vitally important. |
3846 |
|
|
|
3847 |
|
|
When an article (the "target article") is to be cancelled, |
3848 |
|
|
there are four cases of interest: the article hasn't arrived |
3849 |
|
|
yet, it has arrived and been filed and is available for |
3850 |
|
|
reading, it has expired and been archived on some less- |
3851 |
|
|
accessible storage medium, or it has expired and been |
3852 |
|
|
deleted. The next few paragraphs discuss each case in turn |
3853 |
|
|
(in reverse order, which is convenient for the explanation). |
3854 |
|
|
|
3855 |
|
|
EXPIRED AND DELETED. Take no action. |
3856 |
|
|
|
3857 |
|
|
EXPIRED AND ARCHIVED. If the article is readily accessible |
3858 |
|
|
and can be deleted or made unreadable easily, treat as under |
3859 |
|
|
AVAILABLE below. Otherwise treat as under EXPIRED AND |
3860 |
|
|
DELETED. |
3861 |
|
|
|
3862 |
|
|
NOTE: While it is desirable for archived articles |
3863 |
|
|
to be cancellable, this can easily involve rewrit- |
3864 |
|
|
ing an entire archive volume just to get rid of |
3865 |
|
|
one article, perhaps with manual actions required |
3866 |
|
|
to arrange it. It is difficult to envision a sit- |
3867 |
|
|
uation so dire as to require such measures from |
3868 |
|
|
hundreds or thousands of administrators, or for |
3869 |
|
|
that matter one in which widespread compliance |
3870 |
|
|
with such a request is likely. |
3871 |
|
|
|
3872 |
|
|
AVAILABLE. Compare the mailing addresses from the From |
3873 |
|
|
lines of the cancel message and the target article, bearing |
3874 |
|
|
in mind that local parts (except for "postmaster") are case- |
3875 |
|
|
sensitive and domains are case-insensitive. If they do not |
3876 |
|
|
match, either refer the issue to an administrator for a |
3877 |
|
|
case-by-case decision, or treat as if they matched. |
3878 |
|
|
|
3879 |
|
|
NOTE: It is generally trivial to forge articles, |
3880 |
|
|
so nothing short of cryptographic authentication |
3881 |
|
|
is really adequate to ensure that a cancel came |
3882 |
|
|
from the original article's author. Moreover, it |
3883 |
|
|
is highly desirable to permit authorities other |
3884 |
|
|
than the author to cancel articles, to allow for |
3885 |
|
|
cases in which the author is unavailable, uncoop- |
3886 |
|
|
erative, or malicious, and in which damage and/or |
3887 |
|
|
legal problems may be minimized by prompt cancel- |
3888 |
|
|
lation. Reliable authentication that would permit |
3889 |
|
|
|
3890 |
|
|
|
3891 |
|
|
|
3892 |
|
|
2 June 1994 - 59 - expires 15 July 1994 |
3893 |
|
|
|
3894 |
|
|
|
3895 |
|
|
|
3896 |
|
|
|
3897 |
|
|
|
3898 |
|
|
INTERNET DRAFT to be NEWS sec. 7.1 |
3899 |
|
|
|
3900 |
|
|
|
3901 |
|
|
such administrative cancels would be a worthwhile |
3902 |
|
|
extension to this Draft, and experimental work in |
3903 |
|
|
this area is encouraged. |
3904 |
|
|
|
3905 |
|
|
NOTE: Meanwhile, a simple check of addresses is |
3906 |
|
|
useful accident prevention and catches at least |
3907 |
|
|
the most simple-minded forgers. Since the intent |
3908 |
|
|
is accident prevention rather than ironclad secu- |
3909 |
|
|
rity, use of the From address is appropriate, all |
3910 |
|
|
the more so because in the presence of gateways |
3911 |
|
|
(especially redundant multiple gateways), the |
3912 |
|
|
author may not have full control over Sender head- |
3913 |
|
|
ers. |
3914 |
|
|
|
3915 |
|
|
NOTE: The "refer... or treat as if they matched" |
3916 |
|
|
rule is intended to specifically forbid quietly |
3917 |
|
|
ignoring cancels with mismatched addresses. |
3918 |
|
|
|
3919 |
|
|
If the addresses match, then if technically possible, the |
3920 |
|
|
relayer MUST delete the target article completely and imme- |
3921 |
|
|
diately. Failing that, it MUST make the target article |
3922 |
|
|
unreadable (preferably to everyone, minimally to everyone |
3923 |
|
|
but the administrator) and either arrange for it to be |
3924 |
|
|
deleted as soon as possible or notify an administrator at |
3925 |
|
|
once. |
3926 |
|
|
|
3927 |
|
|
NOTE: To allow for events such as criminal |
3928 |
|
|
actions, malicious forgeries, and copyright |
3929 |
|
|
infringements, where damage and/or legal problems |
3930 |
|
|
may be minimized by prompt cancellation, complete |
3931 |
|
|
removal is strongly preferred over merely making |
3932 |
|
|
the target article unreadable. The potential for |
3933 |
|
|
malice is outweighed by the importance of really |
3934 |
|
|
getting rid of the target article in some legiti- |
3935 |
|
|
mate cases. (In cases of inadvertent copyright |
3936 |
|
|
violation in particular, the ability to quickly |
3937 |
|
|
remedy the violation is of considerable legal |
3938 |
|
|
importance.) Failing that, making it unreadable |
3939 |
|
|
is better than nothing. |
3940 |
|
|
|
3941 |
|
|
NOTE: Merely annotating the article so that read- |
3942 |
|
|
ers see an indication that the author wanted it |
3943 |
|
|
cancelled is not acceptable. Making the article |
3944 |
|
|
unreadable is the minimum action. |
3945 |
|
|
|
3946 |
|
|
NOTE: There have been experiments with making can- |
3947 |
|
|
celled articles unreadable, so that local news |
3948 |
|
|
administrators could reverse cancellations. In |
3949 |
|
|
practice, administrators almost never find cause |
3950 |
|
|
to do so. Removal appears to be clearly prefer- |
3951 |
|
|
able where technically feasible. |
3952 |
|
|
|
3953 |
|
|
NOT ARRIVED YET. If practical, retain the cancel message |
3954 |
|
|
until the target article does arrive, or until there is no |
3955 |
|
|
|
3956 |
|
|
|
3957 |
|
|
|
3958 |
|
|
2 June 1994 - 60 - expires 15 July 1994 |
3959 |
|
|
|
3960 |
|
|
|
3961 |
|
|
|
3962 |
|
|
|
3963 |
|
|
|
3964 |
|
|
INTERNET DRAFT to be NEWS sec. 7.1 |
3965 |
|
|
|
3966 |
|
|
|
3967 |
|
|
further possibility of it arriving and being accepted (see |
3968 |
|
|
section 9.2), and then treat as under AVAILABLE. Failing |
3969 |
|
|
that, arrange for the target article to be rejected and dis- |
3970 |
|
|
carded if it does arrive. |
3971 |
|
|
|
3972 |
|
|
NOTE: It may well be impractical to retain the |
3973 |
|
|
control message, given uncertainty about whether |
3974 |
|
|
the target article will ever arrive. Existing |
3975 |
|
|
practice in such cases is to assume that addresses |
3976 |
|
|
would match and arrange the equivalent of dele- |
3977 |
|
|
tion. This is often done by making a spurious |
3978 |
|
|
entry in a database of already-seen message IDs |
3979 |
|
|
(see section 9.3), so that if the article does |
3980 |
|
|
arrive, it will be rejected as a duplicate. |
3981 |
|
|
|
3982 |
|
|
The cancel message MUST be propagated onward in the usual |
3983 |
|
|
fashion, regardless of which of the four cases applied, so |
3984 |
|
|
that the target article will be cancelled everywhere even if |
3985 |
|
|
cancellation and target article follow different routes. |
3986 |
|
|
|
3987 |
|
|
NOTE: RFC 1036 appeared to require stopping cancel |
3988 |
|
|
propagation in the NOT ARRIVED YET case, although |
3989 |
|
|
the wording was somewhat unclear. This appears to |
3990 |
|
|
have been an unwise decision; there are known |
3991 |
|
|
cases of important cancellations (in situations |
3992 |
|
|
of, e.g., inadvertent copyright violation) achiev- |
3993 |
|
|
ing rather poorer propagation than the target |
3994 |
|
|
article. News propagation is often a much less |
3995 |
|
|
orderly process than the authors of RFC 1036 |
3996 |
|
|
apparently envisioned. Modern implementations |
3997 |
|
|
generally propagate the cancellation regardless. |
3998 |
|
|
|
3999 |
|
|
Posting agents meant for use by ordinary posters SHOULD |
4000 |
|
|
reject an attempt to post a cancel message if the target |
4001 |
|
|
article is available and the mailing address in its From |
4002 |
|
|
header does not match the one in the cancel message's From |
4003 |
|
|
header. |
4004 |
|
|
|
4005 |
|
|
NOTE: This, again, is primarily accident preven- |
4006 |
|
|
tion. |
4007 |
|
|
|
4008 |
|
|
|
4009 |
|
|
7.2. ihave, sendme |
4010 |
|
|
|
4011 |
|
|
The ihave and sendme control messages implement a crude |
4012 |
|
|
batched predecessor of the NNTP [rrr] protocol. They are |
4013 |
|
|
largely obsolete in the Internet, but still see use in the |
4014 |
|
|
UUCP environment, especially for backup feeds that normally |
4015 |
|
|
are active only when a primary feed path has failed. |
4016 |
|
|
|
4017 |
|
|
NOTE: The ihave and sendme messages defined here |
4018 |
|
|
have ABSOLUTELY NOTHING TO DO WITH NNTP, despite |
4019 |
|
|
similarities of terminology. |
4020 |
|
|
|
4021 |
|
|
|
4022 |
|
|
|
4023 |
|
|
|
4024 |
|
|
2 June 1994 - 61 - expires 15 July 1994 |
4025 |
|
|
|
4026 |
|
|
|
4027 |
|
|
|
4028 |
|
|
|
4029 |
|
|
|
4030 |
|
|
INTERNET DRAFT to be NEWS sec. 7.2 |
4031 |
|
|
|
4032 |
|
|
|
4033 |
|
|
The two messages share the same syntax: |
4034 |
|
|
|
4035 |
|
|
ihave-arguments = *( message-id space ) relayer-name |
4036 |
|
|
sendme-arguments = ihave-arguments |
4037 |
|
|
ihave-body = *( message-id eol ) |
4038 |
|
|
sendme-body = ihave-body |
4039 |
|
|
|
4040 |
|
|
Message IDs MUST appear in either the arguments or the body, |
4041 |
|
|
but not both. Relayers SHOULD generate the form putting |
4042 |
|
|
message IDs in the body, but the other form MUST be sup- |
4043 |
|
|
ported for backward compatibility. |
4044 |
|
|
|
4045 |
|
|
NOTE: RFC 1036 made the relayer name optional, but |
4046 |
|
|
difficulties could easily ensue in determining the |
4047 |
|
|
origin of the message, and this option is believed |
4048 |
|
|
to be unused nowadays. Putting the message IDs in |
4049 |
|
|
the body is strongly preferred over putting them |
4050 |
|
|
in the arguments because it lends itself much bet- |
4051 |
|
|
ter to large numbers of message IDs and avoids the |
4052 |
|
|
empty-body problem mentioned in section 4.3.1. |
4053 |
|
|
|
4054 |
|
|
The ihave message states that the named relayer has filed |
4055 |
|
|
articles with the specified message IDs, which may be of |
4056 |
|
|
interest to the relayer(s) receiving the ihave message. The |
4057 |
|
|
sendme message requests that the relayer receiving it send |
4058 |
|
|
the articles having the specified message IDs to the named |
4059 |
|
|
relayer. |
4060 |
|
|
|
4061 |
|
|
These control messages are normally sent essentially as |
4062 |
|
|
point-to-point messages, by using "to." newsgroups (see sec- |
4063 |
|
|
tion 5.5) that are sent only to the relayer the messages are |
4064 |
|
|
intended for. The two relayers MUST be neighbors, exchang- |
4065 |
|
|
ing news directly with each other. Each relayer advertises |
4066 |
|
|
its new arrivals to the other using ihave messages, and each |
4067 |
|
|
uses sendme messages to request the articles it lacks. |
4068 |
|
|
|
4069 |
|
|
NOTE: Arguably these point-to-point control mes- |
4070 |
|
|
sages should flow by some other protocol, e.g. |
4071 |
|
|
mail, but administrative and interfacing issues |
4072 |
|
|
are simplified if the news system doesn't need to |
4073 |
|
|
talk to the mail system. |
4074 |
|
|
|
4075 |
|
|
To reduce overhead, ihave and sendme messages SHOULD be sent |
4076 |
|
|
relatively infrequently and SHOULD contain substantial num- |
4077 |
|
|
bers of message IDs. If ihave and sendme are being used to |
4078 |
|
|
implement a backup feed, it may be desirable to insert a |
4079 |
|
|
delay between reception of an ihave and generation of a |
4080 |
|
|
sendme, so that a slightly slow primary feed will not cause |
4081 |
|
|
large numbers of articles to be requested unnecessarily via |
4082 |
|
|
sendme. |
4083 |
|
|
|
4084 |
|
|
|
4085 |
|
|
|
4086 |
|
|
|
4087 |
|
|
|
4088 |
|
|
|
4089 |
|
|
|
4090 |
|
|
2 June 1994 - 62 - expires 15 July 1994 |
4091 |
|
|
|
4092 |
|
|
|
4093 |
|
|
|
4094 |
|
|
|
4095 |
|
|
|
4096 |
|
|
INTERNET DRAFT to be NEWS sec. 7.3 |
4097 |
|
|
|
4098 |
|
|
|
4099 |
|
|
7.3. newgroup |
4100 |
|
|
|
4101 |
|
|
The newgroup control message requests that a new newsgroup |
4102 |
|
|
be created: |
4103 |
|
|
|
4104 |
|
|
newgroup-arguments = newsgroup-name [ space moderation ] |
4105 |
|
|
moderation = "moderated" / "unmoderated" |
4106 |
|
|
newgroup-body = body |
4107 |
|
|
/ [ body ] descriptor [ body ] |
4108 |
|
|
descriptor = descriptor-tag eol description-line eol |
4109 |
|
|
descriptor-tag = "For your newsgroups file:" |
4110 |
|
|
description-line = newsgroup-name space description |
4111 |
|
|
description = nonblank-text [ " (Moderated)" ] |
4112 |
|
|
|
4113 |
|
|
The first argument names the newsgroup to be created, and |
4114 |
|
|
the second one (if present) indicates whether it is moder- |
4115 |
|
|
ated. If there is no second argument, the default is |
4116 |
|
|
"unmoderated". |
4117 |
|
|
|
4118 |
|
|
NOTE: Implementors are warned that there is occa- |
4119 |
|
|
sional use of other forms in the second argument. |
4120 |
|
|
It is suggested that such violations of this |
4121 |
|
|
Draft, which are also violations of RFC 1036, |
4122 |
|
|
cause the newgroup message to be ignored. RFC |
4123 |
|
|
1036 was slightly vague about how second arguments |
4124 |
|
|
other than "moderated" were to be treated (specif- |
4125 |
|
|
ically, whether they were illegal or just |
4126 |
|
|
ignored), but it is thought that all existing |
4127 |
|
|
major implementations will handle "unmoderated" |
4128 |
|
|
correctly, and it appears desirable to tighten up |
4129 |
|
|
the specs to make it possible for other forms to |
4130 |
|
|
be used in future. |
4131 |
|
|
|
4132 |
|
|
The body is a comment, which software MUST ignore, except |
4133 |
|
|
that if it contains a descriptor, the description line is |
4134 |
|
|
intended to be suitable for addition to a list of newsgroup |
4135 |
|
|
descriptions. The description cannot be continued onto |
4136 |
|
|
later lines, but is not constrained to any particular |
4137 |
|
|
length. Moderated newsgroups have descriptions that end |
4138 |
|
|
with the string " (Moderated)" (note that this string begins |
4139 |
|
|
with a blank). |
4140 |
|
|
|
4141 |
|
|
NOTE: It is unfortunate that the description line |
4142 |
|
|
is part of the body, rather than being supplied in |
4143 |
|
|
a header, but this is established practice. News- |
4144 |
|
|
group creators are cautioned that the descriptor |
4145 |
|
|
tag must be reproduced exactly as given above, |
4146 |
|
|
alone on a line, and is case-sensitive. (To |
4147 |
|
|
reduce errors in this regard, posting agents might |
4148 |
|
|
wish to question or reject newgroup messages which |
4149 |
|
|
do not contain a descriptor.) Given the desire |
4150 |
|
|
for short lines, description writers should avoid |
4151 |
|
|
content-free phrases like "discussion of" and |
4152 |
|
|
"news about", and stick to defining what the |
4153 |
|
|
|
4154 |
|
|
|
4155 |
|
|
|
4156 |
|
|
2 June 1994 - 63 - expires 15 July 1994 |
4157 |
|
|
|
4158 |
|
|
|
4159 |
|
|
|
4160 |
|
|
|
4161 |
|
|
|
4162 |
|
|
INTERNET DRAFT to be NEWS sec. 7.3 |
4163 |
|
|
|
4164 |
|
|
|
4165 |
|
|
newsgroup is about. |
4166 |
|
|
|
4167 |
|
|
The remainder of the body SHOULD contain an explanation of |
4168 |
|
|
the purpose of the newsgroup and the decision to create it. |
4169 |
|
|
|
4170 |
|
|
NOTE: Criteria for newsgroup creation vary widely |
4171 |
|
|
and are outside the scope of this Draft, but if |
4172 |
|
|
formal procedures of one kind or another were fol- |
4173 |
|
|
lowed in the decision, the body should mention |
4174 |
|
|
this. Administrators often look for such informa- |
4175 |
|
|
tion when deciding whether to comply with cre- |
4176 |
|
|
ation/deletion requests. |
4177 |
|
|
|
4178 |
|
|
A newgroup message which lacks an Approved header MUST be |
4179 |
|
|
ignored. |
4180 |
|
|
|
4181 |
|
|
NOTE: It would also be desirable to ignore a new- |
4182 |
|
|
group message unless its Approved header names a |
4183 |
|
|
person who is authorized (in some sense) to create |
4184 |
|
|
such a newsgroup. A cooperating subnet with suf- |
4185 |
|
|
ficiently strong coordination to maintain a cor- |
4186 |
|
|
rect and current list of authorized creators might |
4187 |
|
|
wish to do so for its internal newsgroups. It |
4188 |
|
|
also (or alternatively) might wish to ignore a |
4189 |
|
|
newgroup message for an internal newsgroup that |
4190 |
|
|
was posted (or cross-posted) to a non-internal |
4191 |
|
|
newsgroup. |
4192 |
|
|
|
4193 |
|
|
NOTE: As mentioned in section 6.10, some form of |
4194 |
|
|
(cryptographic?) authentication of Approved head- |
4195 |
|
|
ers would be highly desirable, especially for con- |
4196 |
|
|
trol messages. |
4197 |
|
|
|
4198 |
|
|
It would be desirable to provide some way of supplying a |
4199 |
|
|
moderator's address in a newgroup message for a moderated |
4200 |
|
|
newsgroup, but this will cause problems unless effective |
4201 |
|
|
authentication is available, so it is left for future work. |
4202 |
|
|
|
4203 |
|
|
NOTE: This leaves news administrators stuck with |
4204 |
|
|
the annoying chore of arranging proper mailing of |
4205 |
|
|
moderated-newsgroup submissions. On Usenet, this |
4206 |
|
|
can be simplified by exploiting a forwarding |
4207 |
|
|
facility that some major sites provide: they main- |
4208 |
|
|
tain forwarding addresses, each the name of a mod- |
4209 |
|
|
erated newsgroup with all periods (".", ASCII 46) |
4210 |
|
|
replaced by hyphens ("-", ASCII 45), which forward |
4211 |
|
|
mail to the current newsgroup moderators. More |
4212 |
|
|
advice on the subject of forwarding to moderators |
4213 |
|
|
can be found in the document titled "How to Con- |
4214 |
|
|
struct the Mailpaths File", posted regularly to |
4215 |
|
|
the Usenet newsgroups news.lists, news.admin.misc, |
4216 |
|
|
and news.answers. |
4217 |
|
|
|
4218 |
|
|
|
4219 |
|
|
|
4220 |
|
|
|
4221 |
|
|
|
4222 |
|
|
2 June 1994 - 64 - expires 15 July 1994 |
4223 |
|
|
|
4224 |
|
|
|
4225 |
|
|
|
4226 |
|
|
|
4227 |
|
|
|
4228 |
|
|
INTERNET DRAFT to be NEWS sec. 7.3 |
4229 |
|
|
|
4230 |
|
|
|
4231 |
|
|
A newgroup message naming a newsgroup that already exists is |
4232 |
|
|
requesting a change in the moderation status or description |
4233 |
|
|
of the newsgroup. The same rules apply. |
4234 |
|
|
|
4235 |
|
|
|
4236 |
|
|
7.4. rmgroup |
4237 |
|
|
|
4238 |
|
|
The rmgroup message requests that a newsgroup be deleted: |
4239 |
|
|
|
4240 |
|
|
rmgroup-arguments = newsgroup-name |
4241 |
|
|
rmgroup-body = body |
4242 |
|
|
|
4243 |
|
|
The sole argument is the newsgroup name. The body is a com- |
4244 |
|
|
ment, which software MUST ignore; it SHOULD contain an |
4245 |
|
|
explanation of the decision to delete the newsgroup. |
4246 |
|
|
|
4247 |
|
|
NOTE: Criteria for newsgroup deletion vary widely |
4248 |
|
|
and are outside the scope of this Draft, but if |
4249 |
|
|
formal procedures of one kind or another were fol- |
4250 |
|
|
lowed in the decision, the body should mention |
4251 |
|
|
this. Administrators often look for such informa- |
4252 |
|
|
tion when deciding whether to comply with cre- |
4253 |
|
|
ation/deletion requests. |
4254 |
|
|
|
4255 |
|
|
A rmgroup message which lacks an Approved header MUST be |
4256 |
|
|
ignored. |
4257 |
|
|
|
4258 |
|
|
NOTE: It would also be desirable to ignore a |
4259 |
|
|
rmgroup message unless its Approved header names a |
4260 |
|
|
person who is authorized (in some sense) to delete |
4261 |
|
|
such a newsgroup. A cooperating subnet with suf- |
4262 |
|
|
ficiently strong coordination to maintain a cor- |
4263 |
|
|
rect and current list of authorized deleters might |
4264 |
|
|
wish to do so for its internal newsgroups. It |
4265 |
|
|
also (or alternatively) might wish to ignore a |
4266 |
|
|
rmgroup message for an internal newsgroup that was |
4267 |
|
|
posted (or cross-posted) to a non-internal news- |
4268 |
|
|
group. |
4269 |
|
|
|
4270 |
|
|
Unexpected deletion of a newsgroup being a disruptive |
4271 |
|
|
action, implementations are strongly advised to refer |
4272 |
|
|
rmgroup messages to an administrator by default, unless per- |
4273 |
|
|
haps the message can be determined to have originated within |
4274 |
|
|
a cooperating subnet whose members are considered trustwor- |
4275 |
|
|
thy. Abuses have occurred. |
4276 |
|
|
|
4277 |
|
|
|
4278 |
|
|
7.5. sendsys, version, whogets |
4279 |
|
|
|
4280 |
|
|
The sendsys message requests that a description of the |
4281 |
|
|
relayer's news feeds to other relayers be mailed to the |
4282 |
|
|
article's reply address: |
4283 |
|
|
|
4284 |
|
|
|
4285 |
|
|
|
4286 |
|
|
|
4287 |
|
|
|
4288 |
|
|
2 June 1994 - 65 - expires 15 July 1994 |
4289 |
|
|
|
4290 |
|
|
|
4291 |
|
|
|
4292 |
|
|
|
4293 |
|
|
|
4294 |
|
|
INTERNET DRAFT to be NEWS sec. 7.5 |
4295 |
|
|
|
4296 |
|
|
|
4297 |
|
|
sendsys-arguments = [ relayer-name ] |
4298 |
|
|
sendsys-body = body |
4299 |
|
|
|
4300 |
|
|
If there is an argument, relayers other than the one named |
4301 |
|
|
by the argument MUST not respond. The body is a comment, |
4302 |
|
|
which software MUST ignore; it SHOULD contain an explanation |
4303 |
|
|
of the reason for the request. |
4304 |
|
|
|
4305 |
|
|
The version message requests that the name and version of |
4306 |
|
|
the relayer software be mailed to the reply address: |
4307 |
|
|
|
4308 |
|
|
version-arguments = |
4309 |
|
|
version-body = body |
4310 |
|
|
|
4311 |
|
|
There are no arguments. The body is a comment, which soft- |
4312 |
|
|
ware MUST ignore; it SHOULD contain an explanation of the |
4313 |
|
|
reason for the request. |
4314 |
|
|
|
4315 |
|
|
The whogets message requests that a description of the |
4316 |
|
|
relayer and its news feeds to other relayers be mailed to |
4317 |
|
|
the article's reply address: |
4318 |
|
|
|
4319 |
|
|
whogets-arguments = newsgroup-name [ space relayer-name ] |
4320 |
|
|
whogets-body = body |
4321 |
|
|
|
4322 |
|
|
The first argument is the name of the "target newsgroup", |
4323 |
|
|
specifying the newsgroup for which propagation information |
4324 |
|
|
is desired. This MUST be a complete newsgroup name, not the |
4325 |
|
|
name of a hierarchy or a portion of a newsgroup name that is |
4326 |
|
|
not itself the name of a newsgroup. If there is a second |
4327 |
|
|
argument, only the relayer named by that argument should |
4328 |
|
|
respond. The body is a comment, which software MUST ignore; |
4329 |
|
|
it SHOULD contain an explanation of the reason for the |
4330 |
|
|
request. |
4331 |
|
|
|
4332 |
|
|
NOTE: Whogets is intended as a replacement for |
4333 |
|
|
sendsys (and version) with a precisely-specified |
4334 |
|
|
reply format. Since the syntax for specifying |
4335 |
|
|
what newsgroups get sent to what other relayers |
4336 |
|
|
varies widely between different forms of relayer |
4337 |
|
|
software, the only practical way to standardize |
4338 |
|
|
the reply format is to indicate a specific news- |
4339 |
|
|
group and ask where THAT newsgroup propagates. |
4340 |
|
|
The requirement that it be a complete newsgroup |
4341 |
|
|
name is intended to (largely) avoid the problem of |
4342 |
|
|
having to answer "yes and no" in cases where not |
4343 |
|
|
all newsgroups in a hierarchy are sent. |
4344 |
|
|
|
4345 |
|
|
Any of these messages lacking an Approved header MUST be |
4346 |
|
|
ignored. Response to any of these messages SHOULD be |
4347 |
|
|
delayed for at least 24 hours, and no response should be |
4348 |
|
|
attempted if the message has been cancelled in that time. |
4349 |
|
|
Also, no response SHOULD be attempted unless the local part |
4350 |
|
|
of the destination address is "newsmap". News |
4351 |
|
|
|
4352 |
|
|
|
4353 |
|
|
|
4354 |
|
|
2 June 1994 - 66 - expires 15 July 1994 |
4355 |
|
|
|
4356 |
|
|
|
4357 |
|
|
|
4358 |
|
|
|
4359 |
|
|
|
4360 |
|
|
INTERNET DRAFT to be NEWS sec. 7.5 |
4361 |
|
|
|
4362 |
|
|
|
4363 |
|
|
administrators SHOULD arrange for mail to "newsmap" on their |
4364 |
|
|
systems to be discarded (without reply) unless legitimate |
4365 |
|
|
use is in progress. |
4366 |
|
|
|
4367 |
|
|
NOTE: Because these messages can cause many, many |
4368 |
|
|
relayers to send mail to one person, such mes- |
4369 |
|
|
sages, specifying mailing to an innocent person's |
4370 |
|
|
mailbox, have been forged as a half-witted practi- |
4371 |
|
|
cal joke. A delay gives administrators time to |
4372 |
|
|
notice a fraudulent message and act (by cancelling |
4373 |
|
|
the message, preparing to divert the flood of mail |
4374 |
|
|
into the bit bucket, or both). Restriction of the |
4375 |
|
|
destination address to "newsmap" reduces the |
4376 |
|
|
appeal of fraud by making it impossible to use it |
4377 |
|
|
to harass a normal user. (A site which does NOT |
4378 |
|
|
discard mail to "newsmap", but rather bounces it |
4379 |
|
|
back, may incur higher communications costs than |
4380 |
|
|
if the mail had been accepted into a user's mail- |
4381 |
|
|
box... but a malicious forger could accomplish |
4382 |
|
|
this anyway, by using an address whose local part |
4383 |
|
|
is very unlikely to be a legitimate mailbox name.) |
4384 |
|
|
|
4385 |
|
|
NOTE: RFC 1036 did not require the Approved header |
4386 |
|
|
for these control messages. This has been added |
4387 |
|
|
because of the possibility that cryptographic |
4388 |
|
|
authentication of Approved headers will become |
4389 |
|
|
available. |
4390 |
|
|
|
4391 |
|
|
The body of the reply to a sendsys message SHOULD be of the |
4392 |
|
|
form: |
4393 |
|
|
|
4394 |
|
|
sendsys-reply = responder 1*sys-line |
4395 |
|
|
responder = "Responding-System:" space domain eol |
4396 |
|
|
sys-line = relayer-name ":" newsgroup-patterns [ ":" text ] eol |
4397 |
|
|
newsgroup-patterns = newsgroup-name *( "," newsgroup-name ) |
4398 |
|
|
|
4399 |
|
|
The first line identifies the responding system, using a |
4400 |
|
|
syntax resembling a header (but note that it is part of the |
4401 |
|
|
BODY). Remaining lines indicate what newsgroups are sent to |
4402 |
|
|
what other systems. The syntax of newsgroup patterns is not |
4403 |
|
|
well standardized; the form described is common (often with |
4404 |
|
|
newsgroup names only partially given, denoting all names |
4405 |
|
|
starting with a particular set of components) but not uni- |
4406 |
|
|
versal. The whogets message provides a better-defined |
4407 |
|
|
alternative. |
4408 |
|
|
|
4409 |
|
|
The reply to a version message is of somewhat ill-defined |
4410 |
|
|
form, with a body normally consisting of a single line of |
4411 |
|
|
text that somehow describes the version of the relayer soft- |
4412 |
|
|
ware. The whogets message provides a better-defined alter- |
4413 |
|
|
native. |
4414 |
|
|
|
4415 |
|
|
The body of the reply to a whogets message MUST be of the |
4416 |
|
|
form: |
4417 |
|
|
|
4418 |
|
|
|
4419 |
|
|
|
4420 |
|
|
2 June 1994 - 67 - expires 15 July 1994 |
4421 |
|
|
|
4422 |
|
|
|
4423 |
|
|
|
4424 |
|
|
|
4425 |
|
|
|
4426 |
|
|
INTERNET DRAFT to be NEWS sec. 7.5 |
4427 |
|
|
|
4428 |
|
|
|
4429 |
|
|
whogets-reply = responder-domain responder-relayer response-date |
4430 |
|
|
responding-to arrived-via responder-version |
4431 |
|
|
whogets-delimiter *pass-line |
4432 |
|
|
responder-domain = "Responding-System:" space domain eol |
4433 |
|
|
responder-relayer = "Responding-Relayer:" space relayer-name eol |
4434 |
|
|
response-date = "Response-Date:" space date eol |
4435 |
|
|
responding-to = "Responding-To:" space message-id eol |
4436 |
|
|
arrived-via = "Arrived-Via:" path-list eol |
4437 |
|
|
responder-version = "Responding-Version:" space nonblank-text eol |
4438 |
|
|
whogets-delimiter = eol |
4439 |
|
|
pass-line = relayer-name [ space domain ] eol |
4440 |
|
|
|
4441 |
|
|
The first six lines identify the responding relayer by its |
4442 |
|
|
Internet domain name (use of the ".uucp" and ".bitnet" |
4443 |
|
|
pseudo-domains is permissible, for registered hosts in them, |
4444 |
|
|
but discouraged) and its relayer name, specify the date when |
4445 |
|
|
the reply was generated and the message ID of the whogets |
4446 |
|
|
message being replied to, give the path list (from the Path |
4447 |
|
|
header) of the whogets message (which MAY, if absolutely |
4448 |
|
|
necessary, be truncated to a convenient length, but MUST |
4449 |
|
|
contain at least the leading three relayer names), and indi- |
4450 |
|
|
cate the version of relayer software responding. Note that |
4451 |
|
|
these lines are part of the BODY even though their format |
4452 |
|
|
resembles that of headers. Despite the apparently-fixed |
4453 |
|
|
order specified by the syntax above, they can appear in any |
4454 |
|
|
order, but there must be exactly one of each. |
4455 |
|
|
|
4456 |
|
|
After those preliminaries, and an empty line to unambigu- |
4457 |
|
|
ously define their end, the remaining lines are the relayer |
4458 |
|
|
names (which MAY be accompanied by the corresponding domain |
4459 |
|
|
names, if known) of systems which the responding system |
4460 |
|
|
passes the target newsgroup to. Only the names of news |
4461 |
|
|
relayers are to be included. |
4462 |
|
|
|
4463 |
|
|
NOTE: It is desirable for a reply to identify its |
4464 |
|
|
source by both domain name and relayer name |
4465 |
|
|
because news propagation is governed by the latter |
4466 |
|
|
but location in a broader context is best deter- |
4467 |
|
|
mined by the former. The date and whogets message |
4468 |
|
|
ID should, in principle, be present in the MAIL |
4469 |
|
|
headers, but are included in the body for robust- |
4470 |
|
|
ness in the presence of uncooperative mail sys- |
4471 |
|
|
tems. The reason for the path list is discussed |
4472 |
|
|
below. Adding version information eliminates the |
4473 |
|
|
need for a separate message to gather it. |
4474 |
|
|
|
4475 |
|
|
NOTE: The limitation of pass lines to contain only |
4476 |
|
|
names of news relayers is meant to exclude names |
4477 |
|
|
used within a single host (as identifiers for mail |
4478 |
|
|
gateways, portions of ihave/sendme implementa- |
4479 |
|
|
tions, etc.), which do not actually refer to other |
4480 |
|
|
hosts. |
4481 |
|
|
|
4482 |
|
|
|
4483 |
|
|
|
4484 |
|
|
|
4485 |
|
|
|
4486 |
|
|
2 June 1994 - 68 - expires 15 July 1994 |
4487 |
|
|
|
4488 |
|
|
|
4489 |
|
|
|
4490 |
|
|
|
4491 |
|
|
|
4492 |
|
|
INTERNET DRAFT to be NEWS sec. 7.5 |
4493 |
|
|
|
4494 |
|
|
|
4495 |
|
|
A relayer which is unaware of the existence of the target |
4496 |
|
|
newsgroup MUST not reply to a whogets message at all, |
4497 |
|
|
although this MUST not influence decisions on whether to |
4498 |
|
|
pass the article on to other relayers. |
4499 |
|
|
|
4500 |
|
|
NOTE: While this may result in discontinuous maps |
4501 |
|
|
in cases where some hosts have not honored |
4502 |
|
|
requests for creation of a newsgroup, it will also |
4503 |
|
|
prevent a flood of useless responses in the event |
4504 |
|
|
that a whogets message intended to map a small |
4505 |
|
|
region "leaks" out to a larger one. The possibil- |
4506 |
|
|
ity of discontinuous recognition of a newsgroup |
4507 |
|
|
does make it important that the whogets message |
4508 |
|
|
itself continue to propagate (if other criteria |
4509 |
|
|
permit). This is also the reason for the inclu- |
4510 |
|
|
sion of the whogets message's path list, or at |
4511 |
|
|
least the leading portion of it, in the reply: to |
4512 |
|
|
permit reconstruction of at least small gaps in |
4513 |
|
|
maps. |
4514 |
|
|
|
4515 |
|
|
Different networks set different rules for the legitimacy of |
4516 |
|
|
these messages, given that they may reveal details of orga- |
4517 |
|
|
nization-internal topology that are sometimes considered |
4518 |
|
|
proprietary. |
4519 |
|
|
|
4520 |
|
|
NOTE: On Usenet, in particular, willingness to |
4521 |
|
|
respond to these messages is held to be a condi- |
4522 |
|
|
tion of network membership: the topology of Usenet |
4523 |
|
|
is public information. Organizations wishing to |
4524 |
|
|
belong to such networks while keeping their inter- |
4525 |
|
|
nal topology confidential might wish to organize |
4526 |
|
|
their internal news software so that all articles |
4527 |
|
|
reaching outsiders appear to be from a single |
4528 |
|
|
"gatekeeper" system, with the details of internal |
4529 |
|
|
topology hidden behind that system. |
4530 |
|
|
|
4531 |
|
|
UNRESOLVED ISSUE: It might be useful to have a way |
4532 |
|
|
to set some sort of hop limit for these. |
4533 |
|
|
|
4534 |
|
|
|
4535 |
|
|
7.6. checkgroups |
4536 |
|
|
|
4537 |
|
|
The checkgroups control message contains a supposedly |
4538 |
|
|
authoritative list of the valid newsgroups within some sub- |
4539 |
|
|
set of the newsgroup name space: |
4540 |
|
|
|
4541 |
|
|
checkgroups-arguments = |
4542 |
|
|
checkgroups-body = [ invalidation ] valid-groups |
4543 |
|
|
/ invalidation |
4544 |
|
|
invalidation = "!" plain-component *( "," plain-component ) eol |
4545 |
|
|
valid-groups = 1*( description-line eol ) |
4546 |
|
|
|
4547 |
|
|
There are no arguments. The body lines (except possibly for |
4548 |
|
|
an initial invalidation) each contain a description line for |
4549 |
|
|
|
4550 |
|
|
|
4551 |
|
|
|
4552 |
|
|
2 June 1994 - 69 - expires 15 July 1994 |
4553 |
|
|
|
4554 |
|
|
|
4555 |
|
|
|
4556 |
|
|
|
4557 |
|
|
|
4558 |
|
|
INTERNET DRAFT to be NEWS sec. 7.6 |
4559 |
|
|
|
4560 |
|
|
|
4561 |
|
|
a newsgroup, as defined under the newgroup message (section |
4562 |
|
|
7.3). |
4563 |
|
|
|
4564 |
|
|
NOTE: Some other, ill-defined, forms of the check- |
4565 |
|
|
groups body were formerly used. See appendix A. |
4566 |
|
|
|
4567 |
|
|
The checkgroups message applies to all hierarchies contain- |
4568 |
|
|
ing any of the newsgroups listed in the body. The check- |
4569 |
|
|
groups message asserts that the newsgroups it lists are the |
4570 |
|
|
only newsgroups in those hierarchies. If there is an inval- |
4571 |
|
|
idation, it asserts that the hierarchies it names no longer |
4572 |
|
|
contain any newsgroups. |
4573 |
|
|
|
4574 |
|
|
Processing a checkgroups message MAY cause a local list of |
4575 |
|
|
newsgroup descriptions to be updated. It SHOULD also cause |
4576 |
|
|
the local lists of newsgroups (and their moderation sta- |
4577 |
|
|
tuses) in the mentioned hierarchies to be checked against |
4578 |
|
|
the message. The results of the check MAY be used for auto- |
4579 |
|
|
matic corrective action, or MAY be reported to the news |
4580 |
|
|
administrator in some way. |
4581 |
|
|
|
4582 |
|
|
NOTE: Automatically updating descriptions of |
4583 |
|
|
existing newsgroups is relatively safe. In the |
4584 |
|
|
case of newsgroup additions or deletions, simply |
4585 |
|
|
notifying the administrator is generally the wis- |
4586 |
|
|
est action, unless perhaps the message can be |
4587 |
|
|
determined to have originated within a cooperating |
4588 |
|
|
subnet whose members are considered trustworthy. |
4589 |
|
|
|
4590 |
|
|
NOTE: There is a problem with the checkgroups con- |
4591 |
|
|
cept: not all newsgroups in a hierarchy necessar- |
4592 |
|
|
ily propagate to the same set of machines. |
4593 |
|
|
(Notably, there is a set of newsgroups known as |
4594 |
|
|
the "inet" newsgroups, which have relatively lim- |
4595 |
|
|
ited distribution but coexist in several hierar- |
4596 |
|
|
chies with more widely-distributed newsgroups.) |
4597 |
|
|
The advice of checkgroups should always be taken |
4598 |
|
|
with a grain of salt, and should never be followed |
4599 |
|
|
blindly. |
4600 |
|
|
|
4601 |
|
|
|
4602 |
|
|
8. Transmission Formats |
4603 |
|
|
|
4604 |
|
|
While this Draft does not specify transmission methods |
4605 |
|
|
except to place a few constraints on them, there are some |
4606 |
|
|
data formats used only for transmission that are unique to |
4607 |
|
|
news. |
4608 |
|
|
|
4609 |
|
|
|
4610 |
|
|
8.1. Batches |
4611 |
|
|
|
4612 |
|
|
For efficient bulk transmission and processing of news arti- |
4613 |
|
|
cles, it is often desirable to transmit a number of them as |
4614 |
|
|
a single block of data, a "batch". The format of a batch |
4615 |
|
|
|
4616 |
|
|
|
4617 |
|
|
|
4618 |
|
|
2 June 1994 - 70 - expires 15 July 1994 |
4619 |
|
|
|
4620 |
|
|
|
4621 |
|
|
|
4622 |
|
|
|
4623 |
|
|
|
4624 |
|
|
INTERNET DRAFT to be NEWS sec. 8.1 |
4625 |
|
|
|
4626 |
|
|
|
4627 |
|
|
is: |
4628 |
|
|
|
4629 |
|
|
batch = 1*( batch-header article ) |
4630 |
|
|
batch-header = "#! rnews " article-size eol |
4631 |
|
|
article-size = 1*digit |
4632 |
|
|
|
4633 |
|
|
A batch is a sequence of articles, each prefixed by a header |
4634 |
|
|
line that includes its size. The article size is a decimal |
4635 |
|
|
count of the octets in the article, counting each EOL as one |
4636 |
|
|
octet regardless of how it is actually represented. |
4637 |
|
|
|
4638 |
|
|
NOTE: A relayer might wish to accept either a sin- |
4639 |
|
|
gle article or a batch as input. Since "#" cannot |
4640 |
|
|
appear in a header name, examination of the first |
4641 |
|
|
octet of the input will reveal its nature. |
4642 |
|
|
|
4643 |
|
|
NOTE: In the header line, there is exactly one |
4644 |
|
|
blank before "rnews", there is exactly one blank |
4645 |
|
|
after "rnews", and the EOL immediately follows the |
4646 |
|
|
article size. Beware that some software inserts |
4647 |
|
|
non-standard trash after the size. |
4648 |
|
|
|
4649 |
|
|
NOTE: Despite the similarity of this format to the |
4650 |
|
|
executable-script format used by some operating |
4651 |
|
|
systems, it is EXTREMELY unwise to just feed |
4652 |
|
|
incoming batches to a command interpreter in the |
4653 |
|
|
anticipation that it will run a command named |
4654 |
|
|
"rnews" to process the batch. Unless arrangements |
4655 |
|
|
are made to very tightly restrict the range of |
4656 |
|
|
commands that can be executed by this means, the |
4657 |
|
|
security implications are disastrous. |
4658 |
|
|
|
4659 |
|
|
|
4660 |
|
|
8.2. Encoded Batches |
4661 |
|
|
|
4662 |
|
|
When transmitting news, especially over communications links |
4663 |
|
|
that are slow or are billed by the bit, it is often desir- |
4664 |
|
|
able to batch news and apply data compression to the |
4665 |
|
|
batches. Transmission links sending compressed batches |
4666 |
|
|
SHOULD use out-of-band means of communication to specify the |
4667 |
|
|
compression algorithm being used. If there is no way to |
4668 |
|
|
send out-of-band information along with a batch, the follow- |
4669 |
|
|
ing encapsulation for a compressed batch MAY be used: |
4670 |
|
|
|
4671 |
|
|
ec-batch = "#! " compression-keyword eol compressed-batch |
4672 |
|
|
compression-keyword = "cunbatch" |
4673 |
|
|
|
4674 |
|
|
A line containing a keyword indicating the type of compres- |
4675 |
|
|
sion is followed by the compressed batch. The only truly |
4676 |
|
|
widespread compression keyword at present is "cunbatch", |
4677 |
|
|
indicating compression using the widely-distributed "com- |
4678 |
|
|
press" program. Other compression keywords MAY be used by |
4679 |
|
|
mutual agreement between the hosts involved. |
4680 |
|
|
|
4681 |
|
|
|
4682 |
|
|
|
4683 |
|
|
|
4684 |
|
|
2 June 1994 - 71 - expires 15 July 1994 |
4685 |
|
|
|
4686 |
|
|
|
4687 |
|
|
|
4688 |
|
|
|
4689 |
|
|
|
4690 |
|
|
INTERNET DRAFT to be NEWS sec. 8.2 |
4691 |
|
|
|
4692 |
|
|
|
4693 |
|
|
NOTE: An encapsulated compressed batch is NOT, in |
4694 |
|
|
general, a text file, despite having an initial |
4695 |
|
|
text line. This combination of text and non-text |
4696 |
|
|
data is often awkward to handle; for example, |
4697 |
|
|
standard decompression programs cannot be used |
4698 |
|
|
without first stripping off the initial line, and |
4699 |
|
|
that in turn is painful to do because many text- |
4700 |
|
|
handling tools that are superficially suited to |
4701 |
|
|
the job do not cope well with non-text data. |
4702 |
|
|
Hence the recommendation that out-of-band communi- |
4703 |
|
|
cation be used instead when possible. |
4704 |
|
|
|
4705 |
|
|
NOTE: For UUCP transmission, where a batch is typ- |
4706 |
|
|
ically transmitted by invoking the remote command |
4707 |
|
|
"rnews" with the batch as its input stream, a |
4708 |
|
|
plausible out-of-band method for indicating a com- |
4709 |
|
|
pression type would be to give a compression key- |
4710 |
|
|
word in an option to "rnews", perhaps in the form: |
4711 |
|
|
|
4712 |
|
|
rnews -d decompressor |
4713 |
|
|
|
4714 |
|
|
where "decompressor" is the name of a decompres- |
4715 |
|
|
sion program (e.g. "uncompress" for a batch com- |
4716 |
|
|
pressed with "compress" or "gunzip" for a batch |
4717 |
|
|
compressed with "gzip"). How this decompression |
4718 |
|
|
program is located and invoked by the receiving |
4719 |
|
|
relayer is implementation-specific. |
4720 |
|
|
|
4721 |
|
|
NOTE: See the notes in section 8.1 on the inadvis- |
4722 |
|
|
ability of feeding batches directly to command |
4723 |
|
|
interpreters. |
4724 |
|
|
|
4725 |
|
|
NOTE: There is exactly one blank between "#!" and |
4726 |
|
|
the compression keyword, and the EOL immediately |
4727 |
|
|
follows the keyword. |
4728 |
|
|
|
4729 |
|
|
|
4730 |
|
|
8.3. News Within Mail |
4731 |
|
|
|
4732 |
|
|
It is often desirable to transmit news as mail, either for |
4733 |
|
|
the convenience of a human recipient or because that is the |
4734 |
|
|
only type of transmission available on a restrictive commu- |
4735 |
|
|
nication path. |
4736 |
|
|
|
4737 |
|
|
Given the similarity between the news format and the MAIL |
4738 |
|
|
format, it is superficially attractive to just send the news |
4739 |
|
|
article as a mail message. This is typically a mistake: |
4740 |
|
|
mail-handling software often feels free to manipulate vari- |
4741 |
|
|
ous headers in undesirable ways (in some cases, such as |
4742 |
|
|
Sender, such manipulation is actually mandatory), and mail |
4743 |
|
|
transmission problems etc. MUST be reported to the adminis- |
4744 |
|
|
trators responsible for the mail transmission rather than to |
4745 |
|
|
the article's author. In general, news sent as mail should |
4746 |
|
|
be encapsulated to separate the mail headers and the news |
4747 |
|
|
|
4748 |
|
|
|
4749 |
|
|
|
4750 |
|
|
2 June 1994 - 72 - expires 15 July 1994 |
4751 |
|
|
|
4752 |
|
|
|
4753 |
|
|
|
4754 |
|
|
|
4755 |
|
|
|
4756 |
|
|
INTERNET DRAFT to be NEWS sec. 8.3 |
4757 |
|
|
|
4758 |
|
|
|
4759 |
|
|
headers. |
4760 |
|
|
|
4761 |
|
|
When the intended recipient is a human, any convenient form |
4762 |
|
|
of encapsulation may be used. Recommended practice is to |
4763 |
|
|
use MIME encapsulation with a content type of "mes- |
4764 |
|
|
sage/news", given that news articles have additional seman- |
4765 |
|
|
tics beyond what "message/rfc822" implies. |
4766 |
|
|
|
4767 |
|
|
NOTE: "message/news" was registered as a standard |
4768 |
|
|
subtype by IANA 22 June 1993. |
4769 |
|
|
|
4770 |
|
|
When mail is being used as a transmission path between two |
4771 |
|
|
relayers, however, a standard method is desirable. Cur- |
4772 |
|
|
rently the standard method is to send the mail to an address |
4773 |
|
|
whose local part is "rnews", with whatever mail headers are |
4774 |
|
|
necessary for successful transmission. The news article |
4775 |
|
|
(including its headers) is sent as the body of the mail mes- |
4776 |
|
|
sage, with an "N" prepended to each line. |
4777 |
|
|
|
4778 |
|
|
NOTE: The "N" reduces the probability of an inno- |
4779 |
|
|
cent line in a news article being taken as a magic |
4780 |
|
|
command to mail software, and makes it easy for |
4781 |
|
|
receiving software to strip off any lines added by |
4782 |
|
|
mail software (e.g. the trailing empty line added |
4783 |
|
|
by some UUCP mail software). |
4784 |
|
|
|
4785 |
|
|
This method has its weaknesses. In particular, it assumes |
4786 |
|
|
that the mail transmission channel can transmit nearly- |
4787 |
|
|
arbitrary body text undamaged. When mail is being used as a |
4788 |
|
|
transmission path of last resort, however, the mail system |
4789 |
|
|
often has inconvenient preconceived notions about the format |
4790 |
|
|
of message bodies. Various ad-hoc encoding schemes have |
4791 |
|
|
been used to avoid such problems. The recommended method is |
4792 |
|
|
to send a news article or batch as the body of a MIME mail |
4793 |
|
|
message, using content type "application/news-transmission" |
4794 |
|
|
and MIME's "base64" encoding (which is specifically designed |
4795 |
|
|
to survive all known major mail systems). |
4796 |
|
|
|
4797 |
|
|
NOTE: In the process, MIME conventions could be |
4798 |
|
|
used to fragment and reassemble an article which |
4799 |
|
|
is too large to be sent as a single mail message |
4800 |
|
|
over a transmission path that restricts message |
4801 |
|
|
length. In addition, the "conversions" parameter |
4802 |
|
|
to the content type could be used to indicate what |
4803 |
|
|
(if any) compression method has been used. And |
4804 |
|
|
the Content-MD5 header [rrr 1544] can be used as a |
4805 |
|
|
"checksum" to provide high confidence of detecting |
4806 |
|
|
accidental damage to the contents. |
4807 |
|
|
|
4808 |
|
|
UNRESOLVED ISSUE: The "conversions" parameter no |
4809 |
|
|
longer exists. What should be done about this, if |
4810 |
|
|
anything? |
4811 |
|
|
|
4812 |
|
|
|
4813 |
|
|
|
4814 |
|
|
|
4815 |
|
|
|
4816 |
|
|
2 June 1994 - 73 - expires 15 July 1994 |
4817 |
|
|
|
4818 |
|
|
|
4819 |
|
|
|
4820 |
|
|
|
4821 |
|
|
|
4822 |
|
|
INTERNET DRAFT to be NEWS sec. 8.3 |
4823 |
|
|
|
4824 |
|
|
|
4825 |
|
|
NOTE: It might look tempting to use a content type |
4826 |
|
|
such as "message/X-netnews", but MIME bans non- |
4827 |
|
|
trivial encodings of the entire body of messages |
4828 |
|
|
with content type "message". The intent is to |
4829 |
|
|
avoid obscuring nested structure underneath encod- |
4830 |
|
|
ings. For inter-relayer news transmission, there |
4831 |
|
|
is no nested structure of interest, and it is |
4832 |
|
|
important that the entire article (including its |
4833 |
|
|
headers, not just its body) be protected against |
4834 |
|
|
the vagaries of intervening mail software. This |
4835 |
|
|
situation appears to fit the MIME description of |
4836 |
|
|
circumstances in which "application" is the proper |
4837 |
|
|
content type. |
4838 |
|
|
|
4839 |
|
|
NOTE: "application/news-transmission", with a |
4840 |
|
|
"conversions" parameter, was registered as a stan- |
4841 |
|
|
dard subtype by IANA 22 June 1993. |
4842 |
|
|
|
4843 |
|
|
UNRESOLVED ISSUE: The "conversions" parameter no |
4844 |
|
|
longer exists in MIME. What should we do about |
4845 |
|
|
this? |
4846 |
|
|
|
4847 |
|
|
|
4848 |
|
|
8.4. Partial Batches |
4849 |
|
|
|
4850 |
|
|
UNRESOLVED ISSUE: The existing batch conventions |
4851 |
|
|
assemble (potentially) many articles into one |
4852 |
|
|
batch. Handling very large articles would be sub- |
4853 |
|
|
stantially less troublesome if there was also a |
4854 |
|
|
fragmentation convention for splitting a large |
4855 |
|
|
article into several batches. Is this worth |
4856 |
|
|
defining at this time? |
4857 |
|
|
|
4858 |
|
|
|
4859 |
|
|
9. Propagation and Processing |
4860 |
|
|
|
4861 |
|
|
Most aspects of news propagation and processing are imple- |
4862 |
|
|
mentation-specific. The basic propagation algorithms, and |
4863 |
|
|
certain details of how they are implemented, nevertheless |
4864 |
|
|
need to be standard. |
4865 |
|
|
|
4866 |
|
|
There are two important principles that news implementors |
4867 |
|
|
(and administrators) need to keep in mind. The first is the |
4868 |
|
|
well-known Internet Robustness Principle: |
4869 |
|
|
|
4870 |
|
|
Be liberal in what you accept, and conservative in what you send. |
4871 |
|
|
|
4872 |
|
|
However, in the case of news there is an even more important |
4873 |
|
|
principle, derived from a much older code of practice, the |
4874 |
|
|
Hippocratic Oath (we will thus call this the Hippocratic |
4875 |
|
|
Principle): |
4876 |
|
|
|
4877 |
|
|
First, do no harm. |
4878 |
|
|
|
4879 |
|
|
|
4880 |
|
|
|
4881 |
|
|
|
4882 |
|
|
2 June 1994 - 74 - expires 15 July 1994 |
4883 |
|
|
|
4884 |
|
|
|
4885 |
|
|
|
4886 |
|
|
|
4887 |
|
|
|
4888 |
|
|
INTERNET DRAFT to be NEWS sec. 9 |
4889 |
|
|
|
4890 |
|
|
|
4891 |
|
|
It is VITAL to realize that decisions which might be merely |
4892 |
|
|
suboptimal in a smaller context can become devastating mis- |
4893 |
|
|
takes when amplified by the actions of thousands of hosts |
4894 |
|
|
within a few hours. |
4895 |
|
|
|
4896 |
|
|
|
4897 |
|
|
9.1. Relayer General Issues |
4898 |
|
|
|
4899 |
|
|
Relayers MUST not alter the content of articles unnecessar- |
4900 |
|
|
ily. Well-intentioned attempts to "improve" headers, in |
4901 |
|
|
particular, typically do more harm than good. It is neces- |
4902 |
|
|
sary for a relayer to prepend its own name to the Path con- |
4903 |
|
|
tent (see section 5.6) and permissible for it to rewrite or |
4904 |
|
|
delete the Xref header (see section 6.12). Relayers MAY |
4905 |
|
|
delete the thoroughly-obsolete headers described in appendix |
4906 |
|
|
A.3, although this behavior no longer seems useful enough to |
4907 |
|
|
encourage. Other alterations SHOULD be avoided at all |
4908 |
|
|
costs, as per the Hippocratic Principle. |
4909 |
|
|
|
4910 |
|
|
NOTE: As discussed in section 2.3, tidying up the |
4911 |
|
|
headers of a user-prepared article is the job of |
4912 |
|
|
the posting agent, not the relayer. The relayer's |
4913 |
|
|
purpose is to move already-compliant articles |
4914 |
|
|
around efficiently without damaging them. Note |
4915 |
|
|
that in existing implementations, specific pro- |
4916 |
|
|
grams may contain both posting-agent functions and |
4917 |
|
|
relayer functions. The distinction is that post- |
4918 |
|
|
ing-agent functions are invoked only on articles |
4919 |
|
|
posted by local posters, never on articles |
4920 |
|
|
received from other relayers. |
4921 |
|
|
|
4922 |
|
|
NOTE: A particular corollary of this rule is that |
4923 |
|
|
relayers should not add headers unless truly nec- |
4924 |
|
|
essary. In particular, this is not SMTP; do not |
4925 |
|
|
add Received headers. |
4926 |
|
|
|
4927 |
|
|
Relayers MUST not pass non-conforming articles on to other |
4928 |
|
|
relayers, except perhaps in a cooperating subnet that has |
4929 |
|
|
agreed to permit certain kinds of non-conforming behavior. |
4930 |
|
|
This is a direct consequence of the Internet Robustness |
4931 |
|
|
Principle. |
4932 |
|
|
|
4933 |
|
|
The two preceding paragraphs may appear to be in conflict. |
4934 |
|
|
What is to be done when a non-conforming article is |
4935 |
|
|
received? The Robustness Principle argues that it should be |
4936 |
|
|
accepted but must not be passed on to other relayers while |
4937 |
|
|
still non-conforming, and the Hippocratic Principle strongly |
4938 |
|
|
discourages attempts at repair. The conclusion that this |
4939 |
|
|
appears to lead to is correct: a non-conforming article MAY |
4940 |
|
|
be accepted for local filing and processing, or it MAY be |
4941 |
|
|
discarded entirely, but it MUST not be passed on to other |
4942 |
|
|
relayers. |
4943 |
|
|
|
4944 |
|
|
|
4945 |
|
|
|
4946 |
|
|
|
4947 |
|
|
|
4948 |
|
|
2 June 1994 - 75 - expires 15 July 1994 |
4949 |
|
|
|
4950 |
|
|
|
4951 |
|
|
|
4952 |
|
|
|
4953 |
|
|
|
4954 |
|
|
INTERNET DRAFT to be NEWS sec. 9.1 |
4955 |
|
|
|
4956 |
|
|
|
4957 |
|
|
A relayer MUST not respond to the arrival of an article by |
4958 |
|
|
sending mail to any destination, other than a local adminis- |
4959 |
|
|
trator, except by explicit prearrangement with the recipi- |
4960 |
|
|
ent. Neither posting an article (other than certain types |
4961 |
|
|
of control message, see section 7.5) nor being the moderator |
4962 |
|
|
of a moderated newsgroup constitutes such prearrangement. |
4963 |
|
|
UNDER NO CIRCUMSTANCES WHATSOEVER may a relayer attempt to |
4964 |
|
|
send mail to either an article's originator or a moderator. |
4965 |
|
|
|
4966 |
|
|
NOTE: Reporting apparent errors in message compo- |
4967 |
|
|
sition is the job of a posting agent, not a |
4968 |
|
|
relayer. The same is true of mailing moderated- |
4969 |
|
|
newsgroup postings to moderators. In networks of |
4970 |
|
|
thousands of cooperating relayers, it is simply |
4971 |
|
|
unacceptable for there to be any circumstance |
4972 |
|
|
whatsoever that causes any significant fraction of |
4973 |
|
|
them to simultaneously send mail to the same des- |
4974 |
|
|
tination. (Some control messages are exceptions, |
4975 |
|
|
although perhaps ill-advised ones.) What might, |
4976 |
|
|
in a smaller network, be a useful notification or |
4977 |
|
|
forwarding becomes a deluge of near-identical mes- |
4978 |
|
|
sages that can bring mail software to its knees |
4979 |
|
|
and severely inconvenience recipients. Modera- |
4980 |
|
|
tors, in particular, historically have suffered |
4981 |
|
|
grievously from this. |
4982 |
|
|
|
4983 |
|
|
Notification of problems in incoming articles MAY go to |
4984 |
|
|
local administrators, or at most (by prearrangement!) to |
4985 |
|
|
the administrators of the neighboring relayer(s) that passed |
4986 |
|
|
on the problematic articles. |
4987 |
|
|
|
4988 |
|
|
NOTE: It would be desirable to notify the author |
4989 |
|
|
that his posting is not propagating as he expects. |
4990 |
|
|
However, there is no known method for doing this |
4991 |
|
|
that will scale up gracefully. (In particular, |
4992 |
|
|
"notify only if within N relayers of the origina- |
4993 |
|
|
tor" falls down in the presence of commercial news |
4994 |
|
|
services like UUNET: there may be hundreds or |
4995 |
|
|
thousands of relayers within a couple of hops of |
4996 |
|
|
the originator.) The best that can be done right |
4997 |
|
|
now is to notify neighbors, in hopes that the word |
4998 |
|
|
will eventually propagate up the line, or organize |
4999 |
|
|
regional monitoring at major hubs. |
5000 |
|
|
|
5001 |
|
|
If it is necessary to alter an article, e.g. translate it to |
5002 |
|
|
another character set or alter its EOL representation, |
5003 |
|
|
strenuous efforts should be made to ensure that such trans- |
5004 |
|
|
formations are reversible, and that relayers or other soft- |
5005 |
|
|
ware that might wish to reverse them know exactly how to do |
5006 |
|
|
so. |
5007 |
|
|
|
5008 |
|
|
NOTE: For example, a cooperating subnet that |
5009 |
|
|
exchanges articles using a non-ASCII character set |
5010 |
|
|
like EBCDIC should define a standard, reversible |
5011 |
|
|
|
5012 |
|
|
|
5013 |
|
|
|
5014 |
|
|
2 June 1994 - 76 - expires 15 July 1994 |
5015 |
|
|
|
5016 |
|
|
|
5017 |
|
|
|
5018 |
|
|
|
5019 |
|
|
|
5020 |
|
|
INTERNET DRAFT to be NEWS sec. 9.1 |
5021 |
|
|
|
5022 |
|
|
|
5023 |
|
|
ASCII-EBCDIC mapping and take pains to see that it |
5024 |
|
|
is used at all points where the subnet meets the |
5025 |
|
|
outside. If the only reason for using EBCDIC is |
5026 |
|
|
that the readers typically employ EBCDIC devices, |
5027 |
|
|
it would be more robust to employ ASCII as the |
5028 |
|
|
interchange format and do the transformation in |
5029 |
|
|
the reading and posting agents. |
5030 |
|
|
|
5031 |
|
|
|
5032 |
|
|
9.2. Article Acceptance And Propagation |
5033 |
|
|
|
5034 |
|
|
When a relayer first receives an article, it must decide |
5035 |
|
|
whether to accept it. (This applies regardless of whether |
5036 |
|
|
the article arrived by itself or as part of a batch, and in |
5037 |
|
|
principle regardless of whether it originated as a local |
5038 |
|
|
posting or as traffic from another relayer.) In a cooperat- |
5039 |
|
|
ing subnet with well-controlled propagation paths, some of |
5040 |
|
|
the tests specified here MAY be delegated to centrally- |
5041 |
|
|
located relayers; that is, relayers that can receive news |
5042 |
|
|
ONLY via one of the central relayers might simplify accep- |
5043 |
|
|
tance testing based on the assumption that incoming traffic |
5044 |
|
|
has already passed the full set of tests at a central |
5045 |
|
|
relayer. |
5046 |
|
|
|
5047 |
|
|
The wording that follows is based on a model in which arti- |
5048 |
|
|
cles arrive on a relayer's host before acceptance tests are |
5049 |
|
|
done. However, depending on the degree of integration of |
5050 |
|
|
the transport mechanisms and the relayer, some or all of |
5051 |
|
|
these tests MAY be done before the article is actually |
5052 |
|
|
transmitted, so that articles which definitely will not be |
5053 |
|
|
accepted need not be transmitted at all. |
5054 |
|
|
|
5055 |
|
|
The wording that follows also specifies a particular order |
5056 |
|
|
for the acceptance tests. While this order is the obvious |
5057 |
|
|
one, the tests MAY be done in any order. |
5058 |
|
|
|
5059 |
|
|
First, the relayer MUST verify that the article is a legal |
5060 |
|
|
news article, with all mandatory headers present with legal |
5061 |
|
|
contents. |
5062 |
|
|
|
5063 |
|
|
NOTE: This check in principle is done by the first |
5064 |
|
|
relayer to see an article, so an article received |
5065 |
|
|
from another relayer should always be legal, but |
5066 |
|
|
there is enough old software still operational |
5067 |
|
|
that this cannot be taken for granted; see the |
5068 |
|
|
discussion of the Internet Robustness Principle in |
5069 |
|
|
section 9.1. |
5070 |
|
|
|
5071 |
|
|
Second, the relayer MUST determine whether it has already |
5072 |
|
|
seen this article (identified by its message ID). This is |
5073 |
|
|
normally done by retaining a history of all article message |
5074 |
|
|
IDs seen in the last N days, where the value of N is decided |
5075 |
|
|
by the relayer's administrator but SHOULD be at least 7. |
5076 |
|
|
Since N cannot practically be infinite, articles whose Date |
5077 |
|
|
|
5078 |
|
|
|
5079 |
|
|
|
5080 |
|
|
2 June 1994 - 77 - expires 15 July 1994 |
5081 |
|
|
|
5082 |
|
|
|
5083 |
|
|
|
5084 |
|
|
|
5085 |
|
|
|
5086 |
|
|
INTERNET DRAFT to be NEWS sec. 9.2 |
5087 |
|
|
|
5088 |
|
|
|
5089 |
|
|
content indicates that they are older than N days are |
5090 |
|
|
declared "stale" and are deemed to have been seen already. |
5091 |
|
|
|
5092 |
|
|
NOTE: This check is important because news propa- |
5093 |
|
|
gation topology is typically redundant, often |
5094 |
|
|
highly so, and it is not at all uncommon for a |
5095 |
|
|
relayer to receive the same article from several |
5096 |
|
|
neighbors. The history of already-seen message |
5097 |
|
|
IDs can get quite large, hence the desire to limit |
5098 |
|
|
its length... but it is important that it be long |
5099 |
|
|
enough that slowly-propagating articles are not |
5100 |
|
|
classed as stale. News propagation within the |
5101 |
|
|
Internet is normally very rapid, but when UUCP |
5102 |
|
|
links are involved, end-to-end delays of several |
5103 |
|
|
days are not rare, so a week is not a particularly |
5104 |
|
|
generous minimum. |
5105 |
|
|
|
5106 |
|
|
NOTE: Despite generally more rapid propagation in |
5107 |
|
|
recent times, it is still not unheard-of for some |
5108 |
|
|
propagation paths to be very slow. This can |
5109 |
|
|
introduce the possibility of old articles arriving |
5110 |
|
|
again after they are gone from the history. Hence |
5111 |
|
|
the "stale" rule. |
5112 |
|
|
|
5113 |
|
|
Third, the relayer MUST determine whether any of the arti- |
5114 |
|
|
cle's newsgroups are "subscribed to" by the host, i.e. fit a |
5115 |
|
|
description of what hierarchies or newsgroups the site wants |
5116 |
|
|
to receive. |
5117 |
|
|
|
5118 |
|
|
NOTE: This check is significant because informa- |
5119 |
|
|
tion on what newsgroups a relayer wishes to |
5120 |
|
|
receive is often stored at its neighbors, who may |
5121 |
|
|
not have up-to-date information or may simplify |
5122 |
|
|
the rules for implementation reasons. As a hedge |
5123 |
|
|
against the possibility of missed or delayed new- |
5124 |
|
|
group control messages, relayers may wish to |
5125 |
|
|
observe a notion of a newsgroup subscription that |
5126 |
|
|
is independent of the list of newsgroups actually |
5127 |
|
|
known to the relayer. This would permit reception |
5128 |
|
|
and relaying of articles in newsgroups that the |
5129 |
|
|
relayer is not (yet) aware of, subject to more |
5130 |
|
|
general criteria indicating that they are likely |
5131 |
|
|
to be of interest. |
5132 |
|
|
|
5133 |
|
|
Once an article has been accepted, it may be passed on to |
5134 |
|
|
other relayers. The fundamental news propagation rule is a |
5135 |
|
|
flooding algorithm: on receiving and accepting an article, |
5136 |
|
|
send it to all neighboring relayers not already in its path |
5137 |
|
|
list that are sent its newsgroup(s) and distribution(s). |
5138 |
|
|
|
5139 |
|
|
NOTE: The path list's role in loop prevention may |
5140 |
|
|
appear relatively unimportant, given that looping |
5141 |
|
|
articles would typically be rejected as duplicates |
5142 |
|
|
anyway. However, the path list's role in |
5143 |
|
|
|
5144 |
|
|
|
5145 |
|
|
|
5146 |
|
|
2 June 1994 - 78 - expires 15 July 1994 |
5147 |
|
|
|
5148 |
|
|
|
5149 |
|
|
|
5150 |
|
|
|
5151 |
|
|
|
5152 |
|
|
INTERNET DRAFT to be NEWS sec. 9.2 |
5153 |
|
|
|
5154 |
|
|
|
5155 |
|
|
preventing superfluous transmissions is not triv- |
5156 |
|
|
ial. In particular, the path list is the only |
5157 |
|
|
thing that prevents relayer X, on receiving an |
5158 |
|
|
article from relayer Y, from sending it back to Y |
5159 |
|
|
again. (Indeed, the usual symptom of confusion |
5160 |
|
|
about relayer names is that incoming news loops |
5161 |
|
|
back in this manner.) The looping articles would |
5162 |
|
|
be rejected as duplicates, but doubling the commu- |
5163 |
|
|
nications load on every news transmission path is |
5164 |
|
|
not to be taken lightly! |
5165 |
|
|
|
5166 |
|
|
In general, relayers SHOULD not make propagation decisions |
5167 |
|
|
by "anticipation": relayer X, noting that the article's path |
5168 |
|
|
list already contains relayer Y, decides not to send it to |
5169 |
|
|
relayer Z because X anticipates that Z will get the article |
5170 |
|
|
by a better path. If that is generally true, then why is |
5171 |
|
|
there a news feed from X to Z at all? In fact, the "better |
5172 |
|
|
path" may be running slowly or may be down. News propaga- |
5173 |
|
|
tion is very robust precisely because some redundant trans- |
5174 |
|
|
mission is done "just in case". If it is imperative to |
5175 |
|
|
limit unnecessary traffic on a path, use of NNTP [rrr] or |
5176 |
|
|
ihave/sendme (see section 7.2) to pass articles only when |
5177 |
|
|
necessary is better than arbitrary decisions not to pass |
5178 |
|
|
articles at all. |
5179 |
|
|
|
5180 |
|
|
Anticipation is occasionally justified in special cases. |
5181 |
|
|
Such cases should involve both (1) a cooperating subnet |
5182 |
|
|
whose propagation paths are well-understood and well- |
5183 |
|
|
monitored, with failures and slowdowns noticed and dealt |
5184 |
|
|
with promptly, and (2) a persistent pattern of heavy unnec- |
5185 |
|
|
essary traffic on a path that is either slow or costly. In |
5186 |
|
|
addition, there should be some reason why neither NNTP nor |
5187 |
|
|
ihave/sendme is suitable as a solution to the problem. |
5188 |
|
|
|
5189 |
|
|
|
5190 |
|
|
9.3. Administrator Contact |
5191 |
|
|
|
5192 |
|
|
It is desirable to have a standardized contact address for a |
5193 |
|
|
relayer's administrators, in the spirit of the "postmaster" |
5194 |
|
|
address for mail administrators. Mail addressed to "news- |
5195 |
|
|
master" on a relayer's host MUST go to the administrator(s) |
5196 |
|
|
of that relayer. Mail addressed to "usenet" on the |
5197 |
|
|
relayer's host SHOULD be handled likewise. Mail addressed |
5198 |
|
|
to either address on other hosts using the same news |
5199 |
|
|
database SHOULD be handled likewise. |
5200 |
|
|
|
5201 |
|
|
NOTE: These addresses are case-sensitive, although |
5202 |
|
|
it would be desirable for sequences equivalent to |
5203 |
|
|
them using case-insensitive comparison to be han- |
5204 |
|
|
dled likewise. While "newsmaster" seems the pre- |
5205 |
|
|
ferred network-independent address, by analogy to |
5206 |
|
|
"postmaster", there is an existing practice of |
5207 |
|
|
using "usenet" for this purpose, and so "usenet" |
5208 |
|
|
should be supported if at all possible (especially |
5209 |
|
|
|
5210 |
|
|
|
5211 |
|
|
|
5212 |
|
|
2 June 1994 - 79 - expires 15 July 1994 |
5213 |
|
|
|
5214 |
|
|
|
5215 |
|
|
|
5216 |
|
|
|
5217 |
|
|
|
5218 |
|
|
INTERNET DRAFT to be NEWS sec. 9.3 |
5219 |
|
|
|
5220 |
|
|
|
5221 |
|
|
on hosts belonging to Usenet!). The address |
5222 |
|
|
`news" is also sometimes used for purposes like |
5223 |
|
|
this, but less consistently. |
5224 |
|
|
|
5225 |
|
|
|
5226 |
|
|
10. Gatewaying |
5227 |
|
|
|
5228 |
|
|
Gatewaying of traffic between news networks using this Draft |
5229 |
|
|
and those using other exchange mechanisms can be useful, but |
5230 |
|
|
must be done cautiously. Gateway administrators are taking |
5231 |
|
|
on significant responsibilities, and must recognize that the |
5232 |
|
|
consequences of error can be quite serious. |
5233 |
|
|
|
5234 |
|
|
|
5235 |
|
|
10.1. General Gatewaying Issues |
5236 |
|
|
|
5237 |
|
|
This section will primarily address the problems of gateway- |
5238 |
|
|
ing traffic INTO news networks. Little can be said about |
5239 |
|
|
the other direction without some specific knowledge of the |
5240 |
|
|
network(s) involved. However, the two issues are not |
5241 |
|
|
entirely independent: if a non-news network is gatewayed |
5242 |
|
|
into a news network at more than one point, traffic injected |
5243 |
|
|
into the non-news network by one gateway may appear at |
5244 |
|
|
another as a candidate for injection back into the news net- |
5245 |
|
|
work. |
5246 |
|
|
|
5247 |
|
|
This raises a more general principle, the single most impor- |
5248 |
|
|
tant issue for gatewaying: |
5249 |
|
|
|
5250 |
|
|
Above all, prevent loops. |
5251 |
|
|
|
5252 |
|
|
The normal loop prevention of news transmission is vitally |
5253 |
|
|
dependent on the Message-ID header. Any gateway which finds |
5254 |
|
|
it necessary to remove this header, alter it, or supersede |
5255 |
|
|
it (by moving it into the body), MUST take equally effective |
5256 |
|
|
precautions against looping. |
5257 |
|
|
|
5258 |
|
|
NOTE: There are few things more effective at turn- |
5259 |
|
|
ing news readers into a lynch mob than a malfunc- |
5260 |
|
|
tioning gateway, or pair of gateways, that takes |
5261 |
|
|
in news articles, mangles them just enough to pre- |
5262 |
|
|
vent news relayers from recognizing them as dupli- |
5263 |
|
|
cates, and regurgitates them back into the news |
5264 |
|
|
stream. This happens rather too often. |
5265 |
|
|
|
5266 |
|
|
Gateway implementors should realize that gateways have all |
5267 |
|
|
the responsibilities of relayers, plus the added complica- |
5268 |
|
|
tions introduced by transformations between different infor- |
5269 |
|
|
mation formats. Much of section 9's discussion of relayer |
5270 |
|
|
issues is relevant to gateways as well. In particular, |
5271 |
|
|
gateways SHOULD keep a history of recently-seen articles, as |
5272 |
|
|
described in section 9.2, and not assume that articles will |
5273 |
|
|
never reappear. This is particularly important for networks |
5274 |
|
|
that have their own concept analogous to message IDs: a |
5275 |
|
|
|
5276 |
|
|
|
5277 |
|
|
|
5278 |
|
|
2 June 1994 - 80 - expires 15 July 1994 |
5279 |
|
|
|
5280 |
|
|
|
5281 |
|
|
|
5282 |
|
|
|
5283 |
|
|
|
5284 |
|
|
INTERNET DRAFT to be NEWS sec. 10.1 |
5285 |
|
|
|
5286 |
|
|
|
5287 |
|
|
gateway should keep a history of traffic seen from BOTH |
5288 |
|
|
directions. |
5289 |
|
|
|
5290 |
|
|
If at all possible, articles entering the non-news network |
5291 |
|
|
SHOULD be marked in some way so that they will NOT be re- |
5292 |
|
|
gatewayed back into news. Multiple gateways obviously must |
5293 |
|
|
agree on the marking method used; if it is done by having |
5294 |
|
|
them know each others' names, name changes MUST be coordi- |
5295 |
|
|
nated with great care. If marking cannot be done, all |
5296 |
|
|
transformations MUST be reversible so that a re-gatewayed |
5297 |
|
|
article is identical to the original (except perhaps for a |
5298 |
|
|
longer Path header). |
5299 |
|
|
|
5300 |
|
|
Gateways MUST not pass control messages (articles containing |
5301 |
|
|
Control, Also-Control, or Supersedes headers) without remov- |
5302 |
|
|
ing the headers that make them control messages, unless |
5303 |
|
|
there are compelling reasons to believe that they are rele- |
5304 |
|
|
vant to both sides and that conventions are compatible. If |
5305 |
|
|
it is truly desirable to pass them unaltered, suitable pre- |
5306 |
|
|
cautions MUST be taken to ensure that there is NO POSSIBIL- |
5307 |
|
|
ITY of a looping control message. |
5308 |
|
|
|
5309 |
|
|
NOTE: The damage done by looping articles is mul- |
5310 |
|
|
tiplied a thousandfold if one of the affected |
5311 |
|
|
articles is something like a sendsys message (see |
5312 |
|
|
section 7.3) that requests multiple automatic |
5313 |
|
|
replies. Most gateways simply should not pass |
5314 |
|
|
control messages at all. If some unusual reason |
5315 |
|
|
dictates doing so, gateway implementors and admin- |
5316 |
|
|
istrators are urged to consider bulletproof rate- |
5317 |
|
|
limiting measures for the more destructive ones |
5318 |
|
|
like sendsys, e.g. passing only one per hour no |
5319 |
|
|
matter how many are offered. |
5320 |
|
|
|
5321 |
|
|
Gateways, like relayers, SHOULD make determined efforts to |
5322 |
|
|
avoid mangling articles unnecessarily. In the case of gate- |
5323 |
|
|
ways, some transformations may be inevitable, but keeping |
5324 |
|
|
them to a minimum and ensuring that they are reversible is |
5325 |
|
|
still highly desirable. |
5326 |
|
|
|
5327 |
|
|
Gateways MUST avoid destroying information. In particular, |
5328 |
|
|
the restrictions of section 4.2.2 are best taken with a |
5329 |
|
|
grain of salt in the context of gateways. Information that |
5330 |
|
|
does not translate directly into news headers SHOULD be |
5331 |
|
|
retained, perhaps in "X-" headers, both because it may be of |
5332 |
|
|
interest to sophisticated readers and because it may be cru- |
5333 |
|
|
cial to tracing propagation problems. |
5334 |
|
|
|
5335 |
|
|
Gateway implementors should take particular note of the dis- |
5336 |
|
|
cussion of mailed replies, or more precisely the ban on |
5337 |
|
|
same, in section 9.1. Gateway problems MUST be reported to |
5338 |
|
|
the local administration, not to the innocent originator of |
5339 |
|
|
traffic. "Gateway problems" here includes all forms of |
5340 |
|
|
propagation anomaly on the non-news side of the gateway, |
5341 |
|
|
|
5342 |
|
|
|
5343 |
|
|
|
5344 |
|
|
2 June 1994 - 81 - expires 15 July 1994 |
5345 |
|
|
|
5346 |
|
|
|
5347 |
|
|
|
5348 |
|
|
|
5349 |
|
|
|
5350 |
|
|
INTERNET DRAFT to be NEWS sec. 10.1 |
5351 |
|
|
|
5352 |
|
|
|
5353 |
|
|
e.g. unreachable addresses on a mailing list. Note that |
5354 |
|
|
this requires consideration of possible misbehavior of |
5355 |
|
|
"downstream" hosts, not just the gateway host. |
5356 |
|
|
|
5357 |
|
|
|
5358 |
|
|
10.2. Header Synthesis |
5359 |
|
|
|
5360 |
|
|
News articles prepared by gateways MUST be legal news arti- |
5361 |
|
|
cles. In particular, they MUST include all of the mandatory |
5362 |
|
|
headers (see section 5) and MUST fully conform to the |
5363 |
|
|
restrictions on said headers. This often requires that a |
5364 |
|
|
gateway function not only as a relayer, but also partly as a |
5365 |
|
|
posting agent, aiding in the synthesis of a conforming arti- |
5366 |
|
|
cle from non-conforming input. |
5367 |
|
|
|
5368 |
|
|
NOTE: The full-conformance requirement needs par- |
5369 |
|
|
ticularly careful attention when gatewaying mail- |
5370 |
|
|
ing lists to news, because a number of constructs |
5371 |
|
|
that are legal in MAIL headers are NOT permissible |
5372 |
|
|
in news headers. (Note also that not all mail |
5373 |
|
|
traffic fully conforms to even the MAIL specifica- |
5374 |
|
|
tion.) The rest of this section will be phrased |
5375 |
|
|
in terms of mail-to-news gatewaying, but most of |
5376 |
|
|
it is more generally applicable. |
5377 |
|
|
|
5378 |
|
|
The mandatory headers generally present few problems. |
5379 |
|
|
|
5380 |
|
|
If no date information is available, the gateway should sup- |
5381 |
|
|
ply a Date header with the gateway's current date. If only |
5382 |
|
|
partial information is available (e.g. date but not time), |
5383 |
|
|
this should be fleshed out to a full Date header by adding |
5384 |
|
|
default values, not by mixing in parts of the gateway's cur- |
5385 |
|
|
rent date. (Defaults should be chosen so that fleshed-out |
5386 |
|
|
dates will not be in the future!) It may be necessary to |
5387 |
|
|
map timezone information to the restricted forms permitted |
5388 |
|
|
in the news Date header. See section 5.1. |
5389 |
|
|
|
5390 |
|
|
NOTE: The prohibition of mixing dates is on the |
5391 |
|
|
theory that it is better to admit ignorance than |
5392 |
|
|
to lie. |
5393 |
|
|
|
5394 |
|
|
If the author's address as supplied in the original message |
5395 |
|
|
is not suitable for inclusion in a From header, the gateway |
5396 |
|
|
MUST transform it so it is, e.g. by use of the "% hack" and |
5397 |
|
|
the domain address of the gateway. The desire to preserve |
5398 |
|
|
information is NOT an excuse for violating the rules. If |
5399 |
|
|
the transformation is drastic enough that there is reason to |
5400 |
|
|
suspect loss of information, it may be desirable to include |
5401 |
|
|
the original form in an X- header, but the From header's |
5402 |
|
|
contents MUST be as specified in section 5.2. |
5403 |
|
|
|
5404 |
|
|
If the message contains a Message-ID header, the contents |
5405 |
|
|
should be dealt with as discussed in section 10.3. If there |
5406 |
|
|
is no message ID present, it will be necessary to synthesize |
5407 |
|
|
|
5408 |
|
|
|
5409 |
|
|
|
5410 |
|
|
2 June 1994 - 82 - expires 15 July 1994 |
5411 |
|
|
|
5412 |
|
|
|
5413 |
|
|
|
5414 |
|
|
|
5415 |
|
|
|
5416 |
|
|
INTERNET DRAFT to be NEWS sec. 10.2 |
5417 |
|
|
|
5418 |
|
|
|
5419 |
|
|
one, following the news rules (see section 5.3). |
5420 |
|
|
|
5421 |
|
|
Every effort should be made to produce a meaningful Subject |
5422 |
|
|
header; see section 5.4. Many news readers select articles |
5423 |
|
|
to read based on Subject headers, and inserting a place- |
5424 |
|
|
holder like "<no subject available>" is considered highly |
5425 |
|
|
objectionable. Even synthesizing a Subject header by pick- |
5426 |
|
|
ing out the first half-dozen nouns and adjectives in the |
5427 |
|
|
article body is better than using a placeholder, since it |
5428 |
|
|
offers SOME indication of what the article might contain. |
5429 |
|
|
|
5430 |
|
|
The contents of the Newsgroups header (section 5.5) are usu- |
5431 |
|
|
ally predetermined by gateway configuration, but a gateway |
5432 |
|
|
to a network that has its own concept of newsgroups or dis- |
5433 |
|
|
cussions might have to make transformations. Such transfor- |
5434 |
|
|
mations should be reversible; otherwise confusion is likely |
5435 |
|
|
on both sides. |
5436 |
|
|
|
5437 |
|
|
It will rarely be possible for gateways to provide a Path |
5438 |
|
|
header that is both an accurate history of the relayers the |
5439 |
|
|
article has passed through AS NEWS and a usable reply |
5440 |
|
|
address. The history function MUST be given priority; see |
5441 |
|
|
the discussion in section 5.6. It will usually be necessary |
5442 |
|
|
for a gateway to supply an empty path list, abandoning the |
5443 |
|
|
reply function. |
5444 |
|
|
|
5445 |
|
|
It is desirable for gatewayed articles to convey as much |
5446 |
|
|
useful information as possible, e.g. by use of optional news |
5447 |
|
|
headers (see section 6) when the relevant information is |
5448 |
|
|
available. Synthesis of optional headers can generally fol- |
5449 |
|
|
low similar rules. |
5450 |
|
|
|
5451 |
|
|
Software synthesizing References headers should note the |
5452 |
|
|
discussion in section 6.5 concerning the incompatibility |
5453 |
|
|
between MAIL and news. Also of interest is the possibility |
5454 |
|
|
of incorporating information from In-Reply-To headers and |
5455 |
|
|
from attribution lines in the body; an incomplete or some- |
5456 |
|
|
what conjectural References header is much better than none |
5457 |
|
|
at all, and reading agents already have to cope with incom- |
5458 |
|
|
plete or slightly erroneous References lists. |
5459 |
|
|
|
5460 |
|
|
|
5461 |
|
|
10.3. Message ID Mapping |
5462 |
|
|
|
5463 |
|
|
This section, like the previous one, is phrased in terms of |
5464 |
|
|
mail being gatewayed into news, but most of the discussion |
5465 |
|
|
should be more generally applicable. |
5466 |
|
|
|
5467 |
|
|
A particularly sticky problem of gatewaying mail into news |
5468 |
|
|
is supplying legal news message IDs. Note, in particular, |
5469 |
|
|
that not all MAIL message IDs are legal in news; the news |
5470 |
|
|
syntax (specified in section 5.3, with related material in |
5471 |
|
|
5.2) is more restrictive. Generating a fully-conforming |
5472 |
|
|
news article from a mail message may require transforming |
5473 |
|
|
|
5474 |
|
|
|
5475 |
|
|
|
5476 |
|
|
2 June 1994 - 83 - expires 15 July 1994 |
5477 |
|
|
|
5478 |
|
|
|
5479 |
|
|
|
5480 |
|
|
|
5481 |
|
|
|
5482 |
|
|
INTERNET DRAFT to be NEWS sec. 10.3 |
5483 |
|
|
|
5484 |
|
|
|
5485 |
|
|
the message ID somewhat. |
5486 |
|
|
|
5487 |
|
|
Generation and transformation of message IDs assumes partic- |
5488 |
|
|
ular importance if a given mailing list (or whatever) is |
5489 |
|
|
being handled by more than one gateway. It is highly desir- |
5490 |
|
|
able that the same article contents not appear twice in the |
5491 |
|
|
same newsgroup, which requires that they receive the same |
5492 |
|
|
message ID from all gateways. Gateways SHOULD use the fol- |
5493 |
|
|
lowing algorithm (possibly modified by the later discussion |
5494 |
|
|
of gatewaying into more than one newsgroup) unless local |
5495 |
|
|
considerations dictate another: |
5496 |
|
|
|
5497 |
|
|
1. Separate message ID from surroundings, if necessary. |
5498 |
|
|
A plausible method for this is to start at the first |
5499 |
|
|
"<", end at the next ">", and reject the message if |
5500 |
|
|
no ">" is found or a second "<" is seen before the |
5501 |
|
|
">". Also reject the message if the message ID con- |
5502 |
|
|
tains no "@" or more than one "@", or if it contains |
5503 |
|
|
no ".". Also reject the message if the message ID |
5504 |
|
|
contains non-ASCII characters, ASCII control charac- |
5505 |
|
|
ters, or white space. |
5506 |
|
|
|
5507 |
|
|
NOTE: Any legitimate domain will include at |
5508 |
|
|
least one ".". RFC 822 section 6.2.2 forbids |
5509 |
|
|
white space in this context when passing mail |
5510 |
|
|
on to non-MAIL software. |
5511 |
|
|
|
5512 |
|
|
2. Delete the leading "<" and trailing ">". Separate |
5513 |
|
|
message ID into local part and domain at the "@". |
5514 |
|
|
|
5515 |
|
|
3. In both components, transliterate leading dots |
5516 |
|
|
(".", ASCII 46), trailing dots, and dots after the |
5517 |
|
|
first in sequences of two or more consecutive |
5518 |
|
|
dots, into underscores (ASCII 95). |
5519 |
|
|
|
5520 |
|
|
4. In both components, transliterate disallowed char- |
5521 |
|
|
acters other than dots (see the definition of |
5522 |
|
|
<unquoted-char> in section 5.2) to underscores |
5523 |
|
|
(ASCII 95). |
5524 |
|
|
|
5525 |
|
|
5. Form the message ID as |
5526 |
|
|
|
5527 |
|
|
"<" local-part "@" domain ">" |
5528 |
|
|
|
5529 |
|
|
|
5530 |
|
|
NOTE: This algorithm is approximately that of Rich |
5531 |
|
|
Salz's successful gatewaying package. |
5532 |
|
|
|
5533 |
|
|
Despite the desire to keep message IDs consistent across |
5534 |
|
|
multiple gateways, there is also a more subtle issue that |
5535 |
|
|
can require a different approach. If the same articles are |
5536 |
|
|
being gatewayed into more than one newsgroup, and it is not |
5537 |
|
|
possible to arrange that all gateways gateway them to the |
5538 |
|
|
same cross-posted set of newsgroups, then the message IDs in |
5539 |
|
|
|
5540 |
|
|
|
5541 |
|
|
|
5542 |
|
|
2 June 1994 - 84 - expires 15 July 1994 |
5543 |
|
|
|
5544 |
|
|
|
5545 |
|
|
|
5546 |
|
|
|
5547 |
|
|
|
5548 |
|
|
INTERNET DRAFT to be NEWS sec. 10.3 |
5549 |
|
|
|
5550 |
|
|
|
5551 |
|
|
the different newsgroups MUST be DIFFERENT. |
5552 |
|
|
|
5553 |
|
|
NOTE: Otherwise, arrival of an article in one |
5554 |
|
|
newsgroup will prevent it from appearing in |
5555 |
|
|
another, and which newsgroup a particular article |
5556 |
|
|
appears in will be an accident of which direction |
5557 |
|
|
it arrives from first. It is very difficult to |
5558 |
|
|
maintain a coherent discussion when each partici- |
5559 |
|
|
pant sees a randomly-selected 50% of the traffic. |
5560 |
|
|
The fundamental problem here is that the basic |
5561 |
|
|
assumption behind message IDs is being violated: |
5562 |
|
|
the gateways are assigning the same message ID to |
5563 |
|
|
articles that differ in an important respect |
5564 |
|
|
(Newsgroups header). |
5565 |
|
|
|
5566 |
|
|
In such cases, it is suggested that the newsgroup name, or |
5567 |
|
|
an agreed-on abbreviation thereof, be prepended to the local |
5568 |
|
|
part of the message ID (with a separating ".") by the gate- |
5569 |
|
|
way. This will ensure that multiple gateways generate the |
5570 |
|
|
same message ID, while also ensuring that different news- |
5571 |
|
|
groups can be read independently. |
5572 |
|
|
|
5573 |
|
|
NOTE: It is preferable to have the gateway(s) |
5574 |
|
|
cross-post the article, avoiding the issue alto- |
5575 |
|
|
gether, but this may not be feasible, especially |
5576 |
|
|
if one newsgroup is widespread and the other is |
5577 |
|
|
purely local. |
5578 |
|
|
|
5579 |
|
|
|
5580 |
|
|
10.4. Mail to and from News |
5581 |
|
|
|
5582 |
|
|
Gatewaying mail to news, and vice-versa, is the most obvious |
5583 |
|
|
form of news gatewaying. It is common to set up gateways |
5584 |
|
|
between news and mail rather too casually. |
5585 |
|
|
|
5586 |
|
|
It is hard to go very wrong in gatewaying news into a mail- |
5587 |
|
|
ing list, except for the non-trivial matter of making sure |
5588 |
|
|
that error reports go to the local administration rather |
5589 |
|
|
than to the authors of news articles. (This requires atten- |
5590 |
|
|
tion to the "envelope address" as well as to the message |
5591 |
|
|
headers.) Doing the reverse connection correctly is much |
5592 |
|
|
harder than it looks. |
5593 |
|
|
|
5594 |
|
|
NOTE: In particular, just feeding the mail message |
5595 |
|
|
to "inews -h" or the equivalent is NOT, repeat |
5596 |
|
|
NOT, adequate to gateway mail to news. Signifi- |
5597 |
|
|
cant gatewaying software is necessary to do it |
5598 |
|
|
right. Not all headers of mail messages conform |
5599 |
|
|
to even the MAIL specifications, never mind the |
5600 |
|
|
stricter rules for news. |
5601 |
|
|
|
5602 |
|
|
It is useful to distinguish between two different forms of |
5603 |
|
|
mail-to-news gatewaying: gatewaying a mailing list into a |
5604 |
|
|
newsgroup, and operating a "post-by-mail" service in which |
5605 |
|
|
|
5606 |
|
|
|
5607 |
|
|
|
5608 |
|
|
2 June 1994 - 85 - expires 15 July 1994 |
5609 |
|
|
|
5610 |
|
|
|
5611 |
|
|
|
5612 |
|
|
|
5613 |
|
|
|
5614 |
|
|
INTERNET DRAFT to be NEWS sec. 10.4 |
5615 |
|
|
|
5616 |
|
|
|
5617 |
|
|
individual articles can be posted to a newsgroup by mailing |
5618 |
|
|
them to a specific address. In the first case, the message |
5619 |
|
|
is already being "broadcast", and the situation can be |
5620 |
|
|
viewed as gatewaying one form of news into another. The |
5621 |
|
|
second case is closer to that of a moderator posting submis- |
5622 |
|
|
sions to a moderated newsgroup. |
5623 |
|
|
|
5624 |
|
|
In either case, the discussions in the preceding two sec- |
5625 |
|
|
tions are relevant, as is the Hippocratic Principle of sec- |
5626 |
|
|
tion 9. However, some additional considerations are spe- |
5627 |
|
|
cific to mail-to-news gatewaying. |
5628 |
|
|
|
5629 |
|
|
As mentioned in section 6, point-to-point headers like To |
5630 |
|
|
and Cc SHOULD not appear as such in news, although it is |
5631 |
|
|
suggested that they be transformed to "X-" headers, e.g. X- |
5632 |
|
|
To and X-Cc, to preserve their information content for pos- |
5633 |
|
|
sible use by readers or troubleshooters. The Received |
5634 |
|
|
header is entirely specific to MAIL and SHOULD be deleted |
5635 |
|
|
completely during gatewaying, except perhaps for the |
5636 |
|
|
Received header supplied by the gateway host itself. |
5637 |
|
|
|
5638 |
|
|
The Sender header is a tricky case, one where mailing-list |
5639 |
|
|
and post-by-mail practice should differ. For gatewaying |
5640 |
|
|
mailing lists, the mailing-list host should be considered a |
5641 |
|
|
relayer, and the From and Sender headers supplied in its |
5642 |
|
|
transmissions left strictly untouched. For post-by-mail, as |
5643 |
|
|
for a moderator posting a mailed submission, the Sender |
5644 |
|
|
header should reflect the poster rather than the author. If |
5645 |
|
|
a post-by-mail gateway receives a message with its own |
5646 |
|
|
Sender header, it might wish to preserve the content in an |
5647 |
|
|
X-Sender header. |
5648 |
|
|
|
5649 |
|
|
It will generally be necessary to transform between mail's |
5650 |
|
|
In-Reply-To/References convention and news's References/See- |
5651 |
|
|
Also convention, to preserve correct semantics of cross ref- |
5652 |
|
|
erences. This also requires attention when going the other |
5653 |
|
|
way, from news to mail. See the discussion of the differ- |
5654 |
|
|
ence in section 6.5. |
5655 |
|
|
|
5656 |
|
|
|
5657 |
|
|
10.5. Gateway Administration |
5658 |
|
|
|
5659 |
|
|
Any news system will benefit from an attentive administra- |
5660 |
|
|
tor, preferably assisted by automated monitoring for anoma- |
5661 |
|
|
lies. This is particularly true of gateways. Gateway soft- |
5662 |
|
|
ware SHOULD be instrumented so that unusual occurrences, |
5663 |
|
|
such as sudden massive surges in traffic, are reported |
5664 |
|
|
promptly. It is desirable, in fact, to go further: gateway |
5665 |
|
|
software SHOULD endeavour to limit damage in the event that |
5666 |
|
|
the administrator does not respond promptly. |
5667 |
|
|
|
5668 |
|
|
NOTE: For example, software might limit the gate- |
5669 |
|
|
waying rate by queueing incoming traffic and emp- |
5670 |
|
|
tying the queue at a finite maximum rate (well |
5671 |
|
|
|
5672 |
|
|
|
5673 |
|
|
|
5674 |
|
|
2 June 1994 - 86 - expires 15 July 1994 |
5675 |
|
|
|
5676 |
|
|
|
5677 |
|
|
|
5678 |
|
|
|
5679 |
|
|
|
5680 |
|
|
INTERNET DRAFT to be NEWS sec. 10.5 |
5681 |
|
|
|
5682 |
|
|
|
5683 |
|
|
below the maximum that the host is capable of!) |
5684 |
|
|
which is set by the administrator and is not |
5685 |
|
|
raised automatically. |
5686 |
|
|
|
5687 |
|
|
Traffic gatewayed into a news network SHOULD include a suit- |
5688 |
|
|
able header, perhaps X-Gateway-Administrator, giving an |
5689 |
|
|
electronic address that can be used to report problems. |
5690 |
|
|
This SHOULD be an address that goes direct to a human, not |
5691 |
|
|
to a "routine administrative issues" mailbox that is exam- |
5692 |
|
|
ined only occasionally, since the point is to be able to |
5693 |
|
|
reach the administrator quickly in an emergency. Gateway |
5694 |
|
|
administrators SHOULD arrange substitutes to cover gateway |
5695 |
|
|
operation (with suitable redirection of mail) when they are |
5696 |
|
|
on vacation etc. |
5697 |
|
|
|
5698 |
|
|
|
5699 |
|
|
11. Security And Related Issues |
5700 |
|
|
|
5701 |
|
|
Although the interchange format itself raises no significant |
5702 |
|
|
security issues, the wider context does. |
5703 |
|
|
|
5704 |
|
|
|
5705 |
|
|
11.1. Leakage |
5706 |
|
|
|
5707 |
|
|
The most obvious form of security problem with news is |
5708 |
|
|
"leakage" of articles which are intended to have only |
5709 |
|
|
restricted circulation. The flooding algorithm is EXTREMELY |
5710 |
|
|
good at finding any path by which articles can leave a sub- |
5711 |
|
|
net with supposedly-restrictive boundaries. Substantial |
5712 |
|
|
administrative effort is required to ensure that local news- |
5713 |
|
|
groups remain local, unless connections to the outside world |
5714 |
|
|
are tightly restricted. |
5715 |
|
|
|
5716 |
|
|
A related problem is that the sendme control message can be |
5717 |
|
|
used to ask for any article by its message ID. The useful- |
5718 |
|
|
ness of this has declined as message-ID generation algo- |
5719 |
|
|
rithms have become less predictable, but it remains a poten- |
5720 |
|
|
tial problem for "secure" newsgroups. Hosts with such news- |
5721 |
|
|
groups may wish to disable the sendme control message |
5722 |
|
|
entirely. |
5723 |
|
|
|
5724 |
|
|
The sendsys, version, and whogets control messages also |
5725 |
|
|
allow "outsiders" to request information from "inside", |
5726 |
|
|
which may reveal details of internal topology (etc.) that |
5727 |
|
|
are considered confidential. (Note that at least limited |
5728 |
|
|
openness about such matters may be a condition of membership |
5729 |
|
|
in such networks, e.g. Usenet.) |
5730 |
|
|
|
5731 |
|
|
Organizations wishing to control these forms of leakage are |
5732 |
|
|
strongly advised to designate a small number of "official |
5733 |
|
|
gateway" hosts to handle all news exchange with the outside |
5734 |
|
|
world, so that a bounded amount of administrative effort is |
5735 |
|
|
needed to control propagation and eliminate problems. |
5736 |
|
|
Attempts to keep news out entirely, by refusing to support |
5737 |
|
|
|
5738 |
|
|
|
5739 |
|
|
|
5740 |
|
|
2 June 1994 - 87 - expires 15 July 1994 |
5741 |
|
|
|
5742 |
|
|
|
5743 |
|
|
|
5744 |
|
|
|
5745 |
|
|
|
5746 |
|
|
INTERNET DRAFT to be NEWS sec. 11.1 |
5747 |
|
|
|
5748 |
|
|
|
5749 |
|
|
an official gateway, typically result in large numbers of |
5750 |
|
|
unofficial partial gateways appearing over time. Such a |
5751 |
|
|
configuration is much more difficult to troubleshoot. |
5752 |
|
|
|
5753 |
|
|
A somewhat-related problem is the possibility of proprietary |
5754 |
|
|
material being disclosed unintentionally by a poster who |
5755 |
|
|
does not realize how far his words will propagate, either |
5756 |
|
|
from sheer misunderstanding or because of errors made (by |
5757 |
|
|
human or software) in followup preparation. There is little |
5758 |
|
|
that can be done about this except education. |
5759 |
|
|
|
5760 |
|
|
|
5761 |
|
|
11.2. Attacks |
5762 |
|
|
|
5763 |
|
|
Although the limitations of the medium restrict what can be |
5764 |
|
|
done to attack a host via news, some possibilities exist, |
5765 |
|
|
most of them problems news shares with mail. |
5766 |
|
|
|
5767 |
|
|
If reading agents are careless about transmitting non- |
5768 |
|
|
printable characters to output devices, malicious posters |
5769 |
|
|
may post articles containing control sequences ("letter- |
5770 |
|
|
bombs") meant to have various destructive effects on output |
5771 |
|
|
devices. Possible effects depend on the device, but they |
5772 |
|
|
can include hardware damage (e.g. by repeated writing of |
5773 |
|
|
values into configuration memories that can tolerate only a |
5774 |
|
|
limited number of write cycles) and security violation (e.g. |
5775 |
|
|
by reprogramming function keys potentially used by privi- |
5776 |
|
|
leged readers). |
5777 |
|
|
|
5778 |
|
|
A more sophisticated variation on the letterbomb is inclu- |
5779 |
|
|
sion of "Trojan horses" in programs. Obviously, readers |
5780 |
|
|
must be cautious about using software found in news, but |
5781 |
|
|
more subtly, reading agents must also exercise care. MIME |
5782 |
|
|
messages can include material that is executable in some |
5783 |
|
|
sense, such as PostScript documents (which are programs!), |
5784 |
|
|
and letterbombs may be introduced into such material. |
5785 |
|
|
|
5786 |
|
|
Given the presence of finite resources and other software |
5787 |
|
|
limitations, some degree of system disruption can be |
5788 |
|
|
achieved by posting otherwise-innocent material in great |
5789 |
|
|
volume, either in single huge articles (see section 4.6) or |
5790 |
|
|
in a stream of modest-sized articles. (Some would say that |
5791 |
|
|
the steady growth of Usenet volume constitutes a subtle and |
5792 |
|
|
unintentional attack of the latter type; certainly it can |
5793 |
|
|
have disruptive effects if administrators are inattentive.) |
5794 |
|
|
Systems need some ability to cope with surges, because sin- |
5795 |
|
|
gle huge articles occur occasionally as the result of soft- |
5796 |
|
|
ware error, innocent misunderstanding, or deliberate malice, |
5797 |
|
|
and downtime at upstream hosts can cause droughts, followed |
5798 |
|
|
by floods, of legitimate articles. (There is also a certain |
5799 |
|
|
amount of normal variation; for example, Usenet traffic is |
5800 |
|
|
noticeably lighter on weekends and during Christmas holi- |
5801 |
|
|
days, and rises noticeably at the start of the school term |
5802 |
|
|
of North American universities.) However, a site that |
5803 |
|
|
|
5804 |
|
|
|
5805 |
|
|
|
5806 |
|
|
2 June 1994 - 88 - expires 15 July 1994 |
5807 |
|
|
|
5808 |
|
|
|
5809 |
|
|
|
5810 |
|
|
|
5811 |
|
|
|
5812 |
|
|
INTERNET DRAFT to be NEWS sec. 11.2 |
5813 |
|
|
|
5814 |
|
|
|
5815 |
|
|
normally receives little traffic may be quite vulnerable to |
5816 |
|
|
"swamping" attack if its software is insufficiently careful. |
5817 |
|
|
|
5818 |
|
|
In general, careless implementation may open doors that are |
5819 |
|
|
not intrinsic to news. In particular, implementation of |
5820 |
|
|
control messages (see sections 6.6 and 7) and unbatchers |
5821 |
|
|
(see section 8.1 and 8.2) via a command interpreter requires |
5822 |
|
|
substantial precautions to ensure that only the intended |
5823 |
|
|
capabilities are available. Care must also be taken that |
5824 |
|
|
article-supplied text is not fed to programs that have |
5825 |
|
|
escapes to command interpreters. |
5826 |
|
|
|
5827 |
|
|
Finally, there is considerable potential for malice in the |
5828 |
|
|
sendsys, version, and whogets control messages. They are |
5829 |
|
|
not harmful to the hosts receiving them as news, but they |
5830 |
|
|
can be used to enlist those hosts (by the thousands) as |
5831 |
|
|
unwitting allies in a mail-swamping attack on a victim who |
5832 |
|
|
may not even receive news. The precautions discussed in |
5833 |
|
|
section 7.5 can reduce the potential for such attacks con- |
5834 |
|
|
siderably, but the hazard cannot be eliminated as long as |
5835 |
|
|
these control messages exist. |
5836 |
|
|
|
5837 |
|
|
|
5838 |
|
|
11.3. Anarchy |
5839 |
|
|
|
5840 |
|
|
The highly distributed nature of news propagation, and the |
5841 |
|
|
lack of adequate authentication protocols (especially for |
5842 |
|
|
use over the less-interactive transport mechanisms such as |
5843 |
|
|
UUCP), make article forgery relatively straightforward. It |
5844 |
|
|
may be possible to at least track a forgery to its source, |
5845 |
|
|
once it is recognized as such, but clever forgers can make |
5846 |
|
|
even that relatively difficult. The assumption that forg- |
5847 |
|
|
eries will be recognized as such is also not to be taken for |
5848 |
|
|
granted; readers are notoriously prone to blindly assuming |
5849 |
|
|
authenticity. If a forged article's initial path list |
5850 |
|
|
includes the relayer name of the supposed poster's host, the |
5851 |
|
|
article will never be sent to that host, and the alleged |
5852 |
|
|
author may learn about the forgery secondhand or not at all. |
5853 |
|
|
|
5854 |
|
|
A particularly noxious form of forgery is the forged "can- |
5855 |
|
|
cel" control message. Notably, it is relatively straight- |
5856 |
|
|
forward to write software that will automatically send out a |
5857 |
|
|
(forged) cancel message for any article meeting some crite- |
5858 |
|
|
rion, e.g. written by a specific author. The authentication |
5859 |
|
|
problems discussed in section 7.1 make it difficult to solve |
5860 |
|
|
this without crippling cancel's important functionality. |
5861 |
|
|
|
5862 |
|
|
A related problem is the possibility of disagreements over |
5863 |
|
|
newsgroup creation, on networks where such things are not |
5864 |
|
|
decided by central authorities. There have been cases of |
5865 |
|
|
"rmgroup wars", where one poster persistently sends out new- |
5866 |
|
|
group messages to create a newsgroup and another, equally |
5867 |
|
|
persistently, sends out rmgroup messages asking that it be |
5868 |
|
|
removed. This is not particularly damaging, if relayers are |
5869 |
|
|
|
5870 |
|
|
|
5871 |
|
|
|
5872 |
|
|
2 June 1994 - 89 - expires 15 July 1994 |
5873 |
|
|
|
5874 |
|
|
|
5875 |
|
|
|
5876 |
|
|
|
5877 |
|
|
|
5878 |
|
|
INTERNET DRAFT to be NEWS sec. 11.3 |
5879 |
|
|
|
5880 |
|
|
|
5881 |
|
|
configured to be cautious, but can cause serious confusion |
5882 |
|
|
among innocent third parties who just want to know whether |
5883 |
|
|
they can use the newsgroup for communication or not. |
5884 |
|
|
|
5885 |
|
|
|
5886 |
|
|
11.4. Liability |
5887 |
|
|
|
5888 |
|
|
News shares the legal uncertainty surrounding other forms of |
5889 |
|
|
electronic communication: what rules apply to this new |
5890 |
|
|
medium of information exchange? News is a particularly |
5891 |
|
|
problematic case because it is a broadcast medium rather |
5892 |
|
|
than a point-to-point one like mail, and analogies to older |
5893 |
|
|
forms of communication are particularly weak. |
5894 |
|
|
|
5895 |
|
|
Are news-carrying hosts common carriers, like the phone com- |
5896 |
|
|
panies, providing communications paths without having either |
5897 |
|
|
authority over or responsibility for content? Or are they |
5898 |
|
|
publishers, responsible for the content regardless of |
5899 |
|
|
whether they are aware of it or not? Or something in |
5900 |
|
|
between? Such questions are particularly significant when |
5901 |
|
|
the content is technically criminal, e.g. some types of sex- |
5902 |
|
|
ually-oriented material in some jurisdictions, in which case |
5903 |
|
|
ignorance of its presence may not be an adequate defence. |
5904 |
|
|
|
5905 |
|
|
Even in milder situations such as libel or copyright viola- |
5906 |
|
|
tion, the responsibilities of the poster, his host, and |
5907 |
|
|
other hosts carrying the traffic are unclear. Note, in par- |
5908 |
|
|
ticular, the problems arising when the article is a forgery, |
5909 |
|
|
or when the alleged author claims it is a forgery but cannot |
5910 |
|
|
prove this. |
5911 |
|
|
|
5912 |
|
|
|
5913 |
|
|
A. Archeological Notes |
5914 |
|
|
|
5915 |
|
|
|
5916 |
|
|
A.1. A-News Article Format |
5917 |
|
|
|
5918 |
|
|
The obsolete "A News" article format consisted of exactly |
5919 |
|
|
five lines of header information, followed by the body. For |
5920 |
|
|
example: |
5921 |
|
|
|
5922 |
|
|
Aeagle.642 |
5923 |
|
|
news.misc |
5924 |
|
|
cbosgd!mhuxj!mhuxt!eagle!jerry |
5925 |
|
|
Fri Nov 19 16:14:55 1982 |
5926 |
|
|
Usenet Etiquette - Please Read |
5927 |
|
|
body |
5928 |
|
|
body |
5929 |
|
|
body |
5930 |
|
|
|
5931 |
|
|
The first line consisted of an "A" followed by an article ID |
5932 |
|
|
(analogous to a message ID and used for similar purposes). |
5933 |
|
|
The second line was the list of newsgroups. The third line |
5934 |
|
|
was the path. The fourth was the date, in the format above |
5935 |
|
|
|
5936 |
|
|
|
5937 |
|
|
|
5938 |
|
|
2 June 1994 - 90 - expires 15 July 1994 |
5939 |
|
|
|
5940 |
|
|
|
5941 |
|
|
|
5942 |
|
|
|
5943 |
|
|
|
5944 |
|
|
INTERNET DRAFT to be NEWS sec. A.1 |
5945 |
|
|
|
5946 |
|
|
|
5947 |
|
|
(all fields fixed width), resembling an Internet date but |
5948 |
|
|
not quite the same. The fifth was the subject. |
5949 |
|
|
|
5950 |
|
|
This format is documented for archeological purposes only. |
5951 |
|
|
Do not generate articles in this format. |
5952 |
|
|
|
5953 |
|
|
|
5954 |
|
|
A.2. Early B-News Article Format |
5955 |
|
|
|
5956 |
|
|
The obsolete pseudo-Internet article format, used briefly |
5957 |
|
|
during the transition between the A News format and the mod- |
5958 |
|
|
ern format, followed the general outline of a MAIL message |
5959 |
|
|
but with some non-standard headers. For example: |
5960 |
|
|
|
5961 |
|
|
From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz) |
5962 |
|
|
Newsgroups: news.misc |
5963 |
|
|
Title: Usenet Etiquette -- Please Read |
5964 |
|
|
Article-I.D.: eagle.642 |
5965 |
|
|
Posted: Fri Nov 19 16:14:55 1982 |
5966 |
|
|
Received: Fri Nov 19 16:59:30 1982 |
5967 |
|
|
Expires: Mon Jan 1 00:00:00 1990 |
5968 |
|
|
|
5969 |
|
|
body |
5970 |
|
|
body |
5971 |
|
|
body |
5972 |
|
|
|
5973 |
|
|
The From header contained the information now found in the |
5974 |
|
|
Path header, plus possibly the full name now typically found |
5975 |
|
|
in the From header. The Title header contained what is now |
5976 |
|
|
the Subject content. The Posted header contained what is |
5977 |
|
|
now the Date content. The Article-I.D. header contained an |
5978 |
|
|
article ID, analogous to a message ID and used for similar |
5979 |
|
|
purposes. The Newsgroups and Expires headers were approxi- |
5980 |
|
|
mately as now. The Received header contained the date when |
5981 |
|
|
the latest relayer to process the article first saw it. All |
5982 |
|
|
dates were in the above format, with all fields fixed width, |
5983 |
|
|
resembling an Internet date but not quite the same. |
5984 |
|
|
|
5985 |
|
|
This format is documented for archeological purposes only. |
5986 |
|
|
Do not generate articles in this format. |
5987 |
|
|
|
5988 |
|
|
|
5989 |
|
|
A.3. Obsolete Headers |
5990 |
|
|
|
5991 |
|
|
Early versions of news software following the modern format |
5992 |
|
|
sometimes generated headers like the following: |
5993 |
|
|
|
5994 |
|
|
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP |
5995 |
|
|
Posting-Version: version B 2.10 2/13/83; site eagle.UUCP |
5996 |
|
|
Date-Received: Friday, 19-Nov-82 16:59:30 EST |
5997 |
|
|
|
5998 |
|
|
Relay-Version contained version information about the |
5999 |
|
|
relayer that last processed the article. Posting-Version |
6000 |
|
|
contained version information about the posting agent that |
6001 |
|
|
|
6002 |
|
|
|
6003 |
|
|
|
6004 |
|
|
2 June 1994 - 91 - expires 15 July 1994 |
6005 |
|
|
|
6006 |
|
|
|
6007 |
|
|
|
6008 |
|
|
|
6009 |
|
|
|
6010 |
|
|
INTERNET DRAFT to be NEWS sec. A.3 |
6011 |
|
|
|
6012 |
|
|
|
6013 |
|
|
posted the article. Date-Received contained the date when |
6014 |
|
|
the last relayer to process the article first saw it (in a |
6015 |
|
|
slightly nonstandard format). |
6016 |
|
|
|
6017 |
|
|
These headers are documented for archeological purposes |
6018 |
|
|
only. Do not generate articles using them. |
6019 |
|
|
|
6020 |
|
|
|
6021 |
|
|
A.4. Obsolete Control Messages |
6022 |
|
|
|
6023 |
|
|
There once was a senduuname control message, resembling |
6024 |
|
|
sendsys but requesting transmission of the list of hosts |
6025 |
|
|
that the receiving host had UUCP connections to. This |
6026 |
|
|
rapidly ceased to be of much use, and many organizations |
6027 |
|
|
consider information about their internal connectivity to be |
6028 |
|
|
confidential. |
6029 |
|
|
|
6030 |
|
|
Historically, a checkgroups body consisting of one or two |
6031 |
|
|
lines, the first of the form "-n newsgroup", caused check- |
6032 |
|
|
groups to apply to only that single newsgroup. This form is |
6033 |
|
|
documented for archeological purposes only; do not use it. |
6034 |
|
|
|
6035 |
|
|
Historically, an article posted to a newsgroup whose name |
6036 |
|
|
had exactly three components of which the third was "ctl" |
6037 |
|
|
signified that article was to be taken as a control message. |
6038 |
|
|
The Subject header specified the actions, in the same way |
6039 |
|
|
the Control header does now. This form is documented for |
6040 |
|
|
archeological purposes only; do not use it; do not implement |
6041 |
|
|
it. |
6042 |
|
|
|
6043 |
|
|
|
6044 |
|
|
B. A Quick Tour Of MIME |
6045 |
|
|
|
6046 |
|
|
(The editor wishes to thank Luc Rooijakkers; most of this |
6047 |
|
|
appendix is a lightly-edited version of a summary he kindly |
6048 |
|
|
supplied.) |
6049 |
|
|
|
6050 |
|
|
MIME (Multipurpose Internet Mail Extensions) is an upward- |
6051 |
|
|
compatible set of extensions to RFC 822, currently docu- |
6052 |
|
|
mented in RFCs 1341 and 1342. This appendix summarizes |
6053 |
|
|
these documents. See the MIME RFCs for more information; |
6054 |
|
|
they are very readable. |
6055 |
|
|
|
6056 |
|
|
UNRESOLVED ISSUE: These RFC numbers (here and |
6057 |
|
|
elsewhere in this Draft) need updating when the |
6058 |
|
|
new MIME RFCs come out. |
6059 |
|
|
|
6060 |
|
|
MIME defines the following new headers: |
6061 |
|
|
|
6062 |
|
|
|
6063 |
|
|
|
6064 |
|
|
|
6065 |
|
|
|
6066 |
|
|
|
6067 |
|
|
|
6068 |
|
|
|
6069 |
|
|
|
6070 |
|
|
2 June 1994 - 92 - expires 15 July 1994 |
6071 |
|
|
|
6072 |
|
|
|
6073 |
|
|
|
6074 |
|
|
|
6075 |
|
|
|
6076 |
|
|
INTERNET DRAFT to be NEWS sec. B |
6077 |
|
|
|
6078 |
|
|
|
6079 |
|
|
MIME-Version |
6080 |
|
|
Content-Type |
6081 |
|
|
Content-Transfer-Encoding |
6082 |
|
|
Content-ID |
6083 |
|
|
Content-Description |
6084 |
|
|
|
6085 |
|
|
|
6086 |
|
|
The MIME-Version header is mandatory for all messages con- |
6087 |
|
|
forming to the MIME specification and carries the version |
6088 |
|
|
number of the MIME specification. Example: |
6089 |
|
|
|
6090 |
|
|
MIME-Version: 1.0 |
6091 |
|
|
|
6092 |
|
|
|
6093 |
|
|
The Content-Type header indicates the content type of the |
6094 |
|
|
message. Content types are split into a top-level type and |
6095 |
|
|
a subtype, separated by a slash. Auxiliary information can |
6096 |
|
|
also be supplied, using an attribute-value notation. Exam- |
6097 |
|
|
ple: |
6098 |
|
|
|
6099 |
|
|
Content-Type: text/plain; charset=us-ascii |
6100 |
|
|
|
6101 |
|
|
(In the absence of a Content-Type header this is in fact the |
6102 |
|
|
default content type.) |
6103 |
|
|
|
6104 |
|
|
Important type/subtype combinations are |
6105 |
|
|
|
6106 |
|
|
text/plain Plain text, possibly in a non- |
6107 |
|
|
ASCII character set. |
6108 |
|
|
|
6109 |
|
|
text/enriched A very simple wordprocessor-like |
6110 |
|
|
language supporting character |
6111 |
|
|
attributes (e.g., underlining), |
6112 |
|
|
justification control, and multi- |
6113 |
|
|
ple character sets. (This pro- |
6114 |
|
|
posal has gone through several |
6115 |
|
|
iterations and has recently split |
6116 |
|
|
off from the main MIME RFCs into a |
6117 |
|
|
separate document.) |
6118 |
|
|
|
6119 |
|
|
message/rfc822 A mail message conforming to a |
6120 |
|
|
slightly-relaxed version of RFC |
6121 |
|
|
822. |
6122 |
|
|
|
6123 |
|
|
message/partial Part of a message (supporting the |
6124 |
|
|
transparent splitting and joining |
6125 |
|
|
of messages when they are too |
6126 |
|
|
large to be handled by some trans- |
6127 |
|
|
port agent). |
6128 |
|
|
|
6129 |
|
|
message/external-body A message whose body is external. |
6130 |
|
|
Possible access methods include |
6131 |
|
|
via mail, FTP, local file, etc. |
6132 |
|
|
|
6133 |
|
|
|
6134 |
|
|
|
6135 |
|
|
|
6136 |
|
|
2 June 1994 - 93 - expires 15 July 1994 |
6137 |
|
|
|
6138 |
|
|
|
6139 |
|
|
|
6140 |
|
|
|
6141 |
|
|
|
6142 |
|
|
INTERNET DRAFT to be NEWS sec. B |
6143 |
|
|
|
6144 |
|
|
|
6145 |
|
|
multipart/mixed A message whose body consists of |
6146 |
|
|
multiple parts, possibly of dif- |
6147 |
|
|
ferent types, intended to be |
6148 |
|
|
viewed in serial order. Each part |
6149 |
|
|
looks like an RFC 822 message, |
6150 |
|
|
consisting of headers and a body. |
6151 |
|
|
Most of the RFC 822 headers have |
6152 |
|
|
no defined semantics for body |
6153 |
|
|
parts. |
6154 |
|
|
|
6155 |
|
|
multipart/parallel Likewise, except that the parts |
6156 |
|
|
are intended to be viewed in par- |
6157 |
|
|
allel (on user agents that support |
6158 |
|
|
it). |
6159 |
|
|
|
6160 |
|
|
multipart/alternative Likewise, except that the parts |
6161 |
|
|
are intended to be semantically |
6162 |
|
|
equivalent such that the part that |
6163 |
|
|
best matches the capabilities of |
6164 |
|
|
the environment should be dis- |
6165 |
|
|
played. For example, a message |
6166 |
|
|
may include plain-text, enriched- |
6167 |
|
|
text, and postscript versions of |
6168 |
|
|
some document. |
6169 |
|
|
|
6170 |
|
|
multipart/digest A variant of multipart/mixed espe- |
6171 |
|
|
cially intended for message |
6172 |
|
|
digests (the default type of the |
6173 |
|
|
parts is message/rfc822 instead of |
6174 |
|
|
text/plain, saving on the number |
6175 |
|
|
of headers for the parts). |
6176 |
|
|
|
6177 |
|
|
application/postscript A PostScript document. |
6178 |
|
|
(PostScript is a trademark of |
6179 |
|
|
Adobe.) |
6180 |
|
|
|
6181 |
|
|
Other top-level types exist for still images, audio, and |
6182 |
|
|
video samples. |
6183 |
|
|
|
6184 |
|
|
Some of the above types require the ability to transport |
6185 |
|
|
binary data. Since the existing message systems usually do |
6186 |
|
|
not support this, MIME provides a Content-Transfer-Encoding |
6187 |
|
|
header to indicate the kind of encoding used. The possible |
6188 |
|
|
encodings are: |
6189 |
|
|
|
6190 |
|
|
7bit No encoding; the data consists of short |
6191 |
|
|
(less than 1000 characters) lines of |
6192 |
|
|
7-bit ASCII data, delimited by EOL |
6193 |
|
|
sequences. This is the default encod- |
6194 |
|
|
ing. |
6195 |
|
|
|
6196 |
|
|
8bit Like 7bit, except that bytes with the |
6197 |
|
|
high-order bit set may be present. |
6198 |
|
|
Many transmission paths are incapable |
6199 |
|
|
|
6200 |
|
|
|
6201 |
|
|
|
6202 |
|
|
2 June 1994 - 94 - expires 15 July 1994 |
6203 |
|
|
|
6204 |
|
|
|
6205 |
|
|
|
6206 |
|
|
|
6207 |
|
|
|
6208 |
|
|
INTERNET DRAFT to be NEWS sec. B |
6209 |
|
|
|
6210 |
|
|
|
6211 |
|
|
of carrying messages which use this |
6212 |
|
|
encoding. |
6213 |
|
|
|
6214 |
|
|
binary No encoding; any sequence of bytes may |
6215 |
|
|
be present. Many transmission paths |
6216 |
|
|
are incapable of carrying messages |
6217 |
|
|
which use this encoding. |
6218 |
|
|
|
6219 |
|
|
base64 The data is encoded by representing |
6220 |
|
|
every group of 3 bytes as 4 characters |
6221 |
|
|
from the alphabet "A-Za-z0-9+/", which |
6222 |
|
|
was chosen for its high robustness |
6223 |
|
|
through mail gateways (the alphabet |
6224 |
|
|
used by uuencode does not survive |
6225 |
|
|
ASCII-EBCDIC-ASCII translations). In |
6226 |
|
|
the final group of 4 characters, "=" is |
6227 |
|
|
used for those characters not repre- |
6228 |
|
|
senting data bytes. Line length is |
6229 |
|
|
limited and EOLs in the encoded form |
6230 |
|
|
are ignored. |
6231 |
|
|
|
6232 |
|
|
quoted-printable Any byte can be represented by a three |
6233 |
|
|
character "=XX" sequence where the X's |
6234 |
|
|
are upper case hexadecimal digits. |
6235 |
|
|
Bytes representing printable 7-bit US- |
6236 |
|
|
ASCII characters except "=" may be rep- |
6237 |
|
|
resented literally. Tabs and blanks |
6238 |
|
|
may be represented literally if not at |
6239 |
|
|
the end of a line. Line length is lim- |
6240 |
|
|
ited, and an EOL preceded by "=" was |
6241 |
|
|
inserted for this purpose and is not |
6242 |
|
|
present in the original. |
6243 |
|
|
|
6244 |
|
|
The base64 and quoted-printable encodings are applied to |
6245 |
|
|
data in Internet canonical form, which means that any EOL |
6246 |
|
|
encoded as anything but EOL must be an Internet canonical |
6247 |
|
|
EOL: CR followed by LF. |
6248 |
|
|
|
6249 |
|
|
The Content-Description header allows further description of |
6250 |
|
|
a body part, analogous to the use of Subject for messages. |
6251 |
|
|
|
6252 |
|
|
Finally, the Content-ID header can be used to assign an |
6253 |
|
|
identification to body parts, analogous to the assignment of |
6254 |
|
|
identifications to messages by Message-ID. |
6255 |
|
|
|
6256 |
|
|
Note that most of these headers are structured header |
6257 |
|
|
fields, as defined in RFC 822. Consequently, comments are |
6258 |
|
|
allowed in their values. The following is a legal MIME |
6259 |
|
|
header: |
6260 |
|
|
|
6261 |
|
|
Content-Type: (a comment) text (yeah) / |
6262 |
|
|
plain (and now some params:) ; charset= (guess what) |
6263 |
|
|
iso-8859-1 (we don't have iso-10646 yet, pity) |
6264 |
|
|
|
6265 |
|
|
|
6266 |
|
|
|
6267 |
|
|
|
6268 |
|
|
2 June 1994 - 95 - expires 15 July 1994 |
6269 |
|
|
|
6270 |
|
|
|
6271 |
|
|
|
6272 |
|
|
|
6273 |
|
|
|
6274 |
|
|
INTERNET DRAFT to be NEWS sec. B |
6275 |
|
|
|
6276 |
|
|
|
6277 |
|
|
NOTE: Although the MIME specification was devel- |
6278 |
|
|
oped for mail, there is nothing precluding its use |
6279 |
|
|
for news as well. While it might simplify imple- |
6280 |
|
|
mentation to restrict the MIME headers somewhat, |
6281 |
|
|
in the same way that other news headers (e.g. |
6282 |
|
|
From) are restricted subsets of the RFC-822 origi- |
6283 |
|
|
nals, this would add yet another divergence |
6284 |
|
|
between two formats that ought to be as compatible |
6285 |
|
|
as possible. In the case of the MIME headers, |
6286 |
|
|
there is no body of existing code posing compati- |
6287 |
|
|
bility concerns. A full-featured MIME reading |
6288 |
|
|
agent needs a full RFC-822 parser anyway, to prop- |
6289 |
|
|
erly handle body parts of types like mes- |
6290 |
|
|
sage/rfc822, so there is little gain from |
6291 |
|
|
restricting MIME headers. Adopting the MIME spec- |
6292 |
|
|
ification unchanged seems best. However, article- |
6293 |
|
|
level MIME headers must still comply with the |
6294 |
|
|
overall news header syntax given in section 4, so |
6295 |
|
|
that news software which is NOT interested in MIME |
6296 |
|
|
need not contain a full RFC-822 parser. |
6297 |
|
|
|
6298 |
|
|
The second part of MIME, RFC 1342 (Representation of Non- |
6299 |
|
|
ASCII Text in Internet Message Headers), addresses the prob- |
6300 |
|
|
lem of non-ASCII characters in headers. An example of a |
6301 |
|
|
header using the RFC 1342 mechanism is |
6302 |
|
|
|
6303 |
|
|
From: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be> |
6304 |
|
|
|
6305 |
|
|
Such encodings are allowed in selected headers, subject to |
6306 |
|
|
the restrictions listed in RFC 1342. |
6307 |
|
|
|
6308 |
|
|
The MIME effort has also produced an RFC defining a Content- |
6309 |
|
|
MD5 header [rrr 1544], containing an MD5-based "checksum" of |
6310 |
|
|
the contents of an article or body part, giving high confi- |
6311 |
|
|
dence of detecting accidental modifications to the contents. |
6312 |
|
|
|
6313 |
|
|
The "metamail" software package [rrr] helps provide MIME |
6314 |
|
|
support with minimal changes to mailers, and may also be |
6315 |
|
|
relevant to news reading agents. |
6316 |
|
|
|
6317 |
|
|
The PEM (Privacy Enhanced Mail) effort is pursuing analogous |
6318 |
|
|
facilities to offer stronger guarantees against malicious |
6319 |
|
|
modifications, unauthorized eavesdropping, and forgery. |
6320 |
|
|
This work too may be applicable to news, once it is recon- |
6321 |
|
|
ciled with MIME (by efforts now underway). |
6322 |
|
|
|
6323 |
|
|
|
6324 |
|
|
C. Summary of Changes Since RFC 1036 |
6325 |
|
|
|
6326 |
|
|
This Draft is much longer than RFC 1036, so there is obvi- |
6327 |
|
|
ously much change in content. Much of this is just |
6328 |
|
|
increased precision and rigor. Noteworthy changes and addi- |
6329 |
|
|
tions include: |
6330 |
|
|
|
6331 |
|
|
|
6332 |
|
|
|
6333 |
|
|
|
6334 |
|
|
2 June 1994 - 96 - expires 15 July 1994 |
6335 |
|
|
|
6336 |
|
|
|
6337 |
|
|
|
6338 |
|
|
|
6339 |
|
|
|
6340 |
|
|
INTERNET DRAFT to be NEWS sec. C |
6341 |
|
|
|
6342 |
|
|
|
6343 |
|
|
+ section 4.3's restrictions on article bodies |
6344 |
|
|
|
6345 |
|
|
+ all references to MIME facilities |
6346 |
|
|
|
6347 |
|
|
+ size limits on articles |
6348 |
|
|
|
6349 |
|
|
+ precise specification of Date-content syntax |
6350 |
|
|
|
6351 |
|
|
+ message IDs must never be re-used, ever |
6352 |
|
|
|
6353 |
|
|
+ "!" is the only Path delimiter |
6354 |
|
|
|
6355 |
|
|
+ multiple moderators in the Approved header |
6356 |
|
|
|
6357 |
|
|
+ rules on References trimming, and the _-_ mechanism |
6358 |
|
|
|
6359 |
|
|
+ generalization of the Xref rules |
6360 |
|
|
|
6361 |
|
|
+ multiple message IDs in Cancel and Supersedes |
6362 |
|
|
|
6363 |
|
|
+ Also-Control |
6364 |
|
|
|
6365 |
|
|
+ See-Also |
6366 |
|
|
|
6367 |
|
|
+ Article-Names |
6368 |
|
|
|
6369 |
|
|
+ Article-Replacing |
6370 |
|
|
|
6371 |
|
|
+ more precise rules for cancellation |
6372 |
|
|
|
6373 |
|
|
+ cancellation authorization based on From, not Sender |
6374 |
|
|
|
6375 |
|
|
+ "unmoderated" and descriptors in newgroup messages |
6376 |
|
|
|
6377 |
|
|
+ restrictive rules on handling of sendsys and version |
6378 |
|
|
messages |
6379 |
|
|
|
6380 |
|
|
+ the whogets control message |
6381 |
|
|
|
6382 |
|
|
+ precise specification of checkgroups messages |
6383 |
|
|
|
6384 |
|
|
+ compression type preferably specified out-of-band |
6385 |
|
|
|
6386 |
|
|
+ rules for encapsulating news in MIME mail |
6387 |
|
|
|
6388 |
|
|
+ tighter specification of relayer functioning (section |
6389 |
|
|
9.1) |
6390 |
|
|
|
6391 |
|
|
+ the "newsmaster" contact address |
6392 |
|
|
|
6393 |
|
|
+ rules for gatewaying (section 10) |
6394 |
|
|
|
6395 |
|
|
+ discussion of security issues (section 11) |
6396 |
|
|
|
6397 |
|
|
|
6398 |
|
|
|
6399 |
|
|
|
6400 |
|
|
2 June 1994 - 97 - expires 15 July 1994 |
6401 |
|
|
|
6402 |
|
|
|
6403 |
|
|
|
6404 |
|
|
|
6405 |
|
|
|
6406 |
|
|
INTERNET DRAFT to be NEWS sec. C |
6407 |
|
|
|
6408 |
|
|
|
6409 |
|
|
D. Summary of Completely New Features |
6410 |
|
|
|
6411 |
|
|
Most of this Draft merely documents existing practice, but |
6412 |
|
|
there are a few attempts to extend it. These are: |
6413 |
|
|
|
6414 |
|
|
TBW |
6415 |
|
|
|
6416 |
|
|
|
6417 |
|
|
E. Summary of Differences From RFC 822+1123 |
6418 |
|
|
|
6419 |
|
|
The following are noteworthy differences between this |
6420 |
|
|
Draft's articles and MAIL messages: |
6421 |
|
|
|
6422 |
|
|
+ generally less-permissive header syntax |
6423 |
|
|
|
6424 |
|
|
+ notably, limited From syntax |
6425 |
|
|
|
6426 |
|
|
+ MAIL header comments allowed in only a few contexts |
6427 |
|
|
|
6428 |
|
|
+ slightly more restricted message-ID syntax |
6429 |
|
|
|
6430 |
|
|
+ several more mandatory headers |
6431 |
|
|
|
6432 |
|
|
+ duplicate headers forbidden |
6433 |
|
|
|
6434 |
|
|
+ References/See-Also versus In-Reply-To/References |
6435 |
|
|
(section 6.5) |
6436 |
|
|
|
6437 |
|
|
+ case sensitivity in some contexts |
6438 |
|
|
|
6439 |
|
|
+ point-to-point headers, e.g. To and Cc, forbidden |
6440 |
|
|
(section 6) |
6441 |
|
|
|
6442 |
|
|
+ several new headers |
6443 |
|
|
|
6444 |
|
|
|
6445 |
|
|
References |
6446 |
|
|
|
6447 |
|
|
[Sanderson] "Smileys", David Sanderson, O'Reilly & Associ- |
6448 |
|
|
ates Ltd., 1993. |
6449 |
|
|
|
6450 |
|
|
TBW |
6451 |
|
|
|
6452 |
|
|
|
6453 |
|
|
Security Considerations |
6454 |
|
|
|
6455 |
|
|
Section 11 discusses security considerations in detail. |
6456 |
|
|
|
6457 |
|
|
|
6458 |
|
|
Author's Address |
6459 |
|
|
|
6460 |
|
|
|
6461 |
|
|
|
6462 |
|
|
|
6463 |
|
|
|
6464 |
|
|
|
6465 |
|
|
|
6466 |
|
|
2 June 1994 - 98 - expires 15 July 1994 |
6467 |
|
|
|
6468 |
|
|
|
6469 |
|
|
|
6470 |
|
|
|
6471 |
|
|
|
6472 |
|
|
INTERNET DRAFT to be NEWS sec. - |
6473 |
|
|
|
6474 |
|
|
|
6475 |
|
|
Henry Spencer |
6476 |
|
|
henry@zoo.toronto.edu |
6477 |
|
|
|
6478 |
|
|
SP Systems |
6479 |
|
|
Box 280 Stn. A |
6480 |
|
|
Toronto, Ont. M5W1B2 Canada |
6481 |
|
|
|
6482 |
|
|
|
6483 |
|
|
|
6484 |
|
|
|
6485 |
|
|
|
6486 |
|
|
|
6487 |
|
|
|
6488 |
|
|
|
6489 |
|
|
|
6490 |
|
|
|
6491 |
|
|
|
6492 |
|
|
|
6493 |
|
|
|
6494 |
|
|
|
6495 |
|
|
|
6496 |
|
|
|
6497 |
|
|
|
6498 |
|
|
|
6499 |
|
|
|
6500 |
|
|
|
6501 |
|
|
|
6502 |
|
|
|
6503 |
|
|
|
6504 |
|
|
|
6505 |
|
|
|
6506 |
|
|
|
6507 |
|
|
|
6508 |
|
|
|
6509 |
|
|
|
6510 |
|
|
|
6511 |
|
|
|
6512 |
|
|
|
6513 |
|
|
|
6514 |
|
|
|
6515 |
|
|
|
6516 |
|
|
|
6517 |
|
|
|
6518 |
|
|
|
6519 |
|
|
|
6520 |
|
|
|
6521 |
|
|
|
6522 |
|
|
|
6523 |
|
|
|
6524 |
|
|
|
6525 |
|
|
|
6526 |
|
|
|
6527 |
|
|
|
6528 |
|
|
|
6529 |
|
|
|
6530 |
|
|
|
6531 |
|
|
|
6532 |
|
|
2 June 1994 - 99 - expires 15 July 1994 |
6533 |
|
|
|
6534 |
|
|
|