Character entity reference like strings in 2ch threads ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This directory contains results for a quick survey on usage of character entity reference like strings (/&[0-9A-Za-z]+;?/) in 28,085 *biased* dat file collection, as of Auguest 2007, containing 11,332,194 res, from 2ch and similar BBS Web sites. * Files result-all.txt List of entities sorted by occurence. result-res.txt List of entities sorted by occurence, counting more than one occurence of an entity in a res as one. result-dat.txt List of entities sorted by occurence, counting more than one occurence of an entity in a thread as one. all-result.txt Source for result files above, in Perl Data::Dumper output format. It's a Perl array reference representing: [number_of_res, {entity => occurence_in_res_number}, number_of_threads, {entity => occurence_in_thread_number}, {entity => occurence}]. * Glossary Dat file A file representing a thread, which consists of a number of "res". Formatted HTML documents provided for Web browsers are generated from dat files. Dat files might contain some HTML markup including character entity references. Thread A unit of sequential collection of messages in 2ch and similar BBS, discussing a topic. A thread is part of a board. Res A message posted by a user to 2ch or similar BBS. A res belongs to a thread.