File:  [mozdev] / bayesjunktool / doc / README
Revision 1.1: download - view: text, annotated - select for diffs - revision graph
Tue Jul 1 01:13:49 2003 UTC (15 years, 8 months ago) by straxus
Branches: MAIN
CVS tags: HEAD

Checked in source and documentation files that were included in the first
release of the junk mail tool (the one that was released to the
newsgroups).

README for Mozilla Bayesian Filter Training File Analyzer ver. 0.1

/****************\
|* INTRODUCTION *|
\****************/

This is the first release of this tool, so there's not going to be
a high level of refinement about it (although I have done my best).
At this point, I'll turn the doc over to the disclaimer found at
the top of all of my source files:

 * The terms for using this software are as follows:
 * 
 * USE AT YOUR OWN RISK - if this program goes insane and takes
 * out several bystanders, don't come knocking on my door with
 * lawyers.
 * 
 * If you want to extend or use this software for some sort of
 * commercial (read: money-making) software, tell me about it
 * first. I probably won't ask for a cut because the software
 * isn't that complicated, but I do want to know where my little
 * baby heads after it leaves my machine.
 * 
 * If you have any questions about this program, feel free to
 * email me at straxus@baynet.net. I'd love to hear how this
 * program worked for you, or any suggestions or bugfixes that
 * you believe this software should use. I believe that software
 * should evolve and become better, so there's an extremely good
 * chance your suggestion will make it into the next version.
 * 
 * Oh, and for those of you curious about the author's (my) name,
 * just email and ask. :)

/****************\
|* REQUIREMENTS *|
\****************/

* Java 2 Standard Edition 1.4.1 (Due to the requirement for an
XML parser to import data from XML - if you remove the one method
that does XML importing in Analyzer.java, it will then only need
Java 1.3.1)

/************\
|* FEATURES *|
\************/

* Viewing of data contained with Mozilla's training.dat
* Exporting of data as HTML, XML, plain text, or well-formed .dat
(you can take a .dat and drop it in the Mozilla folder, and it
should work perfectly)
* GUI which allows adding new tokens, removing tokens, and editing
the counts associated with each token.
* Sorting of data on any column in the GUI. This allows you to see,
for example, the most frequently encountered good and bad tokens in
email.
* Importing of data from an existing training.dat or XML file and
merging with an existing training.dat. I believe this last feature
is important as it will allow a new user to get up and running very
quickly by importing a well-known XML file containing useful values
for spam tokens, thus greatly reducing the training period for
Mozilla's mail filters.

Valid command-line arguments for this program are:

-q, --quiet					== silent execution of program
-g, --gui					== start up GUI version of program
-h, -?, --help					== display program usage (this message)
-o, --outputfile [filename]			== path to program output file
-f, --format [ xml | html | text | data ]	== program output format
-i, --inputfile [filename] 			== path to Mozilla training.dat

Please note that the input file must include the training.dat
filename, e.g. [path-to-profile]/xxxxxxxx.slt/training.dat

/*************\
|* EXECUTION *|
\*************/

To build the program, type the following in the installation
directory of the program:

javac -d . mozilla_training_analyzer\*.java

Adjust the directory separator as required for your platform.

After compilation, to run the program, type:

java -cp . mozilla_training_analyzer.Analyzer [Analyzer options]

/***********\
|* HISTORY *|
\***********/

June 23rd, 2003 - Release 0.1
* First release, baybee.

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>