File:  [mozdev] / bayesjunktool / www / installation.html
Revision 1.15: download - view: text, annotated - select for diffs - revision graph
Wed Apr 6 06:27:01 2005 UTC (12 years, 6 months ago) by brantgurga
Branches: MAIN
CVS tags: HEAD
version number fix

<!-- MAIN CONTENT -->
<h5 class="page-header"><a id="content" name="content">Download</a></h5>

<table class="downloadTable" cellspacing="0">
	<tr>
		<td class="downloadDescriptor" colspan="4">Bayes Junk Tool</td>
	</tr>

	<tr>
		<td class="downloadHeader">Description</td>
		<td class="downloadHeader">Version</td>
		<td class="downloadHeader">Size</td>
		<td class="downloadHeader">Requirements</td>
	</tr>
	
	<tr>
                <td><a href="http://downloads.mozdev.org/bayesjunktool/bayesjunktool-0.2.1.jar">Binary-only maintenance release 0.2.1</a></td>
                <td class="center">0.2.1</td>
                <td class="center">39k</td>
                <td>
			<a href="http://java.sun.com/j2se/1.4.2/download.html">Java Runtime Edition, version 1.4 or higher</a>
		</td>
	</tr>

	<tr>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/bayesjunktool-src-0.2.zip">Source-only download</a></td>
		<td class="center">0.2</td>
		<td class="center">489k</td>
		<td>
			<a href="http://java.sun.com/j2se/1.4.2/download.html">Java Development Kit, version 1.4 or higher</a>
		</td>
	</tr>

	<tr>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/bayesjunktool-bin-0.2.zip">Source and binary (Java Class) download</a></td>
		<td class="center">0.2</td>
		<td class="center">533k</td>
		<td>
			<a href="http://java.sun.com/j2se/1.4.2/download.html">Java Runtime Edition, version 1.4 or higher</a>
		</td>
	</tr>

	<tr>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/bayesjunktool-src-0.1.zip">Source-only download</a></td>
		<td class="center">0.1</td>
		<td class="center">34k</td>
		<td>
			<a href="http://java.sun.com/j2se/1.4.2/download.html">Java Development Kit, version 1.4 or higher</a>
		</td>
	</tr>

	<tr>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/bayesjunktool-bin-0.1.zip">Source and binary (Java Class) download</a></td>
		<td class="center">0.1</td>
		<td class="center">98k</td>
		<td>
			<a href="http://java.sun.com/j2se/1.4.2/download.html">Java Runtime Edition, version 1.4 or higher</a>
		</td>
	</tr>

	<tr>
		<td class="downloadDescriptor" colspan="4">Sample XML and DAT Token Files</td>
	</tr>

	<tr>
		<td class="downloadHeader" colspan="2">Description</td>
		<td class="downloadHeader">Size</td>
		<td class="downloadHeader">File name</td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from my training.dat. Only tokens which had at least 20 good or bad hits were included. I seem to get a high ratio of Spanish and Portuguese spam, so it may more readily filter those out than other types of spam (for example, Chinese).</td>
		<td class="center">210k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/straxus.xml">straxus.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">66k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/straxus.dat">straxus.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Rob Stow's training.dat. Only tokens which had at least 5 good or bad hits were included. It is optimized for "get rich quick" schemes, pr0n, and some email virii.</td>
		<td class="center">195k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/robstow.xml">robstow.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">55k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/robstow.dat">robstow.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Morten Hansen's training.dat. Only tokens which had at least 20 good or bad hits were included. It is not yet known what kinds of spam this token file is optimized for.</td>
		<td class="center">1032k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/mhansen.xml">mhansen.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">311k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/mhansen.dat">mhansen.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Dmitry Diskin's training.dat. Only tokens which had at least 5 good or bad hits were included. It is optimized for Russian spam.</td>
		<td class="center">129k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/ddiskin.xml">ddiskin.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">41k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/ddiskin.dat">ddiskin.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Christian Hamacher's training.dat. Only tokens which had at least 20 good or bad hits were included. It is optimized for allowing German emails and English emails with technical terms while eliminating most HTML spam.</td>
		<td class="center">276k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/chamacher.xml">chamacher.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">90k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/chamacher.dat">chamacher.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Jan Gundtofte-Bruun's training.dat. Only tokens which had at least 5 good or bad hits were included. It is optimized for mostly English spam (mortgages, pills, loans, etc).</td>
		<td class="center">175k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/jangb.xml">jangb.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">50k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/jangb.dat">jangb.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Oliver Putz's training.dat. Only tokens which had at least 20 good or bad hits were included. It is optimized for allowing German emails.</td>
		<td class="center">1038k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/oputz.xml">oputz.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">316k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/oputz.dat">oputz.dat</a></td>
	</tr>

	<tr>
		<td colspan="2" rowspan="2">This is a copy of the tokens from Will Smith's training.dat. Only tokens which had at least 20 good or bad hits were included. It is optimized for allowing emails that a busy wembaster would receive (such as cron job output, statistics, security notices, wikipedia changes, and emails relating to open source software) as well as eBay auction notices while rejecting most English and Chinese spam.</td>
		<td class="center">894k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/wsmith.xml">wsmith.xml</a></td>
	</tr>

	<tr>
		<!-- First 2 cells are taken by previous TD of rowspan 2 -->
		<td class="center">275k</td>
		<td><a href="http://downloads.mozdev.org/bayesjunktool/wsmith.dat">wsmith.dat</a></td>
	</tr>
</table>

<p>
	To merge one of the sample token files with your own training.dat, please do the following:
</p>

<ol>
	<li>Start up the Bayes Junk Tool in GUI mode (-g command-line switch)</li>
	<li>Under the File menu, select "Import and Merge..." (or press Ctrl-I)</li>
	<li>Select the XML or dat file which was downloaded, and press OK. This will merge the selected file into the existing set of tokens. Please be patient with XML files, this may take a little while (see <a href="http://mozdev.org/bugs/show_bug.cgi?id=3947">bug 3947</a>)</li>
	<li>Select "Save As..." from the File menu (or press Ctrl-S) and save as a Data file. Name the file training.dat.</li>
	<li>When Mozilla is fully closed (including QuickLaunch), copy this saved training.dat over top of your existing training.dat in your Mozilla profile folder. It is always wise to make backups before copying over profile files, so keep that in mind.</li>
</ol>

<p>
	You should notice an immediate increase in your Junk Mail filter's effectiveness.
</p>

<p>
	If you would like to upload either your training.dat or your exported XML training file so that others can benefit from it, please email it to me at <a href="mailto:did_you_know_that_straxus_dislikes_spam@baynet.net">str<!-- Spammers do it, why not me? -->ax<!-- and another one -->us@<!-- and yet another one -->bay<!--  and one last one for good measure -->net.net</a>. Please note when using this link that you will need to remove everything except "straxus" from the beginning of the email address. I will add instructions about how to create a "nice" XML token file later, but for now feel free to send me what you have.
</p>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>