212 lines
8.0 KiB
HTML
212 lines
8.0 KiB
HTML
|
<html>
|
||
|
<head>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||
|
<title> How to store spamassassin bayes in SQL</title>
|
||
|
<link href="../css/markdown.css" rel="stylesheet"></head>
|
||
|
</head>
|
||
|
|
||
|
<body>
|
||
|
<h1 id="how-to-store-spamassassin-bayes-in-sql">How to store spamassassin bayes in SQL</h1>
|
||
|
<p><strong>THIS ARTICLE IS STILL A DRAFT, DO NOT APPLY IT IN PRODUCTION SERVER.</strong></p>
|
||
|
<h2 id="summary">Summary</h2>
|
||
|
<p>This article will guide you to configure related components to store
|
||
|
SpamAssassin Bayes data in SQL server, and allow webmail users to report spam
|
||
|
with one click.</p>
|
||
|
<p>Tested with:</p>
|
||
|
<ul>
|
||
|
<li>iRedMail-0.8.0, iRedMail-0.8.7. </li>
|
||
|
<li>CentOS 6.2 (x86_64)</li>
|
||
|
<li>SpamAssassin-3.3.1</li>
|
||
|
<li>Amavisd-new-2.6.6</li>
|
||
|
<li>MySQL-5.1.61</li>
|
||
|
<li>Roundcubemail-0.7.2</li>
|
||
|
</ul>
|
||
|
<p>Notes:</p>
|
||
|
<ul>
|
||
|
<li>This article should work with all iRedMail releases. We take iRedMail-0.8.0 for example.</li>
|
||
|
<li>This article should work with all backends: OpenLDAP, MySQL, MariaDB, PostgreSQL. We take MySQL backend for example.</li>
|
||
|
<li>This article should work with Amavisd-new-2.6.0 and later versions.</li>
|
||
|
</ul>
|
||
|
<p><strong>IMPORTANT NOTE</strong>:</p>
|
||
|
<ul>
|
||
|
<li>The bayesian classifier can only score new messages if it already has 200
|
||
|
known spams and 200 known hams.</li>
|
||
|
<li>If Spamassassin fails to identify a spam, teach it so it can do better next
|
||
|
time. e.g. Mark it as spam in roundcube webmail.</li>
|
||
|
<li>Read <code>References</code> section at the end of this article before asking questions.</li>
|
||
|
</ul>
|
||
|
<h2 id="create-required-sql-database-used-to-store-bayes-data">Create required SQL database used to store bayes data</h2>
|
||
|
<p>We need to create a SQL database and necessary tables to store SpamAssassin
|
||
|
bayes data. The RPM package installed on CentOS 6 doesn't ship SQL template
|
||
|
for bayes database, so we have to download it from Apache web site. We're
|
||
|
running SpamAssassin-3.3.1, so what we need is this SQL template file:
|
||
|
http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_3_1/sql/bayes_mysql.sql.
|
||
|
If you're running different version, please find the proper SQL file here:
|
||
|
<a href="http://svn.apache.org/repos/asf/spamassassin/tags/">http://svn.apache.org/repos/asf/spamassassin/tags/</a>.</p>
|
||
|
<pre>
|
||
|
# cd /root/
|
||
|
# wget http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_3_1/sql/bayes_mysql.sql
|
||
|
</pre>
|
||
|
|
||
|
<p>Create MySQL database and import SQL template file:</p>
|
||
|
<pre>
|
||
|
# mysql -uroot -p
|
||
|
mysql> CREATE DATABASE sa_bayes;
|
||
|
mysql> USE sa_bayes;
|
||
|
mysql> SOURCE /root/bayes_mysql.sql;
|
||
|
</pre>
|
||
|
|
||
|
<p>Create a new MySQL user (with password <code>sa_user_password</code>) and grant
|
||
|
permissions. <strong>IMPORTANT NOTE</strong>: Please replace password <code>sa_user_password</code>
|
||
|
by your own password.</p>
|
||
|
<pre>
|
||
|
mysql> GRANT SELECT, INSERT, UPDATE, DELETE ON sa_bayes.* TO sa_user@localhost IDENTIFIED BY 'sa_user_password';
|
||
|
mysql> FLUSH PRIVILEGES;
|
||
|
</pre>
|
||
|
|
||
|
<h2 id="enable-bayes-modules-in-spamassassin">Enable Bayes modules in SpamAssassin</h2>
|
||
|
<p>Edit <code>/etc/mail/spamassassin/local.cf</code>, add (or modify below settings):</p>
|
||
|
<pre>
|
||
|
use_bayes 1
|
||
|
bayes_auto_learn 1
|
||
|
bayes_auto_expire 1
|
||
|
|
||
|
# Store bayesian data in MySQL
|
||
|
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
|
||
|
bayes_sql_dsn DBI:mysql:sa_bayes:127.0.0.1:3306
|
||
|
|
||
|
# Store bayesian data in PostgreSQL
|
||
|
#bayes_store_module Mail::SpamAssassin::BayesStore::PgSQL
|
||
|
#bayes_sql_dsn DBI:Pg:sa_bayes:127.0.0.1:5432
|
||
|
|
||
|
bayes_sql_username sa_user
|
||
|
bayes_sql_password sa_user_password
|
||
|
|
||
|
# Override the username used for storing
|
||
|
# data in the database. This could be used to group users together to
|
||
|
# share bayesian filter data. You can also use this config option to
|
||
|
# trick sa-learn to learn data as a specific user.
|
||
|
bayes_sql_override_username vmail
|
||
|
</pre>
|
||
|
|
||
|
<p>Make sure SpamAssassin will load bayes modules:</p>
|
||
|
<pre>
|
||
|
# /etc/init.d/amavisd stop
|
||
|
# amavisd -c /etc/amavisd/amavisd.conf debug 2>&1 | grep -i 'bayes'
|
||
|
May 16 09:59:33 ... SpamAssassin loaded plugins: ..., Bayes, ...
|
||
|
May 16 10:27:38 ... extra modules loaded after daemonizing/chrooting:
|
||
|
Mail/SpamAssassin/BayesStore/MySQL.pm, Mail/SpamAssassin/BayesStore/SQL.pm, ...
|
||
|
</pre>
|
||
|
|
||
|
<p>Looks fine. Now press <code>Ctrl-C</code> to terminate above command.</p>
|
||
|
<p>Start Amavisd service:</p>
|
||
|
<pre>
|
||
|
# /etc/init.d/amavisd restart
|
||
|
</pre>
|
||
|
|
||
|
<p>It is required we initialize the database by learning a message. We use the
|
||
|
sample spam email shipped in the RPM package provided by CentOS 6:</p>
|
||
|
<pre>
|
||
|
# rpm -ql spamassassin | grep 'sample-spam'
|
||
|
/usr/share/doc/spamassassin-3.3.1/sample-spam.txt
|
||
|
|
||
|
# sa-learn --spam --username=vmail /usr/share/doc/spamassassin-3.3.1/sample-spam.txt
|
||
|
Learned tokens from 1 message(s) (1 message(s) examined)
|
||
|
</pre>
|
||
|
|
||
|
<h2 id="enable-roundcube-plugin-markasjunk2">Enable Roundcube plugin: markasjunk2</h2>
|
||
|
<ul>
|
||
|
<li>
|
||
|
<p>We need a third-party Roundcube plugin to allow webmail users to report spam:
|
||
|
<code>Mark as Junk 2</code>. You can download it here:
|
||
|
<a href="https://github.com/JohnDoh/Roundcube-Plugin-Mark-as-Junk-2/releases">https://github.com/JohnDoh/Roundcube-Plugin-Mark-as-Junk-2/releases</a></p>
|
||
|
</li>
|
||
|
<li>
|
||
|
<p>After download, please uncompress it and copy it to roundcube plugins
|
||
|
directory: <code>/var/www/roundcubemail/plugins/</code>. Then we get a new directory:
|
||
|
<code>/var/www/roundcubemail/plugins/markasjunk2/</code>.</p>
|
||
|
</li>
|
||
|
<li>
|
||
|
<p>Enter directory <code>/var/www/roundcubemail/plugins/markasjunk2/</code>, generate
|
||
|
config file by copying its sample config file:</p>
|
||
|
</li>
|
||
|
</ul>
|
||
|
<pre>
|
||
|
# cd /var/www/roundcubemail/plugins/markasjunk2/
|
||
|
# cp config.inc.php.dist config.inc.php
|
||
|
</pre>
|
||
|
|
||
|
<ul>
|
||
|
<li>Edit <code>roundcubemail/plugins/markasjunk2/config.inc.php</code>, update below settings:</li>
|
||
|
</ul>
|
||
|
<pre>
|
||
|
$rcmail_config['markasjunk2_learning_driver'] = 'cmd_learn';
|
||
|
$rcmail_config['markasjunk2_read_spam'] = true;
|
||
|
$rcmail_config['markasjunk2_unread_ham'] = false;
|
||
|
$rcmail_config['markasjunk2_move_spam'] = true;
|
||
|
$rcmail_config['markasjunk2_move_ham'] = true;
|
||
|
$rcmail_config['markasjunk2_mb_toolbar'] = true;
|
||
|
|
||
|
$rcmail_config['markasjunk2_spam_cmd'] = 'sa-learn --spam --username=vmail %f';
|
||
|
$rcmail_config['markasjunk2_ham_cmd'] = 'sa-learn --ham --username=vmail %f';
|
||
|
</pre>
|
||
|
|
||
|
<ul>
|
||
|
<li>Enable this plugin in Roundcube config file
|
||
|
<code>/var/www/roundcubemail/config/main.inc.php</code> by appending <code>markasjunk2</code>
|
||
|
in plugin list:</li>
|
||
|
</ul>
|
||
|
<pre>
|
||
|
$rcmail_config['plugins'] = array(..., "markasjunk2");
|
||
|
</pre>
|
||
|
|
||
|
<ul>
|
||
|
<li>Learning driver <code>cmd_learn</code> requires PHP function <code>exec</code>, so we have to
|
||
|
remove it from PHP config file <code>/etc/php.ini</code>, parameter <code>disabled_functions</code>:</li>
|
||
|
</ul>
|
||
|
<pre>
|
||
|
# OLD SETTING
|
||
|
# disable_functions =show_source,system,shell_exec,passthru,exec,phpinfo,proc_open ;
|
||
|
|
||
|
# NEW SETTING. exec is removed.
|
||
|
disable_functions =show_source,system,shell_exec,passthru,phpinfo,proc_open ;
|
||
|
</pre>
|
||
|
|
||
|
<ul>
|
||
|
<li>Restarting Apache web server.</li>
|
||
|
</ul>
|
||
|
<p>You will see a new toolbar button after logging into Roundcube webmail:</p>
|
||
|
<p><img alt="" src="../images/Markasjunk2_toolbar_button.png" /></p>
|
||
|
<p>Check SQL database <code>sa_bayes</code> before we testing this plugin:</p>
|
||
|
<pre>
|
||
|
# mysql -uroot -p
|
||
|
mysql> USE sa_bayes;
|
||
|
mysql> SELECT COUNT(*) FROM bayes_token;
|
||
|
+----------+
|
||
|
| count(*) |
|
||
|
+----------+
|
||
|
| 65 |
|
||
|
+----------+
|
||
|
</pre>
|
||
|
|
||
|
<p>Back to Roundcube webmail, select a spam email (or a testing email), click
|
||
|
<code>Mark as Junk</code> button, then this email will be scanned by command <code>sa-learn</code>.
|
||
|
Check database <code>sa_bayes</code> again to make sure it's working:</p>
|
||
|
<pre>
|
||
|
# mysql -uroot -p
|
||
|
mysql> USE sa_bayes;
|
||
|
mysql> SELECT COUNT(*) FROM bayes_token;
|
||
|
+----------+
|
||
|
| count(*) |
|
||
|
+----------+
|
||
|
| 143 |
|
||
|
+----------+
|
||
|
</pre>
|
||
|
|
||
|
<p>Note: You may get different result number as shown above.</p>
|
||
|
<p>So far so good. That's all we need to do.</p>
|
||
|
<h2 id="references">References</h2>
|
||
|
<ul>
|
||
|
<li><a href="http://wiki.apache.org/spamassassin/BayesInSpamAssassin">Bayes Introduction</a>. Please do read section <code>Things to remember</code>.</li>
|
||
|
<li><a href="http://wiki.apache.org/spamassassin/BayesFaq">SpamAssassin Bayes Frequently Asked Questions</a></li>
|
||
|
</ul></body></html>
|