2014-09-20 05:50:50 -05:00
|
|
|
# Store SpamAssassin bayes in SQL
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2015-11-04 08:45:57 -06:00
|
|
|
[TOC]
|
|
|
|
|
2014-09-16 05:29:23 -05:00
|
|
|
__THIS ARTICLE IS STILL A DRAFT, DO NOT APPLY IT IN PRODUCTION SERVER.__
|
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
## Summary
|
|
|
|
|
|
|
|
This article will guide you to configure related components to store
|
|
|
|
SpamAssassin Bayes data in SQL server, and allow webmail users to report spam
|
|
|
|
with one click.
|
2014-09-16 05:29:23 -05:00
|
|
|
|
|
|
|
Tested with:
|
|
|
|
|
2019-06-06 02:36:43 -05:00
|
|
|
* iRedMail-0.8.0, iRedMail-0.8.7.
|
2014-09-16 05:29:23 -05:00
|
|
|
* CentOS 6.2 (x86_64)
|
|
|
|
* SpamAssassin-3.3.1
|
|
|
|
* Amavisd-new-2.6.6
|
|
|
|
* MySQL-5.1.61
|
|
|
|
* Roundcubemail-0.7.2
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* This article should work with all iRedMail releases. We take iRedMail-0.8.0 for example.
|
|
|
|
* This article should work with all backends: OpenLDAP, MySQL, MariaDB, PostgreSQL. We take MySQL backend for example.
|
|
|
|
* This article should work with Amavisd-new-2.6.0 and later versions.
|
|
|
|
|
|
|
|
__IMPORTANT NOTE__:
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* The bayesian classifier can only score new messages if it already has 200
|
|
|
|
known spams and 200 known hams.
|
|
|
|
* If Spamassassin fails to identify a spam, teach it so it can do better next
|
|
|
|
time. e.g. Mark it as spam in roundcube webmail.
|
|
|
|
* Read `References` section at the end of this article before asking questions.
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
## Create required SQL database used to store bayes data
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
We need to create a SQL database and necessary tables to store SpamAssassin
|
|
|
|
bayes data. The RPM package installed on CentOS 6 doesn't ship SQL template
|
|
|
|
for bayes database, so we have to download it from Apache web site. We're
|
|
|
|
running SpamAssassin-3.3.1, so what we need is this SQL template file:
|
2015-04-19 03:22:17 -05:00
|
|
|
[http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_3_1/sql/bayes_mysql.sql](http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_3_1/sql/bayes_mysql.sql)
|
2014-09-17 04:32:53 -05:00
|
|
|
If you're running different version, please find the proper SQL file here:
|
|
|
|
[http://svn.apache.org/repos/asf/spamassassin/tags/](http://svn.apache.org/repos/asf/spamassassin/tags/).
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# cd /root/
|
|
|
|
# wget http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_3_1/sql/bayes_mysql.sql
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
|
|
|
Create MySQL database and import SQL template file:
|
2014-09-17 04:32:53 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# mysql -uroot -p
|
|
|
|
mysql> CREATE DATABASE sa_bayes;
|
|
|
|
mysql> USE sa_bayes;
|
|
|
|
mysql> SOURCE /root/bayes_mysql.sql;
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
Create a new MySQL user (with password `sa_user_password`) and grant
|
|
|
|
permissions. __IMPORTANT NOTE__: Please replace password `sa_user_password`
|
|
|
|
by your own password.
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
mysql> GRANT SELECT, INSERT, UPDATE, DELETE ON sa_bayes.* TO sa_user@localhost IDENTIFIED BY 'sa_user_password';
|
|
|
|
mysql> FLUSH PRIVILEGES;
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
## Enable Bayes modules in SpamAssassin
|
|
|
|
|
|
|
|
Edit `/etc/mail/spamassassin/local.cf`, add (or modify below settings):
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
use_bayes 1
|
|
|
|
bayes_auto_learn 1
|
|
|
|
bayes_auto_expire 1
|
|
|
|
|
|
|
|
# Store bayesian data in MySQL
|
|
|
|
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL
|
|
|
|
bayes_sql_dsn DBI:mysql:sa_bayes:127.0.0.1:3306
|
|
|
|
|
|
|
|
# Store bayesian data in PostgreSQL
|
|
|
|
#bayes_store_module Mail::SpamAssassin::BayesStore::PgSQL
|
|
|
|
#bayes_sql_dsn DBI:Pg:sa_bayes:127.0.0.1:5432
|
|
|
|
|
|
|
|
bayes_sql_username sa_user
|
|
|
|
bayes_sql_password sa_user_password
|
|
|
|
|
2016-12-09 06:15:44 -06:00
|
|
|
# Override the username used for storing data in the database.
|
|
|
|
# This could be used to group users together to share bayesian filter data.
|
|
|
|
# You can also use this config option to trick sa-learn to learn data as a
|
|
|
|
# specific user.
|
|
|
|
#
|
|
|
|
# In iRedMail, SpamAssassin is called by Amavisd, so we must set it to be
|
|
|
|
# same as Amavisd daemon user:
|
|
|
|
# - on Linux, it's user `amavis`.
|
|
|
|
# - on FreeBSD, it's user `vscan`.
|
|
|
|
# - on OpenBSD, it's user `_vscan`.
|
|
|
|
bayes_sql_override_username amavis
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
|
|
|
Make sure SpamAssassin will load bayes modules:
|
2014-09-17 04:32:53 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# /etc/init.d/amavisd stop
|
|
|
|
# amavisd -c /etc/amavisd/amavisd.conf debug 2>&1 | grep -i 'bayes'
|
|
|
|
May 16 09:59:33 ... SpamAssassin loaded plugins: ..., Bayes, ...
|
|
|
|
May 16 10:27:38 ... extra modules loaded after daemonizing/chrooting:
|
|
|
|
Mail/SpamAssassin/BayesStore/MySQL.pm, Mail/SpamAssassin/BayesStore/SQL.pm, ...
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2016-12-09 06:15:44 -06:00
|
|
|
Looks fine, now press `Ctrl-C` to terminate above command, and start Amavisd
|
|
|
|
service again normally:
|
2014-09-17 04:32:53 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# /etc/init.d/amavisd restart
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2016-12-09 06:15:44 -06:00
|
|
|
It is required to initialize the database by learning a message. We use the
|
2014-09-17 04:32:53 -05:00
|
|
|
sample spam email shipped in the RPM package provided by CentOS 6:
|
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# rpm -ql spamassassin | grep 'sample-spam'
|
|
|
|
/usr/share/doc/spamassassin-3.3.1/sample-spam.txt
|
|
|
|
|
|
|
|
# sa-learn --spam --username=vmail /usr/share/doc/spamassassin-3.3.1/sample-spam.txt
|
|
|
|
Learned tokens from 1 message(s) (1 message(s) examined)
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
## Enable Roundcube plugin: markasjunk2
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* We need a third-party Roundcube plugin to allow webmail users to report spam:
|
|
|
|
`Mark as Junk 2`. You can download it here:
|
|
|
|
[https://github.com/JohnDoh/Roundcube-Plugin-Mark-as-Junk-2/releases](https://github.com/JohnDoh/Roundcube-Plugin-Mark-as-Junk-2/releases)
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* After download, please uncompress it and copy it to roundcube plugins
|
|
|
|
directory: `/var/www/roundcubemail/plugins/`. Then we get a new directory:
|
|
|
|
`/var/www/roundcubemail/plugins/markasjunk2/`.
|
|
|
|
|
|
|
|
* Enter directory `/var/www/roundcubemail/plugins/markasjunk2/`, generate
|
|
|
|
config file by copying its sample config file:
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# cd /var/www/roundcubemail/plugins/markasjunk2/
|
|
|
|
# cp config.inc.php.dist config.inc.php
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* Edit `roundcubemail/plugins/markasjunk2/config.inc.php`, update below settings:
|
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
$rcmail_config['markasjunk2_learning_driver'] = 'cmd_learn';
|
|
|
|
$rcmail_config['markasjunk2_read_spam'] = true;
|
|
|
|
$rcmail_config['markasjunk2_unread_ham'] = false;
|
|
|
|
$rcmail_config['markasjunk2_move_spam'] = true;
|
|
|
|
$rcmail_config['markasjunk2_move_ham'] = true;
|
|
|
|
$rcmail_config['markasjunk2_mb_toolbar'] = true;
|
|
|
|
|
|
|
|
$rcmail_config['markasjunk2_spam_cmd'] = 'sa-learn --spam --username=vmail %f';
|
|
|
|
$rcmail_config['markasjunk2_ham_cmd'] = 'sa-learn --ham --username=vmail %f';
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* Enable this plugin in Roundcube config file
|
|
|
|
`/var/www/roundcubemail/config/main.inc.php` by appending `markasjunk2`
|
|
|
|
in plugin list:
|
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-17 04:32:53 -05:00
|
|
|
$rcmail_config['plugins'] = array(..., "markasjunk2");
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
* Learning driver `cmd_learn` requires PHP function `exec`, so we have to
|
|
|
|
remove it from PHP config file `/etc/php.ini`, parameter `disabled_functions`:
|
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# OLD SETTING
|
|
|
|
# disable_functions =show_source,system,shell_exec,passthru,exec,phpinfo,proc_open ;
|
|
|
|
|
|
|
|
# NEW SETTING. exec is removed.
|
|
|
|
disable_functions =show_source,system,shell_exec,passthru,phpinfo,proc_open ;
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
|
|
|
* Restarting Apache web server.
|
|
|
|
|
|
|
|
You will see a new toolbar button after logging into Roundcube webmail:
|
|
|
|
|
2017-10-28 00:38:13 -05:00
|
|
|
![](./images/markasjunk2_toolbar_button.png)
|
2014-09-17 04:32:53 -05:00
|
|
|
|
|
|
|
Check SQL database `sa_bayes` before we testing this plugin:
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# mysql -uroot -p
|
|
|
|
mysql> USE sa_bayes;
|
|
|
|
mysql> SELECT COUNT(*) FROM bayes_token;
|
|
|
|
+----------+
|
|
|
|
| count(*) |
|
|
|
|
+----------+
|
|
|
|
| 65 |
|
|
|
|
+----------+
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
|
2014-09-17 04:32:53 -05:00
|
|
|
Back to Roundcube webmail, select a spam email (or a testing email), click
|
|
|
|
`Mark as Junk` button, then this email will be scanned by command `sa-learn`.
|
|
|
|
Check database `sa_bayes` again to make sure it's working:
|
|
|
|
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-16 05:29:23 -05:00
|
|
|
# mysql -uroot -p
|
|
|
|
mysql> USE sa_bayes;
|
|
|
|
mysql> SELECT COUNT(*) FROM bayes_token;
|
|
|
|
+----------+
|
|
|
|
| count(*) |
|
|
|
|
+----------+
|
|
|
|
| 143 |
|
|
|
|
+----------+
|
2014-09-20 05:55:33 -05:00
|
|
|
```
|
2014-09-17 04:32:53 -05:00
|
|
|
|
2014-09-16 05:29:23 -05:00
|
|
|
Note: You may get different result number as shown above.
|
2014-09-17 04:32:53 -05:00
|
|
|
|
|
|
|
So far so good. That's all we need to do.
|
|
|
|
|
|
|
|
## References
|
|
|
|
|
|
|
|
* [Bayes Introduction](http://wiki.apache.org/spamassassin/BayesInSpamAssassin). Please do read section `Things to remember`.
|
2014-09-16 05:29:23 -05:00
|
|
|
* [SpamAssassin Bayes Frequently Asked Questions](http://wiki.apache.org/spamassassin/BayesFaq)
|