This is an alpha version of the new version of the Three Strikes Spam Filter Plugin for WordPress. This version has various bugfixes and a hidden gem that I have wanted to work with for some time.
This version of the filter has a Naive Bayesian filter added to the other checks that were already being performed by the Three Strikes Plugin. The plugin, once installed, learns from all the comments that are posted on your blog. It then waits for spam. Once a comment is determined to be spam, its Bayes filter learns about the spam comment. Kitten’s Spam Words is added to the interface to simplify addition of spam words and the install instructions are included within the download zip. I suggest the use of my dynamic list of spam words as a starting point for the plugin.
This is termed alpha software for a reason. It is developmental code and is barely tested. There will be bugs and some things will break. Also, the bayes filter is only as good as how much it has learnt from your blog and its comments. So it will get much better with time, but time it will need. I have included a TODO inside the plugin for those interested in the code and I welcome suggestions and bugfixes. This is released under the GPL.
Link to generate dynamic list of Common Spam Words.
EDIT: Now with trackback checking in the download
Download link isn’t working.
Sorry about that, try it now.
Couple of problems, the newly created tables do not have the $table_prefix infront of them. And in trainfiltergood.php this line is hardcoded and shouldn’t be $query = “select * from 12_comments order by comment_ID desc”;
Good catches. I have fixed the table prefix in trainfiltergood.php
The naive bayesian link needs fixing.
Neato work!
Mark: Since you included Kitten’s SpamWords, I’ve found that I have to deactivate her plugin to get Three Strikes to work. I assume that this is a bug, not a feature. 🙂
Yes, it is a bug. I will try to get that into the readme tonight. Thanks for the catch Geoff!