Version 27 (modified by 17 years ago) ( diff ) | ,
---|
Trac Spam Filtering
A plugin is being developed that will allow different ways to reject contributions that contain spam. This plugin requires Trac release 0.10. It should also work with 0.11dev now that #4611 is fixed.
Supported Filtering Strategies
Prior to version 0.2.x, the SpamFilter plugin would reject a submission if any single filter strategy said it was spam. Since 0.2, the individual strategies assign scores (“karma”) to submitted content, and the total karma determines whether a submission is rejected or not.
Regular Expressions
The regex filter reads a list of regular expressions from a wiki page named “BadContent”, each regular expression being on a separate line inside the first code block on the page, using the Python syntax for regular expressions.
If any of those regular expressions matches the submitted content, the submission will be rejected.
IP Blacklisting
The ip_blacklist filter uses the third-party Python library dnspython to make DNS requests to a configurable list of IP blacklist servers.
Note: For the SpamFilter to detect dnspython it needs to be installed via "setuptools". Install "setuptools" based on the Trac plugin requirements, then you can run "easy_install dnspython" to automatically download and install the package.
IP Throttling
The ip_throttle filter limits the number of posts per hour allowed from a single IP.
The maximum number of posts per hour is configured in trac.ini:
[spam-filter] max_posts_by_ip = 5
When this limit is exceeded, the filter starts giving submissions negative karma as specified by the ip_throttle_karma
option.
(since version 0.2)
Akismet
The akismet filter uses the Akismet web service to check content for possible spam.
The use of this filter requires a Wordpress API key. The API key is configured in trac.ini in a separate section:
For version 0.1.x of the plugin:
[akismet] api_key = 1234567890
For version 0.2.x:
[spam-filter] akismet_api_key = 1234567890
Bayes
TODO
(The code in svn uses SpamBayes, which is a logical choice. It would make sense to use a custom tokenizer, however, rather than the email-centric one that is included with SpamBayes. The bigger issue is that some form of training is required (e.g. the API could be extended so that (optionally) authenticated users (and the other filters) could report contributions as spam (using automatic training to assume that everything else is ham); however, this is a complex change). An alternative to this would be a script that could be periodically executed that would train all existing contributions as ham, and gather spam from an appropriate source. If you decide to continue with this in the future, please don't hestiate to ask spambayes-dev for help.
WebAdmin Integration
Since version 0.2, the SpamFilter plugin provides integration with WebAdmin for configuration, monitoring, and training. For monitoring and training purposes, it optionally logs all activity to a table in the database. Upgrading the environment is necessary to install the database table required for this logging.
Get the Plugin
See the Trac plugin requirements for instructions on installing setuptools
. Setuptools
includes the easy_install
application which you can use to install the SpamFilter:
easy_install TracSpamFilter
You can also obtain the code from the Trac Subversion repository:
svn co http://svn.edgewall.com/repos/trac/sandbox/spam-filter
See TracPlugins for instructions on building and installing plugins.
You can browse the source in Trac.
Recommended versions:
Trac | Spam Filter |
0.10 | latest |
0.11dev | latest |
This is a link for setuptools to find the SVN download
Enabling the Plugin
If you install the plugin globally (as described here), you'll also need to enable it in trac.ini as follows:
[components] tracspamfilter.* = enabled
Further Reading
- More info about SpamFilter (and screenshots): Managing Trac Spam
- An alternate solution based on mod_security: Fighting Trac Spam.
Comments
Why not approach spam filtering by requiring anonymous users to compute a simple arithmetic riddle or to echo the text of a random string written into a dynamically generated gif?
—anonymous
… which is precisely what the spam-filter-captcha variant of the plugin does (but I haven't tried myself, so I can't comment further - your take).
—cboos
See also: TracPlugins, PluginList