= Trac Spam Filtering = [[PageOutline(2-3)]] A plugin is being developed that will allow different ways to reject contributions that contain spam. This plugin requires Trac release [milestone:0.10]. It should also work with [milestone:0.11]dev now that #4611 is fixed. == Supported Filtering Strategies == Prior to version 0.2.x, the SpamFilter plugin would reject a submission if any single filter strategy said it was spam. Since 0.2, the individual strategies assign scores (“karma”) to submitted content, and the total karma determines whether a submission is rejected or not. === Regular Expressions === The [source:sandbox/spam-filter/tracspamfilter/filters/regex.py regex] filter reads a list of regular expressions from a wiki page named “BadContent”, each regular expression being on a separate line inside the first code block on the page, using the [http://docs.python.org/lib/re-syntax.html Python syntax] for regular expressions. If any of those regular expressions matches the submitted content, the submission will be rejected. === IP Blacklisting === The [source:sandbox/spam-filter/tracspamfilter/filters/ip_blacklist.py ip_blacklist] filter uses the third-party Python library [http://www.dnspython.org/ dnspython] to make DNS requests to a configurable list of IP blacklist servers. '''Note:''' For the SpamFilter to detect [http://www.dnspython.org/ dnspython] it needs to be installed via "setuptools". Install "setuptools" based on the [wiki:TracPlugins#Requirements Trac plugin requirements], then you can run "easy_install dnspython" to automatically download and install the package. === IP Throttling === The [source:sandbox/spam-filter/tracspamfilter/filters/ip_throttle.py ip_throttle] filter limits the number of posts per hour allowed from a single IP. The maximum number of posts per hour is configured in [wiki:TracIni trac.ini]: {{{ [spam-filter] max_posts_by_ip = 5 }}} When this limit is exceeded, the filter starts giving submissions negative karma as specified by the `ip_throttle_karma` option. ''(since version 0.2)'' === Akismet === The [source:sandbox/spam-filter/tracspamfilter/filters/akismet.py akismet] filter uses the [http://akismet.com/ Akismet] web service to check content for possible spam. The use of this filter requires a [http://www.wordpress.com Wordpress] API key. The API key is configured in [wiki:TracIni trac.ini] in a separate section: For version 0.1.x of the plugin: {{{ [akismet] api_key = 1234567890 }}} For version 0.2.x: {{{ [spam-filter] akismet_api_key = 1234567890 }}} === Bayes === ''TODO'' > (The code in svn uses [http://spambayes.org SpamBayes], which is a logical choice. It would make sense to use a custom tokenizer, however, rather than the email-centric one that is included with [http://spambayes.org SpamBayes]. The bigger issue is that some form of training is required (e.g. the API could be extended so that (optionally) authenticated users (and the other filters) could report contributions as spam (using automatic training to assume that everything else is ham); however, this is a complex change). An alternative to this would be a script that could be periodically executed that would train all existing contributions as ham, and gather spam from an appropriate source. If you decide to continue with this in the future, please don't hestiate to ask [mailto:spambayes-dev@python.org spambayes-dev] for help. == WebAdmin Integration == Since version 0.2, the SpamFilter plugin provides integration with WebAdmin for configuration, monitoring, and training. For monitoring and training purposes, it optionally logs all activity to a table in the database. Upgrading the environment is necessary to install the database table required for this logging. == Get the Plugin == See the [wiki:TracPlugins#Requirements Trac plugin requirements] for instructions on installing `setuptools`. `Setuptools` includes the `easy_install` application which you can use to install the SpamFilter: {{{ easy_install TracSpamFilter }}} You can also obtain the code from the Trac Subversion repository: {{{ svn co http://svn.edgewall.com/repos/trac/sandbox/spam-filter }}} See TracPlugins for instructions on building and installing plugins. You can [source:sandbox/spam-filter browse the source in Trac]. Recommended versions: || Trac || Spam Filter || || [milestone:0.10] || latest || || [milestone:0.11]dev || latest || ''[http://svn.edgewall.com/repos/trac/sandbox/spam-filter#egg=TracSpamFilter-dev This is a link for setuptools to find the SVN download]'' == Enabling the Plugin == If you install the plugin globally (as described [wiki:TracPlugins#ForAllProjects here]), you'll also need to enable it in [wiki:TracIni trac.ini] as follows: {{{ [components] tracspamfilter.* = enabled }}} == Further Reading == * More info about SpamFilter (and screenshots): [http://www.cmlenz.net/blog/2006/11/managing_trac_s.html Managing Trac Spam] * An alternate solution based on mod_security: [http://madwifi.org/wiki/FightingTracSpam Fighting Trac Spam]. == Comments == ''Why not approach spam filtering by requiring anonymous users to compute a simple arithmetic riddle or to echo the text of a random string written into a dynamically generated gif?'' [[br]]--anonymous ... which is precisely what the [source:sandbox/spam-filter-captcha spam-filter-captcha] variant of the plugin does (but I haven't tried myself, so I can't comment further - your take). [[br]]--cboos ''Will there be a way to use reCAPTCHA instead of akismet?'' [[br]]--other anonymous ---- See also: TracPlugins, PluginList