= Trac Spam Filtering = [[PageOutline(2-3)]] This plugin allows different ways to reject contributions that contain spam. This plugin requires Trac release [milestone:0.11] or [milestone:0.12]. The spamfilter plugin has many options, but most of them are optional. Basically installing is enough to have a basic spam protection. But there are some things which may be helpful (in order of importance): * Train bayes database (using the entries of the log) to activate that filter and reach good performance * Setup !BadContent page containing regular expressions to filter * Get API keys for Akismet, !TypePad and/or HTTP:BL to use external services * Activate captcha rejection handler to improve user treatment * finetune the karma settings and parameters for your system (e.g. you may increase karma for good trained bayes filters or stop trusting registered users) == Supported Filtering Strategies == The individual strategies assign scores (“karma”) to submitted content, and the total karma determines whether a submission is rejected or not. === Regular Expressions === The [source:plugins/0.12/spam-filter-captcha/tracspamfilter/filters/regex.py regex] filter reads a list of regular expressions from a wiki page named “BadContent”, each regular expression being on a separate line inside the first code block on the page, using the [http://docs.python.org/lib/re-syntax.html Python syntax] for regular expressions. If any of those regular expressions matches the submitted content, the submission will be rejected. === IP Blacklisting === The [source:plugins/0.12/spam-filter-captcha/tracspamfilter/filters/ip_blacklist.py ip_blacklist] filter uses the third-party Python library [http://www.dnspython.org/ dnspython] to make DNS requests to a configurable list of IP blacklist servers. === IP Throttling === The [source:plugins/0.12/spam-filter-captcha/tracspamfilter/filters/ip_throttle.py ip_throttle] filter limits the number of posts per hour allowed from a single IP. The maximum number of posts per hour is configured in [wiki:TracIni trac.ini]: {{{ [spam-filter] max_posts_by_ip = 5 }}} When this limit is exceeded, the filter starts giving submissions negative karma as specified by the `ip_throttle_karma` option. === Akismet === The [source:plugins/0.12/spam-filter-captcha/tracspamfilter/filters/akismet.py akismet] filter uses the [http://akismet.com/ Akismet] web service to check content for possible spam. The use of this filter requires a [http://www.wordpress.com Wordpress] API key. The API key is configured in [wiki:TracIni trac.ini] in a separate section: {{{ [spam-filter] akismet_api_key = 1234567890 }}} === !TypePad === The [source:plugins/0.12/spam-filter-captcha/tracspamfilter/filters/typepad.py TypePad AntiSpam] filter uses the [http://antispam.typepad.com/ Typepad] web service to check content for possible spam. The use of this filter requires a API key. The API key is configured in [wiki:TracIni trac.ini] in a separate section: {{{ [spam-filter] typepad_api_key = 1234567890 }}} === HTTP:BL === The [source:plugins/0.12/spam-filter-captcha/tracspamfilter/filters/httpbl.py HTTP:BL] filter uses the [http://www.projecthoneypot.org/httpbl.php Project HoneyPot HTTP:BL] web service to check content for possible spam. The use of this filter requires a [http://www.projecthoneypot.org/httpbl_configure.php HTTP:BL] API key. The API key is configured in [wiki:TracIni trac.ini] in a separate section: {{{ [spam-filter] httpbl_api_key = abcdefghijkl }}} === Bayes === ''TODO'' > (The code in svn uses [http://spambayes.org SpamBayes], which is a logical choice. It would make sense to use a custom tokenizer, however, rather than the email-centric one that is included with [http://spambayes.org SpamBayes]. The bigger issue is that some form of training is required (e.g. the API could be extended so that (optionally) authenticated users (and the other filters) could report contributions as spam (using automatic training to assume that everything else is ham); however, this is a complex change). An alternative to this would be a script that could be periodically executed that would train all existing contributions as ham, and gather spam from an appropriate source. If you decide to continue with this in the future, please don't hestiate to ask [mailto:spambayes-dev@python.org spambayes-dev] for help. == WebAdmin Integration == The SpamFilter plugin provides integration with WebAdmin for configuration, monitoring, and training. For monitoring and training purposes, it optionally logs all activity to a table in the database. Upgrading the environment is necessary to install the database table required for this logging. == Get the Plugin == See the [wiki:TracPlugins#Requirements Trac plugin requirements] for instructions on installing `setuptools`. `Setuptools` includes the `easy_install` application which you can use to install the SpamFilter: {{{ easy_install TracSpamFilter }}} You can also obtain the code from the Trac Subversion repository: {{{ svn co http://svn.edgewall.com/repos/trac/plugins/0.12/spam-filter-captcha }}} or download [http://trac.edgewall.org/changeset/latest/plugins/0.12/spam-filter-captcha?old_path=/&format=zip zipped source]. See TracPlugins for instructions on building and installing plugins. You can [source:plugins/0.12/spam-filter-captcha browse the source in Trac]. Recommended versions: || Trac || Spam Filter || || [milestone:0.11] || latest (in 0.12 tree)|| || [milestone:0.12] || latest || ''[http://svn.edgewall.com/repos/trac/plugins/0.12/spam-filter-captcha/#egg=TracSpamFilter-dev This is a link for setuptools to find the SVN download]'' == Enabling the Plugin == If you install the plugin globally (as described [wiki:TracPlugins#ForAllProjects here]), you'll also need to enable it in [wiki:TracIni trac.ini] as follows: {{{ [components] tracspamfilter.* = enabled }}} = Captcha support for the SpamFilter = Since version 0.3.1 the support to have CAPTCHA-style "human" verification has been integrated. Note that you need to add the following to your [TracIni trac.ini]: {{{ [spam-filter] ... reject_handler = CaptchaSystem # captcha = ExpressionCaptcha # (default, doesn't need to be specified explicitly) # captcha = ImageCaptcha # (needs to be activated and needs PIL to be installed) # captcha = RecaptchaCaptcha # (uses Googles reCAPTCHA service) # captcha = RandomCaptcha # (chooses one of the others randomly) }}} == Further Reading == * More info about SpamFilter (and screenshots): [http://www.cmlenz.net/blog/2006/11/managing_trac_s.html Managing Trac Spam] * An alternate solution based on mod_security: [http://scallywhack.org ScallyWhack]. == Known Issues == [[TicketQuery(component=plugin/spamfilter,status=!closed)]] == Requirements == * The modules for IP blacklistening und HTTP:BL need [http://www.dnspython.org/ dnspython] installed. Install "setuptools" based on the [wiki:TracPlugins#Requirements Trac plugin requirements], then you can run "easy_install dnspython" to automatically download and install the package. * '''Attention''': The 1.7 series of dnspython causes a massive slowdown of whole Trac. Use 1.6.x or 1.8.x. * The !ImageCaptcha requires python-imaging to work. == Comments == ---- See also: TracPlugins, PluginList