Opened 8 years ago
Closed 8 years ago
#12715 closed defect (fixed)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 82-83: ordinal not in range(128)
Reported by: | Ryan J Ollos | Owned by: | Ryan J Ollos |
---|---|---|---|
Priority: | normal | Milestone: | plugin - spam-filter |
Component: | plugin/spamfilter | Version: | |
Severity: | normal | Keywords: | |
Cc: | Dirk Stöcker | Branch: | |
Release Notes: |
Fixed |
||
API Changes: | |||
Internal Changes: |
Description
How to Reproduce
While doing a POST operation on /admin/spamfilter/monitor
, Trac issued an internal error.
(please provide additional details here)
Request parameters:
{u'__FORM_TOKEN': u'08455ce73d6144f5f1320720', 'cat_id': u'spamfilter', u'markspamdel': u'Delete selected as Spam', u'num': u'50', u'page': u'1', 'panel_id': u'monitor', 'path_info': None, u'sel': u'200783'}
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36
System Information
Trac | 1.3.2.dev0
|
Babel | 2.3.4
|
dnspython | 1.12.0
|
Docutils | 0.12
|
Genshi | 0.7 (with speedups)
|
GIT | 2.1.4
|
Jinja2 | 2.9.5
|
Mercurial | 3.1.2
|
mod_wsgi | 4.5.13 (WSGIProcessGroup trac WSGIApplicationGroup %{GLOBAL})
|
Pillow | 2.6.1
|
PostgreSQL | server: 9.4.10, client: 9.4.10
|
psycopg2 | 2.5.4
|
Pygments | 2.0.1
|
Python | 2.7.9 (default, Jun 29 2016, 13:11:10) [GCC 4.9.2]
|
pytz | 2012c
|
setuptools | 18.2
|
SpamBayes | 1.1b1
|
Subversion | 1.8.10 (r1615264)
|
jQuery | 1.11.3
|
jQuery UI | 1.11.4
|
jQuery Timepicker | 1.5.5
|
Enabled Plugins
help-guide-version-notice | N/A
|
milestone-to-version | r15098
|
StatusFixer | r6326
|
TracMercurial | 1.0.0.7.dev0
|
TracSpamFilter | 1.3.0.dev0
|
TracVote | 0.6.0.dev0
|
TracWikiExtras | 1.3.1.dev0
|
TranslatedPagesMacro | 0.5
|
Interface Customization
shared-htdocs | |
shared-templates | |
site-htdocs | |
site-templates | site.html , site_footer.html , site_head.html , site_header.html , site_leftbox.html
|
Python Traceback
Traceback (most recent call last): File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/web/main.py", line 630, in _dispatch_request dispatcher.dispatch(req) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/web/main.py", line 252, in dispatch resp = chosen_handler.process_request(req) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/admin/web_ui.py", line 96, in process_request resp = provider.render_admin_panel(req, cat_id, panel_id, path_info) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/TracSpamFilter-1.3.0.dev0-py2.7.egg/tracspamfilter/admin.py", line 89, in render_admin_panel if self._process_monitoring_panel(req): File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/TracSpamFilter-1.3.0.dev0-py2.7.egg/tracspamfilter/admin.py", line 285, in _process_monitoring_panel filtersys.train(req, entries, spam=spam, delete=delete) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/TracSpamFilter-1.3.0.dev0-py2.7.egg/tracspamfilter/filtersystem.py", line 347, in train entry.content, entry.ipnr, spam=spam) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/TracSpamFilter-1.3.0.dev0-py2.7.egg/tracspamfilter/filters/akismet.py", line 87, in train self._post(url, req, author, content, ip) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/TracSpamFilter-1.3.0.dev0-py2.7.egg/tracspamfilter/filters/akismet.py", line 160, in _post urlreq = urllib2.Request(url, urlencode(params), File "/usr/lib/python2.7/urllib.py", line 1338, in urlencode v = quote_plus(str(v)) UnicodeEncodeError: 'ascii' codec can't encode characters in position 82-83: ordinal not in range(128)
Attachments (1)
Change History (7)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
I haven't tested the change yet, but it looks like we just need to encode the User-Agent string to utf-8: plugins/1.2/spam-filter/tracspamfilter/filters/akismet.py@15254:149#L136.
-
tracspamfilter/filters/akismet.py
146 146 params = { 147 147 'blog': req.base_url, 148 148 'user_ip': ip, 149 'user_agent': req.get_header('User-Agent') ,149 'user_agent': req.get_header('User-Agent').encode('utf-8'), 150 150 'referrer': req.get_header('Referer') or 'unknown', 151 151 'comment_author': author_name, 152 152 'comment_type': 'trac',
Possibly for blogspam.py
as well:
-
tracspamfilter/filters/blogspam.py
123 123 'ip': ip, 124 124 'name': author_name, 125 125 'comment': content.encode('utf-8'), 126 'agent': req.get_header('User-Agent') ,126 'agent': req.get_header('User-Agent').encode('utf-8'), 127 127 'site': req.base_url, 128 128 'version': user_agent 129 129 }
comment:3 by , 8 years ago
I think req.get_header()
returns str
but req.base_url
is unicode
.
I consider we could use unicode_urlencode
in trac.util.text
rather than adding .encode('utf-8')
to each entry (untested):
-
tracspamfilter/filters/akismet.py
25 25 from trac.config import IntOption, Option 26 26 from trac.core import Component, implements 27 27 from trac.mimeview.api import is_binary 28 from trac.util.text import unicode_urlencode 28 29 from tracspamfilter.api import IFilterStrategy, N_ 29 30 30 31 … … 155 156 'referrer': req.get_header('Referer') or 'unknown', 156 157 'comment_author': author_name, 157 158 'comment_type': 'trac', 158 'comment_content': content .encode('utf-8')159 'comment_content': content, 159 160 } 160 161 if author_email: 161 162 params['comment_author_email'] = author_email 162 163 for k, v in req.environ.items(): 163 164 if k.startswith('HTTP_') and k not in self.noheaders: 164 params[k] = v .encode('utf-8')165 urlreq = urllib2.Request(url, u rlencode(params),165 params[k] = v 166 urlreq = urllib2.Request(url, unicode_urlencode(params), 166 167 {'User-Agent': self.user_agent}) 167 168 168 169 resp = urllib2.urlopen(urlreq)
by , 8 years ago
Attachment: | t12715.diff added |
---|
comment:5 by , 8 years ago
comment:6 by , 8 years ago
Cc: | added |
---|---|
Release Notes: | modified (diff) |
Resolution: | → fixed |
Status: | assigned → closed |
user_agent
string isu'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html\xa3\xa9'
.