Opened 6 years ago
Closed 6 years ago
#13200 closed defect (fixed)
ValueError: A string literal cannot contain NUL (0x00) characters.
| Reported by: | Ryan J Ollos | Owned by: | Ryan J Ollos |
|---|---|---|---|
| Priority: | normal | Milestone: | plugin - spam-filter |
| Component: | plugin/spamfilter | Version: | 1.4 |
| Severity: | normal | Keywords: | |
| Cc: | Branch: | ||
| Release Notes: |
Replace nulls with spaces before training data with Bayesian strategy. |
||
| API Changes: | |||
| Internal Changes: | |||
Description (last modified by )
How to Reproduce
While doing a POST operation on /admin/spamfilter/monitor, Trac issued an internal error.
Spam entry that causes the issue:
Request parameters:
{u'__FORM_TOKEN': u'3fe406664047354be332d4a9',
'cat_id': u'spamfilter',
u'markspamdel': u'Delete selected as Spam',
u'num': u'50',
u'page': u'1',
'panel_id': u'monitor',
'path_info': None,
u'sel': u'232668',
u'toggle_group': u'on'}
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36
System Information
Trac | 1.4
|
Babel | 2.7.0
|
dnspython | 1.15.0
|
Docutils | 0.14
|
Genshi | 0.7.1 (with speedups)
|
GIT | 2.11.0
|
Jinja2 | 2.10.1
|
Mercurial | 4.8.2
|
mod_wsgi | 4.5.13 (WSGIProcessGroup trac WSGIApplicationGroup %{GLOBAL})
|
Pillow | 6.1.0
|
PostgreSQL | server: 9.6.15, client: 9.6.15
|
psycopg2 | 2.8.3
|
Pygments | 2.3.1
|
Python | 2.7.13 (default, Sep 26 2018, 18:42:22) [GCC 6.3.0 20170516]
|
pytz | 2018.9
|
setuptools | 41.0.1
|
SpamBayes | 1.1b3
|
Subversion | 1.9.5 (r1770682)
|
jQuery | 1.12.4
|
jQuery UI | 1.12.1
|
jQuery Timepicker | 1.6.3
|
Enabled Plugins
conditional-clear-milestone-operation | N/A
|
help-guide-version-notice | N/A
|
milestone-to-version | r15098
|
StatusFixer | r6326
|
trac-releases | N/A
|
TracMercurial | 1.0.0.9.dev0
|
TracSpamFilter | 1.3.0.dev0
|
TracVote | 0.7.0.dev0
|
TracWikiExtras | 1.3.1.dev0
|
TranslatedPages | 1.1.0
|
Interface Customization
| shared-htdocs | |
| shared-templates | |
| site-htdocs | |
| site-templates | site.html, site_footer.html, site_head.html, site_header.html, site_leftbox.html
|
Python Traceback
Traceback (most recent call last):
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/web/main.py", line 639, in dispatch_request
dispatcher.dispatch(req)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/web/main.py", line 250, in dispatch
resp = chosen_handler.process_request(req)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/admin/web_ui.py", line 103, in process_request
resp = provider.render_admin_panel(req, cat_id, panel_id, path_info)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/admin.py", line 87, in render_admin_panel
if self._process_monitoring_panel(req):
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/admin.py", line 288, in _process_monitoring_panel
filtersys.train(req, entries, spam=spam, delete=delete)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filtersystem.py", line 393, in train
spam=spam)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filters/bayes.py", line 90, in train
hammie.train(testcontent.encode('utf-8', 'ignore'), spam)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/spambayes/hammie.py", line 164, in train
self.bayes.learn(tokenize(msg), is_spam)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/spambayes/classifier.py", line 252, in learn
self._add_msg(wordstream, is_spam)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/spambayes/classifier.py", line 354, in _add_msg
record = self._wordinfoget(word)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filters/bayes.py", line 211, in _wordinfoget
row = self._get_row(word)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filters/bayes.py", line 168, in _get_row
""", (word,)):
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/db/api.py", line 50, in execute
return db.execute(query, params)
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/db/util.py", line 129, in execute
cursor.execute(query, params if params is not None else [])
File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/db/util.py", line 73, in execute
return self.cursor.execute(sql_escape_percent(sql), args)
ValueError: A string literal cannot contain NUL (0x00) characters.
Attachments (1)
Change History (9)
by , 6 years ago
| Attachment: | Screen Shot 2019-09-01 at 18.39.41.jpg added |
|---|
comment:1 by , 6 years ago
| Description: | modified (diff) |
|---|
comment:2 by , 6 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:3 by , 6 years ago
comment:4 by , 6 years ago
Additional work around is to replace nulls to spaces before passing hammie.train().
-
tracspamfilter/filters/bayes.py
79 79 ("%3.2f" % (score * 100))) 80 80 81 81 def train(self, req, author, content, ip, spam=True): 82 # Split tokens at null characters by tokenizer in spambayes.hammie 83 content = content.replace('\x00', ' ') 82 84 if author is not None: 83 85 testcontent = author + '\n' + content 84 86 else:
follow-up: 6 comment:5 by , 6 years ago
Okay, I'll apply comment:4 instead if you think it's better to strip out the nulls further up the call stack. Or perhaps even in FilterSystem.train?
comment:6 by , 6 years ago
Replying to Ryan J Ollos:
Okay, I'll apply comment:4 instead if you think it's better to strip out the nulls further up the call stack.
Yes. Removing null characters would create another long word, e.g. aaa\x00bbb → aaabbb. I think it would be a little nice to avoid it.
Or perhaps even in FilterSystem.train?
I think that is not good. If a filter would use such null characters to detect spam, the filter will stop the detecting.
comment:8 by , 6 years ago
| Release Notes: | modified (diff) |
|---|---|
| Resolution: | → fixed |
| Status: | assigned → closed |




The content is
\x7fwg\x03\xc3\xbf\x00, so maybe we just need to remove null characters from the string.tracspamfilter/filters/bayes.py