Opened 5 years ago
Closed 5 years ago
#13200 closed defect (fixed)
ValueError: A string literal cannot contain NUL (0x00) characters.
Reported by: | Ryan J Ollos | Owned by: | Ryan J Ollos |
---|---|---|---|
Priority: | normal | Milestone: | plugin - spam-filter |
Component: | plugin/spamfilter | Version: | 1.4 |
Severity: | normal | Keywords: | |
Cc: | Branch: | ||
Release Notes: |
Replace nulls with spaces before training data with Bayesian strategy. |
||
API Changes: | |||
Internal Changes: |
Description (last modified by )
How to Reproduce
While doing a POST operation on /admin/spamfilter/monitor
, Trac issued an internal error.
Spam entry that causes the issue:
Request parameters:
{u'__FORM_TOKEN': u'3fe406664047354be332d4a9', 'cat_id': u'spamfilter', u'markspamdel': u'Delete selected as Spam', u'num': u'50', u'page': u'1', 'panel_id': u'monitor', 'path_info': None, u'sel': u'232668', u'toggle_group': u'on'}
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36
System Information
Trac | 1.4
|
Babel | 2.7.0
|
dnspython | 1.15.0
|
Docutils | 0.14
|
Genshi | 0.7.1 (with speedups)
|
GIT | 2.11.0
|
Jinja2 | 2.10.1
|
Mercurial | 4.8.2
|
mod_wsgi | 4.5.13 (WSGIProcessGroup trac WSGIApplicationGroup %{GLOBAL})
|
Pillow | 6.1.0
|
PostgreSQL | server: 9.6.15, client: 9.6.15
|
psycopg2 | 2.8.3
|
Pygments | 2.3.1
|
Python | 2.7.13 (default, Sep 26 2018, 18:42:22) [GCC 6.3.0 20170516]
|
pytz | 2018.9
|
setuptools | 41.0.1
|
SpamBayes | 1.1b3
|
Subversion | 1.9.5 (r1770682)
|
jQuery | 1.12.4
|
jQuery UI | 1.12.1
|
jQuery Timepicker | 1.6.3
|
Enabled Plugins
conditional-clear-milestone-operation | N/A
|
help-guide-version-notice | N/A
|
milestone-to-version | r15098
|
StatusFixer | r6326
|
trac-releases | N/A
|
TracMercurial | 1.0.0.9.dev0
|
TracSpamFilter | 1.3.0.dev0
|
TracVote | 0.7.0.dev0
|
TracWikiExtras | 1.3.1.dev0
|
TranslatedPages | 1.1.0
|
Interface Customization
shared-htdocs | |
shared-templates | |
site-htdocs | |
site-templates | site.html , site_footer.html , site_head.html , site_header.html , site_leftbox.html
|
Python Traceback
Traceback (most recent call last): File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/web/main.py", line 639, in dispatch_request dispatcher.dispatch(req) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/web/main.py", line 250, in dispatch resp = chosen_handler.process_request(req) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/admin/web_ui.py", line 103, in process_request resp = provider.render_admin_panel(req, cat_id, panel_id, path_info) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/admin.py", line 87, in render_admin_panel if self._process_monitoring_panel(req): File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/admin.py", line 288, in _process_monitoring_panel filtersys.train(req, entries, spam=spam, delete=delete) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filtersystem.py", line 393, in train spam=spam) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filters/bayes.py", line 90, in train hammie.train(testcontent.encode('utf-8', 'ignore'), spam) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/spambayes/hammie.py", line 164, in train self.bayes.learn(tokenize(msg), is_spam) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/spambayes/classifier.py", line 252, in learn self._add_msg(wordstream, is_spam) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/spambayes/classifier.py", line 354, in _add_msg record = self._wordinfoget(word) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filters/bayes.py", line 211, in _wordinfoget row = self._get_row(word) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/tracspamfilter/filters/bayes.py", line 168, in _get_row """, (word,)): File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/db/api.py", line 50, in execute return db.execute(query, params) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/db/util.py", line 129, in execute cursor.execute(query, params if params is not None else []) File "/usr/local/virtualenv/1.3dev/lib/python2.7/site-packages/trac/db/util.py", line 73, in execute return self.cursor.execute(sql_escape_percent(sql), args) ValueError: A string literal cannot contain NUL (0x00) characters.
Attachments (1)
Change History (9)
by , 5 years ago
Attachment: | Screen Shot 2019-09-01 at 18.39.41.jpg added |
---|
comment:1 by , 5 years ago
Description: | modified (diff) |
---|
comment:2 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:3 by , 5 years ago
comment:4 by , 5 years ago
Additional work around is to replace nulls to spaces before passing hammie.train()
.
-
tracspamfilter/filters/bayes.py
79 79 ("%3.2f" % (score * 100))) 80 80 81 81 def train(self, req, author, content, ip, spam=True): 82 # Split tokens at null characters by tokenizer in spambayes.hammie 83 content = content.replace('\x00', ' ') 82 84 if author is not None: 83 85 testcontent = author + '\n' + content 84 86 else:
follow-up: 6 comment:5 by , 5 years ago
Okay, I'll apply comment:4 instead if you think it's better to strip out the nulls further up the call stack. Or perhaps even in FilterSystem.train?
comment:6 by , 5 years ago
Replying to Ryan J Ollos:
Okay, I'll apply comment:4 instead if you think it's better to strip out the nulls further up the call stack.
Yes. Removing null characters would create another long word, e.g. aaa\x00bbb
→ aaabbb
. I think it would be a little nice to avoid it.
Or perhaps even in FilterSystem.train?
I think that is not good. If a filter would use such null characters to detect spam, the filter will stop the detecting.
comment:8 by , 5 years ago
Release Notes: | modified (diff) |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
The content is
\x7fwg\x03\xc3\xbf\x00
, so maybe we just need to remove null characters from the string.tracspamfilter/filters/bayes.py