Opened 19 years ago
Closed 19 years ago
#1821 closed defect (fixed)
Accentuated characters prevent Trac from rendering the ticket
Reported by: | Emmanuel Blot | Owned by: | Christian Boos |
---|---|---|---|
Priority: | normal | Milestone: | 0.9 |
Component: | ticket system | Version: | devel |
Severity: | major | Keywords: | |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
For example, if the 'a' character with a accent (I won't submit the actual character here ;-)) is specified in a comment, Trac accepts the character, stores it in the SQL database, but then fails to render it:
Trac detected an internal error: 'ascii' codec can't encode character u'\xe2' in position 0: ordinal not in range(128)
Python traceback
Traceback (most recent call last): File "/local/engine/trac/trac/web/modpython_frontend.py", line 274, in handler dispatch_request(mpr.path_info, mpr, env) File "/local/engine/trac/trac/web/main.py", line 425, in dispatch_request dispatcher.dispatch(req) File "/local/engine/trac/trac/web/main.py", line 285, in dispatch resp = chosen_handler.process_request(req) File "/local/engine/trac/trac/ticket/web_ui.py", line 194, in process_request self._insert_ticket_data(req, db, ticket, reporter_id) File "/local/engine/trac/trac/ticket/web_ui.py", line 379, in _insert_ticket_data changes[-1]['comment'] = wiki_to_html(new, self.env, req, db) File "/local/engine/trac/trac/wiki/formatter.py", line 668, in wiki_to_html Formatter(env, req, absurls, db).format(wikitext, out, escape_newlines) File "/local/engine/trac/trac/wiki/formatter.py", line 557, in format line = util.escape(line, False) File "/local/engine/trac/trac/util.py", line 55, in escape text = str(text).replace('&', '&') \ UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position 0: ordinal not in range(128)
This means that once someone has submitted such a character, the ticket cannot be rendered anymore.
I do not know if this issue also impacts other component (as it seems to occur when Wiki text is rendered)
Attachments (0)
Change History (12)
comment:1 by , 19 years ago
comment:2 by , 19 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Bizarre, ça marche pour moâ ;-) (trac r2013)
Maybe a problem with your default encoding?
comment:3 by , 19 years ago
No, this is a problem with PySQLite 2.x returning unicode objects instead of UTF-8 strings. PySQLite 2 support is still experimental at this point, and the encoding issue is the reason. This problem can also be encountered in the search module.
comment:4 by , 19 years ago
It shouldn't be the default encoding (mine is ascii, so if that would be the problem, I'd have it too).
Are you using pysqlite2?
comment:5 by , 19 years ago
To be more specific, this is a problem with PySQLite/SQLite losing column type information in SQL UNION
s. See PySQLite ticket #102. We register a converter for text columns that encodes unicode objects to utf-8 strings (technically, it just returns the original bytestring), but that converter doesn't get triggered in queries with unions, as used by the search module and, as in this case, for generating the ticket changelog.
cboos, I think it's good style to only accept a ticket when you think you have understood the cause of the problem, and hopefully know a way to fix it.
comment:6 by , 19 years ago
I initially thought it was a simple matter of default encoding, and progressively realized it has to do with pysqlite2…
Nevertheless, this probelm has something to do with the default encoding,
as the str(text)
raises the error only because the default encoding
is ascii.
If the default encoding would be utf8, things would work as expected.
comment:7 by , 19 years ago
installed pysqlite2, reproduced the problem, quick hack… yes, it works.
Index: trac/db.py =================================================================== --- trac/db.py (revision 2013) +++ trac/db.py (working copy) @@ -214,7 +214,15 @@ # we need two converters sqlite.register_converter('text', str) sqlite.register_converter('TEXT', str) - + # ...but this is not enough, as type information is lost + # in queries using UNION. + # Changing the default encoding makes `str(x)` succeed + # when `x` is an unicode string. + import sys + reload(sys) + sys.setdefaultencoding('utf8') + del sys.setdefaultencoding + cnx = sqlite.connect(path, detect_types=sqlite.PARSE_DECLTYPES, check_same_thread=False, timeout=timeout) else:
Disclaimer: I didn't invent the ugly reload/del thing :)
An alternative would be to use type annotations for the affected queries, but that will affect portability, I guess.
OTOH, there seems to be no adverse side-effects with the above hack. What do you think?
comment:8 by , 19 years ago
pysqlite2 sometimes returns strings as unicode object instead of the utf-8 encoded "plain" strings used everywhere else in Trac, that causes things to break. The sys.setdefaultencoding hack might work but is not the correct way to fix this. Please don't apply this patch.
comment:9 by , 19 years ago
I've added a workaround in [2023] that makes sure pysqlite2 never returns unicode strings. If this works and do not introduce any unacceptable performance penalty we can probably close this ticket.
comment:10 by , 19 years ago
I'm not sure to understand the proper way to fix up this issue: Do I need to change the default encoding afterall (and where is the default setting defined ?)
Thanks.
comment:11 by , 19 years ago
Normally upgrading to the latest revision (r2027) should be enough. If it works afterwards, feel free to close this ticket.
comment:12 by , 19 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I'm closing this ticket, since the fix seems to work gracefully.
Test performed w/ [2033].
I forgot to specify the Trac version: pre-0.9, [2011]