Edgewall Software
Modify

Opened 13 years ago

Closed 13 years ago

#1821 closed defect (fixed)

Accentuated characters prevent Trac from rendering the ticket

Reported by: Emmanuel Blot Owned by: Christian Boos
Priority: normal Milestone: 0.9
Component: ticket system Version: devel
Severity: major Keywords:
Cc:
Release Notes:
API Changes:

Description

For example, if the 'a' character with a accent (I won't submit the actual character here ;-)) is specified in a comment, Trac accepts the character, stores it in the SQL database, but then fails to render it:

Trac detected an internal error:

'ascii' codec can't encode character u'\xe2' in position 0: ordinal not in range(128)

Python traceback

Traceback (most recent call last):
  File "/local/engine/trac/trac/web/modpython_frontend.py", line 274, in handler
    dispatch_request(mpr.path_info, mpr, env)
  File "/local/engine/trac/trac/web/main.py", line 425, in dispatch_request
    dispatcher.dispatch(req)
  File "/local/engine/trac/trac/web/main.py", line 285, in dispatch
    resp = chosen_handler.process_request(req)
  File "/local/engine/trac/trac/ticket/web_ui.py", line 194, in process_request
    self._insert_ticket_data(req, db, ticket, reporter_id)
  File "/local/engine/trac/trac/ticket/web_ui.py", line 379, in _insert_ticket_data
    changes[-1]['comment'] = wiki_to_html(new, self.env, req, db)
  File "/local/engine/trac/trac/wiki/formatter.py", line 668, in wiki_to_html
    Formatter(env, req, absurls, db).format(wikitext, out, escape_newlines)
  File "/local/engine/trac/trac/wiki/formatter.py", line 557, in format
    line = util.escape(line, False)
  File "/local/engine/trac/trac/util.py", line 55, in escape
    text = str(text).replace('&', '&') \
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe2' in position 0: ordinal not in range(128)

This means that once someone has submitted such a character, the ticket cannot be rendered anymore.

I do not know if this issue also impacts other component (as it seems to occur when Wiki text is rendered)

Attachments (0)

Change History (12)

comment:1 Changed 13 years ago by Emmanuel Blot

I forgot to specify the Trac version: pre-0.9, [2011]

comment:2 Changed 13 years ago by Christian Boos

Owner: changed from Jonas Borgström to Christian Boos
Status: newassigned

Bizarre, ça marche pour moâ ;-) (trac r2013)

Maybe a problem with your default encoding?

comment:3 Changed 13 years ago by Christopher Lenz

No, this is a problem with PySQLite 2.x returning unicode objects instead of UTF-8 strings. PySQLite 2 support is still experimental at this point, and the encoding issue is the reason. This problem can also be encountered in the search module.

comment:4 Changed 13 years ago by Christian Boos

It shouldn't be the default encoding (mine is ascii, so if that would be the problem, I'd have it too).

Are you using pysqlite2?

comment:5 Changed 13 years ago by Christopher Lenz

To be more specific, this is a problem with PySQLite/SQLite losing column type information in SQL UNIONs. See PySQLite ticket #102. We register a converter for text columns that encodes unicode objects to utf-8 strings (technically, it just returns the original bytestring), but that converter doesn't get triggered in queries with unions, as used by the search module and, as in this case, for generating the ticket changelog.

cboos, I think it's good style to only accept a ticket when you think you have understood the cause of the problem, and hopefully know a way to fix it.

comment:6 Changed 13 years ago by Christian Boos

I initially thought it was a simple matter of default encoding, and progressively realized it has to do with pysqlite2…

Nevertheless, this probelm has something to do with the default encoding, as the str(text) raises the error only because the default encoding is ascii. If the default encoding would be utf8, things would work as expected.

comment:7 Changed 13 years ago by Christian Boos

installed pysqlite2, reproduced the problem, quick hack… yes, it works.

Index: trac/db.py
===================================================================
--- trac/db.py  (revision 2013)
+++ trac/db.py  (working copy)
@@ -214,7 +214,15 @@
             # we need two converters
             sqlite.register_converter('text', str)
             sqlite.register_converter('TEXT', str)
-
+            # ...but this is not enough, as type information is lost
+            # in queries using UNION.
+            # Changing the default encoding makes `str(x)` succeed
+            # when `x` is an unicode string.
+            import sys
+            reload(sys)
+            sys.setdefaultencoding('utf8')
+            del sys.setdefaultencoding
+
             cnx = sqlite.connect(path, detect_types=sqlite.PARSE_DECLTYPES,
                                  check_same_thread=False, timeout=timeout)
         else:

Disclaimer: I didn't invent the ugly reload/del thing :)

An alternative would be to use type annotations for the affected queries, but that will affect portability, I guess.

OTOH, there seems to be no adverse side-effects with the above hack. What do you think?

comment:8 Changed 13 years ago by Jonas Borgström

pysqlite2 sometimes returns strings as unicode object instead of the utf-8 encoded "plain" strings used everywhere else in Trac, that causes things to break. The sys.setdefaultencoding hack might work but is not the correct way to fix this. Please don't apply this patch.

comment:9 Changed 13 years ago by Jonas Borgström

I've added a workaround in [2023] that makes sure pysqlite2 never returns unicode strings. If this works and do not introduce any unacceptable performance penalty we can probably close this ticket.

comment:10 Changed 13 years ago by Emmanuel Blot

I'm not sure to understand the proper way to fix up this issue: Do I need to change the default encoding afterall (and where is the default setting defined ?)

Thanks.

comment:11 Changed 13 years ago by Christian Boos

Normally upgrading to the latest revision (r2027) should be enough. If it works afterwards, feel free to close this ticket.

comment:12 Changed 13 years ago by Emmanuel Blot

Resolution: fixed
Status: assignedclosed

I'm closing this ticket, since the fix seems to work gracefully.
Test performed w/ [2033].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.