Edgewall Software
Modify

Opened 18 years ago

Closed 16 years ago

Last modified 6 years ago

#3723 closed defect (worksforme)

Problems with database encoding

Reported by: szymon Owned by: Christian Boos
Priority: low Milestone:
Component: general Version: 0.10.4
Severity: blocker Keywords: postgresql, mysql, trac, wiki, utf8, sqlite, unicode, needinfo
Cc: frederic.duarte@…, benjamin.azan@…, joerg.wendland@…, evantdster@… Branch:
Release Notes:
API Changes:
Internal Changes:

Description

I am using Trac v.0.10-b1 with MySQL and I get the following error:

Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 335, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 220, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/lib/python2.4/site-packages/trac/wiki/web_ui.py", line 97, in process_request
    page = WikiPage(self.env, pagename, version, db)
  File "/usr/lib/python2.4/site-packages/trac/wiki/model.py", line 32, in __init__
    self._fetch(name, version, db)
  File "/usr/lib/python2.4/site-packages/trac/wiki/model.py", line 53, in _fetch
    (name,))
  File "/usr/lib/python2.4/site-packages/trac/db/util.py", line 47, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/usr/lib/python2.4/site-packages/trac/db/util.py", line 47, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/usr/lib/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 919-922: invalid data

Any help?

Attachments (0)

Change History (61)

comment:1 by anonymous, 18 years ago

Keywords: trac wiki utf8 added
Priority: highhighest
Severity: majorblocker

I'm trying hard to fix this but to no success. Can anyone help as this bug renders the whole wiki unusable?? Thanks in advance…

comment:2 by movex@…, 18 years ago

Hi, I'm using also Trac 0.10b1 with Python 2.3.5 and Mysql 4.1.11 and I get the same error. Mysql database wokrs in utf-8 constellation. When restarting apache the error is gone for some hours after that same thing again. This also affects timeline and tickets.

comment:3 by Christian Boos, 18 years ago

Keywords: mysql added
Milestone: 0.10.1

comment:4 by anonymous, 18 years ago

Keywords: sqlite added

Seeing the same sort of problem since upgrade to 0.10 with sqlite 3.2.1 & python 2.4. It only seems to affect searches on tickets, not on the wiki.

Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 356, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 224, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/lib/python2.4/site-packages/trac/Search.py", line 181, in process_request
    results += list(source.get_search_results(req, terms, filters))
  File "/usr/lib/python2.4/site-packages/trac/ticket/api.py", line 265, in get_search_results
    for summary, desc, author, keywords, tid, date, status in cursor:
  File "/usr/lib/python2.4/site-packages/trac/db/util.py", line 40, in __iter__
    row = self.cursor.fetchone()
  File "/usr/lib/python2.4/site-packages/trac/db/sqlite_backend.py", line 73, in fetchone
    return row and self._convert_row(row) or None
  File "/usr/lib/python2.4/site-packages/trac/db/sqlite_backend.py", line 69, in _convert_row
    return tuple([(isinstance(v, str) and [v.decode('utf-8')] or [v])[0]
  File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1784-1785: invalid data

in reply to:  4 ; comment:5 by Christian Boos, 18 years ago

Replying to anonymous:

Seeing the same sort of problem since upgrade to 0.10 with sqlite 3.2.1 & python 2.4. It only seems to affect searches on tickets, not on the wiki.

return tuple([(isinstance(v, str) and [v.decode('utf-8')] or [v])[0]

Looks like you're using pysqlite 1.x. You could (should) upgrade to 2.3.2. Check the PySqlite page.

in reply to:  5 ; comment:6 by anonymous, 18 years ago

Hi,

pysqlite 2.3.2 caused apache to segfault, so I tried version 2.0.7, which worked ok, but I still get an error (although a slightly different one)

Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 356, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 224, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/lib/python2.4/site-packages/trac/Search.py", line 181, in process_request
    results += list(source.get_search_results(req, terms, filters))
  File "/usr/lib/python2.4/site-packages/trac/ticket/api.py", line 267, in get_search_results
    if status == 'closed':
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1784-1785: invalid data

Again this only happens on ticket searches.

in reply to:  6 ; comment:7 by Christian Boos, 18 years ago

Replying to anonymous:

Hi,

pysqlite 2.3.2 caused apache to segfault,

Probably the same issue as pysqlite:ticket:174 (fixed in pysqlite's trunk).

so I tried version 2.0.7, which worked ok, but I still get an error (although a slightly different one)


if status == 'closed':

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1784-1785: invalid data

I can't see why the status would have such a length … can you insert a print tid, repr(status) just before line 267?

comment:8 by frederic.duarte@…, 18 years ago

Hi gentlemens,

Same problem as Szymon (except the bytes position : UnicodeDecodeError: 'utf8' codec can't decode bytes in position 2-4: invalid data) with trac 0.10 (tag), apache 2.0.52, mysql 4.1.20 in using UTF8, Python 2.3.4.

As Movex said, the trange thing is that it disappear for some time when restarting apache. And when I come back to work the day after, it is broken again.

I didnt find any clue in the apache logs :(

Tell me if/how I could help.

in reply to:  7 ; comment:9 by anonymous, 18 years ago

Hopefully of some help…

I tried writing the values out to a file. If I do f.write(tid) I get the same Unicode error for the write. However if I do f.write(repr(tid)) the write works fine and just prints a four digit ticket number. I'm learning python as I go on this one so I'm not sure if that's what you'd expect or not :o)

jon at the hyphen mill dot com

Replying to cboos:

Replying to anonymous:

Hi,

pysqlite 2.3.2 caused apache to segfault,

Probably the same issue as pysqlite:ticket:174 (fixed in pysqlite's trunk).

so I tried version 2.0.7, which worked ok, but I still get an error (although a slightly different one)


if status == 'closed':

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1784-1785: invalid data

I can't see why the status would have such a length … can you insert a print tid, repr(status) just before line 267?

in reply to:  9 ; comment:10 by Christian Boos, 18 years ago

Keywords: unicode added

Replying to anonymous:

I tried writing the values out to a file. If I do f.write(tid) I get the same Unicode error for the write. However if I do f.write(repr(tid)) the write works fine and just prints a four digit ticket number.

That's very surprising… Please try: f.write(tid.encode('unicode_internal')), and also:

from locale import getlocale
f.write(repr(getlocale()))

comment:11 by frederic.duarte, 18 years ago

Component: wikimod_python frontend
Owner: changed from Jonas Borgström to Christopher Lenz

I suggest to change the component as the error occurs on every page of any trac project. Please correct if I'm wrong.

in reply to:  10 comment:12 by anonymous, 18 years ago

Ok, hopefully some progress! :o)

f.write(tid.encode('unicode_internal')) gives the error: 'int' object has no attribute 'encode'.

However, if I just put the

from locale import getlocale

line in above line 267 and nothing else, the problem goes away and the search displays correctly. If I put the import line at the top of api.py I still get the error (not sure if that's expected or not)

If I write out the locale I get: ('en_GB', 'ISO8859-1')

Replying to cboos:

Please try: f.write(tid.encode('unicode_internal')), and also:

from locale import getlocale
f.write(repr(getlocale()))

in reply to:  11 comment:13 by anonymous, 18 years ago

Replying to frederic.duarte:

I suggest to change the component as the error occurs on every page of any trac project. Please correct if I'm wrong.

you are right, it occurs now everywhere - in ticket reports, wiki, timeline…

comment:14 by batman, 18 years ago

I am trying trac for the first time. My config is trac 0.10, python 2.4, mysql 5.0.22, and tracd. I see the behavior as previously reported. Even the macros are not working. Restart the server, and all problems go away for a while.

comment:15 by Christian Boos, 18 years ago

Keywords: weird added

comment:16 by jannek, 18 years ago

I got trac 0.10, apache 2.0.54, MySQL 4.1.20, mod_python 3.1.4, python 2.4.3, MySQL-python 1.2.0

And are experiencing the same problem with the timedependant erorr that occurs the next day, for almost any pages. Reloading a few times might make the page render properly, but restarting apache (probably, reloading mod_python) solves the problem, for another day.

I'm wondering if it's a bug in mod_python, but haven't had time to revert or upgrade it yet.

comment:17 by frederic.duarte@…, 18 years ago

Cc: frederic.duarte@… added

Hello,

Today I had this problem but as a transient one : it raises around one over two times when I click on Timeline…

  • click on Timeline : it fails ("crashes")
  • click on Roadmap : it works
  • click on Timeline : it works
  • click on Roadmap : it works
  • click on Timeline : it fails
  • click on Roadmap : it works
  • click on Timeline : it fails
  • click on Roadmap : it works
  • click on Timeline : it works
  • etc …

Same test with other pages, user not logged in, only for available menu items (Roadmap page tuned for time tracking) :

  • wiki sometimes fails
  • timeline sometimes fails
  • roadmap never fails
  • browse sources never fails
  • view tickets sometimes fails
  • search never fails

Same test with user (admin) logged in :

  • every page fails around 80% of the times, including the new accessible ones : New Ticket and Admin"

Here is an extract of my trac.log file, trying to view the timeline :

2006-11-30 10:34:44,943 Trac[api] DEBUG: Updating wiki page index
2006-11-30 10:34:44,967 Trac[svn_fs] DEBUG: Opening subversion file-system at /blabla with scope /bla/
2006-11-30 10:34:44,967 Trac[cache] DEBUG: Checking whether sync with repository is needed
2006-11-30 10:34:45,049 Trac[svn_fs] DEBUG: Closing subversion file-system at /blabla
2006-11-30 10:34:45,082 Trac[main] ERROR: 'utf8' codec can't decode bytes in position 2-4: invalid data
Traceback (most recent call last):
  File "/usr/lib/python2.3/site-packages/trac/web/main.py", line 356, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.3/site-packages/trac/web/main.py", line 224, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/lib/python2.3/site-packages/trac/Timeline.py", line 158, in process_request
    for username, name, email in self.env.get_known_users():
  File "/usr/lib/python2.3/site-packages/trac/env.py", line 284, in get_known_users
    cursor.execute("SELECT DISTINCT s.sid, n.value, e.value "
  File "/usr/lib/python2.3/site-packages/trac/db/util.py", line 48, in execute
    return self.cursor.execute(sql)
  File "/usr/lib/python2.3/site-packages/trac/db/util.py", line 48, in execute
    return self.cursor.execute(sql)
  File "/usr/lib/python2.3/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/python2.3/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 2-4: invalid data
2006-11-30 10:34:46,760 Trac[api] DEBUG: Updating wiki page index
2006-11-30 10:34:47,756 Trac[api] DEBUG: Updating wiki page index

Killed httpd -HUP and everything went back to 100% operationnal.

Yes, weird …

I dont know if it will be helpful, I just hope :)

comment:18 by anonymous, 18 years ago

Cc: benjamin.azan@… added

i'm having the same problem using trac 0.10. the web server is tracd and the database backend is MySQL

comment:19 by Christian Boos, 18 years ago

Keywords: needinfo added

Please make sure you're using utf8 for the database charset and utf8_general_ci for the database collation parameter, as explained in #3884.

comment:20 by frederic.duarte, 18 years ago

According to PhpMyAdmin, our database is UTF-8 Unicode (utf8), but the mysql> status shows everything being latin1. I think I've been misleaded. I'll try to change that and keep you in touch…

comment:21 by anonymous, 18 years ago

We changed the database settings of my.cnf and now the database is full utf-8 :

[mysqld]
...
character-set-server=utf8
collation-server=utf8_general_ci

[client]
default-character-set=utf8

Gives now :

mysql> status
[...]
Server version:         4.1.20
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
[...]

We also dumped the database and converted the Latin1 fields types and contents to the utf-8 equivalent but we felt into the index limited size problem, see : #3676, #3673 and so on …

So we had to modify some maximum index sizes for MySQL to accept the queries, but we are not aware of their optimal size. All that is ugly.

Anyway, the expected result is still not here : the system still sometimes crashes with the same kind of error, but … now only for the pages with diacritics.

As usually, after reloading the Apache process, the problem is gone for a random period of time between 0.5 and 3 days.

Could it be a problem of connection character set between Apache/Trac and the MySQL server that wouldn't be consistent after a period of time ? Non persistent timeout or so ?

comment:22 by frederic.duarte, 18 years ago

Sorry, my connection timed out and I forgot to put my username again. The previous comment is mine. Cheers.

comment:23 by Christian Boos, 18 years ago

#4879 closed as duplicate of this ticket.

comment:24 by Jörg Wendland, 18 years ago

Cc: joerg.wendland@… added

we discover everyday the same problem on all pages with

  • apache 2.2.4
  • trac-0.10.3-win32
  • mySQL 5.0.27
  • Python 2.4

the only workaround still is a restart of apache.

as I understood so far, the problem appears (at least)

  • with trac 0.10b1, 0.10.3
  • either with apache 2.0.52, 2.0.54, 2.2.4 or tracd
  • either with SQLite 3.2.1 or mySQL 4.1.11, 4.1.20, 5.0.22, 5.0.27
  • with one of Python 2.3.4, 2.3.5, 2.4, 2.4.3
  • with mod_python 3.1.4, 3.2.10

all of this let me believe that it is not a problem of the database nor of the web server.

in reply to:  24 ; comment:25 by Christian Boos, 18 years ago

Replying to Jörg Wendland:

  • either with SQLite 3.2.1

Really? Also with sqlite? If so, can you please attach a backtrace for this specific case?

For the issues with MySQL, you didn't mention one quite important piece of the puzzle, the MySqlDb python bindings for MySQL. Check on that wiki page for additional hints about which version you could use.

in reply to:  25 comment:26 by anonymous, 18 years ago

Replying to cboos:

Replying to Jörg Wendland:

  • either with SQLite 3.2.1

Really? Also with sqlite? If so, can you please attach a backtrace for this specific case?

ups, you are right! No, not with SQLite! I just searched all these comments regarding configurations but didn't notice the different error message.

For the issues with MySQL, you didn't mention one quite important piece of the puzzle, the MySqlDb python bindings for MySQL. Check on that wiki page for additional hints about which version you could use.

Our bindings are MySQL/python 1.2.1_p2.

That leads to a slightly corrected list of configurations with this error:

  • trac 0.10b1, 0.10.3
  • apache 2.0.52, 2.0.54, 2.2.4 or tracd
  • mySQL 4.1.11, 4.1.20, 5.0.22, 5.0.27
  • Python 2.3.4, 2.3.5, 2.4, 2.4.3
  • mod_python 3.1.4, 3.2.10
  • MySQLdb 1.2.0, 1.2.1_p2

comment:27 by Jörg Wendland, 18 years ago

sorry, last comment was mine…

and the last line should read

  • mysql-python 1.2.0, 1.2.1_p2

comment:28 by Christian Boos, 18 years ago

Well, MySQLdb is the name of the package and mysql-python the name of the sourceforge project responsible for this package.

Would it be possible for you to test their latest version? (1.2.2b3)

comment:29 by Christian Boos, 18 years ago

Just looked at their download area: 1.2.2 was released a few days ago (2007-03-03 07:44).

See https://sourceforge.net/project/showfiles.php?group_id=22307.

comment:30 by Jörg Wendland, 18 years ago

hmm… somewhat difficult..

I haven't any compiler on our server, so I can't compile _mysql.c. And I haven't found a binary package for 1.2.2. any ideas?

in reply to:  30 ; comment:31 by Christian Boos, 18 years ago

Replying to Jörg Wendland:

I haven't any compiler on our server, so I can't compile _mysql.c. And I haven't found a binary package for 1.2.2. any ideas?

I've built MySQLdb 1.2.2 for Python 2.4, win32 build (warning: built against MySQL 5.0.24a, I don't know if it will work with other versions).

Please tell where I should send or upload the file (860M).

in reply to:  31 comment:32 by Jörg Wendland, 18 years ago

I've built MySQLdb 1.2.2 for Python 2.4, win32 build (warning: built against MySQL 5.0.24a, I don't know if it will work with other versions).

Please tell where I should send or upload the file (860M).

you could send it to my email address from the cc field. i assume you didn't mean 860M, did you?

comment:33 by evantdster@…, 18 years ago

Cc: evantdster@… added
Component: mod_python frontendgeneral
Owner: changed from Christopher Lenz to Jonas Borgström

Me too. I'm using the tracd webserver with the following versions:

  • trac 0.11dev-r4936
  • mysql 4.1.15
  • Python 2.5
  • MySQL-python 1.2.1_p2

Which results in a full list like this:

  • trac 0.10b1, 0.10.3, 0.11dev-r4936
  • apache 2.0.52, 2.0.54, 2.2.4 or even none when running under tracd
  • mySQL 4.1.11, 4.1.15, 4.1.20, 5.0.22, 5.0.27
  • Python 2.3.4, 2.3.5, 2.4, 2.4.3, 2.5
  • mod_python 3.1.4, 3.2.10, or even none when running under tracd
  • MySQL-python 1.2.0, 1.2.1_p2

To summarize the problem: For several hours after restarting the webserver, everything is fine. Then you start getting errors like:

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 2-4: invalid data

being thrown from:

site-packages/MySQLdb/connections.py

You may see these errors in any number of places, including the wiki. The default TracInstall page seems to reliably reproduce it for me. I personally also see a mysql error about mixing collations in a union when I view a ticket. Unfortunately, I just restarted tracd and finally enabled logging, so I don't have an exact error message. When it happens again, I'll provide exact error messages and stack traces.

I have changed the component to general since this seems to be a problem in the DB access layer (maybe in MySQL-python) rather than the frontend (and there's no DB access layer component).

comment:34 by Christian Boos, 18 years ago

Can you try this patch?

Index: trac/db/mysql_backend.py
===================================================================
--- trac/db/mysql_backend.py	(revision 4994)
+++ trac/db/mysql_backend.py	(working copy)
@@ -182,6 +182,7 @@
 
     def rollback(self):
         if self.cnx.ping():
+            self._set_character_set(self.cnx, 'utf8')
             self.cnx.rollback()
         else:
             self._is_closed = True

Warning: it only applies revisions ≥ r4962, as this changeset modified that method.

If this works, then it should make it in time for 0.10.4, of course.

comment:35 by evantdster@…, 18 years ago

I have applied a variation of that patch. Also, I had MyISAM tables and converted them to InnoDB. I'll report tomorrow if it's successful. Can you explain why the _set_character_set would need to be called again before a rolback? I figured it would only need to be called once at the start. Do you have any idea how the charset for the connection is getting reset (since that *seems* to be the problem)? Or am I misunderstanding?

in reply to:  35 comment:36 by Christian Boos, 18 years ago

Replying to evantdster@gmail.com:

Can you explain why the _set_character_set would need to be called again before a rollback?

It's rather that it should perhaps be called after a ping. The ping not only tests if a connection is still alive, but it will reconnect if it has become stalled (see #3645). It could well be that after this reconnection, the charset setting is lost, and that would explain the behavior described in this ticket (i.e. it initially works, and only after some hours or a day it fails, see comments 2, 8, 14, 16, 24 and your own report).

comment:37 by evantdster@…, 18 years ago

This was pretty much the first thing I read this morning, and it totally makes my day. cboos, you are awesome! Thanks very much for the explanation. It makes perfect sense. Even better, it seems to be right. After being left alone all night, my trac instance is still running excellently.

This seems like a bug in MySQL-python, since it supports specifying the connection encoding when you create it, but doesn't preserve it when it automatically recreates it for you. Looking into the MySQL-python code, it looks like it could be a little complicated to implement since the charset stuff is all in the Python layer, while ping is all in the C layer (though I've never done any C Python extensions, so for all I know, it could be easy).

This definitely seems to fix the problem for me, so I'd love to see it committed!

comment:38 by Christian Boos, 18 years ago

Keywords: weird removed
Milestone: 0.10.50.10.4
Owner: changed from Jonas Borgström to Christian Boos
Status: newassigned

Ok, thanks a lot for the feedback.

comment:39 by Christian Boos, 18 years ago

Resolution: fixed
Status: assignedclosed

I applied that patch in r5022 (trunk) and r5024 (0.10-stable).

If for some people the problem persists with 0.10.4, don't hesitate to reopen this ticket!

comment:40 by evantdster@…, 18 years ago

I'm leaving this closed, but I thought I'd note that I filed a bug against MySQL-python about this. See http://sourceforge.net/tracker/index.php?func=detail&aid=1680931&group_id=22307&atid=374932

comment:41 by Jörg Wendland, 18 years ago

today I just started working and didn't notice my trac not to work as usual, because it worked! I didn't have to restart apache.

what did I do?

yesterday I installed MySQl-python-1.2.2-win32 for Python 2.4 - thanks to cboos!

as of now it seems to work and to fix the problem with the bindings talked above!

comment:42 by Christian Boos, 18 years ago

You're welcome ;-)

By the way, I noticed that there are .eggs for windows available in the Cheese shop: http://cheeseshop.python.org/pypi/MySQL-python/1.2.2

comment:43 by evantdster@…, 18 years ago

Resolution: fixed
Status: closedreopened

Ah nuts! I have recurrence with Trac 0.11dev-r5022.

I get this when viewing wiki/TracInstall:

2007-03-15 11:19:49,743 Trac[main] ERROR: 'utf8' codec can't decode byte 0x93 in position 9798: unexpected code byte
Traceback (most recent call last):
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/web/main.py", line 429, in
dispatch_request
    dispatcher.dispatch(req)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/web/main.py", line 217, in
dispatch
    resp = chosen_handler.process_request(req)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/wiki/web_ui.py", line 104,
in process_request
    page = WikiPage(self.env, pagename, version, db)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/wiki/model.py", line 36, in __init__
    self._fetch(name, version, db)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/wiki/model.py", line 57, in _fetch
    (name,))
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/db/util.py", line 50, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/db/util.py", line 50, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 9798: unexpected code byte

And I get this when viewing any ticket:

2007-03-15 11:19:27,982 Trac[main] ERROR: (1267, "Illegal mix of collations (utf8_general_ci,IMPLICIT) and (latin1_swedish_ci,COERCIBLE) for operation 'UNION'")
Traceback (most recent call last):
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/web/main.py", line 429, in
dispatch_request
    dispatcher.dispatch(req)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/web/main.py", line 217, in
dispatch
    resp = chosen_handler.process_request(req)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/ticket/web_ui.py", line 118, in process_request
    return self._process_ticket_request(req)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/ticket/web_ui.py", line 372, in _process_ticket_request
    self._insert_ticket_data(context, data, get_reporter_id(req, 'author'))
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/ticket/web_ui.py", line 825, in _insert_ticket_data
    for change in self.grouped_changelog_entries(ticket, context.db):
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/ticket/web_ui.py", line 893, in grouped_changelog_entries
    changelog = ticket.get_changelog(when=when, db=db)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/ticket/model.py", line 300, in get_changelog
    "ORDER BY time", (self.id,  str(self.id), self.id))
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/db/util.py", line 50, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/Trac-0.11dev_r5022-py2.5.egg/trac/db/util.py", line 50, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/MySQLdb/cursors.py", line 163, in execute
    self.errorhandler(self, exc, value)
  File "/workplace/spectre-svk/edower/Traction/trunk/build/Traction/lib/python2.5/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
OperationalError: (1267, "Illegal mix of collations (utf8_general_ci,IMPLICIT) and (latin1_swedish_ci,COERCIBLE) for operation 'UNION'")

It definitely seems like my DB connection is getting recreated without setting the charset again.

comment:44 by evantdster@…, 18 years ago

I added a method to override ping in MySQL-python and I'll let you know if it works for me:

    def ping(self):
        """Ping the DB, reconnecting if necessary. Also set_character_set
        because it will have been forgotten if we had to reconnect."""
        _mysql.connection.ping(self)
        self.set_character_set(self.string_decoder.charset)

in reply to:  43 comment:45 by anonymous, 18 years ago

Replying to evantdster@gmail.com:

Ah nuts! I have recurrence with Trac 0.11dev-r5022.

What about trying with MySQL-python 1.2.2?

comment:46 by evantdster@…, 18 years ago

In an attempt to make this easier to reproduce, I ran:

SET GLOBAL wait_timeout=5, interactive_timeout=5;

as root on my mysql instance and then restarted tracd. The idea is that this should make my connection expire in 5 seconds, the new connection (implicitly created by ping) would not use the utf8 charset, and I would see the errors again.

First I wanted to make sure that I could reproduce the problem, then I would verify the solution(s). In order to verify that I could reproduce it, I commented out my added MySQL-python 'ping' override method and also commented out the self._set_character_set(self.cnx, 'utf8') line in the rollback method in mysql_backend.py. When I use the mysql client, I can verify that my connection times out very quickly. And yet I'm unable to reproduce the errors in Trac.

I also verified using the mysql client that my charset info does get lost (and not automatically reset for me) when the connection times out. Obviously one workaround would be to run the mysql server with utf8 as its default for everything for people who have that option. And I also verified with the mysql client (using 'show processlist;') that trac's connection disappears after 5 seconds.

And I'll try MySQL-python 1.2.2 as well, but I would certainly feel better about a fix if I could consistently reproduce the problem.

comment:47 by evantdster@…, 18 years ago

Upgrading to MySQL-python 1.2.2 did not fix the problem for me. Has anyone been able to reproduce the problem without waiting several hours? This is frustrating to debug largely because I can only try one change per day (and then wait a day to see if the error shows up again), so if anyone has advice on how I can accelerate it, I would greatly appreciate it.

comment:48 by techtonik <techtonik@…>, 18 years ago

In my case the problems you have here were caused by wrong collation used in DB, in tables and in columns. The errors about undecoded symbols appeared on localized pages and even though DB was altered to use utf8 collation, tables and columns still had windows-… in their definition.

If you want international symbols to be present in your Trac you need to be sure that DB, tables and columns all have unicode collation or default to unicode.

I tried to convert tables from one of the windows-… encodings to utf8, but found out that Trac will not going to work with MySQL 4.1.x, because after conversion some of the indexes in utf8 tables are getting wider than maximum size allowed for 4.1.x.

So the only solution was to upgrade to 5.0.x and make sure that DB and all its tables and columns have utf8 generic collation.

comment:49 by Emmanuel Blot, 18 years ago

#5018 marked as a duplicate

comment:50 by evantdster@…, 18 years ago

Resolution: fixed
Status: reopenedclosed

FWIW, I can no longer reproduce this, even if I let it marinade over the weekend, so I'm going to close this and assume that if someone else still has the problem they'll reopen it.

comment:51 by techtonik <techtonik@…>, 18 years ago

In case of this bug reopened (while it seems better to get it created anew) we should collect all necessary version information concerning:

Trac 0.10.x
Python 2.x.x
DataBase MySQL x.x.x
Server Apache 2.x.x
mod_python x.x.x
MySQL-python x.x.x

In case of MySQL also the result of the next queries to check database charset:

USE trac;
HOW VARIABLES LIKE '%character%'

And attached dump of database structure to check tables collation:

mysqldump.exe -d trac > trac.sql

comment:52 by anupam@…, 18 years ago

Resolution: fixed
Status: closedreopened

I'm getting this error

Traceback (most recent call last):
  File "/usr/local/lib/python2.4/site-packages/trac/web/main.py", line 406, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/local/lib/python2.4/site-packages/trac/web/main.py", line 237, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/local/lib/python2.4/site-packages/trac/wiki/web_ui.py", line 103, in process_request
    latest_version = WikiPage(self.env, pagename, None, db).version
  File "/usr/local/lib/python2.4/site-packages/trac/wiki/model.py", line 32, in __init__
    self._fetch(name, version, db)
  File "/usr/local/lib/python2.4/site-packages/trac/wiki/model.py", line 53, in _fetch
    (name,))
  File "/usr/local/lib/python2.4/site-packages/trac/db/util.py", line 50, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "/usr/local/lib/python2.4/site-packages/trac/db/util.py", line 50, in execute
    return self.cursor.execute(sql_escape_percent(sql), args)
  File "build/bdist.linux-i686/egg/MySQLdb/cursors.py", line 166, in execute
  File "build/bdist.linux-i686/egg/MySQLdb/connections.py", line 35, in defaulterrorhandler
UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 1804: unexpected code byte

Started happening on accessing a wiki page I had created after I added 3 GIF attachments to it, and then doesn't go away even if I rolled back my edit.

I'm using
Trac 0.10.4
Python 2.4.3
MySql 4.1.21-standard
Server Tracd
MySql-python 1.2.2

mysql> sHOW VARIABLES LIKE '%character%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
7 rows in set (0.01 sec)
DROP TABLE IF EXISTS `wiki`;
CREATE TABLE `wiki` (
  `name` text NOT NULL,
  `version` int(11) NOT NULL default '0',
  `time` int(11) default NULL,
  `author` text,
  `ipnr` text,
  `text` text,
  `comment` text,
  `readonly` int(11) default NULL,
  PRIMARY KEY  (`name`(166),`version`),
  KEY `wiki_time_idx` (`time`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

comment:53 by anonymous, 17 years ago

You could try to set the default encoding for python to utf8 (instead of ascii). Go to the python directory (/usr/lib/python2.4), edit file site.py and change the default in the function setencoding.

in reply to:  52 comment:54 by sid, 17 years ago

Resolution: fixed
Status: reopenedclosed

Replying to anupam@bunchball.com:

I'm getting this error UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 1804: unexpected code byte

As pointed out in comment:48, you need to:

I tried to convert tables from one of the windows-… encodings to utf8, but found out that Trac will not going to work with MySQL 4.1.x, because after conversion some of the indexes in utf8 tables are getting wider than maximum size allowed for 4.1.x.

So the only solution was to upgrade to 5.0.x and make sure that DB and all its tables and columns have utf8 generic collation.

Please reopen if you've done that and you're still having the problem. Thanks!

comment:55 by hagnat, 17 years ago

ok, as i pointed out in #6495, i was having the utf8 problem with my local trac… but soon after i changed my mysql tables to use utf8 and utf8_general_ci and restarted tracd Trac was working fine and we were all happy in here.

But as i tried to access our local trac today, the same bug was there, tables were fine, and simply restarting tracd was enough to make trac work fine as before.

character_set_database is utf8, character_set_server is latin1

i am a little confused now, what could be wrong with my trac ? (beyond running on a windows2003 server with iis and mysql 4.x)

comment:56 by greubel@…, 17 years ago

Priority: highestlow
Resolution: fixed
Status: closedreopened
Version: 0.10b10.10.4

This problem still exists on a system with following configurations/packages:

+--------------------------+-----------------------------------+
| Variable_name            | Value                             |
+--------------------------+-----------------------------------+
| character_set_client     | latin1                            |
| character_set_connection | latin1                            |
| character_set_database   | utf8                              |
| character_set_filesystem | binary                            |
| character_set_results    | latin1                            |
| character_set_server     | latin1                            |
| character_set_system     | utf8                              |
| character_sets_dir       | /opt/mysql5/share/mysql/charsets/ |
+--------------------------+-----------------------------------+

The tables itself are collation utf8_unicode_ci. MySQL version is 5.0.37.

This error seems to occur when using german umlauts in the comment field for the wiki.

I already tried the hint to modify the default encoding in /usr/lib/python2.4/site.py without a change of this behaviour. For now I recommend to not make use of the comment field to my colleagues.

Thanks for hints, kind regards and very nice christmas to all.

Maik

comment:57 by anonymous, 17 years ago

Owner: changed from Christian Boos to mondher
Status: reopenednew

comment:58 by anonymous, 17 years ago

Owner: changed from mondher to Christian Boos

comment:59 by jd@…, 16 years ago

Keywords: postgresql added

I can duplicate this with postgresql as well

comment:60 by Christian Boos, 16 years ago

Milestone: 0.10.4
Resolution: worksforme
Status: newclosed

Since 0.11.1 MySQL support has much improved, so one should be at least using that version before reporting any (new) issue about encoding and MySQL.

See the MySqlDb page for details.

Similar encoding problems with PostgreSQL should eventually be reported in an another ticket, but same warning apply: be sure to use a recent Trac (0.11.1 or above) and a recent version of the PsycoPg2 bindings (2.0.6 or above).

comment:61 by Ryan J Ollos, 10 years ago

Keywords: postgresql mysql trac wiki utf8 sqlite unicode needinfo → postgresql, mysql, trac, wiki, utf8, sqlite, unicode, needinfo

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.