bugzilla2trac.py chokes on Unicode (UTF-8) or localized data
|Reported by:||Owned by:|
|Severity:||major||Keywords:||patch, windows, unicode, pysqlite, CJK|
Windows 7 Python 2.7 Trac 0.12 MySQL-python-1.2.3 MySQL Essential 5.0.90 bugzilla2trac.py 
I am migrating from a Bugzilla with bugs written in Chinese, MySQL database encoded in UTF-8. The Trac instance is on an SQLite DB.
At first, bugzilla2trac finished very quickly, without importing any bugs. It showed the product names as question marks. I added charset='utf8' to the MySQL connection string, and the products were displayed well.
Then, at the beginning of '7. Import bugs and bug activity…', bugzilla2trac failed with the following traceback.
Traceback (most recent call last): … File “C:\Users\jackqq\Desktop\bugzilla2trac.py”, line 301, in setComponentList comp['owner'].encode('utf-8'))) … sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.
I tried removing all the calls to encode('utf-8'), and bugzilla2trac started to import successfully.
Hope this tip helps those with CJK chars in their Bugzilla DBs.