Opened 18 years ago
Closed 17 years ago
#5169 closed defect (wontfix)
UTF8 Conversion Error
Reported by: | Owned by: | Jonas Borgström | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | search system | Version: | 0.10.3.1 |
Severity: | normal | Keywords: | needinfo |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description (last modified by )
We updated our Trac to 0.10.3.1 and moved it to a new machine, but since then we have the following error if we search for some word like "version":
Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 387, in dispatch_request dispatcher.dispatch(req) File "/usr/lib/python2.4/site-packages/trac/web/main.py", line 237, in dispatch resp = chosen_handler.process_request(req) File "/usr/lib/python2.4/site-packages/trac/Search.py", line 181, in process_request results += list(source.get_search_results(req, terms, filters)) File "/usr/lib/python2.4/site-packages/trac/ticket/api.py", line 267, in get_search_results for summary, desc, author, keywords, tid, date, status in cursor: File "/usr/lib/python2.4/site-packages/trac/db/util.py", line 40, in __iter__ row = self.cursor.fetchone() File "/usr/lib/python2.4/site-packages/trac/db/sqlite_backend.py", line 73, in fetchone return row and self._convert_row(row) or None File "/usr/lib/python2.4/site-packages/trac/db/sqlite_backend.py", line 69, in _convert_row return tuple([(isinstance(v, str) and [v.decode('utf-8')] or [v])[0] File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xa7 in position 174: unexpected code byte
Attachments (0)
Change History (6)
comment:1 by , 18 years ago
Description: | modified (diff) |
---|
follow-up: 3 comment:2 by , 18 years ago
We updated from 0.8.2 (I think, not sure atm).
I resynced the Database. It was advised here: http://trac.edgewall.org/wiki/TracUpgrade
HOw do I sanitize the DB?
comment:3 by , 18 years ago
Replying to bodo.tasche@lexisnexis.de:
I resynced the Database. It was advised here: http://trac.edgewall.org/wiki/TracUpgrade
There are three different actions:
- mandatory:
trac-admin upgrade
to upgrade the database schema. A new version of Trac would not run without such an upgrade - optional:
trac-admin wiki upgrade
to upgrade the default wiki pages. This is useful to get up-to-date documentation in your wiki pages - optional:
trac-admin resync
to rebuild the content of the repository cache which is stored in the DB
I was refering to trac-admin resync
: this would rebuild the cache, which may help fixing the issue if the non-UTF8 source is one of the SVN log message of your repository.
You need to find the source of the non-UTF8 characters: it could be in a log message, in a ticket, in a wiki page. One way to track down the issue is to use or create a user that is given permissions for only a subset of the Trac features, and search for a term: Trac only searches fom where a user has access. For ex. if you remove the CHANGESET_VIEW permission for a user, Trac won't search in the SVN log messages.
How do I sanitize the DB?
I'm afraid you'll have to dump the SQLite DB in a file, seach for non-UTF8 character, replace them with their UTF-8 counterpart and reload the SQLite DB
comment:4 by , 17 years ago
Keywords: | needinfo added |
---|
Did eblot's suggestions help you resolve the problem?
comment:5 by , 17 years ago
One thing you can try is to use wget to crawl your site and then look for which file contains a traceback. I imported a bunch of information from CVSTrac when we migrated and found that pasted emails and word doc content were problematic because they were encoded with the standard Windows 1252 code page. 0xA7 is the section sign (like interlinked S characters). You might see what you can find that way.
Did you import data into Trac?
comment:6 by , 17 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Probably a problem with data imported into Trac. See above for the recovery procedure.
It looks like the original DB contains non-UTF8 characters, you'll probably have to sanitize the DB and re-import it into Trac.
Have you tried to 'resync' the repository cache?
Which version did you upgrade from?