Ticket #2972 (closed defect: fixed)
Opened 6 years ago
Last modified 6 years ago
Error with non-ASCII project description
| Reported by: | eblot | Owned by: | cboos |
|---|---|---|---|
| Priority: | high | Milestone: | 0.10 |
| Component: | general | Version: | devel |
| Severity: | major | Keywords: | unicode |
| Cc: | |||
| Release Notes: | |||
| API Changes: | |||
Description
If the project description in the configuration file (conf\trac.ini) contains non-ASCII characters, such as:
[project] ; ... descr = Tést
Trac [3094] fails with the following error:
Python traceback
Traceback (most recent call last):
File "trac\web\main.py", line 299, in dispatch_request
dispatcher.dispatch(req)
File "trac\web\main.py", line 153, in dispatch
populate_hdf(req.hdf, self.env, req)
File "trac\web\main.py", line 92, in populate_hdf
hdf['project'] = {
File "trac\web\clearsilver.py", line 194, in __setitem__
self.set_value(name, value, True)
File "trac\web\clearsilver.py", line 236, in set_value
add_value(name, value)
File "trac\web\clearsilver.py", line 228, in add_value
add_value('%s.%s' % (prefix, k), value[k])
File "trac\web\clearsilver.py", line 218, in add_value
self.hdf.setValue(prefix, markup.escape(value))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
It seems the trac.ini file encoding is irrelevant. At least, the same trouble occurs with iso-8859-1 and utf-8 encodings.
The trouble occurs with the project description, but it probably occurs with other fields as well.
Attachments
Change History
comment:1 Changed 6 years ago by cboos
- Milestone set to 0.10
- Owner changed from jonas to cboos
- Priority changed from normal to high
- Status changed from new to assigned
comment:2 Changed 6 years ago by eblot
Is this option still necessary ? (I mean, can't UTF-8 always be used)
comment:3 Changed 6 years ago by cboos
default_charset is used when there's no other clue to
determine what charset is used by some external content
(like an attachment file, a source file in the repository,
etc.) so I think using it for the TracIni file itself is
a good fit.
Note: using iso-8859-15 or similar here rather than utf-8
has the nice property of making a file always decipherable,
as this is a fixed byte encoding.
If you'd use utf-8 for that config option, you'll likely go into
trouble whenever you'll manipulate anything but ascii or utf-8.
Changed 6 years ago by cboos
- Attachment text_to_unicode_r3109.diff added
Possible fix for the issue; consider trac.ini is an UTF-8 encoded file, or encoded using the system encoding
comment:4 Changed 6 years ago by cboos
Well, finally in the attachment:text_to_unicode_r3109.diff
I took the approach of defaulting to UTF-8, and if this doesn't
work, use the system encoding, after a suggestion from cmlenz.
comment:5 Changed 6 years ago by cboos
- Resolution set to fixed
- Status changed from assigned to closed
This should now be fixed in r3118.



Thanks for having tested that. I suspected problems like this
from looking at some of the (various) backtraces in #2905.
I think it should be OK to require that the trac.ini file
should have the same encoding as the one specified in