Edgewall Software
Modify

Opened 13 years ago

Closed 13 years ago

#2972 closed defect (fixed)

Error with non-ASCII project description

Reported by: Emmanuel Blot Owned by: Christian Boos
Priority: high Milestone: 0.10
Component: general Version: devel
Severity: major Keywords: unicode
Cc: Branch:
Release Notes:
API Changes:

Description

If the project description in the configuration file (conf\trac.ini) contains non-ASCII characters, such as:

[project]
; ...
descr = Tést

Trac [3094] fails with the following error:

Python traceback

Traceback (most recent call last):
  File "trac\web\main.py", line 299, in dispatch_request
    dispatcher.dispatch(req)
  File "trac\web\main.py", line 153, in dispatch
    populate_hdf(req.hdf, self.env, req)
  File "trac\web\main.py", line 92, in populate_hdf
    hdf['project'] = {
  File "trac\web\clearsilver.py", line 194, in __setitem__
    self.set_value(name, value, True)
  File "trac\web\clearsilver.py", line 236, in set_value
    add_value(name, value)
  File "trac\web\clearsilver.py", line 228, in add_value
    add_value('%s.%s' % (prefix, k), value[k])
  File "trac\web\clearsilver.py", line 218, in add_value
    self.hdf.setValue(prefix, markup.escape(value))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

It seems the trac.ini file encoding is irrelevant. At least, the same trouble occurs with iso-8859-1 and utf-8 encodings.

The trouble occurs with the project description, but it probably occurs with other fields as well.

Attachments (1)

text_to_unicode_r3109.diff (3.4 KB ) - added by Christian Boos 13 years ago.
Possible fix for the issue; consider trac.ini is an UTF-8 encoded file, or encoded using the system encoding

Download all attachments as: .zip

Change History (6)

comment:1 by Christian Boos, 13 years ago

Milestone: 0.10
Owner: changed from Jonas Borgström to Christian Boos
Priority: normalhigh
Status: newassigned

Thanks for having tested that. I suspected problems like this from looking at some of the (various) backtraces in #2905.

I think it should be OK to require that the trac.ini file should have the same encoding as the one specified in

[trac]
default_charset = ...

comment:2 by Emmanuel Blot, 13 years ago

Is this option still necessary ? (I mean, can't UTF-8 always be used)

comment:3 by Christian Boos, 13 years ago

default_charset is used when there's no other clue to determine what charset is used by some external content (like an attachment file, a source file in the repository, etc.) so I think using it for the TracIni file itself is a good fit.

Note: using iso-8859-15 or similar here rather than utf-8 has the nice property of making a file always decipherable, as this is a fixed byte encoding. If you'd use utf-8 for that config option, you'll likely go into trouble whenever you'll manipulate anything but ascii or utf-8.

by Christian Boos, 13 years ago

Attachment: text_to_unicode_r3109.diff added

Possible fix for the issue; consider trac.ini is an UTF-8 encoded file, or encoded using the system encoding

comment:4 by Christian Boos, 13 years ago

Well, finally in the attachment:text_to_unicode_r3109.diff I took the approach of defaulting to UTF-8, and if this doesn't work, use the system encoding, after a suggestion from cmlenz.

comment:5 by Christian Boos, 13 years ago

Resolution: fixed
Status: assignedclosed

This should now be fixed in r3118.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.