Edgewall Software

Ticket #2972 (closed defect: fixed)

Opened 2 years ago

Last modified 2 years ago

Error with non-ASCII project description

Reported by: eblot Owned by: cboos
Priority: high Milestone: 0.10
Component: general Version: devel
Severity: major Keywords: unicode
Cc:

Description

If the project description in the configuration file (conf\trac.ini) contains non-ASCII characters, such as:

[project]
; ...
descr = Tést

Trac [3094] fails with the following error:

Python traceback

Traceback (most recent call last):
  File "trac\web\main.py", line 299, in dispatch_request
    dispatcher.dispatch(req)
  File "trac\web\main.py", line 153, in dispatch
    populate_hdf(req.hdf, self.env, req)
  File "trac\web\main.py", line 92, in populate_hdf
    hdf['project'] = {
  File "trac\web\clearsilver.py", line 194, in __setitem__
    self.set_value(name, value, True)
  File "trac\web\clearsilver.py", line 236, in set_value
    add_value(name, value)
  File "trac\web\clearsilver.py", line 228, in add_value
    add_value('%s.%s' % (prefix, k), value[k])
  File "trac\web\clearsilver.py", line 218, in add_value
    self.hdf.setValue(prefix, markup.escape(value))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)

It seems the trac.ini file encoding is irrelevant. At least, the same trouble occurs with iso-8859-1 and utf-8 encodings.

The trouble occurs with the project description, but it probably occurs with other fields as well.

Attachments

text_to_unicode_r3109.diff (3.4 kB) - added by cboos 2 years ago.
Possible fix for the issue; consider trac.ini is an UTF-8 encoded file, or encoded using the system encoding

Change History

Changed 2 years ago by cboos

  • owner changed from jonas to cboos
  • priority changed from normal to high
  • status changed from new to assigned
  • milestone set to 0.10

Thanks for having tested that. I suspected problems like this from looking at some of the (various) backtraces in #2905.

I think it should be OK to require that the trac.ini file should have the same encoding as the one specified in

[trac]
default_charset = ...

Changed 2 years ago by eblot

Is this option still necessary ? (I mean, can't UTF-8 always be used)

Changed 2 years ago by cboos

default_charset is used when there's no other clue to determine what charset is used by some external content (like an attachment file, a source file in the repository, etc.) so I think using it for the TracIni file itself is a good fit.

Note: using iso-8859-15 or similar here rather than utf-8 has the nice property of making a file always decipherable, as this is a fixed byte encoding. If you'd use utf-8 for that config option, you'll likely go into trouble whenever you'll manipulate anything but ascii or utf-8.

Changed 2 years ago by cboos

Possible fix for the issue; consider trac.ini is an UTF-8 encoded file, or encoded using the system encoding

Changed 2 years ago by cboos

Well, finally in the attachment:text_to_unicode_r3109.diff I took the approach of defaulting to UTF-8, and if this doesn't work, use the system encoding, after a suggestion from cmlenz.

Changed 2 years ago by cboos

  • status changed from assigned to closed
  • resolution set to fixed

This should now be fixed in r3118.

Add/Change #2972 (Error with non-ASCII project description)

Author



Change Properties
<Author field>
Action
as closed
Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.