Edgewall Software
Modify

Opened 17 years ago

Closed 16 years ago

Last modified 15 years ago

#5607 closed defect (worksforme)

RestructuredText preview doesn't handle utf-8

Reported by: Dave Abrahams <dave@…> Owned by: Christian Boos
Priority: normal Milestone:
Component: wiki system Version: devel
Severity: normal Keywords:
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

Check a utf-8 ReST document containing unicode curly quotes into svn, look at the document in the browser, see garbage characters. Isn't there some way to automatically detect the encoding? Emacs does it most of the time.

Attachments (1)

utf8.rst (14 bytes ) - added by Dave Abrahams <dave@…> 16 years ago.
ReST file with utf-8 curly quotes

Download all attachments as: .zip

Change History (7)

comment:1 by Remy Blank, 16 years ago

Keywords: needinfo added

Is this still an issue with the current 0.11.1 version and Pygments? If yes, could you please attach a sample ReST file that shows the problem?

by Dave Abrahams <dave@…>, 16 years ago

Attachment: utf8.rst added

ReST file with utf-8 curly quotes

comment:2 by Dave Abrahams <dave@…>, 16 years ago

Resolution: fixed
Status: newclosed

Appears to be fixed as the attachment shows.

comment:3 by Remy Blank, 16 years ago

Keywords: needinfo removed

Actually, this is a configuration issue. When no charset information is available to display a text file, Trac uses the [trac] default_charset configuration option to convert the file to utf-8. This site is most probably configured with default_charset=utf-8, hence the attachment is displayed properly. Changing the setting to default_charset=iso-8859-15 (the default) will show the problem you describe.

The ticket description mentions files checked into SVN, though. If for some reason you can't set default_charset=utf-8 on your site, you can add an svn:mime-type property to your files and specify the charset. For example, a ReST file would have the following MIME type:

text/x-rst;charset=utf-8

This will override the default_charset setting.

There is currently no way of doing the same for attachments, although it has been requested in #7724.

comment:4 by Dave Abrahams <dave@…>, 16 years ago

Awesome; that worked! Thanks for the explanation.

2 follow up questions:

  1. Would utf-8 be a superior default?
  2. Is this information documented somewhere?

in reply to:  4 comment:5 by Remy Blank, 16 years ago

Replying to Dave Abrahams <dave@…>:

  1. Would utf-8 be a superior default?

It depends on what encoding most of your files use. That will leave you less files to "tag" with an svn:mime-type property.

Personally, I don't understand why everybody isn't using utf-8 already. I can't see a downside.

  1. Is this information documented somewhere?

default_charset is obviously documented in TracIni. The svn:mime-type with charset was discussed on the SVN developer mailing list some time ago, but I couldn't find any mention about it in the documentation.

And so you are warned: you'll not be able to set the charset in the [auto-props] section of your SVN configuration, as ';' is used to separate properties in that file (see this post). You'll have to set the property manually with svn pset. One more reason to set a sensible default_charset.

I'll add a section to TracBrowser about svn:mime-type.

comment:6 by Christian Boos, 15 years ago

Resolution: fixedworksforme

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.