Edgewall Software
Modify

Ticket #7799 (closed defect: duplicate)

Opened 3 years ago

Last modified 2 years ago

listing files with utf8 filenames raises UnicodeDecodeError with certain characters

Reported by: shockwave107@… Owned by: cboos
Priority: normal Milestone:
Component: plugin/mercurial Version: 0.12dev
Severity: normal Keywords: unicode
Cc: igor.dejanovic@…
Release Notes:
API Changes:

Description

How to Reproduce

  1. Requires: filesystem with utf8 filename support
  2. create mercurial repository
  3. create file with utf8 coded character "ö"

(["ö"="0xc3 0xb6"] or others with utf8 byte sequence containing "bytes out of range(128)" )

  1. add/commitpush, and navigate with repository browser to containing path

same procedure with subversion does work.

Original Query and Trace

While doing a GET operation on /browser/WuS/uebung, Trac issued an internal error.

Contained File

WuS/uebung/Lös4.doc

Request parameters:

{'path': u'/WuS/uebung'}

User Agent was: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.9 (like Gecko) (Gentoo)

$&
Trac 0.11.1
Python 2.5.2 (r252, Oct 27 2008, 23:33:32)
[GCC 4.1.2 (Gentoo 4.1.2 p1.1)]
setuptools 0.6c8
psycopg2 2.0.2
Genshi 0.5.1
mod_python 3.3.1
Pygments 0.10
Mercurial 1.0.2
jQuery: 1.2.6 $&
Traceback (most recent call last):
  File "/usr/lib64/python2.5/site-packages/trac/web/main.py", line 423, in _dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib64/python2.5/site-packages/trac/web/main.py", line 197, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/lib64/python2.5/site-packages/trac/versioncontrol/web_ui/browser.py", line 361, in process_request
    'dir': node.isdir and self._render_dir(req, repos, node, rev),
  File "/usr/lib64/python2.5/site-packages/trac/versioncontrol/web_ui/browser.py", line 407, in _render_dir
    entries = [entry(n) for n in node.get_entries()]
  File "build/bdist.linux-x86_64/egg/tracext/hg/backend.py", line 652, in get_entries
    entry = posixpath.join(self.path, entry)
  File "/usr/lib64/python2.5/posixpath.py", line 65, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)


Additional

TracMercurial from svn

http://svn.edgewall.com/repos/trac/sandbox/mercurial-plugin-0.11 
Revision: 7656

PS: same with Trac 0.11.0/0.10.5 and TracMercurial 0.10 svn version

Attachments

Change History

comment:1 Changed 3 years ago by rblank

  • Milestone set to not applicable

comment:2 Changed 3 years ago by cboos

#8018 closed as duplicate.

Non-ascii file names are only one part of the problem. See also #7160.

comment:3 Changed 3 years ago by Joseph Tate <jtate+trac@…>

  • Version changed from 0.11.1 to 0.12dev

This still happens in 0.12 svn:

Most recent call last:
File "/usr/lib/python2.4/site-packages/Trac-0.12multirepos_r0-py2.4.egg/trac/web/main.py", line 467, in _dispatch_request
  dispatcher.dispatch(req)
File "/usr/lib/python2.4/site-packages/Trac-0.12multirepos_r0-py2.4.egg/trac/web/main.py", line 212, in dispatch
  resp = chosen_handler.process_request(req)
File "/usr/lib/python2.4/site-packages/Trac-0.12multirepos_r0-py2.4.egg/trac/versioncontrol/web_ui/browser.py", line 376, in process_request
  dir_data = self._render_dir(
File "/usr/lib/python2.4/site-packages/Trac-0.12multirepos_r0-py2.4.egg/trac/versioncontrol/web_ui/browser.py", line 493, in _render_dir
  entries = [entry(n) for n in node.get_entries()]
File "/home/admin/trac_stuff/mercurial-plugin-0.12/tracext/hg/backend.py", line 754, in get_entries
  dirnodes = self.findnode((n_rev > node_rev) and node_rev or n_rev, dirnames)
File "/home/admin/trac_stuff/mercurial-plugin-0.12/tracext/hg/backend.py", line 661, in findnode
  if f.startswith(d):

d=u'assets/help/'
f='v_images/affiliate_0/affiliate_logos/BJ\x92s_Wholesale_Club_logo.gif'

Notice that f is a str. I tried wrapping that and a few other things in to_unicode(), but it ended up being a rat hole. I'm not sure where that wrapping needs to occur, but it appears to be partially applied at the moment.

comment:4 Changed 2 years ago by Igor Dejanović <igor.dejanovic@…>

  • Cc igor.dejanovic@… added

Same issue here.
Putting

import sys
sys.setdefaultencoding('utf-8')

in sitecustomize.py seems to fix it.

comment:5 Changed 2 years ago by cboos

  • Keywords unicode added
  • Milestone changed from not applicable to mercurial-plugin

comment:6 Changed 2 years ago by cboos

  • Milestone mercurial-plugin deleted
  • Resolution set to duplicate
  • Status changed from new to closed

See #8538, which has the beginning of a patch...

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
to The owner will be changed from cboos. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.