#10121 closed defect (fixed)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 5: ordinal not in range(128)
Reported by: | Owned by: | Christian Boos | |
---|---|---|---|
Priority: | normal | Milestone: | 0.12.3 |
Component: | general | Version: | 0.12.2 |
Severity: | major | Keywords: | unicode error, localization, german umlaut |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
How to Reproduce
While doing a GET operation on /timeline
, Trac issued an internal error.
UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 5: ordinal not in range(128)
This occured after checking in a new changeset containting a german Umlaut 'Ü' into Subversion. I tried to deactivate all active plugins, but this problem seems to be located in Trac itself. The exception is thrown at the timeline module. The same message ("added c# implementation") seems to make no problems when displayed in the source code browser.
Request parameters:
{}
User agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16
System Information
Trac | 0.12.2
|
Babel | 0.9.5
|
Genshi | 0.6
|
mod_wsgi | 2.5 (WSGIProcessGroup moin WSGIApplicationGroup trac.urpg-engine.net|)
|
pysqlite | 2.3.2
|
Python | 2.5.2 (r252:60911, Jan 24 2010, 18:02:01) [GCC 4.3.2]
|
setuptools | 0.6c11
|
SQLite | 3.5.9
|
Subversion | 1.5.1 (r32289)
|
jQuery | 1.4.2
|
Enabled Plugins
TracMenusPlugin | 0.1
|
Python Traceback
Traceback (most recent call last): File "build/bdist.linux-x86_64/egg/trac/web/main.py", line 511, in _dispatch_request dispatcher.dispatch(req) File "build/bdist.linux-x86_64/egg/trac/web/main.py", line 237, in dispatch resp = chosen_handler.process_request(req) File "build/bdist.linux-x86_64/egg/trac/timeline/web_ui.py", line 213, in process_request format='rss') File "build/bdist.linux-x86_64/egg/trac/web/href.py", line 181, in <lambda> self._derived[name] = lambda *args, **kw: self(name, *args, **kw) File "build/bdist.linux-x86_64/egg/trac/web/href.py", line 175, in __call__ href += '?' + unicode_urlencode(params, self.query_safe) File "build/bdist.linux-x86_64/egg/trac/util/text.py", line 118, in unicode_urlencode k = quote_plus(str(k), safe) UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 5: ordinal not in range(128)
Attachments (1)
Change History (18)
follow-up: 2 comment:1 by , 14 years ago
follow-up: 4 comment:2 by , 14 years ago
Replying to cboos:
And anyway, why do we end up with that Ü as a parameter in the timeline RSS link? I have no idea how it could slip from the commit message to the
filters
…
The name of a repository?
comment:3 by , 14 years ago
Milestone: | → 0.12.3 |
---|---|
Severity: | blocker → major |
Ugh, right ;-) Ok, so let's fix unicode_urlencode
and see what happens in such a scenario.
comment:4 by , 14 years ago
Replying to rblank:
Replying to cboos:
And anyway, why do we end up with that Ü as a parameter in the timeline RSS link? I have no idea how it could slip from the commit message to the
filters
…The name of a repository?
Yes, the letter is also inside the repository name. i totally forgot to add that, since the error first ocurred after the initial svn commit :\ That means i can do a workaround by just changing the rep. name in the admin control panel to get the timeline working again. I first feared i needed to revert the changeset or something..Thanks for the info! :)
comment:5 by , 14 years ago
Owner: | set to |
---|
follow-ups: 9 15 comment:6 by , 14 years ago
Status: | new → assigned |
---|
comment:7 by , 13 years ago
comment:8 by , 13 years ago
A couple of Javascript errors left for repositories with non-ascii names (comment:6). Nothing critical but annoying.
follow-up: 10 comment:9 by , 13 years ago
Replying to cboos:
- the checkbox for a repository with non-ascii characters doesn't get updated properly
This is because req.args
contain UTF-8 str
keys.
I think we should to_unicode()
that in timeline.py for 0.12.x:
-
trac/timeline/web_ui.py
diff --git a/trac/timeline/web_ui.py b/trac/timeline/web_ui.py index 81eec11..e6a8a2d 100644
a b class TimelineModule(Component): 142 142 available_filters += event_provider.get_timeline_filters(req) or [] 143 143 144 144 # check the request or session for enabled filters, or use default 145 filters = [f[0] for f in available_filters if f[0] in req.args] 145 keys = [to_unicode(k) for k in req.args] 146 filters = [f[0] for f in available_filters if f[0] in keys] 146 147 if not filters and format != 'rss': 147 148 filters = [f[0] for f in available_filters 148 149 if req.session.get('timeline.filter.' + f[0]) == '1'] … … class TimelineModule(Component): 153 154 if 'update' in req.args: 154 155 for filter in available_filters: 155 156 key = 'timeline.filter.%s' % filter[0] 156 if filter[0] in req.args:157 if filter[0] in keys: 157 158 req.session[key] = '1' 158 159 elif key in req.session: 159 160 del req.session[key]
… but rather ensure that the whole req.args
content is unicode
for 0.13:
-
trac/web/api.py
diff --git a/trac/web/api.py b/trac/web/api.py index 1dea028..69d3a6e 100644
a b class Request(object): 571 571 name = value.name 572 572 if not value.filename: 573 573 value = unicode(value.value, 'utf-8') 574 args.append(( name, value))574 args.append((to_unicode(name), value)) 575 575 return args 576 576 577 577 def _parse_cookies(self):
Or maybe directly make the second change on 0.12-stable, as no test gets harmed in the process and I don't think this could break things ;-) Thoughts?
- the autoexpand feature in browser navigation is broken for such a repository (see also r10609)
Can't reproduce that one.
follow-up: 11 comment:10 by , 13 years ago
Replying to cboos:
… but rather ensure that the whole
req.args
content isunicode
for 0.13:
Yes, that makes the most sense to me.
Or maybe directly make the second change on 0.12-stable, as no test gets harmed in the process and I don't think this could break things ;-) Thoughts?
I'm split between the desire to do the right thing, and the superstition that any major change on 0.12-stable will come back and bite us :)
/me tosses coin…
Ok, if no tests break, let's do it. The only possible issue I can see is if a plugin has encountered the issue, and is converting the keys to UTF-8 itself. But that should be rare.
Can't reproduce that one.
Weird, I can't either. Forget about it, then ;)
by , 13 years ago
Attachment: | t10121-ie7-r10871.png added |
---|
follow-ups: 12 13 comment:11 by , 13 years ago
Replying to rblank:
Replying to cboos:
… but rather ensure that the whole
req.args
content isunicode
for 0.13:Yes, that makes the most sense to me.
parse_arg_list
in trac.web.api
has the same problem. I think we should change.
-
trac/web/api.py
diff -r 1cb9ae3fba1c trac/web/api.py
a b 116 116 else: 117 117 (name, value) = (nv[0], empty) 118 118 name = unquote(name.replace('+', ' ')) 119 if isinstance(name, unicode):120 name = name.encode('utf-8')119 if not isinstance(name, unicode): 120 name = unicode(name, 'utf-8') 121 121 value = unquote(value.replace('+', ' ')) 122 122 if not isinstance(value, unicode): 123 123 value = unicode(value, 'utf-8')
Ok, if no tests break, let's do it. The only possible issue I can see is if a plugin has encountered the issue, and is converting the keys to UTF-8 itself. But that should be rare.
+1.
Can't reproduce that one.
Weird, I can't either. Forget about it, then ;)
The autoexpand feature is broken with IE7 and Chrome 15 for a repository which has non ASCII name. It works fine with Firefox 7.
follow-up: 14 comment:12 by , 13 years ago
Replying to jomae:
parse_arg_list
intrac.web.api
has the same problem. I think we should change […]
Right, that fix is even better.
Maybe also convert the if not isinstance(..., unicode)
calls to if isinstance(..., str)
, but that's just me ;-)
I would also have used to_unicode()
as it's more robust, but seeing that we never had a problem with the conversion done for values, that's probably not needed.
Can't reproduce that one.
Weird, I can't either. Forget about it, then ;)
The autoexpand feature is broken with IE7 and Chrome 15 for a repository which has non ASCII name.
Ah yes, the autoexpand, not the expand ;-) i.e. expand something then click on a filename then "Back". Yes, it's broken for me as well (Chrome 16.0.912.21).
It's a simple decodeURI()
missing:
-
trac/htdocs/js/expand_dir.js
diff --git a/trac/htdocs/js/expand_dir.js b/trac/htdocs/js/expand_dir.js index 09d24eb..c25ce30 100644
a b 24 24 else { // rows are toplevel rows, this is the initial call 25 25 var anchor = window.location.hash.substr(1); 26 26 if (anchor) 27 autoexpand = anchor.split("/");27 autoexpand = decodeURI(anchor).split("/"); 28 28 } 29 29 30 30 var autoexpand_expander = null;
comment:13 by , 13 years ago
And btw, Replying to jomae:
… It works fine with Firefox 7.
This is because FF normalizes the URL, i.e. when you paste in:
http://localhost:8000/devel-0.12/browser#tract%C3%AAst/t%C3%AAte/v2
it shows in the address bar:
http://localhost:8000/devel-0.12/browser#tractêst/tête/v2
(Opera 11.52 and Safari 5.1.1 do the same, and these ones even let you copy that normalized value back ;-) )
comment:14 by , 13 years ago
Ah yes, the autoexpand, not the expand ;-) i.e. expand something then click on a filename then "Back". Yes, it's broken for me as well (Chrome 16.0.912.21). […]
The patch works fine with IE. Thanks.
And, found the another issue with IE 6 and 7. When second level item (e.g. repo-name/dir or dir1/dir2 in default repository) is expanded, location.hash
is wrong. See the location bar in t10121-ie7-r10871.png.
It seems that the value of href attribute is absolute URL with IE 6 and 7.
-
trac/htdocs/js/expand_dir.js
diff --git a/trac/htdocs/js/expand_dir.js b/trac/htdocs/js/expand_dir.js
a b 76 76 var a = expander.next("a"); 77 77 if ( !autoexpand ) { 78 78 var pathname = window.location.pathname; 79 var entry_href = a.attr("href"); 79 // strip scheme and hostname, href is absolute with IE 6 and 7. 80 var entry_href = a.attr("href").replace(/^[^:]+:\/\/[^\/]+/, ''); 80 81 // normalize (PATH_INFO possibly has squashed "/") 81 82 pathname = pathname.replace(/\/+/g, '/').replace(/\/$/, ''); 82 83 entry_href = entry_href.replace(/\/+/g, '/');
follow-up: 16 comment:15 by , 13 years ago
Ok, so to summarize the changes regarding the opened points mentioned in comment:6:
- the
decodeURI
fix mentioned in comment:12 was committed in r10873 - I reproduced the problem with IE[67] in comment:14 and tested the fix proposed by jomae with IE7 with success; please commit!
- the other fix from Jun in comment:11 (with possibly the
if isinstance(..., str)
change I suggested in comment:12) was not yet committed; I think it should be OK to commit it on 0.12-stable as well, after all
Did I forget something?
comment:16 by , 13 years ago
Replying to cboos:
- I reproduced the problem with IE[67] in comment:14 and tested the fix proposed by jomae with IE7 with success; please commit!
- the other fix from Jun in comment:11 (with possibly the
if isinstance(..., str)
change I suggested in comment:12) was not yet committed; I think it should be OK to commit it on 0.12-stable as well, after all
Thanks for the summarizing! Committed in r10912 and r10913 with unit tests.
Did I forget something?
That's all that's left.
comment:17 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Thanks Jun! Closing then (work on this ticket was 50/50, so let's say the next one like that will be for you ;-) ).
Well, the Ü here is used as a part of a key, i.e. a parameter name, something we didn't anticipate.
We can probably fix this specific problem in
unicode_urlencode
, but I'm curious to see if we're then able to properly handle the resulting GET…And anyway, why do we end up with that Ü as a parameter in the timeline RSS link? I have no idea how it could slip from the commit message to the
filters
…You should add a couple extra logging statements there to give us more clues, e.g.