Edgewall Software
Modify

Opened 13 years ago

Closed 12 years ago

Last modified 18 months ago

#10121 closed defect (fixed)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 5: ordinal not in range(128)

Reported by: florianessl@… Owned by: Christian Boos
Priority: normal Milestone: 0.12.3
Component: general Version: 0.12.2
Severity: major Keywords: unicode error, localization, german umlaut
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

How to Reproduce

While doing a GET operation on /timeline, Trac issued an internal error.

UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 5: ordinal not in range(128)

This occured after checking in a new changeset containting a german Umlaut 'Ü' into Subversion. I tried to deactivate all active plugins, but this problem seems to be located in Trac itself. The exception is thrown at the timeline module. The same message ("added c# implementation") seems to make no problems when displayed in the source code browser.

Request parameters:

{}

User agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16

System Information

Trac 0.12.2
Babel 0.9.5
Genshi 0.6
mod_wsgi 2.5 (WSGIProcessGroup moin WSGIApplicationGroup trac.urpg-engine.net|)
pysqlite 2.3.2
Python 2.5.2 (r252:60911, Jan 24 2010, 18:02:01)
[GCC 4.3.2]
setuptools 0.6c11
SQLite 3.5.9
Subversion 1.5.1 (r32289)
jQuery 1.4.2

Enabled Plugins

TracMenusPlugin 0.1

Python Traceback

Traceback (most recent call last):
  File "build/bdist.linux-x86_64/egg/trac/web/main.py", line 511, in _dispatch_request
    dispatcher.dispatch(req)
  File "build/bdist.linux-x86_64/egg/trac/web/main.py", line 237, in dispatch
    resp = chosen_handler.process_request(req)
  File "build/bdist.linux-x86_64/egg/trac/timeline/web_ui.py", line 213, in process_request
    format='rss')
  File "build/bdist.linux-x86_64/egg/trac/web/href.py", line 181, in <lambda>
    self._derived[name] = lambda *args, **kw: self(name, *args, **kw)
  File "build/bdist.linux-x86_64/egg/trac/web/href.py", line 175, in __call__
    href += '?' + unicode_urlencode(params, self.query_safe)
  File "build/bdist.linux-x86_64/egg/trac/util/text.py", line 118, in unicode_urlencode
    k = quote_plus(str(k), safe)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 5: ordinal not in range(128)

Attachments (1)

t10121-ie7-r10871.png (40.7 KB ) - added by Jun Omae 12 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 by Christian Boos, 13 years ago

Well, the Ü here is used as a part of a key, i.e. a parameter name, something we didn't anticipate.

We can probably fix this specific problem in unicode_urlencode, but I'm curious to see if we're then able to properly handle the resulting GET…

And anyway, why do we end up with that Ü as a parameter in the timeline RSS link? I have no idea how it could slip from the commit message to the filters

You should add a couple extra logging statements there to give us more clues, e.g.

  self.log.debug("authors: %r, filters: %r", authors, filters)

in reply to:  1 ; comment:2 by Remy Blank, 13 years ago

Replying to cboos:

And anyway, why do we end up with that Ü as a parameter in the timeline RSS link? I have no idea how it could slip from the commit message to the filters

The name of a repository?

comment:3 by Christian Boos, 13 years ago

Milestone: 0.12.3
Severity: blockermajor

Ugh, right ;-) Ok, so let's fix unicode_urlencode and see what happens in such a scenario.

in reply to:  2 comment:4 by florianessl@…, 13 years ago

Replying to rblank:

Replying to cboos:

And anyway, why do we end up with that Ü as a parameter in the timeline RSS link? I have no idea how it could slip from the commit message to the filters

The name of a repository?

Yes, the letter is also inside the repository name. i totally forgot to add that, since the error first ocurred after the initial svn commit :\ That means i can do a workaround by just changing the rep. name in the admin control panel to get the timeline working again. I first feared i needed to revert the changeset or something..Thanks for the info! :)

comment:5 by Christian Boos, 13 years ago

Owner: set to Christian Boos

comment:6 by Christian Boos, 13 years ago

Status: newassigned

[10672] fixes the internal error, but there are a couple of other issues left:

  • the checkbox for a repository with non-ascii characters doesn't get updated properly
  • the autoexpand feature in browser navigation is broken for such a repository (see also r10609)

comment:7 by Christian Boos, 13 years ago

#10295 was closed as duplicate. We really should release 0.12.3 one of these days…

comment:8 by Christian Boos, 13 years ago

A couple of Javascript errors left for repositories with non-ascii names (comment:6). Nothing critical but annoying.

in reply to:  6 ; comment:9 by Christian Boos, 12 years ago

Replying to cboos:

  • the checkbox for a repository with non-ascii characters doesn't get updated properly

This is because req.args contain UTF-8 str keys. I think we should to_unicode() that in timeline.py for 0.12.x:

  • trac/timeline/web_ui.py

    diff --git a/trac/timeline/web_ui.py b/trac/timeline/web_ui.py
    index 81eec11..e6a8a2d 100644
    a b class TimelineModule(Component):  
    142142            available_filters += event_provider.get_timeline_filters(req) or []
    143143
    144144        # check the request or session for enabled filters, or use default
    145         filters = [f[0] for f in available_filters if f[0] in req.args]
     145        keys = [to_unicode(k) for k in req.args]
     146        filters = [f[0] for f in available_filters if f[0] in keys]
    146147        if not filters and format != 'rss':
    147148            filters = [f[0] for f in available_filters
    148149                       if req.session.get('timeline.filter.' + f[0]) == '1']
    class TimelineModule(Component):  
    153154        if 'update' in req.args:
    154155            for filter in available_filters:
    155156                key = 'timeline.filter.%s' % filter[0]
    156                 if filter[0] in req.args:
     157                if filter[0] in keys:
    157158                    req.session[key] = '1'
    158159                elif key in req.session:
    159160                    del req.session[key]

… but rather ensure that the whole req.args content is unicode for 0.13:

  • trac/web/api.py

    diff --git a/trac/web/api.py b/trac/web/api.py
    index 1dea028..69d3a6e 100644
    a b class Request(object):  
    571571            name = value.name
    572572            if not value.filename:
    573573                value = unicode(value.value, 'utf-8')
    574             args.append((name, value))
     574            args.append((to_unicode(name), value))
    575575        return args
    576576
    577577    def _parse_cookies(self):

Or maybe directly make the second change on 0.12-stable, as no test gets harmed in the process and I don't think this could break things ;-) Thoughts?

  • the autoexpand feature in browser navigation is broken for such a repository (see also r10609)

Can't reproduce that one.

in reply to:  9 ; comment:10 by Remy Blank, 12 years ago

Replying to cboos:

… but rather ensure that the whole req.args content is unicode for 0.13:

Yes, that makes the most sense to me.

Or maybe directly make the second change on 0.12-stable, as no test gets harmed in the process and I don't think this could break things ;-) Thoughts?

I'm split between the desire to do the right thing, and the superstition that any major change on 0.12-stable will come back and bite us :)

/me tosses coin…

Ok, if no tests break, let's do it. The only possible issue I can see is if a plugin has encountered the issue, and is converting the keys to UTF-8 itself. But that should be rare.

Can't reproduce that one.

Weird, I can't either. Forget about it, then ;)

by Jun Omae, 12 years ago

Attachment: t10121-ie7-r10871.png added

in reply to:  10 ; comment:11 by Jun Omae, 12 years ago

Replying to rblank:

Replying to cboos:

… but rather ensure that the whole req.args content is unicode for 0.13:

Yes, that makes the most sense to me.

parse_arg_list in trac.web.api has the same problem. I think we should change.

  • trac/web/api.py

    diff -r 1cb9ae3fba1c trac/web/api.py
    a b  
    116116        else:
    117117            (name, value) = (nv[0], empty)
    118118        name = unquote(name.replace('+', ' '))
    119         if isinstance(name, unicode):
    120             name = name.encode('utf-8')
     119        if not isinstance(name, unicode):
     120            name = unicode(name, 'utf-8')
    121121        value = unquote(value.replace('+', ' '))
    122122        if not isinstance(value, unicode):
    123123            value = unicode(value, 'utf-8')

Ok, if no tests break, let's do it. The only possible issue I can see is if a plugin has encountered the issue, and is converting the keys to UTF-8 itself. But that should be rare.

+1.

Can't reproduce that one.

Weird, I can't either. Forget about it, then ;)

The autoexpand feature is broken with IE7 and Chrome 15 for a repository which has non ASCII name. It works fine with Firefox 7.

After clicking résumé/summary/résumé with IE7:

in reply to:  11 ; comment:12 by Christian Boos, 12 years ago

Replying to jomae:

parse_arg_list in trac.web.api has the same problem. I think we should change […]

Right, that fix is even better.

Maybe also convert the if not isinstance(..., unicode) calls to if isinstance(..., str), but that's just me ;-)

I would also have used to_unicode() as it's more robust, but seeing that we never had a problem with the conversion done for values, that's probably not needed.

Can't reproduce that one.

Weird, I can't either. Forget about it, then ;)

The autoexpand feature is broken with IE7 and Chrome 15 for a repository which has non ASCII name.

Ah yes, the autoexpand, not the expand ;-) i.e. expand something then click on a filename then "Back". Yes, it's broken for me as well (Chrome 16.0.912.21).

It's a simple decodeURI() missing:

  • trac/htdocs/js/expand_dir.js

    diff --git a/trac/htdocs/js/expand_dir.js b/trac/htdocs/js/expand_dir.js
    index 09d24eb..c25ce30 100644
    a b  
    2424    else { // rows are toplevel rows, this is the initial call
    2525      var anchor = window.location.hash.substr(1);
    2626      if (anchor)
    27         autoexpand = anchor.split("/");
     27        autoexpand = decodeURI(anchor).split("/");
    2828    }
    2929
    3030    var autoexpand_expander = null;

in reply to:  11 comment:13 by Christian Boos, 12 years ago

And btw, Replying to jomae:

… It works fine with Firefox 7.

This is because FF normalizes the URL, i.e. when you paste in:

http://localhost:8000/devel-0.12/browser#tract%C3%AAst/t%C3%AAte/v2

it shows in the address bar:

http://localhost:8000/devel-0.12/browser#tractêst/tête/v2

(Opera 11.52 and Safari 5.1.1 do the same, and these ones even let you copy that normalized value back ;-) )

in reply to:  12 comment:14 by Jun Omae, 12 years ago

Ah yes, the autoexpand, not the expand ;-) i.e. expand something then click on a filename then "Back". Yes, it's broken for me as well (Chrome 16.0.912.21). […]

The patch works fine with IE. Thanks.

And, found the another issue with IE 6 and 7. When second level item (e.g. repo-name/dir or dir1/dir2 in default repository) is expanded, location.hash is wrong. See the location bar in t10121-ie7-r10871.png.

It seems that the value of href attribute is absolute URL with IE 6 and 7.

  • trac/htdocs/js/expand_dir.js

    diff --git a/trac/htdocs/js/expand_dir.js b/trac/htdocs/js/expand_dir.js
    a b  
    7676    var a = expander.next("a");
    7777    if ( !autoexpand ) {
    7878      var pathname = window.location.pathname;
    79       var entry_href = a.attr("href");
     79      // strip scheme and hostname, href is absolute with IE 6 and 7.
     80      var entry_href = a.attr("href").replace(/^[^:]+:\/\/[^\/]+/, '');
    8081      // normalize (PATH_INFO possibly has squashed "/")
    8182      pathname = pathname.replace(/\/+/g, '/').replace(/\/$/, '');
    8283      entry_href = entry_href.replace(/\/+/g, '/');

in reply to:  6 ; comment:15 by Christian Boos, 12 years ago

Ok, so to summarize the changes regarding the opened points mentioned in comment:6:

  • the decodeURI fix mentioned in comment:12 was committed in r10873
  • I reproduced the problem with IE[67] in comment:14 and tested the fix proposed by jomae with IE7 with success; please commit!
  • the other fix from Jun in comment:11 (with possibly the if isinstance(..., str) change I suggested in comment:12) was not yet committed; I think it should be OK to commit it on 0.12-stable as well, after all

Did I forget something?

Last edited 12 years ago by Christian Boos (previous) (diff)

in reply to:  15 comment:16 by Jun Omae, 12 years ago

Replying to cboos:

  • I reproduced the problem with IE[67] in comment:14 and tested the fix proposed by jomae with IE7 with success; please commit!
  • the other fix from Jun in comment:11 (with possibly the if isinstance(..., str) change I suggested in comment:12) was not yet committed; I think it should be OK to commit it on 0.12-stable as well, after all

Thanks for the summarizing! Committed in r10912 and r10913 with unit tests.

Did I forget something?

That's all that's left.

comment:17 by Christian Boos, 12 years ago

Resolution: fixed
Status: assignedclosed

Thanks Jun! Closing then (work on this ticket was 50/50, so let's say the next one like that will be for you ;-) ).

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.