Edgewall Software
Modify

Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#4875 closed defect (wontfix)

Unicode error in notification

Reported by: luis@… Owned by: eblot
Priority: high Milestone:
Component: notification Version: 0.10.3
Severity: major Keywords:
Cc: blackhex@…
Release Notes:
API Changes:

Description

This backtrace refers to a module (Discussion) but the same happens with the ticket system… Basically, anything that uses the notification API.

When sending email with accented chars (á, for example) in the subject , this exception is raised. On the file notification.py, inside the function format_header(key, value), the Exception raised is "UnicodeDecodeError", but the code expects "UnicodeEncodeError"… Just switching the name of the expected exception solves the problem.

  File "/usr/local//lib/python2.3/site-packages/trac/web/main.py", line 387, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/local//lib/python2.3/site-packages/trac/web/main.py", line 237, in dispatch 
    resp = chosen_handler.process_request(req)
  File "build/bdist.linux-i686/egg/tracdiscussion/core.py", line 72, in process_request
  File "build/bdist.linux-i686/egg/tracdiscussion/api.py", line 33, in render_discussion 
  File "build/bdist.linux-i686/egg/tracdiscussion/api.py", line 597, in _do_action
  File "build/bdist.linux-i686/egg/tracdiscussion/notification.py", line 91, in notify
  File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 216, in notify 
    Notify.notify(self, resid)
  File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 115, in notify
    self.send(torcpts, ccrcpts)
  File "build/bdist.linux-i686/egg/tracdiscussion/notification.py", line 130, in send 
  File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 360, in send
    self.add_headers(msg, headers);
  File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 236, in add_headers 
    msg[h] = self.encode_header(h, headers[h])
  File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 275, in encode_header
    return self.format_header(key, value)
  File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 225, in format_header 
    tmp = name.encode('ascii')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 43: ordinal not in range(128)

Attachments (0)

Change History (20)

comment:1 Changed 8 years ago by eblot

  • Keywords notification added; error removed
  • Owner changed from jonas to eblot
  • Status changed from new to assigned

comment:2 follow-up: Changed 8 years ago by cboos

I was about to close this ticket, saying:

This is a problem with the TracHacks:DiscussionPlugin, which doesn't give unicode objects as input to the notification API.

You should report this issue on TracHacks.

But I let you decide if the Notify API should be "tolerant" to invalid input.

comment:3 in reply to: ↑ 2 ; follow-up: Changed 8 years ago by luis@…

As I said in the bug report: "the same happens with the ticket system".

I don't know enough of python to argue with you, but again, as said in the report "the Exception raised is "UnicodeDecodeError", but the code expects "UnicodeEncodeError"…". This seems more like a typo then a feature… And if it is so, there's a bug in the notification API, am I correct?

The bug reflects on anything that sends email's with non ASCII chars in the subject…

Replying to cboos:

I was about to close this ticket, saying:

This is a problem with the TracHacks:DiscussionPlugin, which doesn't give unicode objects as input to the notification API.

You should report this issue on TracHacks.

But I let you decide if the Notify API should be "tolerant" to invalid input.

comment:4 in reply to: ↑ 3 ; follow-ups: Changed 8 years ago by cboos

Replying to luis@rosilva.ath.cx:

As I said in the bug report: "the same happens with the ticket system".

Ok, that's a bug then. If you can provide a backtrace for that one, all the better.

I don't know enough of python to argue with you, but again, as said in the report "the Exception raised is "UnicodeDecodeError", but the code expects "UnicodeEncodeError"…". This seems more like a typo then a feature…

No, the exception handler is there to intercept failures to encode an unicode object to a str using the ASCII encoding, when doing name.encode('ascii'). The actual exception comes from the fact that name is not an unicode instance as expected, so there's first a str to unicode encoding taking place, using the default charset (ascii as well, usually) and this is what is failing.

Compare:

>>> u'2342ereeêù'.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters in position 8-9: ordinal not in range(128)

with:

>>> '2342ereeêù'.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xea in position 8: ordinal not in range(128)

And if it is so, there's a bug in the notification API, am I correct?

Well, this being a direct Trac API (not a plugin interface), one could argue that strict conformance to the internal usage of unicode should be respected. As I said above, this is an open question.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 8 years ago by anonymous

Replying to cboos:

Ok, that's a bug then. If you can provide a backtrace for that one, all the better.

Ooops… I can't seem to reproduce the bug now… I guess I was wrong after all… I'll report this to the Discussion Plugin in the trac-hacks trac…

Well, this being a direct Trac API (not a plugin interface), one could argue that strict conformance to the internal usage of unicode should be respected. As I said above, this is an open question.

Fair enough, now I understood the problem and I leave up guys to you what to do… Sorry for taking your time.

comment:6 in reply to: ↑ 5 Changed 8 years ago by eblot

Replying to anonymous:

Ooops… I can't seem to reproduce the bug now… I guess I was wrong after all… I'll report this to the Discussion Plugin in the trac-hacks trac…

I'm pretty sure it is possible to send emails with non-ascii characters in the subject or the sender (I think I wrote some unit tests to validate this feature).

An important point is to check that the encoding is either 'base 64' or 'quoted printable' in the [notification] configuration.

If you can reproduce the issue without a plugin, it would be great.

Replying to cboos:

Agreed. I don't think the code should be able to cope with non-unicode strings. Unicode is the internal format in Trac, so plugins should conform to it.

comment:7 in reply to: ↑ 4 Changed 8 years ago by cboos

Correction to comment:4

… so there's first a str to unicode encoding taking place …

I meant to say decoding here, of course.

comment:8 Changed 8 years ago by Blackhex

  • Cc blackhex@… added

I took a look on this error and I don't see any differences between discussion notification code and ticket/notification.py in send method except two explicit str() conversions of integer values. I've changed them to unicode() or removed too but error still occurs. I've tried explicit conversion of all header fields too but this is the same. I've added debug print to line of error in Trac's notification.py and that gives me with explicit conversion to unicode this output:

[u'Test']
[u'blackhex@post.cz']
['0.10.3']
['Trac 0.10.3, by Edgewall Software']
[u'blackhex@seznam.cz']
[u'Test']
['Sat, 03 Mar 2007 11:45:16 -0000']
[u'']
[u'http://example.org/']

As you can see there are all custom fields in unicode but default one provided by NotifyEmail? class are not. Do you think that this may couse DiscussionPlugin? and if it is so how?

comment:9 Changed 8 years ago by luis£rosilva.ath.cx

Tested again… Created a ticket with accented chars in the subject (á, é, ç) and the email was sent ok. But in what relates to the discussion plugin, I always get a backtrace like in the first post.

Already checked that the encoding was either 'base 64' or 'quoted printable' in the [notification] configuration. It is 'base 64'.

comment:10 Changed 8 years ago by luis@…

New results: Now everything works. I'm using Debian Sarge and there's a /etc/python2.3/site.py that can be used among other things to define the default locale. The system locale is en_GB.UTF-8

The file contains among other things:

(...)
encoding = "ascii" # Default value set by _PyUnicode_Init()

if 0:
    # Enable to support locale aware default string encodings.
    import locale
    loc = locale.getdefaultlocale()
    if loc[1]:
        encoding = loc[1]

if 0:
    # Enable to switch off string to Unicode coercion and implicit
    # Unicode to string conversion.
    encoding = "undefined"

if encoding != "ascii":
    # On Non-Unicode builds this will raise an AttributeError...
    sys.setdefaultencoding(encoding) # Needs Python Unicode build !
(...)

If I change encoding = "ascii" for encoding = "utf-8" everything works correctly. Could trac be expecting this configuration or is it supposed to work even when default encoding is defined as ascii?

This is the trac section of the trac.ini:

[trac]
authz_file =
authz_module_name =
base_url =
check_auth_ip = true
database = sqlite:db/trac.db
default_charset = utf-8
default_handler = WikiModule
htdocs_location =
ignore_auth_case = false
mainnav = wiki,timeline,roadmap,browser,tickets,newticket,discussion,search,admin
metanav = login,logout,settings,help,about
permission_store = DefaultPermissionStore
repository_dir =
repository_type = svn
templates_dir = /usr/local/share/trac/templates
timeout = 20

And here is the notification section of the same file:

[notification]
always_notify_owner = true
always_notify_reporter = true
always_notify_updater = true
mime_encoding = base64
smtp_always_bcc =
smtp_always_cc =
smtp_default_domain = rosilva.ath.cx
smtp_enabled = true
smtp_from = trac@rosilva.ath.cx
smtp_password =
smtp_port = 25
smtp_replyto = trac@rosilva.ath.cx
smtp_server = mail.rosilva.ath.cx
smtp_subject_prefix = __default__
smtp_user =
use_public_cc = false
use_short_addr = true
use_tls = false

comment:11 follow-up: Changed 8 years ago by cboos

  • Milestone changed from 0.10.4 to 0.10.5

Post-poning a bit, and keeping it opened because of comment:8

comment:12 in reply to: ↑ 11 Changed 7 years ago by eblot

  • Component changed from general to notification
  • Keywords needinfo added; notification unicode removed

Replying to cboos:

Post-poning a bit, and keeping it opened because of comment:8

No sure to understand the exact issue: as long as the notification API is used w/ Unicode, Trac behaves as expected.
cboos, can you elaborate about what is to be done? Thanks.

comment:13 follow-up: Changed 7 years ago by cboos

I've gone through the whole ticket and the code once again (as it was also not obvious to me what was being left to do…) and I believe the issue boils down in "normalizing" self.subject to unicode. It's likely that some plugins like the discussion plugin assigns and utf-8 str object to it, which is not valid in this case.

Either we consider that NotifyEmail.subject is part of the internal API that strictly requires the use of unicode for non-ascii content, or we consider it to be part of the external API for which we offer some robustness against mixed unicode/utf-8 str input (e.g. like for wiki macros).

I'd tend to the first choice and close this as wontfix.

comment:14 in reply to: ↑ 13 Changed 7 years ago by eblot

  • Keywords needinfo removed
  • Milestone 0.10.5 deleted
  • Resolution set to wontfix
  • Status changed from assigned to closed

Replying to cboos:

I'd tend to the first choice and close this as wontfix.

So would I: Trac now uses unicode internally, it is up to the plugins to comply with this requirement. Thanks for the feedback.

comment:15 follow-up: Changed 7 years ago by luis@…

  • Resolution wontfix deleted
  • Status changed from closed to reopened

According to comment 8 the problem doesn't seems to be in the plugin itself but on trac.

Also, as discussed in comment 3 and comment 4 the same happens with the ticketing system. So, as cboos already realized in one of those comments, it should be a bug, am I wrong?

In comment comment 10, I show one thing that changes behavior of the python interpreter in what relates to this bug. Should that be the key to solve it, even if this bug is closed as won't fix, I think that this fact should be referred in trac's documentation.

Sorry for reopening the ticket, but it seemed to me that in the beginning of the ticket it was recognized this was a bug in trac and not the plugin itself and if so, closing it is going in the wrong way…

comment:16 in reply to: ↑ 15 Changed 7 years ago by eblot

Replying to luis@rosilva.ath.cx:

According to comment 8 the problem doesn't seems to be in the plugin itself but on trac.

Well, if you can reproduce the error with a Unicode string, there's a bug in Trac.
If you inject anything but ASCII or Unicode in the notification subsystem, it's a bug in the client (that is, the plugin).

If you have a code snippet that triggers an error with a valid Unicode string, it would help to discriminate the root cause.

comment:17 follow-up: Changed 7 years ago by cboos

(just to second eblot which was as usual faster than me for replying ;-) )

In the absence of a stack trace involving the ticket subsystem, I don't believe this can be a problem with Trac. You're right that comment:10 shows a few non-unicode strings, but none of those can be problematic, as they have guaranteed ascii content (I initially had a doubt about the date, but I don't think it can be a problem anymore).

I'm willing to reconsider the above if you're able to provide a backtrace involving the ticket system or a recipe for reproducing the issue (but that has already been asked for in comment:4).

comment:18 in reply to: ↑ 17 ; follow-up: Changed 7 years ago by anonymous

I'm sorry, I must have made some confusion… I missed my own post about not being able to reproduce with the ticketing system… As this was open a while ago I completely forgot it. The problem only appears in the discussion plug in.

So probably the bug is in the Discussion Plugin? And what about comment 8? I leave it up to you guys. Sorry for wasting your time with my distraction.

comment:19 in reply to: ↑ 18 Changed 7 years ago by cboos

  • Resolution set to wontfix
  • Status changed from reopened to closed

Ok I hope everything is clear now ;-)

Replying to anonymous:

So probably the bug is in the Discussion Plugin?

Yes, as we decided in comment:14.

And what about comment 8?

I should have written in comment:17:

You're right that comment:8 shows a few non-unicode strings, but none of those can be problematic, as they have guaranteed ascii content (I initially had a doubt about the date, but I don't think it can be a problem anymore).

…instead of comment:10, sorry for the mistake.

comment:20 Changed 7 years ago by eblot

Replying to anonymous:

So probably the bug is in the Discussion Plugin?

Previous releases of Trac were using UTF-8 internally, so at some point, the notification code may have supported UTF-8 as input. (I'm not sure if by that time the plugin would have triggered an error or not.)

Trac now uses unicode internally, so the plugin may need to be updated to fulfill this requirement, yes. You might want to submit a ticket on trac-hacks.org (check for duplicate first ;-))

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed The owner will remain eblot.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from eblot to the specified user.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.