#4875 closed defect (wontfix)
Unicode error in notification
Reported by: | Owned by: | Emmanuel Blot | |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | notification | Version: | 0.10.3 |
Severity: | major | Keywords: | |
Cc: | blackhex@… | Branch: | |
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
This backtrace refers to a module (Discussion) but the same happens with the ticket system… Basically, anything that uses the notification API.
When sending email with accented chars (á, for example) in the subject , this exception is raised. On the file notification.py, inside the function format_header(key, value), the Exception raised is "UnicodeDecodeError", but the code expects "UnicodeEncodeError"… Just switching the name of the expected exception solves the problem.
File "/usr/local//lib/python2.3/site-packages/trac/web/main.py", line 387, in dispatch_request dispatcher.dispatch(req) File "/usr/local//lib/python2.3/site-packages/trac/web/main.py", line 237, in dispatch resp = chosen_handler.process_request(req) File "build/bdist.linux-i686/egg/tracdiscussion/core.py", line 72, in process_request File "build/bdist.linux-i686/egg/tracdiscussion/api.py", line 33, in render_discussion File "build/bdist.linux-i686/egg/tracdiscussion/api.py", line 597, in _do_action File "build/bdist.linux-i686/egg/tracdiscussion/notification.py", line 91, in notify File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 216, in notify Notify.notify(self, resid) File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 115, in notify self.send(torcpts, ccrcpts) File "build/bdist.linux-i686/egg/tracdiscussion/notification.py", line 130, in send File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 360, in send self.add_headers(msg, headers); File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 236, in add_headers msg[h] = self.encode_header(h, headers[h]) File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 275, in encode_header return self.format_header(key, value) File "/usr/local//lib/python2.3/site-packages/trac/notification.py", line 225, in format_header tmp = name.encode('ascii') UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 43: ordinal not in range(128)
Attachments (0)
Change History (20)
comment:1 by , 18 years ago
Keywords: | notification added; error removed |
---|---|
Owner: | changed from | to
Status: | new → assigned |
follow-up: 3 comment:2 by , 18 years ago
follow-up: 4 comment:3 by , 18 years ago
As I said in the bug report: "the same happens with the ticket system".
I don't know enough of python to argue with you, but again, as said in the report "the Exception raised is "UnicodeDecodeError", but the code expects "UnicodeEncodeError"…". This seems more like a typo then a feature… And if it is so, there's a bug in the notification API, am I correct?
The bug reflects on anything that sends email's with non ASCII chars in the subject…
Replying to cboos:
I was about to close this ticket, saying:
This is a problem with the TracHacks:DiscussionPlugin, which doesn't give
unicode
objects as input to the notification API.You should report this issue on TracHacks.
But I let you decide if the Notify API should be "tolerant" to invalid input.
follow-ups: 5 7 comment:4 by , 18 years ago
Replying to luis@rosilva.ath.cx:
As I said in the bug report: "the same happens with the ticket system".
Ok, that's a bug then. If you can provide a backtrace for that one, all the better.
I don't know enough of python to argue with you, but again, as said in the report "the Exception raised is "UnicodeDecodeError", but the code expects "UnicodeEncodeError"…". This seems more like a typo then a feature…
No, the exception handler is there to intercept failures to encode an unicode
object to a str
using the ASCII encoding, when doing name.encode('ascii')
. The actual exception comes from the fact that name
is not an unicode
instance as expected, so there's first a str
to unicode
encoding taking place, using the default charset (ascii as well, usually) and this is what is failing.
Compare:
>>> u'2342ereeêù'.encode('ascii') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode characters in position 8-9: ordinal not in range(128)
with:
>>> '2342ereeêù'.encode('ascii') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xea in position 8: ordinal not in range(128)
And if it is so, there's a bug in the notification API, am I correct?
Well, this being a direct Trac API (not a plugin interface), one could argue that strict conformance to the internal usage of unicode
should be respected.
As I said above, this is an open question.
follow-up: 6 comment:5 by , 18 years ago
Replying to cboos:
Ok, that's a bug then. If you can provide a backtrace for that one, all the better.
Ooops… I can't seem to reproduce the bug now… I guess I was wrong after all… I'll report this to the Discussion Plugin in the trac-hacks trac…
Well, this being a direct Trac API (not a plugin interface), one could argue that strict conformance to the internal usage of
unicode
should be respected. As I said above, this is an open question.
Fair enough, now I understood the problem and I leave up guys to you what to do… Sorry for taking your time.
comment:6 by , 18 years ago
Replying to anonymous:
Ooops… I can't seem to reproduce the bug now… I guess I was wrong after all… I'll report this to the Discussion Plugin in the trac-hacks trac…
I'm pretty sure it is possible to send emails with non-ascii characters in the subject or the sender (I think I wrote some unit tests to validate this feature).
An important point is to check that the encoding is either 'base 64' or 'quoted printable' in the [notification]
configuration.
If you can reproduce the issue without a plugin, it would be great.
Replying to cboos:
Agreed. I don't think the code should be able to cope with non-unicode strings. Unicode is the internal format in Trac, so plugins should conform to it.
comment:7 by , 18 years ago
Correction to comment:4
… so there's first a
str
tounicode
encoding taking place …
I meant to say decoding here, of course.
comment:8 by , 18 years ago
Cc: | added |
---|
I took a look on this error and I don't see any differences between discussion notification code and ticket/notification.py in send method except two explicit str() conversions of integer values. I've changed them to unicode() or removed too but error still occurs. I've tried explicit conversion of all header fields too but this is the same. I've added debug print to line of error in Trac's notification.py and that gives me with explicit conversion to unicode this output:
[u'Test'] [u'blackhex@post.cz'] ['0.10.3'] ['Trac 0.10.3, by Edgewall Software'] [u'blackhex@seznam.cz'] [u'Test'] ['Sat, 03 Mar 2007 11:45:16 -0000'] [u''] [u'http://example.org/']
As you can see there are all custom fields in unicode but default one provided by NotifyEmail class are not. Do you think that this may couse DiscussionPlugin and if it is so how?
comment:9 by , 18 years ago
Tested again… Created a ticket with accented chars in the subject (á, é, ç) and the email was sent ok. But in what relates to the discussion plugin, I always get a backtrace like in the first post.
Already checked that the encoding was either 'base 64' or 'quoted printable' in the [notification] configuration. It is 'base 64'.
comment:10 by , 18 years ago
New results: Now everything works. I'm using Debian Sarge and there's a /etc/python2.3/site.py that can be used among other things to define the default locale. The system locale is en_GB.UTF-8
The file contains among other things:
(...) encoding = "ascii" # Default value set by _PyUnicode_Init() if 0: # Enable to support locale aware default string encodings. import locale loc = locale.getdefaultlocale() if loc[1]: encoding = loc[1] if 0: # Enable to switch off string to Unicode coercion and implicit # Unicode to string conversion. encoding = "undefined" if encoding != "ascii": # On Non-Unicode builds this will raise an AttributeError... sys.setdefaultencoding(encoding) # Needs Python Unicode build ! (...)
If I change encoding = "ascii" for encoding = "utf-8" everything works correctly. Could trac be expecting this configuration or is it supposed to work even when default encoding is defined as ascii?
This is the trac section of the trac.ini:
[trac] authz_file = authz_module_name = base_url = check_auth_ip = true database = sqlite:db/trac.db default_charset = utf-8 default_handler = WikiModule htdocs_location = ignore_auth_case = false mainnav = wiki,timeline,roadmap,browser,tickets,newticket,discussion,search,admin metanav = login,logout,settings,help,about permission_store = DefaultPermissionStore repository_dir = repository_type = svn templates_dir = /usr/local/share/trac/templates timeout = 20
And here is the notification section of the same file:
[notification] always_notify_owner = true always_notify_reporter = true always_notify_updater = true mime_encoding = base64 smtp_always_bcc = smtp_always_cc = smtp_default_domain = rosilva.ath.cx smtp_enabled = true smtp_from = trac@rosilva.ath.cx smtp_password = smtp_port = 25 smtp_replyto = trac@rosilva.ath.cx smtp_server = mail.rosilva.ath.cx smtp_subject_prefix = __default__ smtp_user = use_public_cc = false use_short_addr = true use_tls = false
follow-up: 12 comment:11 by , 18 years ago
Milestone: | 0.10.4 → 0.10.5 |
---|
Post-poning a bit, and keeping it opened because of comment:8
comment:12 by , 18 years ago
Component: | general → notification |
---|---|
Keywords: | needinfo added; notification unicode removed |
follow-up: 14 comment:13 by , 18 years ago
I've gone through the whole ticket and the code once again (as it was also not obvious to me what was being left to do…) and I believe the issue boils down in "normalizing" self.subject to unicode. It's likely that some plugins like the discussion plugin assigns and utf-8 str object to it, which is not valid in this case.
Either we consider that NotifyEmail.subject
is part of the internal API that strictly requires the use of unicode
for non-ascii content, or we consider it to be part of the external API for which we offer some robustness against mixed unicode/utf-8 str input (e.g. like for wiki macros).
I'd tend to the first choice and close this as wontfix.
comment:14 by , 18 years ago
Keywords: | needinfo removed |
---|---|
Milestone: | 0.10.5 |
Resolution: | → wontfix |
Status: | assigned → closed |
Replying to cboos:
I'd tend to the first choice and close this as wontfix.
So would I: Trac now uses unicode internally, it is up to the plugins to comply with this requirement. Thanks for the feedback.
follow-up: 16 comment:15 by , 18 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
According to comment 8 the problem doesn't seems to be in the plugin itself but on trac.
Also, as discussed in comment 3 and comment 4 the same happens with the ticketing system. So, as cboos already realized in one of those comments, it should be a bug, am I wrong?
In comment comment 10, I show one thing that changes behavior of the python interpreter in what relates to this bug. Should that be the key to solve it, even if this bug is closed as won't fix, I think that this fact should be referred in trac's documentation.
Sorry for reopening the ticket, but it seemed to me that in the beginning of the ticket it was recognized this was a bug in trac and not the plugin itself and if so, closing it is going in the wrong way…
comment:16 by , 18 years ago
Replying to luis@rosilva.ath.cx:
According to comment 8 the problem doesn't seems to be in the plugin itself but on trac.
Well, if you can reproduce the error with a Unicode string, there's a bug in Trac.
If you inject anything but ASCII or Unicode in the notification subsystem, it's a bug in the client (that is, the plugin).
If you have a code snippet that triggers an error with a valid Unicode string, it would help to discriminate the root cause.
follow-up: 18 comment:17 by , 18 years ago
(just to second eblot which was as usual faster than me for replying ;-) )
In the absence of a stack trace involving the ticket subsystem, I don't believe this can be a problem with Trac. You're right that comment:10 shows a few non-unicode strings, but none of those can be problematic, as they have guaranteed ascii content (I initially had a doubt about the date, but I don't think it can be a problem anymore).
I'm willing to reconsider the above if you're able to provide a backtrace involving the ticket system or a recipe for reproducing the issue (but that has already been asked for in comment:4).
follow-up: 19 comment:18 by , 18 years ago
I'm sorry, I must have made some confusion… I missed my own post about not being able to reproduce with the ticketing system… As this was open a while ago I completely forgot it. The problem only appears in the discussion plug in.
So probably the bug is in the Discussion Plugin? And what about comment 8? I leave it up to you guys. Sorry for wasting your time with my distraction.
comment:19 by , 18 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
Ok I hope everything is clear now ;-)
Replying to anonymous:
So probably the bug is in the Discussion Plugin?
Yes, as we decided in comment:14.
And what about comment 8?
I should have written in comment:17:
You're right that comment:8 shows a few non-unicode strings, but none of those can be problematic, as they have guaranteed ascii content (I initially had a doubt about the date, but I don't think it can be a problem anymore).
…instead of comment:10, sorry for the mistake.
comment:20 by , 18 years ago
Replying to anonymous:
So probably the bug is in the Discussion Plugin?
Previous releases of Trac were using UTF-8 internally, so at some point, the notification code may have supported UTF-8 as input. (I'm not sure if by that time the plugin would have triggered an error or not.)
Trac now uses unicode internally, so the plugin may need to be updated to fulfill this requirement, yes. You might want to submit a ticket on trac-hacks.org (check for duplicate first ;-))
I was about to close this ticket, saying:
But I let you decide if the Notify API should be "tolerant" to invalid input.