Edgewall Software
Modify

Opened 17 years ago

Closed 13 years ago

Last modified 13 years ago

#4717 closed defect (fixed)

wrong table cell width with multibyte in ticket notify

Reported by: xuefer@… Owned by: Jun Omae
Priority: high Milestone: 0.12.2
Component: general Version: 0.10.3
Severity: minor Keywords: notification unicode patch needfixup
Cc: shunichi.goto@…, jun66j5@… Branch:
Release Notes:

Improved e-mail notifications for CJK users and new [notification] ambiguous_char_width = single/double setting.

API Changes:
Internal Changes:

Description

Expected:

Priority:  major                                 |    Component:  Others
Keywords:  tester xxxxxxxxxxxxx                  |     Username:  abc

Unxpected:

Priority:  major                                 |    Component:  Others
Keywords:  tester xxxxxxxxxxxxx                        |     Username:  abc

where xxxxxxxxxxxxx is 7 Chinese characters, in 14 'w' width

Attachments (5)

4717.diff (4.3 KB ) - added by Jun Omae 14 years ago.
broken-table-cell-width.png (15.5 KB ) - added by Jun Omae 14 years ago.
fixed-table-cell-width.png (15.5 KB ) - added by Jun Omae 14 years ago.
t4717-using-east-asian-width-r10051.diff (4.6 KB ) - added by Jun Omae 13 years ago.
Refreshed t4717.diff to [10051/branches/0.12-stable]
t4717-unittest.diff (12.0 KB ) - added by Jun Omae 13 years ago.
unit-test for #4717 and #9494

Download all attachments as: .zip

Change History (24)

comment:1 by Matthew Good, 17 years ago

Milestone: 0.10.4
Resolution: wontfix
Status: newclosed

So, you're saying that the characters in question are displayed in a font such that each character is wider than an ascii character. For plaintext mail formatting we expect the emails to be displayed in a fixed-width font so that all characters are the same width. If you can, configure your mail client to use a font where the characters will all be the same width. If you can't do that, I'm afraid there's no way to work around that for the plaintext mails. HTML email formatting has been requested, and would enable better formatting with variable-width fonts. See #2625.

comment:2 by xuefer@…, 17 years ago

Resolution: wontfix
Status: closedreopened

i'm sure using fixed-width font, and each Chinese character is in the same width as 2 ascii, which IS expected, even u use fixed-width font in html. i agree that using html table can solve the problem, because browser calc the width correctly, but not trac. that's why i ask tach to fix itself, calc the text width, not character count (count of code point)

if u know php, there is mb_strwidth to do the job.

i'm not familar with python, but it seems there's:

east_asian_width( unichr) 

Returns the east asian width assigned to the Unicode character unichr as string. New in version 2.4. 

One shall never suppose 1 Chinese char is same with as 1 ascii char, in fixed-width font.

comment:3 by Christian Boos, 17 years ago

Component: ticket systemgeneral
Keywords: notification unicode added
Milestone: 0.12
Severity: normalminor

east_asian_width(unichr) - New in version 2.4

Maybe we can bump the Python requirement in the next version of Trac?

(btw, mgood, tee from itertools is also new in 2.4)

comment:4 by Christian Boos, 15 years ago

Owner: changed from Jonas Borgström to Christian Boos
Status: reopenednew

Note for self: reuse this code

    from unicodedata import east_asian_width as eaw
    _colwidth = lambda s: sum([eaw(c) in 'WF' and 2 or 1 for c in s])

(from Shun-ichi Goto on Mercurial-devel, used with permission)

where xxxxxxxxxxxxx is 7 Chinese characters, in 14 'w' width

maybe a real example would help…

in reply to:  4 comment:5 by Shun-ichi Goto <shunichi.goto@…>, 15 years ago

Cc: shunichi.goto@… added

Replying to cboos:

where xxxxxxxxxxxxx is 7 Chinese characters, in 14 'w' width

maybe a real example would help…

For example in Japanese, the text "日本語" (means "Japanese") has 3 characters and occupies 6 column width.

>>> len(u'日本語')
3
>>> _colwidth(u'日本語')
6

Expected (left:

Priority:  major                                 |    Component:  Others
Keywords:  tester 日本語                         |     Username:  abc

Unwanted (extra 3 spaces):

Priority:  major                                 |    Component:  Others
Keywords:  tester 日本語                            |     Username:  abc

comment:6 by Christian Boos, 15 years ago

Note that it doesn't look like browsers are respecting this 3 asian characters → 6 column width convention, at least when using the default fixed font, as can be seen in the above example. Same goes for Thunderbird when displaying the notification mail for comment:5.

in reply to:  6 comment:7 by Shun-ichi Goto <shunichi.goto@…>, 15 years ago

Replying to cboos:

Note that it doesn't look like browsers are respecting this 3 asian characters → 6 column width convention, at least when using the default fixed font, as can be seen in the above example.

I guess it is issue of renderer of application on using multiple fonts (for ASCII and Japanese). For example, if you use MSゴシック (msgothic) font on windows, you will see good looking because it is a fixed-width font having both ASCII and Japanese glyphs.

by Jun Omae, 14 years ago

Attachment: 4717.diff added

by Jun Omae, 14 years ago

Attachment: broken-table-cell-width.png added

by Jun Omae, 14 years ago

Attachment: fixed-table-cell-width.png added

comment:8 by Jun Omae, 14 years ago

Cc: jun66j5@… added

I created a patch 4717.diff for the issue using unicodedata.east_asian_width.

New [notification] ambiguous_char_width option is like ambiwidth option of Vim. CJK (Chinese, Japanese and Korean) users need this option and specify 'double'. Default of the option is 'single', the most users don't need to specify the option.

If you want to know about Ambiguous Characters, visit http://www.unicode.org/reports/tr11/.

Trac 0.12dev-r9679:

Trac 0.12dev-r9679 with the patch:

comment:9 by Christian Boos, 14 years ago

Milestone: next-major-0.1X0.12.1

Thanks for the patch, it looks good except for a few details:

  • from unicodedata import east_asian_width should be among the first block of imports, as it's a module coming from stock Python
  • I didn't count, but it looks like you have some lines > 79 (single width ;-) ) chars; please check our TracDev/CodingStyle

Finally, I think the setting makes sense in a locally deployed Trac instance, where most users are either CJK ones or not. But ultimately this should become a user setting, in the notification preferences (#4056).

comment:10 by Christian Boos, 14 years ago

Keywords: patch added
Milestone: 0.12.1next-minor-0.12.x
Priority: normalhigh

The notification code has changed in this area (r10051), the patch 4717.diff needs to be refreshed.

comment:11 by Christian Boos, 13 years ago

Keywords: needfixup added

by Jun Omae, 13 years ago

Refreshed t4717.diff to [10051/branches/0.12-stable]

in reply to:  10 comment:12 by Jun Omae, 13 years ago

Replying to cboos:

The notification code has changed in this area (r10051), the patch 4717.diff needs to be refreshed.

Reworked the issue and t4717-using-east-asian-width-r10051.diff is refreshed.

comment:13 by Andrew C Martin <andrew.c.martin@…>, 13 years ago

per cboos' request in ticket:2378#comment:15, I reviewed and tested attachment:t4717-using-east-asian-width-r10051.diff. The patch produced no regressions in my testing and the code changes looked fine. I didn't test with any locales other than "en_US.UTF-8".

Nice touch incorporating the obfuscated email length in the table width calculations!

comment:14 by Christian Boos, 13 years ago

Milestone: next-minor-0.12.x0.12.2
Owner: changed from Christian Boos to Jun Omae
Release Notes: modified (diff)

Thanks Jun and Andrew.

Patch committed in r10393.

comment:15 by Christian Boos, 13 years ago

Resolution: fixed
Status: newclosed

And minor tweaks done in r10394 (to_unicode(t) is a no-op if t is already unicode).

A unit-test would have been nice, feel free to contribute one later, Jun ;-)

comment:16 by Thijs Triemstra, 13 years ago

would be good to have a "Since 0.12.2" note in the doc(string).

comment:17 by Thijs Triemstra, 13 years ago

sorry, missed r10394, nevermind :)

Last edited 13 years ago by Thijs Triemstra (previous) (diff)

by Jun Omae, 13 years ago

Attachment: t4717-unittest.diff added

unit-test for #4717 and #9494

in reply to:  15 comment:18 by Jun Omae, 13 years ago

Replying to cboos:

A unit-test would have been nice, feel free to contribute one later, Jun ;-)

I create t4717-unittest.diff for #4717 and #9494. Please apply if you can :)

comment:19 by Remy Blank, 13 years ago

(Perfect) patch applied in [10413]. Thanks!

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.