Edgewall Software

Opened 16 years ago

Last modified 13 years ago

#6930 closed defect

showworkflow script can't handle accented characters in workflow states — at Version 1

Reported by: abli@… Owned by: Jonas Borgström
Priority: normal Milestone: 0.12.3
Component: ticket system Version: 0.11b1
Severity: minor Keywords: workflow unicode
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description (last modified by Tim Hatch)

trac seems to be able to handle accented characters in state names or transition names. The showworkflow script, however, fails with the following exception when run on a .ini file that trac can handle:

Traceback (most recent call last):
  File "./workflow_parser.py", line 109, in ?
    main(args[0], show_ops, show_perms)
  File "./workflow_parser.py", line 76, in main
    sys.stdout.write(''.join(digraph_lines))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 189: ordinal not in range(128)
Failed to parse "inventory-workflow.ini", exiting.

The bug is actually in workflow_parser.py (and python's handling of sys.stdout): showworkflow runs workflow_parser.py and redirects the output. Because sys.stdout is redirected, its encoding is set to None, which means that ascii encoding is used. This can't handle most accented characters, result in the exception.

A possible fix is to set encoding of sys.stdout in workflow_parser.py, by replacing

sys.stdout.write(''.join(digraph_lines))

with

    import locale, codecs
    sys.stdout = codecs.getwriter(locale.getpreferredencoding())(sys.stdout);
    sys.stdout.write(''.join(digraph_lines))

(see, for example http://wiki.python.org/moin/PrintFails and http://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/)

After this, showworkflow script will run and produce correct .png output. .ps output, however, will be wrong as graphviz doesn't appear to be able to handle non- latin-1 chars in ps output (see "More generally, how do I use non-ASCII character sets?" in http://www.graphviz.org/doc/FAQ.html)

.pdf output, however appears to work, so I think instead of using ps2pdf, .pdf should be seperatelly generated with

dot -T pdf -o ...filenames...

Change History (1)

comment:1 by Tim Hatch, 16 years ago

Description: modified (diff)

First, make sure your $LANG is set correctly, and that Python is picking it up for stdout.

>>> import sys
>>> sys.stdout.encoding
'UTF-8'

Then, if you change the ''.join to u''.join, does it work correctly? I've never had to resort to codecs.getwriter just to print Unicode.

Note: See TracTickets for help on using tickets.