Edgewall Software
Modify

Opened 18 years ago

Closed 13 years ago

#2971 closed defect (fixed)

Unicode encoding error w/ Windows & defined locale

Reported by: Emmanuel Blot Owned by: Christian Boos
Priority: normal Milestone: 0.10
Component: general Version: devel
Severity: major Keywords: unicode windows python23
Cc: lists@… Branch:
Release Notes:
API Changes:
Internal Changes:

Description

The following piece of code:

encoding = locale.getlocale(locale.LC_TIME)[1] or \
           locale.getpreferredencoding()

in /trac/util/__init__.py breaks w/ Windows & ActiveState Python 2.3:

ActivePython 2.3.5 Build 236 (ActiveState Corp.) based on
Python 2.3.5 (#62, Feb  9 2005, 16:17:08) [MSC v.1200 32 bit (Intel)] on win32

with the following Python stack trace

Traceback (most recent call last):
  File "trac\web\main.py", line 308, in dispatch_request
    dispatcher.dispatch(req)
  File "trac\web\main.py", line 153, in dispatch
    populate_hdf(req.hdf, self.env, req)
  File "trac\web\main.py", line 69, in populate_hdf
    hdf['trac'] = {
  File "trac\util\__init__.py", line 198, in format_datetime
    return unicode(text, encoding, 'replace')
LookupError: unknown encoding: 1252

when locale is defined.

It seems the trouble comes from the Python encoding:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'French_France')
'French_France.1252'
>>> locale.setlocale(locale.LC_ALL, 'English_United-Kingdom')
'English_United Kingdom.1252'
>>> locale.getlocale(locale.LC_TIME)[1]
'1252'

The expected code page was cp1252, not 1252:

>>> locale.getpreferredencoding()
'cp1252'
>>> unicode('test', 'cp1252', 'replace')
u'test'
>>> unicode('test', '1252', 'replace')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
LookupError: unknown encoding: 1252

Attachments (0)

Change History (12)

comment:1 by Christian Boos, 18 years ago

Milestone: 0.10
Owner: changed from Jonas Borgström to Christian Boos

Using only locale.getpreferredencoding() would work on windows, but not on Linux, where setting the locale doesn't seem to affect it (e.g. after a locale.setlocale(locale.LC_ALL, 'French'), my locale.getpreferredencoding() was still 'UTF-8', whereas locale.getlocale(locale.LC_TIME)[1] gave 'ISO8859-1', which was consistent with the encoding used by strftime).

So I guess some platform dependent code is in order here…

comment:2 by Christian Boos, 18 years ago

Emmanuel, this fix worked for me, could you try it out?

Index: trac/util/__init__.py
===================================================================
--- trac/util/__init__.py       (revision 3118)
+++ trac/util/__init__.py       (working copy)
@@ -208,8 +208,8 @@
             t = time.localtime(int(t))

     text = time.strftime(format, t)
-    encoding = locale.getlocale(locale.LC_TIME)[1] or \
-               locale.getpreferredencoding()
+    lc_time_encoding = sys.platform != 'win32' and getlocale(locale.LC_TIME)[1]
+    encoding = lc_time_encoding or locale.getpreferredencoding()
     return unicode(text, encoding, 'replace')

 def format_date(t=None, format='%x', gmt=False):

comment:3 by Christian Boos, 18 years ago

Resolution: fixed
Status: newclosed

Issue fixed in r3141.

comment:4 by Christian Boos, 17 years ago

Keywords: python23 added

More precisely, that was a win32 issue with Python 2.3. When using 2.4 or 2.5, the original code would have worked just fine ('1252' is a known encoding alias). See follow-up change r6113.

comment:5 by sakesun, 16 years ago

Resolution: fixed
Status: closedreopened

This won't work for Thai on Python 2.5

'abc'.encode('cp874')

'abc'

'abc'.encode('874')

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

LookupError: unknown encoding: 874

comment:6 by sakesun, 16 years ago

I cannot find information to confirm the behaviour for every python versions across every platforms.

I use this fix for my own Trac:

def format_datetime(t=None, format='%x %X', tzinfo=None):
    """Format the `datetime` object `t` into an `unicode` string

    If `t` is None, the current time will be used.
    
    The formatting will be done using the given `format`, which consist
    of conventional `strftime` keys. In addition the format can be 'iso8601'
    to specify the international date format.

    `tzinfo` will default to the local timezone if left to `None`.
    """
    t = to_datetime(t, tzinfo).astimezone(tzinfo or localtz)
    if format.lower() == 'iso8601':
        format = '%Y-%m-%dT%H:%M:%SZ%z'
    text = t.strftime(format)
    encoding1 = locale.getpreferredencoding() or sys.getdefaultencoding()
    encoding2 = locale.getlocale(locale.LC_TIME)[1] or encoding1
    try:
        return unicode(text, encoding2, 'replace')
    except LookupError:
        return unicode(text, encoding1, 'replace')

comment:7 by anonymous, 16 years ago

Milestone: 0.100.11.1

comment:8 by Emmanuel Blot, 16 years ago

Locale name definition is platform-specific.

comment:9 by Christian Boos, 16 years ago

Milestone: 0.11.20.11.3

comment:10 by anonymous, 14 years ago

I just copy python's cp874.py to 874.py. It kinda works.

comment:11 by Thijs Triemstra <lists@…>, 13 years ago

Cc: lists@… added
Milestone: next-minor-0.12.x0.13

Python 2.3, is that still supported? py2.4 willbe dropped in trac 0.13, let's close this?

comment:12 by Remy Blank, 13 years ago

Milestone: 0.130.10
Resolution: fixed
Status: reopenedclosed

Right, and something has been fixed in 0.10 (comment:3).

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.