Edgewall Software
Modify

Opened 12 years ago

Closed 7 years ago

#2971 closed defect (fixed)

Unicode encoding error w/ Windows & defined locale

Reported by: Emmanuel Blot Owned by: Christian Boos
Priority: normal Milestone: 0.10
Component: general Version: devel
Severity: major Keywords: unicode windows python23
Cc: lists@…
Release Notes:
API Changes:

Description

The following piece of code:

encoding = locale.getlocale(locale.LC_TIME)[1] or \
           locale.getpreferredencoding()

in /trac/util/__init__.py breaks w/ Windows & ActiveState Python 2.3:

ActivePython 2.3.5 Build 236 (ActiveState Corp.) based on
Python 2.3.5 (#62, Feb  9 2005, 16:17:08) [MSC v.1200 32 bit (Intel)] on win32

with the following Python stack trace

Traceback (most recent call last):
  File "trac\web\main.py", line 308, in dispatch_request
    dispatcher.dispatch(req)
  File "trac\web\main.py", line 153, in dispatch
    populate_hdf(req.hdf, self.env, req)
  File "trac\web\main.py", line 69, in populate_hdf
    hdf['trac'] = {
  File "trac\util\__init__.py", line 198, in format_datetime
    return unicode(text, encoding, 'replace')
LookupError: unknown encoding: 1252

when locale is defined.

It seems the trouble comes from the Python encoding:

>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'French_France')
'French_France.1252'
>>> locale.setlocale(locale.LC_ALL, 'English_United-Kingdom')
'English_United Kingdom.1252'
>>> locale.getlocale(locale.LC_TIME)[1]
'1252'

The expected code page was cp1252, not 1252:

>>> locale.getpreferredencoding()
'cp1252'
>>> unicode('test', 'cp1252', 'replace')
u'test'
>>> unicode('test', '1252', 'replace')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
LookupError: unknown encoding: 1252

Attachments (0)

Change History (12)

comment:1 Changed 12 years ago by Christian Boos

Milestone: 0.10
Owner: changed from Jonas Borgström to Christian Boos

Using only locale.getpreferredencoding() would work on windows, but not on Linux, where setting the locale doesn't seem to affect it (e.g. after a locale.setlocale(locale.LC_ALL, 'French'), my locale.getpreferredencoding() was still 'UTF-8', whereas locale.getlocale(locale.LC_TIME)[1] gave 'ISO8859-1', which was consistent with the encoding used by strftime).

So I guess some platform dependent code is in order here…

comment:2 Changed 12 years ago by Christian Boos

Emmanuel, this fix worked for me, could you try it out?

Index: trac/util/__init__.py
===================================================================
--- trac/util/__init__.py       (revision 3118)
+++ trac/util/__init__.py       (working copy)
@@ -208,8 +208,8 @@
             t = time.localtime(int(t))

     text = time.strftime(format, t)
-    encoding = locale.getlocale(locale.LC_TIME)[1] or \
-               locale.getpreferredencoding()
+    lc_time_encoding = sys.platform != 'win32' and getlocale(locale.LC_TIME)[1]
+    encoding = lc_time_encoding or locale.getpreferredencoding()
     return unicode(text, encoding, 'replace')

 def format_date(t=None, format='%x', gmt=False):

comment:3 Changed 12 years ago by Christian Boos

Resolution: fixed
Status: newclosed

Issue fixed in r3141.

comment:4 Changed 10 years ago by Christian Boos

Keywords: python23 added

More precisely, that was a win32 issue with Python 2.3. When using 2.4 or 2.5, the original code would have worked just fine ('1252' is a known encoding alias). See follow-up change r6113.

comment:5 Changed 9 years ago by sakesun

Resolution: fixed
Status: closedreopened

This won't work for Thai on Python 2.5

'abc'.encode('cp874')

'abc'

'abc'.encode('874')

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

LookupError: unknown encoding: 874

comment:6 Changed 9 years ago by sakesun

I cannot find information to confirm the behaviour for every python versions across every platforms.

I use this fix for my own Trac:

def format_datetime(t=None, format='%x %X', tzinfo=None):
    """Format the `datetime` object `t` into an `unicode` string

    If `t` is None, the current time will be used.
    
    The formatting will be done using the given `format`, which consist
    of conventional `strftime` keys. In addition the format can be 'iso8601'
    to specify the international date format.

    `tzinfo` will default to the local timezone if left to `None`.
    """
    t = to_datetime(t, tzinfo).astimezone(tzinfo or localtz)
    if format.lower() == 'iso8601':
        format = '%Y-%m-%dT%H:%M:%SZ%z'
    text = t.strftime(format)
    encoding1 = locale.getpreferredencoding() or sys.getdefaultencoding()
    encoding2 = locale.getlocale(locale.LC_TIME)[1] or encoding1
    try:
        return unicode(text, encoding2, 'replace')
    except LookupError:
        return unicode(text, encoding1, 'replace')

comment:7 Changed 9 years ago by anonymous

Milestone: 0.100.11.1

comment:8 Changed 9 years ago by Emmanuel Blot

Locale name definition is platform-specific.

comment:9 Changed 9 years ago by Christian Boos

Milestone: 0.11.20.11.3

comment:10 Changed 7 years ago by anonymous

I just copy python's cp874.py to 874.py. It kinda works.

comment:11 Changed 7 years ago by Thijs Triemstra <lists@…>

Cc: lists@… added
Milestone: next-minor-0.12.x0.13

Python 2.3, is that still supported? py2.4 willbe dropped in trac 0.13, let's close this?

comment:12 Changed 7 years ago by Remy Blank

Milestone: 0.130.10
Resolution: fixed
Status: reopenedclosed

Right, and something has been fixed in 0.10 (comment:3).

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .

 
Note: See TracTickets for help on using tickets.