Context Navigation

Modify ↓

#3058 closed defect (fixed)

Bug of the wiki compiler

Reported by:	m.petretta@…	Owned by:	Christian Boos
Priority:	highest	Milestone:	0.9.6
Component:	general	Version:	0.9.5
Severity:	normal	Keywords:	unicode
Cc:		Branch:
Release Notes:
API Changes:
Internal Changes:

Description (last modified by Christian Boos)

When compiling a page of a project of mine, the compiler crashes without any reason. By seeveral tests, I isolated the problem: it happens when I add to the page the following string: " === Indice di Priorità ==="

In the following is the python traceback:

Traceback (most recent call last):
  File "C:\Python24\lib\site-packages\trac\web\standalone.py", line 303, in _do_trac_req
    dispatch_request(path_info, req, env)
  File "C:\Python24\lib\site-packages\trac\web\main.py", line 139, in dispatch_request
    dispatcher.dispatch(req)
  File "C:\Python24\lib\site-packages\trac\web\main.py", line 107, in dispatch
    resp = chosen_handler.process_request(req)
  File "C:\Python24\lib\site-packages\trac\wiki\web_ui.py", line 92, in process_request
    self._render_editor(req, db, page, preview=True)
  File "C:\Python24\lib\site-packages\trac\wiki\web_ui.py", line 311, in _render_editor
    info['page_html'] = wiki_to_html(page.text, self.env, req, db)
  File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 744, in wiki_to_html
    Formatter(env, req, absurls, db).format(wikitext, out, escape_newlines)
  File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 599, in format
    result = re.sub(self.rules, self.replace, line)
  File "C:\Python24\lib\sre.py", line 142, in sub
    return _compile(pattern, 0).sub(repl, string, count)
  File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 221, in replace
    return getattr(self, '_' + itype + '_formatter')(match, fullmatch)
  File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 389, in _heading_formatter
    anchor = self._anchor_re.sub('', sans_markup.decode('utf-8'))
  File "C:\Python24\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 17: unexpected end of data

Attachments (0)

Change History (5)

comment:1 by Christian Boos, 19 years ago

Description:	modified (diff)
Keywords:	unicode added
Milestone:	→ 0.9.6
Owner:	changed from Jonas Borgström to Christian Boos
Priority:	normal → highest

Right, I can reproduce this.

comment:2 by Christian Boos, 19 years ago

Does anybody have an idea why, on the command line, I have this:

>>> s = 'Indice di Priorit\xc3\xa0'
>>> s.strip()
'Indice di Priorit\xc3\xa0'

while in Trac, the same strip() operation, on the same input, returns 'Indice di Priorit\xc3' ?

i.e.

formatter.py

         self.out = out
         self._open_tags = []
+        print 'oneliner', type(text), `text`
+        print 'oneliner.strip', `text.strip()`
         # Simplify code blocks
         in_code_block = 0

shows:

oneliner <type 'str'> 'Indice di Priorit\xc3\xa0'
oneliner.strip 'Indice di Priorit\xc3'

comment:3 by Christian Boos, 19 years ago

Answering to myself:

>>> s = 'Indice di Priorit\xc3\xa0'
>>> s.strip()
'Indice di Priorit\xc3\xa0'
>>> import locale
>>> locale.getlocale()
(None, None)
>>> locale.setlocale(locale.LC_ALL, 'en')
'English_United States.1252'
>>> s.strip()
'Indice di Priorit\xc3'

comment:4 by Christian Boos, 19 years ago

… and in cp1252, we have: A0 = U+00A0 : NO-BREAK SPACE

Yet another example of why using unicode internally is so important (0.10).

In the meantime, for this issue, a temporary conversion to unicode could do the trick:

Index: trac/wiki/formatter.py
===================================================================
--- trac/wiki/formatter.py	(revision 3213)
+++ trac/wiki/formatter.py	(working copy)
@@ -21,6 +21,7 @@
 import re
 import os
 import urllib
+import StringIO as pyStringIO
 
 try:
     from cStringIO import StringIO
@@ -660,7 +661,9 @@
         # Simplify code blocks
         in_code_block = 0
         processor = None
-        buf = StringIO()
+        buf = pyStringIO.StringIO()
+        text = unicode(text, 'utf-8', 'replace')
+
         for line in text.strip().splitlines():
             if line.strip() == '{{{':
                 in_code_block += 1
@@ -678,6 +681,7 @@
             else:
                 print>>buf, line
         result = buf.getvalue()[:-1]
+        result = result.encode('utf-8')
 
         if shorten:
             result = util.shorten_line(result)

Opinions?

comment:5 by Christian Boos, 19 years ago

Resolution:	→ fixed
Status:	new → closed

Fixed in r3236.

Modify Ticket

Change Properties

Summary:
Description:	When compiling a page of a project of mine, the compiler crashes without any reason. By seeveral tests, I isolated the problem: it happens when I add to the page the following string: " === Indice di Priorità ===" In the following is the python traceback: {{{ Traceback (most recent call last): File "C:\Python24\lib\site-packages\trac\web\standalone.py", line 303, in _do_trac_req dispatch_request(path_info, req, env) File "C:\Python24\lib\site-packages\trac\web\main.py", line 139, in dispatch_request dispatcher.dispatch(req) File "C:\Python24\lib\site-packages\trac\web\main.py", line 107, in dispatch resp = chosen_handler.process_request(req) File "C:\Python24\lib\site-packages\trac\wiki\web_ui.py", line 92, in process_request self._render_editor(req, db, page, preview=True) File "C:\Python24\lib\site-packages\trac\wiki\web_ui.py", line 311, in _render_editor info['page_html'] = wiki_to_html(page.text, self.env, req, db) File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 744, in wiki_to_html Formatter(env, req, absurls, db).format(wikitext, out, escape_newlines) File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 599, in format result = re.sub(self.rules, self.replace, line) File "C:\Python24\lib\sre.py", line 142, in sub return _compile(pattern, 0).sub(repl, string, count) File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 221, in replace return getattr(self, '_' + itype + '_formatter')(match, fullmatch) File "C:\Python24\lib\site-packages\trac\wiki\formatter.py", line 389, in _heading_formatter anchor = self._anchor_re.sub('', sans_markup.decode('utf-8')) File "C:\Python24\lib\encodings\utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 17: unexpected end of data }}} You may use WikiFormatting here.
Type:		Priority:
Milestone:		Component:
Version:		Severity:
Keywords:		Cc:	Set your email in Preferences
Branch:
Release Notes:
API Changes:
Internal Changes:

Action

leave as closed The owner will remain Christian Boos.

reopen The resolution will be deleted. Next status will be 'reopened'.

change ownership to The owner will be changed from Christian Boos to the specified user.

Add Comment

Your email or username:

E-mail address and name can be saved in the Preferences .

You may use WikiFormatting here.

Attachments ↑ Description ↑

Note: See TracTickets for help on using tickets.

Download in other formats: