Edgewall Software
Modify

Opened 10 years ago

Closed 8 years ago

Last modified 5 years ago

#12046 closed defect (fixed)

Replace StringIO and cStringIO with io.StringIO and io.BytesIO

Reported by: Ryan J Ollos Owned by: Ryan J Ollos
Priority: normal Milestone: 1.3.1
Component: general Version:
Severity: normal Keywords: python3
Cc: timograham@… Branch:
Release Notes:
API Changes:
Internal Changes:

Replaced StringIO.StringIO and cStringIO.StringIO with io.StringIO and io.BytesIO.

Description

The io module is available since Python 2.6. We can therefore replace StringIO.StringIO and cStringIO.StringIO with io.StringIO and io.BytesIO.

io.StringIO requires a unicode string. io.BytesIO requires a bytes string. StringIO.StringIO allows either unicode or bytes string. cStringIO.StringIO requires a string that is encoded as a bytes string.

Attachments (1)

Screen Shot 2016-08-09 at 16.01.39.png (236.8 KB ) - added by Ryan J Ollos 8 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 by Ryan J Ollos, 10 years ago

Based on what I've read, it would seem that the replacement cStringIO.StringIOio.BytesIO can be made everywhere. We'll have to be more careful when replacing StringIO.StringIO though.

comment:2 by Ryan J Ollos, 9 years ago

Milestone: next-major-releases1.3.1

Proposed changes in log:rjollos.git:t12046_stringio_replacement. I'll do more extensive testing before committing early in milestone:1.3.1. Errors in the proposed changes may point to the need for additional test coverage as well, with the exception of cases that I modified the test case incorrectly. I'm least certain of the changes in the formatter module, and need to test changes to bugzilla2trac.py (unsure whether the attachment content will be utf-8 encoded or unicode). All tests pass on OSX for SQLite, MySQL and PostgreSQL.

comment:3 by Ryan J Ollos, 9 years ago

Owner: set to Ryan J Ollos
Status: newassigned

comment:4 by Jun Omae, 9 years ago

Two things:

  1. Workflow macro with unicode string raises TypeError: 'unicode' does not have the buffer interface.
  2. I think io.BytesIO('...') should be io.BytesIO(b'...')
Last edited 9 years ago by Ryan J Ollos (previous) (diff)

comment:5 by Tim Graham <timograham@…>, 9 years ago

Cc: timograham@… added

comment:6 by Ryan J Ollos, 8 years ago

comment:7 by Jun Omae, 8 years ago

Unit and functional tests pass with the branch on Windows 10 (python 2.7, 32 bit).

However, the following changes in [28fc0d3b/rjollos.git], I think we shouldn't use unicode(wikidom).

@@ -1532,5 +1532,5 @@
         if isinstance(wikidom, basestring):
             wikidom = WikiParser(env).parse(wikidom)
-        self.wikidom = wikidom
+        self.wikidom = unicode(wikidom)
 
     def generate(self, escape_newlines=False):

The wikidom variable would be passed to text argument of Formatter.parse. The text argument accepts a basestring and a iterable instance. See trunk/trac/wiki/formatter.py@14969:1277-1278,1280#L1275.

Instead, how about the following changes which convert to a unicode if it is a str instance in WikiParser.parse?

  • trac/wiki/formatter.py

    diff --git a/trac/wiki/formatter.py b/trac/wiki/formatter.py
    index 852cabb0f..db1f47ed6 100644
    a b class HtmlFormatter(object):  
    15311531        self.context = context
    15321532        if isinstance(wikidom, basestring):
    15331533            wikidom = WikiParser(env).parse(wikidom)
    1534         self.wikidom = unicode(wikidom)
     1534        self.wikidom = wikidom
    15351535
    15361536    def generate(self, escape_newlines=False):
    15371537        """Generate HTML elements.
    class InlineHtmlFormatter(object):  
    15581558        self.context = context
    15591559        if isinstance(wikidom, basestring):
    15601560            wikidom = WikiParser(env).parse(wikidom)
    1561         self.wikidom = unicode(wikidom)
     1561        self.wikidom = wikidom
    15621562
    15631563    def generate(self, shorten=False):
    15641564        """Generate HTML inline elements.
  • trac/wiki/parser.py

    diff --git a/trac/wiki/parser.py b/trac/wiki/parser.py
    index 598d600aa..71b285720 100644
    a b class WikiParser(Component):  
    223223    def parse(self, wikitext):
    224224        """Parse `wikitext` and produce a WikiDOM tree."""
    225225        # obviously still some work to do here ;)
     226        if isinstance(wikitext, str):
     227            wikitext = wikitext.decode('utf-8')
    226228        return wikitext
    227229
    228230

comment:8 by Ryan J Ollos, 8 years ago

That looks good to me, but I don't know that area of the code very well. Is it possible that wikidom could be an iterable of strings that need to be decoded?

in reply to:  8 comment:9 by Jun Omae, 8 years ago

Replying to Ryan J Ollos:

That looks good to me, but I don't know that area of the code very well. Is it possible that wikidom could be an iterable of strings that need to be decoded?

Sounds good.

  • trac/wiki/formatter.py

    diff --git a/trac/wiki/formatter.py b/trac/wiki/formatter.py
    index 852cabb0f..510234108 100644
    a b class Formatter(object):  
    12781278            text = text.splitlines()
    12791279
    12801280        for line in text:
     1281            if isinstance(line, str):
     1282                line = line.decode('utf-8')
    12811283            # Detect start of code block (new block or embedded block)
    12821284            block_start_match = None
    12831285            if WikiParser.ENDBLOCK not in line:
    class OneLinerFormatter(Formatter):  
    14041406        processor = None
    14051407        buf = io.StringIO()
    14061408        for line in text.strip().splitlines():
     1409            if isinstance(line, str):
     1410                line = line.decode('utf-8')
    14071411            if WikiParser.ENDBLOCK not in line and \
    14081412                   WikiParser._startblock_re.match(line):
    14091413                in_code_block += 1
    class HtmlFormatter(object):  
    15311535        self.context = context
    15321536        if isinstance(wikidom, basestring):
    15331537            wikidom = WikiParser(env).parse(wikidom)
    1534         self.wikidom = unicode(wikidom)
     1538        self.wikidom = wikidom
    15351539
    15361540    def generate(self, escape_newlines=False):
    15371541        """Generate HTML elements.
    class InlineHtmlFormatter(object):  
    15581562        self.context = context
    15591563        if isinstance(wikidom, basestring):
    15601564            wikidom = WikiParser(env).parse(wikidom)
    1561         self.wikidom = unicode(wikidom)
     1565        self.wikidom = wikidom
    15621566
    15631567    def generate(self, shorten=False):
    15641568        """Generate HTML inline elements.

comment:10 by Ryan J Ollos, 8 years ago

Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Thanks, committed to trunk in r15010, r15011.

comment:11 by Ryan J Ollos, 8 years ago

Spotted an error:

  • trac/mimeview/api.py

    diff --git a/trac/mimeview/api.py b/trac/mimeview/api.py
    index 8f38e9e..15246a3 100644
    a b class Content(object):  
    598598        if size == 0:
    599599            return ''
    600600        if self.content is None:
    601             self.content = io.StringIO(self.input.read(self.max_size))
     601            self.content = io.BytesIO(self.input.read(self.max_size))
    602602        return self.content.read(size)
    603603
    604604    def reset(self):

I'll add a test case.

by Ryan J Ollos, 8 years ago

comment:12 by Ryan J Ollos, 8 years ago

Fix with simple test case committed in r15069.

comment:13 by Ryan J Ollos, 8 years ago

Possible bug in Pygments (using latest, 2.2.0). I tried replacing StringIO.StringIO with io.StringIO and was getting an error with test_python_hello_mimeview: TypeError: unicode argument expected, got 'str'.

The issue appears to be associated with the newline at the start of the content, and can be reproduced with this minimal test case:

import io

from pygments.formatters.html import HtmlFormatter
from pygments.lexers import get_lexer_by_name


lexer_options = {'stripnl': False}
lexer_name = 'ipython2'

content = """
"""

out = io.StringIO()
lexer = get_lexer_by_name(lexer_name, **lexer_options)
formatter = HtmlFormatter(nowrap=True)
formatter.format(lexer.get_tokens(content), out)

assert '\n' == out.getvalue()

Workaround I've found is to set the HtmlFormatter lineseparator option:

- formatter = HtmlFormatter(nowrap=True)
+ formatter = HtmlFormatter(nowrap=True, lineseparator=u'\n')

The issue is also not seen if adding a single whitespace before the newline. In html.py, the code branches to if line is there is a single whitespace and newline, but branches to else: yield 1, lsep if there's only a newline.

comment:14 by Ryan J Ollos, 8 years ago

We've accumulated some instances of StringIO.StringIO, which will be incompatible with Python 3. Proposed changes in log:rjollos.git:t12046_stringio_replacement.3, including the workaround described in comment:13. I haven't yet tested the deploy_trac.fcgi change.

in reply to:  13 comment:15 by Ryan J Ollos, 8 years ago

Replying to Ryan J Ollos:

Possible bug in Pygments (using latest, 2.2.0). I tried replacing StringIO.StringIO with io.StringIO and was getting an error with test_python_hello_mimeview: TypeError: unicode argument expected, got 'str'.

I've reported the issue to the Pygments project: issue 1349.

in reply to:  14 comment:16 by Ryan J Ollos, 7 years ago

Replying to Ryan J Ollos:

I haven't yet tested the deploy_trac.fcgi change.

From testing, it seems the type of the traceback is bytes. The change from [15010#file29] was correct but was overwritten in [15424]. Corrected in proposed changes: [01221e2e/rjollos.git].

comment:17 by Ryan J Ollos, 7 years ago

comment:16 changes committed in r15904.

comment:18 by Ryan J Ollos, 5 years ago

Internal Changes: modified (diff)
Release Notes: modified (diff)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Ryan J Ollos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Ryan J Ollos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.