Context Navigation

Modify ↓

#12046 closed defect (fixed)

Replace StringIO and cStringIO with io.StringIO and io.BytesIO

Reported by:	Ryan J Ollos	Owned by:	Ryan J Ollos
Priority:	normal	Milestone:	1.3.1
Component:	general	Version:
Severity:	normal	Keywords:	python3
Cc:	timograham@…	Branch:
Release Notes:
API Changes:
Internal Changes:	Replaced `StringIO.StringIO` and `cStringIO.StringIO` with `io.StringIO` and `io.BytesIO`.

Description

The io module is available since Python 2.6. We can therefore replace StringIO.StringIO and cStringIO.StringIO with io.StringIO and io.BytesIO.

io.StringIO requires a unicode string. io.BytesIO requires a bytes string. StringIO.StringIO allows either unicode or bytes string. cStringIO.StringIO requires a string that is encoded as a bytes string.

Attachments (1)

Screen Shot 2016-08-09 at 16.01.39.png (236.8 KB ) - added by Ryan J Ollos 9 years ago.

Download all attachments as: .zip

Change History (19)

comment:1 by Ryan J Ollos, 10 years ago

Based on what I've read, it would seem that the replacement cStringIO.StringIO → io.BytesIO can be made everywhere. We'll have to be more careful when replacing StringIO.StringIO though.

comment:2 by Ryan J Ollos, 10 years ago

Milestone:	next-major-releases → 1.3.1

Proposed changes in log:rjollos.git:t12046_stringio_replacement. I'll do more extensive testing before committing early in milestone:1.3.1. Errors in the proposed changes may point to the need for additional test coverage as well, with the exception of cases that I modified the test case incorrectly. I'm least certain of the changes in the formatter module, and need to test changes to bugzilla2trac.py (unsure whether the attachment content will be utf-8 encoded or unicode). All tests pass on OSX for SQLite, MySQL and PostgreSQL.

comment:3 by Ryan J Ollos, 10 years ago

Owner:	set to Ryan J Ollos
Status:	new → assigned

comment:4 by Jun Omae, 10 years ago

Two things:

Workflow macro with unicode string raises TypeError: 'unicode' does not have the buffer interface.
I think io.BytesIO('...') should be io.BytesIO(b'...')

Last edited 10 years ago by Ryan J Ollos (previous) (diff)

comment:5 by Tim Graham <timograham@…>, 10 years ago

Cc:	timograham@… added

comment:6 by Ryan J Ollos, 9 years ago

I've addressed the comment:4 issues in log:rjollos.git:t12046_stringio_replacement.2.

comment:7 by Jun Omae, 9 years ago

Unit and functional tests pass with the branch on Windows 10 (python 2.7, 32 bit).

However, the following changes in [28fc0d3b/rjollos.git], I think we shouldn't use unicode(wikidom).

@@ -1532,5 +1532,5 @@
         if isinstance(wikidom, basestring):
             wikidom = WikiParser(env).parse(wikidom)
-        self.wikidom = wikidom
+        self.wikidom = unicode(wikidom)
 
     def generate(self, escape_newlines=False):

The wikidom variable would be passed to text argument of Formatter.parse. The text argument accepts a basestring and a iterable instance. See trunk/trac/wiki/formatter.py@14969:1277-1278,1280#L1275 .

Instead, how about the following changes which convert to a unicode if it is a str instance in WikiParser.parse?

trac/wiki/formatter.py

diff --git a/trac/wiki/formatter.py b/trac/wiki/formatter.py
index 852cabb0f..db1f47ed6 100644

                class HtmlFormatter(object):
         self.context = context
         if isinstance(wikidom, basestring):
             wikidom = WikiParser(env).parse(wikidom)
         self.wikidom = unicode(wikidom)
+        self.wikidom = wikidom
     def generate(self, escape_newlines=False):
         """Generate HTML elements.
-…
+               class InlineHtmlFormatter(object):
         self.context = context
         if isinstance(wikidom, basestring):
             wikidom = WikiParser(env).parse(wikidom)
         self.wikidom = unicode(wikidom)
+        self.wikidom = wikidom
     def generate(self, shorten=False):
         """Generate HTML inline elements.

trac/wiki/parser.py

diff --git a/trac/wiki/parser.py b/trac/wiki/parser.py
index 598d600aa..71b285720 100644

                class WikiParser(Component):
     def parse(self, wikitext):
         """Parse `wikitext` and produce a WikiDOM tree."""
         # obviously still some work to do here ;)
+        if isinstance(wikitext, str):
+            wikitext = wikitext.decode('utf-8')
         return wikitext

follow-up: 9 comment:8 by Ryan J Ollos, 9 years ago

That looks good to me, but I don't know that area of the code very well. Is it possible that wikidom could be an iterable of strings that need to be decoded?

in reply to: 8 comment:9 by Jun Omae, 9 years ago

Replying to Ryan J Ollos:

That looks good to me, but I don't know that area of the code very well. Is it possible that wikidom could be an iterable of strings that need to be decoded?

Sounds good.

trac/wiki/formatter.py

diff --git a/trac/wiki/formatter.py b/trac/wiki/formatter.py
index 852cabb0f..510234108 100644

                class Formatter(object):
             text = text.splitlines()
         for line in text:
+            if isinstance(line, str):
+                line = line.decode('utf-8')
             # Detect start of code block (new block or embedded block)
             block_start_match = None
             if WikiParser.ENDBLOCK not in line:
-…
+               class OneLinerFormatter(Formatter):
         processor = None
         buf = io.StringIO()
         for line in text.strip().splitlines():
+            if isinstance(line, str):
+                line = line.decode('utf-8')
             if WikiParser.ENDBLOCK not in line and \
                    WikiParser._startblock_re.match(line):
                 in_code_block += 1
-…
+               class HtmlFormatter(object):
         self.context = context
         if isinstance(wikidom, basestring):
             wikidom = WikiParser(env).parse(wikidom)
         self.wikidom = unicode(wikidom)
+        self.wikidom = wikidom
     def generate(self, escape_newlines=False):
         """Generate HTML elements.
-…
+               class InlineHtmlFormatter(object):
         self.context = context
         if isinstance(wikidom, basestring):
             wikidom = WikiParser(env).parse(wikidom)
         self.wikidom = unicode(wikidom)
+        self.wikidom = wikidom
     def generate(self, shorten=False):
         """Generate HTML inline elements.

comment:10 by Ryan J Ollos, 9 years ago

Release Notes:	modified (diff)
Resolution:	→ fixed
Status:	assigned → closed

Thanks, committed to trunk in r15010, r15011.

comment:11 by Ryan J Ollos, 9 years ago

Spotted an error:

trac/mimeview/api.py

diff --git a/trac/mimeview/api.py b/trac/mimeview/api.py
index 8f38e9e..15246a3 100644

                class Content(object):
         if size == 0:
             return ''
         if self.content is None:
             self.content = io.StringIO(self.input.read(self.max_size))
+            self.content = io.BytesIO(self.input.read(self.max_size))
         return self.content.read(size)
     def reset(self):

I'll add a test case.

by Ryan J Ollos, 9 years ago

Attachment:	Screen Shot 2016-08-09 at 16.01.39.png added

comment:12 by Ryan J Ollos, 9 years ago

Fix with simple test case committed in r15069.

follow-up: 15 comment:13 by Ryan J Ollos, 8 years ago

Possible bug in Pygments (using latest, 2.2.0). I tried replacing StringIO.StringIO with io.StringIO and was getting an error with test_python_hello_mimeview : TypeError: unicode argument expected, got 'str'.

The issue appears to be associated with the newline at the start of the content, and can be reproduced with this minimal test case:

import io

from pygments.formatters.html import HtmlFormatter
from pygments.lexers import get_lexer_by_name


lexer_options = {'stripnl': False}
lexer_name = 'ipython2'

content = """
"""

out = io.StringIO()
lexer = get_lexer_by_name(lexer_name, **lexer_options)
formatter = HtmlFormatter(nowrap=True)
formatter.format(lexer.get_tokens(content), out)

assert '\n' == out.getvalue()

Workaround I've found is to set the HtmlFormatter lineseparator option:

- formatter = HtmlFormatter(nowrap=True)
+ formatter = HtmlFormatter(nowrap=True, lineseparator=u'\n')

The issue is also not seen if adding a single whitespace before the newline. In html.py, the code branches to if line is there is a single whitespace and newline, but branches to else: yield 1, lsep if there's only a newline.

follow-up: 16 comment:14 by Ryan J Ollos, 8 years ago

We've accumulated some instances of StringIO.StringIO, which will be incompatible with Python 3. Proposed changes in log:rjollos.git:t12046_stringio_replacement.3, including the workaround described in comment:13. I haven't yet tested the deploy_trac.fcgi change.

in reply to: 13 comment:15 by Ryan J Ollos, 8 years ago

Replying to Ryan J Ollos:

Possible bug in Pygments (using latest, 2.2.0). I tried replacing StringIO.StringIO with io.StringIO and was getting an error with test_python_hello_mimeview : TypeError: unicode argument expected, got 'str'.

I've reported the issue to the Pygments project: issue 1349.

in reply to: 14 comment:16 by Ryan J Ollos, 8 years ago

Replying to Ryan J Ollos:

I haven't yet tested the deploy_trac.fcgi change.

From testing, it seems the type of the traceback is bytes. The change from [15010#file29] was correct but was overwritten in [15424]. Corrected in proposed changes: [01221e2e/rjollos.git].

comment:17 by Ryan J Ollos, 8 years ago

comment:16 changes committed in r15904.

comment:18 by Ryan J Ollos, 5 years ago

Internal Changes:	modified (diff)
Release Notes:	modified (diff)

Modify Ticket

Change Properties

Summary:
Description:	The [https://docs.python.org/2/library/io.html io] module is available since Python 2.6. We can therefore replace `StringIO.StringIO` and `cStringIO.StringIO` with `io.StringIO` and `io.BytesIO`. `io.StringIO` requires a unicode string. `io.BytesIO` requires a bytes string. `StringIO.StringIO` allows either unicode or bytes string. `cStringIO.StringIO` requires a string that is encoded as a bytes string. You may use WikiFormatting here.
Type:		Priority:
Milestone:		Component:
Version:		Severity:
Keywords:		Cc:	Set your email in Preferences
Branch:
Release Notes:
API Changes:
Internal Changes:	Replaced `StringIO.StringIO` and `cStringIO.StringIO` with `io.StringIO` and `io.BytesIO`.

Action

leave as closed The owner will remain Ryan J Ollos.

reopen The resolution will be deleted. Next status will be 'reopened'.

change ownership to The owner will be changed from Ryan J Ollos to the specified user.

Add Comment

Your email or username:

E-mail address and name can be saved in the Preferences .

You may use WikiFormatting here.

Attachments ↑ Description ↑

Note: See TracTickets for help on using tickets.

Download in other formats:

Context Navigation

#12046 closed defect (fixed)

Replace StringIO and cStringIO with io.StringIO and io.BytesIO

Description

Attachments (1)

Change History (19)

comment:1 by Ryan J Ollos, 10 years ago

comment:2 by Ryan J Ollos, 10 years ago

comment:3 by Ryan J Ollos, 10 years ago

comment:4 by Jun Omae, 10 years ago

comment:5 by Tim Graham <timograham@…>, 10 years ago

comment:6 by Ryan J Ollos, 9 years ago

comment:7 by Jun Omae, 9 years ago

trac/wiki/formatter.py

trac/wiki/parser.py

follow-up: 9 comment:8 by Ryan J Ollos, 9 years ago

in reply to: 8 comment:9 by Jun Omae, 9 years ago

trac/wiki/formatter.py

comment:10 by Ryan J Ollos, 9 years ago

comment:11 by Ryan J Ollos, 9 years ago

trac/mimeview/api.py

by Ryan J Ollos, 9 years ago

comment:12 by Ryan J Ollos, 9 years ago

follow-up: 15 comment:13 by Ryan J Ollos, 8 years ago

follow-up: 16 comment:14 by Ryan J Ollos, 8 years ago

in reply to: 13 comment:15 by Ryan J Ollos, 8 years ago

in reply to: 14 comment:16 by Ryan J Ollos, 8 years ago

comment:17 by Ryan J Ollos, 8 years ago

comment:18 by Ryan J Ollos, 5 years ago

Modify Ticket

Add Comment

by anonymous

Download in other formats: