#12046 closed defect (fixed)
Replace StringIO and cStringIO with io.StringIO and io.BytesIO
Reported by: | Ryan J Ollos | Owned by: | Ryan J Ollos |
---|---|---|---|
Priority: | normal | Milestone: | 1.3.1 |
Component: | general | Version: | |
Severity: | normal | Keywords: | python3 |
Cc: | timograham@… | Branch: | |
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Replaced |
Description
The io module is available since Python 2.6. We can therefore replace StringIO.StringIO
and cStringIO.StringIO
with io.StringIO
and io.BytesIO
.
io.StringIO
requires a unicode string. io.BytesIO
requires a bytes string. StringIO.StringIO
allows either unicode or bytes string. cStringIO.StringIO
requires a string that is encoded as a bytes string.
Attachments (1)
Change History (19)
comment:1 by , 10 years ago
comment:2 by , 9 years ago
Milestone: | next-major-releases → 1.3.1 |
---|
Proposed changes in log:rjollos.git:t12046_stringio_replacement. I'll do more extensive testing before committing early in milestone:1.3.1. Errors in the proposed changes may point to the need for additional test coverage as well, with the exception of cases that I modified the test case incorrectly. I'm least certain of the changes in the formatter
module, and need to test changes to bugzilla2trac.py
(unsure whether the attachment content will be utf-8
encoded or unicode). All tests pass on OSX for SQLite, MySQL and PostgreSQL.
comment:3 by , 9 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:4 by , 9 years ago
Two things:
Workflow
macro with unicode string raisesTypeError: 'unicode' does not have the buffer interface
.- I think
io.BytesIO('...')
should beio.BytesIO(b'...')
comment:5 by , 9 years ago
Cc: | added |
---|
comment:6 by , 8 years ago
I've addressed the comment:4 issues in log:rjollos.git:t12046_stringio_replacement.2.
comment:7 by , 8 years ago
Unit and functional tests pass with the branch on Windows 10 (python 2.7, 32 bit).
However, the following changes in [28fc0d3b/rjollos.git], I think we shouldn't use unicode(wikidom)
.
@@ -1532,5 +1532,5 @@ if isinstance(wikidom, basestring): wikidom = WikiParser(env).parse(wikidom) - self.wikidom = wikidom + self.wikidom = unicode(wikidom) def generate(self, escape_newlines=False):
The wikidom
variable would be passed to text
argument of Formatter.parse
. The text
argument accepts a basestring
and a iterable instance. See trunk/trac/wiki/formatter.py@14969:1277-1278,1280#L1275.
Instead, how about the following changes which convert to a unicode
if it is a str
instance in WikiParser.parse
?
-
trac/wiki/formatter.py
diff --git a/trac/wiki/formatter.py b/trac/wiki/formatter.py index 852cabb0f..db1f47ed6 100644
a b class HtmlFormatter(object): 1531 1531 self.context = context 1532 1532 if isinstance(wikidom, basestring): 1533 1533 wikidom = WikiParser(env).parse(wikidom) 1534 self.wikidom = unicode(wikidom)1534 self.wikidom = wikidom 1535 1535 1536 1536 def generate(self, escape_newlines=False): 1537 1537 """Generate HTML elements. … … class InlineHtmlFormatter(object): 1558 1558 self.context = context 1559 1559 if isinstance(wikidom, basestring): 1560 1560 wikidom = WikiParser(env).parse(wikidom) 1561 self.wikidom = unicode(wikidom)1561 self.wikidom = wikidom 1562 1562 1563 1563 def generate(self, shorten=False): 1564 1564 """Generate HTML inline elements. -
trac/wiki/parser.py
diff --git a/trac/wiki/parser.py b/trac/wiki/parser.py index 598d600aa..71b285720 100644
a b class WikiParser(Component): 223 223 def parse(self, wikitext): 224 224 """Parse `wikitext` and produce a WikiDOM tree.""" 225 225 # obviously still some work to do here ;) 226 if isinstance(wikitext, str): 227 wikitext = wikitext.decode('utf-8') 226 228 return wikitext 227 229 228 230
follow-up: 9 comment:8 by , 8 years ago
That looks good to me, but I don't know that area of the code very well. Is it possible that wikidom
could be an iterable of strings that need to be decoded?
comment:9 by , 8 years ago
Replying to Ryan J Ollos:
That looks good to me, but I don't know that area of the code very well. Is it possible that
wikidom
could be an iterable of strings that need to be decoded?
Sounds good.
-
trac/wiki/formatter.py
diff --git a/trac/wiki/formatter.py b/trac/wiki/formatter.py index 852cabb0f..510234108 100644
a b class Formatter(object): 1278 1278 text = text.splitlines() 1279 1279 1280 1280 for line in text: 1281 if isinstance(line, str): 1282 line = line.decode('utf-8') 1281 1283 # Detect start of code block (new block or embedded block) 1282 1284 block_start_match = None 1283 1285 if WikiParser.ENDBLOCK not in line: … … class OneLinerFormatter(Formatter): 1404 1406 processor = None 1405 1407 buf = io.StringIO() 1406 1408 for line in text.strip().splitlines(): 1409 if isinstance(line, str): 1410 line = line.decode('utf-8') 1407 1411 if WikiParser.ENDBLOCK not in line and \ 1408 1412 WikiParser._startblock_re.match(line): 1409 1413 in_code_block += 1 … … class HtmlFormatter(object): 1531 1535 self.context = context 1532 1536 if isinstance(wikidom, basestring): 1533 1537 wikidom = WikiParser(env).parse(wikidom) 1534 self.wikidom = unicode(wikidom)1538 self.wikidom = wikidom 1535 1539 1536 1540 def generate(self, escape_newlines=False): 1537 1541 """Generate HTML elements. … … class InlineHtmlFormatter(object): 1558 1562 self.context = context 1559 1563 if isinstance(wikidom, basestring): 1560 1564 wikidom = WikiParser(env).parse(wikidom) 1561 self.wikidom = unicode(wikidom)1565 self.wikidom = wikidom 1562 1566 1563 1567 def generate(self, shorten=False): 1564 1568 """Generate HTML inline elements.
comment:10 by , 8 years ago
Release Notes: | modified (diff) |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
comment:11 by , 8 years ago
Spotted an error:
-
trac/mimeview/api.py
diff --git a/trac/mimeview/api.py b/trac/mimeview/api.py index 8f38e9e..15246a3 100644
a b class Content(object): 598 598 if size == 0: 599 599 return '' 600 600 if self.content is None: 601 self.content = io. StringIO(self.input.read(self.max_size))601 self.content = io.BytesIO(self.input.read(self.max_size)) 602 602 return self.content.read(size) 603 603 604 604 def reset(self):
I'll add a test case.
by , 8 years ago
Attachment: | Screen Shot 2016-08-09 at 16.01.39.png added |
---|
follow-up: 15 comment:13 by , 8 years ago
Possible bug in Pygments (using latest, 2.2.0). I tried replacing StringIO.StringIO
with io.StringIO
and was getting an error with test_python_hello_mimeview: TypeError: unicode argument expected, got 'str'
.
The issue appears to be associated with the newline at the start of the content, and can be reproduced with this minimal test case:
import io from pygments.formatters.html import HtmlFormatter from pygments.lexers import get_lexer_by_name lexer_options = {'stripnl': False} lexer_name = 'ipython2' content = """ """ out = io.StringIO() lexer = get_lexer_by_name(lexer_name, **lexer_options) formatter = HtmlFormatter(nowrap=True) formatter.format(lexer.get_tokens(content), out) assert '\n' == out.getvalue()
Workaround I've found is to set the HtmlFormatter
lineseparator
option:
- formatter = HtmlFormatter(nowrap=True) + formatter = HtmlFormatter(nowrap=True, lineseparator=u'\n')
The issue is also not seen if adding a single whitespace before the newline. In html.py, the code branches to if line
is there is a single whitespace and newline, but branches to else: yield 1, lsep
if there's only a newline.
follow-up: 16 comment:14 by , 8 years ago
We've accumulated some instances of StringIO.StringIO
, which will be incompatible with Python 3. Proposed changes in log:rjollos.git:t12046_stringio_replacement.3, including the workaround described in comment:13. I haven't yet tested the deploy_trac.fcgi
change.
comment:15 by , 8 years ago
Replying to Ryan J Ollos:
Possible bug in Pygments (using latest, 2.2.0). I tried replacing
StringIO.StringIO
withio.StringIO
and was getting an error with test_python_hello_mimeview:TypeError: unicode argument expected, got 'str'
.
I've reported the issue to the Pygments project: issue 1349.
comment:16 by , 8 years ago
Replying to Ryan J Ollos:
I haven't yet tested the
deploy_trac.fcgi
change.
From testing, it seems the type of the traceback is bytes. The change from [15010#file29] was correct but was overwritten in [15424]. Corrected in proposed changes: [01221e2e/rjollos.git].
Based on what I've read, it would seem that the replacement
cStringIO.StringIO
→io.BytesIO
can be made everywhere. We'll have to be more careful when replacingStringIO.StringIO
though.