Edgewall Software
Modify

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#13327 closed defect (fixed)

TypeError: StringIO() argument 1 must be string or buffer, not None when running Trac with gevent

Reported by: Clemens Lang <cal@…> Owned by: Clemens Lang <cal@…>
Priority: normal Milestone: 1.4.3
Component: plugin/git Version: 1.2.2
Severity: normal Keywords: patch
Cc: cal@… Branch:
Release Notes:

PyGIT.cat_file() handles short reads from a pipe.

API Changes:
Internal Changes:

Description

How to Reproduce

The MacPorts Trac installation (currently still 1.2.x) runs using gunicorn with gevent workers and runs the monkey patching code for gevent on startup:

from gevent import monkey
monkey.patch_all()

This replaces a number builtin methods with reimplementations that support cooperative multitasking. This, however, has an effect on the trac browser when showing a file from a Git repository that's larger than 65535 bytes (at least it's that specific number on our system, this may depend on what the system buffers in various places), where instead of showing the file, you get a TypeError described below.

This happens because tracopt.versioncontrol.git.PyGIT.Storage.cat_file reads from self.__cat_file_pipe.stdout using read(), assuming that this read will always return the full number of requested bytes. When running with gevent, this is not the case, and a read can return fewer bytes than requested. I'm not completely sure whether that is a bug in gevent and it shouldn't behave like this on read(), or whether it's a bug in Trac assuming that the read will always return the full number of bytes.

I am attaching a patch against trunk that wraps the call to read in a loop to check, detect, and counteract this situation rather than silently ignoring it, because a partial read can mess up processing the results of the next command sent to the git cat-file --batch process.

My patch also contains a test for this, but adding the monkey patching code above to the test file is required for the problem to show up without the fix.

While doing a GET operation on /browser/macports-base/src/port1.0/portconfigure.tcl, Trac issued an internal error.

Request parameters:

{'path': u'/macports-base/src/port1.0/portconfigure.tcl'}

User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Firefox/78.0

System Information

Trac 1.2.2
Babel 2.6.0
dnspython 1.16.0
Docutils 0.16
Genshi 0.7.1 (with speedups)
GIT 2.20.1
Pillow 5.4.1
psycopg2 2.7.7
Pygments 2.3.1
Python 2.7.16 (default, Oct 10 2019, 22:02:15)
[GCC 8.3.0]
pytz 2019.1
setuptools 39.0.1
Subversion 1.10.4 (r1850624)
jQuery 1.11.3
jQuery UI 1.11.4
jQuery Timepicker 1.5.5

Enabled Plugins

CcMe 1.2
MacPortsTicketsModule 1.0
NavAdd 0.1.1
TicketReporterPermissionsPolicy N/A
trac-configurable-ctu 1.0
trac-github 2.4.dev0
TracAutocompleteUsersPlugin 0.4.4.dev0
TracSpamFilter 1.2.4.dev0
TracUserMigration 0.1
TracUsernameDecoratePlugin 0.12.0.1
TracWikiNotification 0.4.3

Interface Customization

shared-htdocs
shared-templates
site-htdocs .gitignore, MacPorts.png, macports.ico, style.css
site-templates site.html

Python Traceback

Traceback (most recent call last):
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/trac/web/main.py", line 623, in _dispatch_request
    dispatcher.dispatch(req)
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/trac/web/main.py", line 239, in dispatch
    resp = chosen_handler.process_request(req)
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/trac/versioncontrol/web_ui/browser.py", line 407, in process_request
    file_data = self._render_file(req, context, repos, node, rev)
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/trac/versioncontrol/web_ui/browser.py", line 728, in _render_file
    node.get_processed_content(),
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/trac/versioncontrol/api.py", line 1165, in get_processed_content
    return self.get_content()
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/git_fs.py", line 720, in get_content
    return self.repos.git.get_file(self.fs_sha)
  File "/var/www/trac/virtualenv/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/PyGIT.py", line 838, in get_file
    return cStringIO.StringIO(self.cat_file('blob', str(sha)))
TypeError: StringIO() argument 1 must be string or buffer, not None

Attachments (1)

0001-1.5.2dev-PyGIT.Storage.cat_file-Handle-short-reads.patch (4.6 KB ) - added by Clemens Lang <cal@…> 3 years ago.

Download all attachments as: .zip

Change History (6)

comment:1 by Jun Omae, 3 years ago

Component: version control/browserplugin/git
Milestone: 1.4.3
Owner: set to Jun Omae
Status: newassigned
Version: 1.2.2

Thanks for the reporting.

I'll test the patch and investigate behavior and document of read(). I may push it to 1.0-stable and 1.2-stable branches if possible.

comment:2 by Jun Omae, 3 years ago

read(2) with a pipe can return smaller than the given n parameter. The proposed patch is correct with/without gevent.monkey.patch_*(). However, I think io.BytesIO or tempfile.SpooledTemporaryFile should be used rather than concatenating a bytes instance repeatedly.


read(2) - Linux manual page says:

On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal.

read - The Open Group Base Specifications Issue 7, 2018 edition says:

Upon successful completion, where nbyte is greater than 0, read() shall mark for update the last data access timestamp of the file, and shall return the number of bytes read. This number shall never be greater than nbyte. The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading. For example, a read() from a file associated with a terminal may return one typed line of data.

comment:3 by Jun Omae, 3 years ago

Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Revised the proposed patch and committed in [17519:17523]. Thanks.

comment:4 by Jun Omae, 3 years ago

Owner: changed from Jun Omae to Clemens Lang <cal@…>

comment:5 by Jun Omae, 3 years ago

I missed test failing on Windows…. Fixed in [17524:17526].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Clemens Lang <cal@…>.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Clemens Lang <cal@…> to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.