Context Navigation

Modify ↓

#5697 closed defect (worksforme)

Ephemeral wsgi.input stream makes it nearly impossible for plugins to change URL structure

Reported by:	Dave Abrahams <dave@…>	Owned by:	Jonas Borgström
Priority:	normal	Milestone:
Component:	general	Version:	devel
Severity:	normal	Keywords:
Cc:	daniel@…	Branch:
Release Notes:
API Changes:
Internal Changes:

Description

See, for example, the TracForge project list.

Some information is lost as req.args is constructed, and all of that information needs to be reassembled exactly as-is for, e.g. file upload (which uses a multipart-formdata request), even though the plugin may not even be interested in touching it (as in the case of TracForge). Python gives us no facility for undoing the parsing of cgi.FieldStorage, so anyone attempting this sort of hook will have to repeat an extremely complicated series of steps; it would be better to capture the request in a string that can be reused.

The following simple patch remedies the problem:

api.py

-              old
+              new
         args = _RequestArgs()
         fp = self.environ['wsgi.input']
+        # We need to be able to read the data more than once.
+        from StringIO import StringIO
+        data = fp.read()
+        fp = StringIO(data)
+        self.environ['wsgi.input'] = StringIO(data)
         # Avoid letting cgi.FieldStorage consume the input stream when the
         # request does not contain form data

Attachments (0)

Change History (12)

follow-up: 2 comment:1 by Graham.Dumpleton@…, 18 years ago

I don't know the problem you are trying to solve, but a few comments.

First note that the result of calling read() on 'wsgi.input' with no argument will have an undefined result. For some WSGI adapters it will return all data from the request, but on others it will hang indefinitely if HTTP/1.1 keep alive is enabled.

When calling read() you should always be using the CONTENT_LENGTH from the WSGI environment. Ie.,

clen = int(environ.get('CONTENT_LENGTH', '0'))
data = fp.read(clen)

That said, one shouldn't be doing what you are wanting to do anyway as it will have the result when doing POST with very large input files of reading the complete files into memory. If you are trying to upload files which are GBs in size that would be really bad.

in reply to: 1 comment:2 by Dave Abrahams <dave@…>, 18 years ago

Cc:	daniel@… added

Replying to Graham.Dumpleton@gmail.com:

I don't know the problem you are trying to solve,

The problem of being able to, in a plugin, modify dispatching of URLs without completely taking it over. In other words, I need to be able to change some details of the Request and send it back to Trac for processing.

but a few comments.

First note that the result of calling read() on 'wsgi.input' with no argument will have an undefined result. For some WSGI adapters it will return all data from the request, but on others it will hang indefinitely if HTTP/1.1 keep alive is enabled.

Yes, I discovered as much, the hard way, a few days ago. The patch was actually produced by a colleague of mine and I overlooked that problem.

When calling read() you should always be using the CONTENT_LENGTH from the WSGI environment. Ie.,
clen = int(environ.get('CONTENT_LENGTH', '0'))
data = fp.read(clen)

Right.

That said, one shouldn't be doing what you are wanting to do anyway

One shouldn't want to write plugins that alter the URL dispatching structure? I think you should take that up with Noah Kantrowitz, then. TracForge is his plugin; I'm just trying to figure out how to make it work with 0.11.

as it will have the result when doing POST with very large input files of reading the complete files into memory. If you are trying to upload files which are GBs in size that would be really bad.

That's true; that patch is not the best approach. One alternative approach builds a new compound "File-like" stream object that is the lazy concattenation of many smaller streams, including the ones held in any FieldStorage objects that remain in the stream.

comment:3 by Dave Abrahams <dave@…>, 18 years ago

Here's an even simpler and better approach:

trac/web/api.py

             '_inheaders': Request._parse_headers
+        }
+        if 'parsed_args' in self.environ:
+            self.args = self.environ['parsed_args']
         self.base_url = self.environ.get('trac.base_url')
         if not self.base_url:
             self.base_url = self._reconstruct_url()
 …
                     else:
                         args[name] = value
+        self.environ['parsed_args'] = args
         return args
     def _parse_cookies(self):

follow-up: 8 comment:4 by Christian Boos, 18 years ago

Does the parsed_args name correspond to some sort of standard? If not, we should probably name this WSGI environment variable trac.parsed_args or something. Also, adding a comment before the if about the use cases for the variable would be helpful.

follow-up: 6 comment:5 by Noah Kantrowitz, 18 years ago

cmlenz and I agreed that this should be worked around in TracForge, not Trac. I believe I have a viable mechanism to do it, just need to test it.

in reply to: 5 ; follow-up: 7 comment:6 by Dave Abrahams <dave@…>, 18 years ago

Replying to nkantrowitz:

cmlenz and I agreed that this should be worked around in TracForge, not Trac.

The problem AFAICT is that the current design of Trac makes it nearly impossible to do that in a plugin. The only reasons I can see for not changing Trac in this case are:

It's not only possible, but actually easy to do in a plugin
This sort of thing is considered evil and Trac does not want to support plugins doing it.

I believe I have a viable mechanism to do it, just need to test it.

Well then, I'm eager to see the details, because we tried lots of approaches to doing it in TracForge, none of which ultimately worked. And the solution ought to be prominently documented (unless of course Trac doesn't want to support plugins doing this) because of its non-obviousness.

in reply to: 6 comment:7 by Noah Kantrowitz, 18 years ago

Replying to Dave Abrahams <dave@boost-consulting.com>:

Replying to nkantrowitz:

cmlenz and I agreed that this should be worked around in TracForge, not Trac.

The problem AFAICT is that the current design of Trac makes it nearly impossible to do that in a plugin. The only reasons I can see for not changing Trac in this case are:

It's not only possible, but actually easy to do in a plugin
This sort of thing is considered evil and Trac does not want to support plugins doing it.

I believe I have a viable mechanism to do it, just need to test it.

Well then, I'm eager to see the details, because we tried lots of approaches to doing it in TracForge, none of which ultimately worked. And the solution ought to be prominently documented (unless of course Trac doesn't want to support plugins doing this) because of its non-obviousness.

Its not so much not wanting to support it as that this kind of WSGI re-serving is clearly a special case. The workaround I devised is to do the secondary WSGI request in a pre-request filter. req.args is evaluated lazily, and as long as nothing touches it you should be fine.

in reply to: 4 comment:8 by Dave Abrahams <dave@…>, 18 years ago

Replying to cboos:

Does the parsed_args name correspond to some sort of standard?

Not that we know of :-)

If not, we should probably name this WSGI environment variable trac.parsed_args or something.

Fine with us.

Also, adding a comment before the if about the use cases for the variable would be helpful.

If nkantrowitz' idea doesn't pan out, we'll be happy to submit a new patch, but right now it sounds like you guys want to use his approach, whatever that is.

comment:9 by Dave Abrahams <dave@…>, 18 years ago

Resolution:	→ fixed
Status:	new → closed

CodeRanger says this is now handled properly in TracForge. It was broken "because the sub-request was being processed much later in the cycle."

comment:10 by Emmanuel Blot, 18 years ago

Resolution:	fixed
Status:	closed → reopened

comment:11 by Emmanuel Blot, 18 years ago

Resolution:	→ worksforme
Status:	reopened → closed

comment:12 by Noah Kantrowitz, 18 years ago

For the record, nothing was changed specifically for this. Just running the sub-request in the match_request was enough.

Modify Ticket

Change Properties

Summary:
Description:	See, for example, the [trachacks:browser:tracforgeplugin/branches/bewst/0.11-clearsilver/tracforge/admin/dispatch.py#L57 TracForge project list]. Some information is lost [browser:trunk/trac/web/api.py:r5646#L447 as req.args is constructed], and all of that information needs to be reassembled exactly as-is for, e.g. file upload (which uses a multipart-formdata request), even though the plugin may not even be interested in touching it (as in the case of !TracForge). Python gives us no facility for undoing the parsing of `cgi.FieldStorage`, so anyone attempting this sort of hook will have to repeat an extremely complicated series of steps; it would be better to capture the request in a string that can be reused. The following simple patch remedies the problem: {{{ #!diff --- api.py 2007-06-14 20:52:51.627746101 -0400 +++ /tmp/api.py 2007-07-12 09:41:56.888371624 -0400 @@ -434,6 +434,11 @@ args = _RequestArgs() fp = self.environ['wsgi.input'] + # We need to be able to read the data more than once. + from StringIO import StringIO + data = fp.read() + fp = StringIO(data) + self.environ['wsgi.input'] = StringIO(data) # Avoid letting cgi.FieldStorage consume the input stream when the # request does not contain form data }}} You may use WikiFormatting here.
Type:		Priority:
Milestone:		Component:
Version:		Severity:
Keywords:		Cc:	Set your email in Preferences
Branch:
Release Notes:
API Changes:
Internal Changes:

Action

leave as closed The owner will remain Jonas Borgström.

reopen The resolution will be deleted. Next status will be 'reopened'.

change ownership to The owner will be changed from Jonas Borgström to the specified user.

Add Comment

Your email or username:

E-mail address and name can be saved in the Preferences .

You may use WikiFormatting here.

Attachments ↑ Description ↑

Note: See TracTickets for help on using tickets.

Download in other formats: