#10048 closed defect (fixed)
mod_wsgi and PATH_INFO mangling
| Reported by: | Christian Boos | Owned by: | Christian Boos | 
|---|---|---|---|
| Priority: | normal | Milestone: | 0.12.3 | 
| Component: | web frontend | Version: | 0.13dev | 
| Severity: | normal | Keywords: | mod_wsgi apache22 | 
| Cc: | Branch: | ||
| Release Notes: | |||
| API Changes: | |||
| Internal Changes: | |||
Description
I've upgraded one of my servers from mod_python (3.3.1) to mod_wsgi (3.3), and this broke the browsing of the default repository.
The breakage is due to mod_wsgi squashing repeated slashes in the PATH_INFO, and we have precisely this situation when browsing the default repository (see the URL for e.g. source:trunk).
This squashing doesn't play nicely with a number of our assumptions.
The first one is when building the SCRIPT_NAME from the SCRIPT_URL (r4718:4719). If this breaks, the consequences are severe as all the generated links end up broken but fortunately this seems easy to fix (patch follows).
The other problem is the autoexpand feature (#7074), where we need a similar fix.
An alternative approach would be to ensure we have no repeated "/" in our generated links, which would be a bit of a shame, as it seems to be perfectly valid to have a semantic associated with empty segments (web-sig, rfc:3875#section-4.1.5). The argument that Apache 2.x does this normalization is surprising, as this was not an issue with mod_python.
Attachments (2)
Change History (15)
by , 15 years ago
| Attachment: | t10048-fixup-mod_wsgi-collapse-slashes-r10603.patch added | 
|---|
comment:1 by , 15 years ago
| Summary: | mod_wsgi issue PATH_INFO mangling → mod_wsgi and PATH_INFO mangling | 
|---|
The patch was developed and tested on trunk, but also applies on 0.12-stable cleanly.
comment:2 by , 15 years ago
Replying to self:
The argument that Apache 2.x does this normalization is surprising, as this was not an issue with mod_python.
Ok, http://code.google.com/p/modwsgi/source/detail?r=601 explains it better: 1.3 and 2.0 have the same behavior (I still use 2.0.59), the squashing happened in 2.2. So mod_python on Apache 2.2 (if that even works) will also have the problems described in this ticket.
comment:3 by , 15 years ago
| Keywords: | apache22 added | 
|---|
I've committed the patch in r10609.
Maybe we should nevertheless make sure that we won't generate links with empty segments, in Href, to prevent further glitches to happen:
- 
      
trac/web/href.py
diff -r 8746a86b3432 trac/web/href.py
a b class Href(object): 162 162 163 163 # build the path 164 164 path = '/'.join(unicode_quote(unicode(arg).strip('/'), self.path_safe) 165 for arg in args if arg is not None)165 for arg in args if arg) 166 166 if path: 167 167 href += '/' + path 168 168 elif not href:  
?
follow-up: 12 comment:4 by , 14 years ago
Don't we use an empty segment in diff links? Ah, no, only in the trac link, not in the final URL:
I get 4 test failures with the patch in comment:3. Two of them test the possibility to have empty segments, so those could be dropped. The other two are due to an empty trailing segment, leading to a trailing /. This could still be useful to have. So I think the change is a bit too invasive, as the current behavior could be relied upon by plugins.
comment:5 by , 14 years ago
Ok, so something like:
- 
      
trac/web/href.py
diff -r 8746a86b3432 trac/web/href.py
a b class Href(object): 162 162 163 163 # build the path 164 164 path = '/'.join(unicode_quote(unicode(arg).strip('/'), self.path_safe) 165 165 for arg in args if arg is not None) 166 166 if path: 167 href += '/' + path 167 if '//' in path: 168 path = re.sub('/+', '/', path) 169 href += '/' + path.lstrip('/') 168 170 elif not href:  
(+ probably caching that regexp … and verifying what's after the elif, but you get the idea ;-) )
by , 14 years ago
| Attachment: | 10048-no-empty-segments-r10850.patch added | 
|---|
Remove empty path segements from generated URLs.
comment:6 by , 14 years ago
10048-no-empty-segments-r10850.patch is the complete patch, and seems to work fine. So the only question is whether this will break any plugins that rely on the old behavior. I guess the answer is that these plugins are broken anyway by mod_wsgi.
Ok to apply?
follow-up: 9 comment:8 by , 14 years ago
From a class that should really not now anything about other internals, this seems to be very much the correct behavior according to what the standards actually support:
>>> from trac.web.href import Href >>> h = Href('a') >>> h(None, 'b') 'a/b' >>> h('', 'b') 'a//b'
I support just leaving this code as-is.
comment:9 by , 14 years ago
Replying to osimons:
From a class that should really not now anything about other internals, this seems to be very much the correct behavior according to what the standards actually support:
You'll be able to generate a link which is standards compliant but which is prone to introduce subtle issues as that link might get normalized and come back to you in a different form (a/b). By enforcing this limitation in Href, we drastically reduce the likeliness of such problems.
That would avoid having to deal with this issue when we compare a link as seen in the browser and a generated link (see [10609/branches/0.12-stable/trac/htdocs/js/expand_dir.js]).
So it's not about messing with other internals, it's about generating links that are guaranteed to come back to us in the same form (almost guaranteed - I think we still have a problem with unicode links, see ticket:10121#comment:6, to be verified).
comment:10 by , 14 years ago
| Resolution: | → fixed | 
|---|---|
| Status: | new → closed | 
The final word on this seems to be Remy's proposal to make the restriction effective for 0.13 only (googlegroups:trac-dev:65acf0c667f3b681).
I keep that ticket for 0.12.3 though, as the initial issue (mod_wsgi (3.3), breaking the browsing of the default repository) will be fixed for this release.



  
web: mod_wsgi collapses multiple "/" in PATH_INFO