Edgewall Software
Modify

Opened 23 months ago

Closed 23 months ago

Last modified 23 months ago

#10840 closed defect (fixed)

Zip source download fails for large directories

Reported by: mrelbe Owned by: cboos
Priority: high Milestone: 1.0.1
Component: version control/browser Version: 1.0
Severity: major Keywords:
Cc:
Release Notes:

Fix zip source download for large directories in Subversion repositories.

API Changes:

Description

Zip source download fails for large directories, e.g. zip download tags/trac-1.0 yields the following error:

SubversionException: ("Can't open file '/var/svn/trac/db/revprops/10/10019': Too many open files", 24)

How to Reproduce

While doing a GET operation on /changeset/11317/tags/trac-1.0, Trac issued an internal error.

Request parameters:

{'format': u'zip',
 'new': u'11317',
 'new_path': u'/tags/trac-1.0',
 'old_path': u'/'}

User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1

System Information

Trac at t.e.o.

Enabled Plugins

Trac at t.e.o.

Python Traceback

Traceback (most recent call last):
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/web/main.py", line 497, in _dispatch_request
    dispatcher.dispatch(req)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/web/main.py", line 214, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/web_ui/changeset.py", line 352, in process_request
    self._render_zip(req, filename, repos, data)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/web_ui/changeset.py", line 781, in _render_zip
    zipinfo.date_time = new_node.last_modified.utctimetuple()[:6]
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/api.py", line 1073, in <lambda>
    last_modified = property(lambda self: self.get_last_modified())
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/tracopt/versioncontrol/svn/svn_fs.py", line 864, in get_last_modified
    core.SVN_PROP_REVISION_DATE, self.pool())
  File "/usr/lib/pymodules/python2.6/libsvn/fs.py", line 682, in svn_fs_revision_prop
    return _fs.svn_fs_revision_prop(*args)
SubversionException: ("Can't open file '/var/svn/trac/db/revprops/10/10019': Too many open files", 24)

Attachments (1)

t10840-get-changes.patch (2.6 KB) - added by cboos 23 months ago.
only call get_node() in the outer iteration

Download all attachments as: .zip

Change History (16)

comment:1 Changed 23 months ago by cboos

  • Milestone set to 1.0.1
  • Owner set to cboos
  • Priority changed from normal to high
  • Severity changed from normal to major

#10842 was closed as duplicate, and #10831 as well (somehow for that one I failed to realize it was a Zip download).

comment:2 Changed 23 months ago by cboos

It's a problem we introduced in the .rc1 (r11237). The iterator was there for a reason after all…

Changed 23 months ago by cboos

only call get_node() in the outer iteration

comment:3 Changed 23 months ago by cboos

Fix seems to work fine (applied locally). Rémy, OK to commit?

comment:4 Changed 23 months ago by rblank

If it fixes the issue, sure. But I still don't understand what's going on. Or whether the patch actually fixes the issue, or only papers over it. Can you please explain?

It looks like each SubversionNode object keeps a file open, but I can't find any hint in the implementation that would confirm that. And if it's the case, I would argue that it's a bad design of SubversionNode and we should rather fix that (e.g. opening the file lazily) rather than all users of get_node().

comment:5 Changed 23 months ago by rblank

Or actually, fix both. It can't be good to have too many SubversionNode objects live at the same time, considering all that's done in the constructor.

comment:6 follow-up: Changed 23 months ago by cboos

Elsewhere we also try to minimize the lifetime of nodes, even going as far as building "fake" nodes (class entry in BrowserModule._render_dir()) as its costly to keep them around. The cost stems from the pool which is attached to each node. And in the present case, it looks like the closing of the file happens at the disposal of the node's pool as well, as in SubversionNode.get_content() the fs.file_contents uses that pool (though I have not looked in the svn code to verify).

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there. Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 23 months ago by cboos

Replying to cboos:

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there.

Right, this patch alone also fixes the issue:

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object.""" 
    754754        if self.isdir: 
    755755            return None 
     756        pool = Pool(self.pool) 
    756757        s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8, 
    757                                          self.pool())) 
     758                                         pool())) 
    758759        # The stream object needs to reference the pool to make sure the pool 
    759760        # is not destroyed before the former. 
    760         s._pool = self.pool 
     761        s._pool = pool 
    761762        return s 
    762763 
    763764    def get_entries(self): 

Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

And this one as well:

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object.""" 
    754754        if self.isdir: 
    755755            return None 
    756         s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8, 
    757                                          self.pool())) 
    758         # The stream object needs to reference the pool to make sure the pool 
    759         # is not destroyed before the former. 
    760         s._pool = self.pool 
    761         return s 
     756        return core.Stream(fs.file_contents(self.root, self._scoped_path_utf8)) 
    762757 
    763758    def get_entries(self): 
    764759        """Yield `SubversionNode` corresponding to entries in this directory. 

Let's experiment with one or the other here, in order to detect any issues under load (going with the second).

(and it looks like we have another regression, with the PatchRenderer which doesn't get used)

Last edited 23 months ago by cboos (previous) (diff)

comment:8 Changed 23 months ago by rblank

The patch issue looks familiar. The logs say:

Trac[patch] WARNING: Invalid unified diff content

comment:9 follow-up: Changed 23 months ago by rblank

Also, it works if I copy/paste the patch above. CR/LF issue?

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object.""" 
    754754        if self.isdir: 
    755755            return None 
    756         s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8, 
    757                                          self.pool())) 
    758         # The stream object needs to reference the pool to make sure the pool 
    759         # is not destroyed before the former. 
    760         s._pool = self.pool 
    761         return s 
     756        return core.Stream(fs.file_contents(self.root, self._scoped_path_utf8)) 
    762757 
    763758    def get_entries(self): 
    764759        """Yield `SubversionNode` corresponding to entries in this directory. 

comment:10 in reply to: ↑ 9 ; follow-up: Changed 23 months ago by cboos

Replying to rblank:

Also, it works if I copy/paste the patch above. CR/LF issue?

No, #!diff vs. #!patch

comment:11 in reply to: ↑ 10 ; follow-up: Changed 23 months ago by rblank

Replying to cboos:

No, #!diff vs. #!patch

Oh, yes, I see. So #!diff is now handled by Pygments?

comment:12 in reply to: ↑ 11 ; follow-up: Changed 23 months ago by cboos

Replying to rblank:

Replying to cboos:

No, #!diff vs. #!patch

Oh, yes, I see. So #!diff is now handled by Pygments?

Yes, it seems it now takes precedence. When I disable the PygmentsRenderer, the PatchRenderer takes over. It must be a side-effect of #5533, as 'diff' is now mapped to something different than text/x-diff (the only one recognized by PatchRenderer).

Ah, funny:

MIME TypesWikiProcessors
text/x-diffpatch

MIME TypesWikiProcessors
text/x-patchdiff udiff

The second being added by PygmentsRenderer.get_extra_mimetypes().

comment:13 in reply to: ↑ 12 Changed 23 months ago by cboos

Replying to cboos:

(the only one recognized by PatchRenderer).

Fixed in r11319.

comment:14 in reply to: ↑ 7 Changed 23 months ago by cboos

  • Resolution set to fixed
  • Status changed from new to closed

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there.

Right, this patch alone also fixes the issue:

That less disruptive change was applied to 1.0-stable, in r11344.

Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

And this one as well:

And that slightly more risky approach was applied to trunk, in r11345.

comment:15 Changed 23 months ago by cboos

  • Release Notes modified (diff)

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed The owner will remain cboos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from cboos to the specified user.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.