Edgewall Software
Modify

Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#10840 closed defect (fixed)

Zip source download fails for large directories

Reported by: Mikael Relbe Owned by: Christian Boos
Priority: high Milestone: 1.0.1
Component: version control/browser Version: 1.0
Severity: major Keywords:
Cc: Branch:
Release Notes:

Fix zip source download for large directories in Subversion repositories.

API Changes:
Internal Changes:

Description

Zip source download fails for large directories, e.g. zip download tags/trac-1.0 yields the following error:

SubversionException: ("Can't open file '/var/svn/trac/db/revprops/10/10019': Too many open files", 24)

How to Reproduce

While doing a GET operation on /changeset/11317/tags/trac-1.0, Trac issued an internal error.

Request parameters:

{'format': u'zip',
 'new': u'11317',
 'new_path': u'/tags/trac-1.0',
 'old_path': u'/'}

User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1

System Information

Trac at t.e.o.

Enabled Plugins

Trac at t.e.o.

Python Traceback

Traceback (most recent call last):
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/web/main.py", line 497, in _dispatch_request
    dispatcher.dispatch(req)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/web/main.py", line 214, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/web_ui/changeset.py", line 352, in process_request
    self._render_zip(req, filename, repos, data)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/web_ui/changeset.py", line 781, in _render_zip
    zipinfo.date_time = new_node.last_modified.utctimetuple()[:6]
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/api.py", line 1073, in <lambda>
    last_modified = property(lambda self: self.get_last_modified())
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/tracopt/versioncontrol/svn/svn_fs.py", line 864, in get_last_modified
    core.SVN_PROP_REVISION_DATE, self.pool())
  File "/usr/lib/pymodules/python2.6/libsvn/fs.py", line 682, in svn_fs_revision_prop
    return _fs.svn_fs_revision_prop(*args)
SubversionException: ("Can't open file '/var/svn/trac/db/revprops/10/10019': Too many open files", 24)

Attachments (1)

t10840-get-changes.patch (2.6 KB ) - added by Christian Boos 12 years ago.
only call get_node() in the outer iteration

Download all attachments as: .zip

Change History (16)

comment:1 by Christian Boos, 12 years ago

Milestone: 1.0.1
Owner: set to Christian Boos
Priority: normalhigh
Severity: normalmajor

#10842 was closed as duplicate, and #10831 as well (somehow for that one I failed to realize it was a Zip download).

comment:2 by Christian Boos, 12 years ago

It's a problem we introduced in the .rc1 (r11237). The iterator was there for a reason after all…

by Christian Boos, 12 years ago

Attachment: t10840-get-changes.patch added

only call get_node() in the outer iteration

comment:3 by Christian Boos, 12 years ago

Fix seems to work fine (applied locally). Rémy, OK to commit?

comment:4 by Remy Blank, 12 years ago

If it fixes the issue, sure. But I still don't understand what's going on. Or whether the patch actually fixes the issue, or only papers over it. Can you please explain?

It looks like each SubversionNode object keeps a file open, but I can't find any hint in the implementation that would confirm that. And if it's the case, I would argue that it's a bad design of SubversionNode and we should rather fix that (e.g. opening the file lazily) rather than all users of get_node().

comment:5 by Remy Blank, 12 years ago

Or actually, fix both. It can't be good to have too many SubversionNode objects live at the same time, considering all that's done in the constructor.

comment:6 by Christian Boos, 12 years ago

Elsewhere we also try to minimize the lifetime of nodes, even going as far as building "fake" nodes (class entry in BrowserModule._render_dir()) as its costly to keep them around. The cost stems from the pool which is attached to each node. And in the present case, it looks like the closing of the file happens at the disposal of the node's pool as well, as in SubversionNode.get_content() the fs.file_contents uses that pool (though I have not looked in the svn code to verify).

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there. Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

in reply to:  6 ; comment:7 by Christian Boos, 12 years ago

Replying to cboos:

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there.

Right, this patch alone also fixes the issue:

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object."""
    754754        if self.isdir:
    755755            return None
     756        pool = Pool(self.pool)
    756757        s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8,
    757                                          self.pool()))
     758                                         pool()))
    758759        # The stream object needs to reference the pool to make sure the pool
    759760        # is not destroyed before the former.
    760         s._pool = self.pool
     761        s._pool = pool
    761762        return s
    762763
    763764    def get_entries(self):

Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

And this one as well:

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object."""
    754754        if self.isdir:
    755755            return None
    756         s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8,
    757                                          self.pool()))
    758         # The stream object needs to reference the pool to make sure the pool
    759         # is not destroyed before the former.
    760         s._pool = self.pool
    761         return s
     756        return core.Stream(fs.file_contents(self.root, self._scoped_path_utf8))
    762757
    763758    def get_entries(self):
    764759        """Yield `SubversionNode` corresponding to entries in this directory.

Let's experiment with one or the other here, in order to detect any issues under load (going with the second).

(and it looks like we have another regression, with the PatchRenderer which doesn't get used)

Last edited 12 years ago by Christian Boos (previous) (diff)

comment:8 by Remy Blank, 12 years ago

The patch issue looks familiar. The logs say:

Trac[patch] WARNING: Invalid unified diff content

comment:9 by Remy Blank, 12 years ago

Also, it works if I copy/paste the patch above. CR/LF issue?

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object."""
    754754        if self.isdir:
    755755            return None
    756         s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8,
    757                                          self.pool()))
    758         # The stream object needs to reference the pool to make sure the pool
    759         # is not destroyed before the former.
    760         s._pool = self.pool
    761         return s
     756        return core.Stream(fs.file_contents(self.root, self._scoped_path_utf8))
    762757
    763758    def get_entries(self):
    764759        """Yield `SubversionNode` corresponding to entries in this directory.

in reply to:  9 ; comment:10 by Christian Boos, 12 years ago

Replying to rblank:

Also, it works if I copy/paste the patch above. CR/LF issue?

No, #!diff vs. #!patch

in reply to:  10 ; comment:11 by Remy Blank, 12 years ago

Replying to cboos:

No, #!diff vs. #!patch

Oh, yes, I see. So #!diff is now handled by Pygments?

in reply to:  11 ; comment:12 by Christian Boos, 12 years ago

Replying to rblank:

Replying to cboos:

No, #!diff vs. #!patch

Oh, yes, I see. So #!diff is now handled by Pygments?

Yes, it seems it now takes precedence. When I disable the PygmentsRenderer, the PatchRenderer takes over. It must be a side-effect of #5533, as 'diff' is now mapped to something different than text/x-diff (the only one recognized by PatchRenderer).

Ah, funny:

MIME TypesWikiProcessors
text/x-diffpatch

MIME TypesWikiProcessors
text/x-patchdiff udiff

The second being added by PygmentsRenderer.get_extra_mimetypes().

in reply to:  12 comment:13 by Christian Boos, 12 years ago

Replying to cboos:

(the only one recognized by PatchRenderer).

Fixed in r11319.

in reply to:  7 comment:14 by Christian Boos, 12 years ago

Resolution: fixed
Status: newclosed

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there.

Right, this patch alone also fixes the issue:

That less disruptive change was applied to 1.0-stable, in r11344.

Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

And this one as well:

And that slightly more risky approach was applied to trunk, in r11345.

comment:15 by Christian Boos, 12 years ago

Release Notes: modified (diff)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.