Edgewall Software
Modify

Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#10840 closed defect (fixed)

Zip source download fails for large directories

Reported by: mrelbe Owned by: cboos
Priority: high Milestone: 1.0.1
Component: version control/browser Version: 1.0
Severity: major Keywords:
Cc:
Release Notes:

Fix zip source download for large directories in Subversion repositories.

API Changes:

Description

Zip source download fails for large directories, e.g. zip download tags/trac-1.0 yields the following error:

SubversionException: ("Can't open file '/var/svn/trac/db/revprops/10/10019': Too many open files", 24)

How to Reproduce

While doing a GET operation on /changeset/11317/tags/trac-1.0, Trac issued an internal error.

Request parameters:

{'format': u'zip',
 'new': u'11317',
 'new_path': u'/tags/trac-1.0',
 'old_path': u'/'}

User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1

System Information

Trac at t.e.o.

Enabled Plugins

Trac at t.e.o.

Python Traceback

Traceback (most recent call last):
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/web/main.py", line 497, in _dispatch_request
    dispatcher.dispatch(req)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/web/main.py", line 214, in dispatch
    resp = chosen_handler.process_request(req)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/web_ui/changeset.py", line 352, in process_request
    self._render_zip(req, filename, repos, data)
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/web_ui/changeset.py", line 781, in _render_zip
    zipinfo.date_time = new_node.last_modified.utctimetuple()[:6]
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/trac/versioncontrol/api.py", line 1073, in <lambda>
    last_modified = property(lambda self: self.get_last_modified())
  File "/usr/local/virtualenv/0.13-stable/lib/python2.6/site-packages/Trac-1.0-py2.6.egg/tracopt/versioncontrol/svn/svn_fs.py", line 864, in get_last_modified
    core.SVN_PROP_REVISION_DATE, self.pool())
  File "/usr/lib/pymodules/python2.6/libsvn/fs.py", line 682, in svn_fs_revision_prop
    return _fs.svn_fs_revision_prop(*args)
SubversionException: ("Can't open file '/var/svn/trac/db/revprops/10/10019': Too many open files", 24)

Attachments (1)

t10840-get-changes.patch (2.6 KB) - added by cboos 2 years ago.
only call get_node() in the outer iteration

Download all attachments as: .zip

Change History (16)

comment:1 Changed 2 years ago by cboos

  • Milestone set to 1.0.1
  • Owner set to cboos
  • Priority changed from normal to high
  • Severity changed from normal to major

#10842 was closed as duplicate, and #10831 as well (somehow for that one I failed to realize it was a Zip download).

comment:2 Changed 2 years ago by cboos

It's a problem we introduced in the .rc1 (r11237). The iterator was there for a reason after all…

Changed 2 years ago by cboos

only call get_node() in the outer iteration

comment:3 Changed 2 years ago by cboos

Fix seems to work fine (applied locally). Rémy, OK to commit?

comment:4 Changed 2 years ago by rblank

If it fixes the issue, sure. But I still don't understand what's going on. Or whether the patch actually fixes the issue, or only papers over it. Can you please explain?

It looks like each SubversionNode object keeps a file open, but I can't find any hint in the implementation that would confirm that. And if it's the case, I would argue that it's a bad design of SubversionNode and we should rather fix that (e.g. opening the file lazily) rather than all users of get_node().

comment:5 Changed 2 years ago by rblank

Or actually, fix both. It can't be good to have too many SubversionNode objects live at the same time, considering all that's done in the constructor.

comment:6 follow-up: Changed 2 years ago by cboos

Elsewhere we also try to minimize the lifetime of nodes, even going as far as building "fake" nodes (class entry in BrowserModule._render_dir()) as its costly to keep them around. The cost stems from the pool which is attached to each node. And in the present case, it looks like the closing of the file happens at the disposal of the node's pool as well, as in SubversionNode.get_content() the fs.file_contents uses that pool (though I have not looked in the svn code to verify).

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there. Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

comment:7 in reply to: ↑ 6 ; follow-up: Changed 2 years ago by cboos

Replying to cboos:

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there.

Right, this patch alone also fixes the issue:

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object."""
    754754        if self.isdir:
    755755            return None
     756        pool = Pool(self.pool)
    756757        s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8,
    757                                          self.pool()))
     758                                         pool()))
    758759        # The stream object needs to reference the pool to make sure the pool
    759760        # is not destroyed before the former.
    760         s._pool = self.pool
     761        s._pool = pool
    761762        return s
    762763
    763764    def get_entries(self):

Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

And this one as well:

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object."""
    754754        if self.isdir:
    755755            return None
    756         s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8,
    757                                          self.pool()))
    758         # The stream object needs to reference the pool to make sure the pool
    759         # is not destroyed before the former.
    760         s._pool = self.pool
    761         return s
     756        return core.Stream(fs.file_contents(self.root, self._scoped_path_utf8))
    762757
    763758    def get_entries(self):
    764759        """Yield `SubversionNode` corresponding to entries in this directory.

Let's experiment with one or the other here, in order to detect any issues under load (going with the second).

(and it looks like we have another regression, with the PatchRenderer which doesn't get used)

Last edited 2 years ago by cboos (previous) (diff)

comment:8 Changed 2 years ago by rblank

The patch issue looks familiar. The logs say:

Trac[patch] WARNING: Invalid unified diff content

comment:9 follow-up: Changed 2 years ago by rblank

Also, it works if I copy/paste the patch above. CR/LF issue?

  • tracopt/versioncontrol/svn/svn_fs.py

     
    753753        """Retrieve raw content as a "read()"able object."""
    754754        if self.isdir:
    755755            return None
    756         s = core.Stream(fs.file_contents(self.root, self._scoped_path_utf8,
    757                                          self.pool()))
    758         # The stream object needs to reference the pool to make sure the pool
    759         # is not destroyed before the former.
    760         s._pool = self.pool
    761         return s
     756        return core.Stream(fs.file_contents(self.root, self._scoped_path_utf8))
    762757
    763758    def get_entries(self):
    764759        """Yield `SubversionNode` corresponding to entries in this directory.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 2 years ago by cboos

Replying to rblank:

Also, it works if I copy/paste the patch above. CR/LF issue?

No, #!diff vs. #!patch

comment:11 in reply to: ↑ 10 ; follow-up: Changed 2 years ago by rblank

Replying to cboos:

No, #!diff vs. #!patch

Oh, yes, I see. So #!diff is now handled by Pygments?

comment:12 in reply to: ↑ 11 ; follow-up: Changed 2 years ago by cboos

Replying to rblank:

Replying to cboos:

No, #!diff vs. #!patch

Oh, yes, I see. So #!diff is now handled by Pygments?

Yes, it seems it now takes precedence. When I disable the PygmentsRenderer, the PatchRenderer takes over. It must be a side-effect of #5533, as 'diff' is now mapped to something different than text/x-diff (the only one recognized by PatchRenderer).

Ah, funny:

MIME TypesWikiProcessors
text/x-diffpatch

MIME TypesWikiProcessors
text/x-patchdiff udiff

The second being added by PygmentsRenderer.get_extra_mimetypes().

comment:13 in reply to: ↑ 12 Changed 2 years ago by cboos

Replying to cboos:

(the only one recognized by PatchRenderer).

Fixed in r11319.

comment:14 in reply to: ↑ 7 Changed 2 years ago by cboos

  • Resolution set to fixed
  • Status changed from new to closed

For fixing the problem at the level of SubversionNode.get_content(), we should be able to use a subpool there.

Right, this patch alone also fixes the issue:

That less disruptive change was applied to 1.0-stable, in r11344.

Or even get rid of the explicit pool management, as the bindings are now supposed to do that by themselves in a smart way.

And this one as well:

And that slightly more risky approach was applied to trunk, in r11345.

comment:15 Changed 2 years ago by cboos

  • Release Notes modified (diff)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain cboos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from cboos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.