Edgewall Software
Modify

Opened 9 years ago

Closed 9 years ago

#11905 closed defect (fixed)

Recursion depth in git_fs.py traverse - "trac-admin repository sync" fails

Reported by: anonymous Owned by: Jun Omae
Priority: normal Milestone: 1.0.3
Component: plugin/git Version: 1.0.2
Severity: normal Keywords:
Cc: Branch:
Release Notes:

Avoid "maximum recursion depth exceeded" error while executing repository sync command for cached git repository with many merge commits.

API Changes:
Internal Changes:

Description (last modified by Jun Omae)

trac-admin repository sync myrepo fails for me:

  File "/usr/lib/python2.7/dist-packages/tracopt/versioncontrol/git/git_fs.py", line 122, in traverse
    revs[idx:idx] = traverse(parent, seen)
  File "/usr/lib/python2.7/dist-packages/tracopt/versioncontrol/git/git_fs.py", line 122, in traverse
    revs[idx:idx] = traverse(parent, seen)
  File "/usr/lib/python2.7/dist-packages/tracopt/versioncontrol/git/git_fs.py", line 122, in traverse
    revs[idx:idx] = traverse(parent, seen)
  File "/usr/lib/python2.7/dist-packages/tracopt/versioncontrol/git/git_fs.py", line 110, in traverse
    if is_synced(rev):
  File "/usr/lib/python2.7/dist-packages/tracopt/versioncontrol/git/git_fs.py", line 99, in is_synced
    """, (self.id, rev)):
  File "/usr/lib/python2.7/dist-packages/trac/db/api.py", line 122, in execute
    with self as db:
  File "/usr/lib/python2.7/dist-packages/trac/db/api.py", line 175, in __enter__
    db = self.dbmgr.get_connection(readonly=True)
  File "/usr/lib/python2.7/dist-packages/trac/db/api.py", line 288, in get_connection
    db = self._cnx_pool.get_cnx(self.timeout or None)
  File "/usr/lib/python2.7/dist-packages/trac/db/pool.py", line 213, in get_cnx
    return _backend.get_cnx(self._connector, self._kwargs, timeout)
  File "/usr/lib/python2.7/dist-packages/trac/db/pool.py", line 91, in get_cnx
    self._active[(tid, key)] = (cnx, num)
  File "/usr/lib/python2.7/threading.py", line 289, in __exit__
    return self.__lock.__exit__(*args)
  File "/usr/lib/python2.7/threading.py", line 216, in __exit__
    self.release()
  File "/usr/lib/python2.7/threading.py", line 210, in release
    self._note("%s.release(): final release", self)
RuntimeError: maximum recursion depth exceeded

If you have a very complicated repository (with lots of merges, e.g. because of migration from other version control systems - probably something like 1000 merges is the limit of trac right now?), the "traverse" method in git_fs.py can fail with a recursion depth exception. The last line of traverse gets invoked when a revision has more than one parent.

May I suggest to use a list based approach that always uses a loop, not falls back to recursion when there is more than one parent?

Some logic (e.g. inserting at revs[idx:idx]) does not appear to be necessary, not even for topology ordering. It only changes the order of the parents backwards, something that can be easily achieved otherwise.

Two modifications in below code (still recursive, though!):

  1. filter seen parents out early
  2. collect into a single array of revisions, assuming that the order of parent_revs does not matter anyway.

With below patch, synchronisation seems to work for me - it does not fan out badly, but there are a lot of merges in my codebase. By filtering the seen revisions, the non-recursive codepath seems to be used more often.

  • tracopt/versioncontrol/git/git_fs.py

    old new  
    103103        def traverse(rev, seen, revs=None):
    104104            if revs is None:
    105105                revs = []
    106106            while True:
    107107                if rev in seen:
    108108                    return revs
    109109                seen.add(rev)
    110110                if is_synced(rev):
    111111                    return revs
    112112                revs.append(rev)
    113                 parent_revs = repos.parent_revs(rev)
     113                parent_revs = filter(lambda x: not x in seen, repos.parent_revs(rev))
    114114                if not parent_revs:
    115115                    return revs
    116116                if len(parent_revs) == 1:
    117117                    rev = parent_revs[0]
    118118                    continue
    119                 idx = len(revs)
    120119                traverse(parent_revs.pop(), seen, revs)
    121120                for parent in parent_revs:
    122                     revs[idx:idx] = traverse(parent, seen)
     121                    traverse(parent, seen, revs)

Attachments (0)

Change History (5)

comment:1 by Jun Omae, 9 years ago

Component: version controlplugin/git
Description: modified (diff)
Milestone: 1.0.3
Owner: set to Jun Omae
Status: newassigned
Version: 1.0.2

Thanks for the reporting! I'll reproduce it and confirm suggested fix.

comment:2 by Jun Omae, 9 years ago

I confirmed suggested fix for git repository with 2000 merge commits. The issue still exists. I just created patch in [55f4f3dd/jomae.git] (jomae.git@t11905).

Could you please try the patch on your environment?

comment:3 by Jun Omae, 9 years ago

Added unit tests for this issue in [61b3244ff/jomae.git]. I'll push proposed changes tonight.

comment:4 by Ryan J Ollos, 9 years ago

I'm running late in creating the release but plan to do that tomorrow, so pushing proposed changes tonight would be good timing.

comment:5 by Jun Omae, 9 years ago

Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Yeah. I just pushed in [13633] and merged to trunk in [13634].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.