Edgewall Software

Opened 7 years ago

Last modified 7 years ago

#12694 closed defect

Browser page with large git repository is pretty slow since git 2.9 — at Initial Version

Reported by: Jun Omae Owned by:
Priority: normal Milestone: 1.0.14
Component: plugin/git Version: 1.0.11
Severity: normal Keywords: performance
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

After upgrading to git 2.11.1, I get speed degradation of browser page with large git repository in production environment.

The browser page calls GitNode.get_entries() for git repository. The get_entries() internally executes git log ... command at tags/trac-1.0.13/tracopt/versioncontrol/git/PyGIT.py@:914-915#L906. However, git log ... command is pretty slow caused by detecting renames in commit.

$ time /tmp/git/2.8.4/bin/git --git-dir /path/to/git/host/reponame log --pretty=format:%n%H --name-status master -- . >/dev/null

real    0m1.596s
user    0m0.613s
sys     0m0.103s

$ time /tmp/git/2.9.3/bin/git --git-dir /path/to/git/host/reponame log --pretty=format:%n%H --name-status master -- . >/dev/null
warning: inexact rename detection was skipped due to too many files.
warning: you may want to set your diff.renameLimit variable to at least 3652 and retry the command.

real    0m42.107s
user    0m17.918s
sys     0m4.408s

Work around: adding --no-renames option to the git log ... command.

$ time /tmp/git/2.8.4/bin/git --git-dir /path/to/git/host/reponame log --pretty=format:%n%H --no-renames --name-status master -- . >/dev/null

real    0m0.288s
user    0m0.203s
sys     0m0.020s

$ time /tmp/git/2.9.3/bin/git --git-dir /path/to/git/host/reponame log --pretty=format:%n%H --no-renames --name-status master -- . >/dev/null

real    0m0.527s
user    0m0.300s
sys     0m0.044s

Large git repository:

$ du -sh /path/to/git/host/reponame
3.3G    /path/to/git/host/reponame
$ git --git-dir /path/to/git/host/reponame rev-list --all | wc -l
857

Test script:

from time import time
from trac.env import Environment

env = Environment('/path/to/trac/host/env')
from trac.versioncontrol.api import RepositoryManager
from tracopt.versioncontrol.git.git_fs import GitConnector
version = GitConnector(env)._version['v_str']
rm = RepositoryManager(env)
repos = rm.get_repository('reponame')
node = repos.get_node('/')

t = time()
try:
    list(node.get_entries())
finally:
    print('%s - %.3fs' % (version, time() - t))

Patch which unit tests pass with git 1.5.6.5 - 2.11.1:

  • trac/tracopt/versioncontrol/git/PyGIT.py

    diff --git a/trac/tracopt/versioncontrol/git/PyGIT.py b/trac/tracopt/versioncontrol/git/PyGIT.py
    index fc61319e..c08fbb8a 100644
    a b class Storage(object):  
    902902        base_path = self._fs_from_unicode(base_path)
    903903
    904904        def name_status_gen():
    905             p[:] = [self.repo.log_pipe('--pretty=format:%n%H',
     905            p[:] = [self.repo.log_pipe('--pretty=format:%n%H', '--no-renames',
    906906                                       '--name-status', sha, '--', base_path)]
    907907            f = p[0].stdout
    908908            for l in f:

Timing of GitNode.get_entries():

Version w/o patch w/ patch
1.7.4.1 0.353s 0.317s
2.3.10 0.707s 0.325s
2.4.11 0.318s 0.317s
2.5.5 0.308s 0.303s
2.7.4 0.317s 0.305s
2.8.4 0.323s 0.304s
2.9.3 46.758s 0.306s
2.10.2 47.422s 0.370s
2.11.1 49.967s 0.310s

Change History (0)

Note: See TracTickets for help on using tickets.