Edgewall Software
Modify

Opened 2 years ago

Closed 21 months ago

#12545 closed enhancement (fixed)

trac is slow when rendering the changeset page of git with many files

Reported by: walty <walty8@…> Owned by: Jun Omae
Priority: normal Milestone: 1.0.14
Component: version control/changeset view Version:
Severity: normal Keywords: git performance
Cc: walty8@…
Release Notes:

Improve performance of GitRepository.get_changes() when changes between two revisions have many files.

API Changes:

Description

If a git changeset contains a lot of files, the rendering will be very slow. Currently the system would try to do two API calls for each file in the changeset, one for the old revision and one for the new revision. For example, if the changeset contains 100 files, then 200 git API calls will be triggered.

The trigger point should be the following lines of tracopt/versioncontrol/git/git_fs.py:

                    if change != Changeset.ADD:
                        old_node = self.get_node(path, old_rev, old_historian)
                    if change != Changeset.DELETE:
                        new_node = self.get_node(path, new_rev, new_historian)

Attachments (1)

t12545-gitrepos-get-changes.diff (4.0 KB ) - added by Jun Omae 2 years ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 2 years ago by Jun Omae

Component: generalversion control/changeset view
Keywords: git added

Changed 2 years ago by Jun Omae

comment:2 Changed 2 years ago by Jun Omae

I posted t12545-gitrepos-get-changes.diff. In the patch, git ls-tree -r REV would be called only once for each revision instead of calling git ls-tree REV -- PATH for each revision and path.

After the patch, changeset view between two revisions with git (e.g. diff:jomae.git@8a59602e:3fd4da50) would work 2 times as fast as the previous (2059 ms ⇒ 1147 ms).

However, changeset view about a revision (e.g. [3fd4da50/jomae.git]) is still slow.

comment:3 in reply to:  2 Changed 2 years ago by walty <walty8@…>

Replying to Jun Omae:

I posted t12545-gitrepos-get-changes.diff. In the patch, git ls-tree -r REV would be called only once for each revision instead of calling git ls-tree REV -- PATH for each revision and path.

I tried to apply the patch, and the improvement is significant.

For the corresponding comparison page in the Trac regarding to this github diff page, the total time elapsed for class ChangesetModuledef render_html() is greatly reduced in my low-profile PC:

  • before: 21.565263 seconds
  • after: 2.853279 seconds

comment:4 Changed 2 years ago by walty <walty8@…>

Cc: walty8@… added

comment:5 Changed 2 years ago by Jun Omae

Milestone: next-stable-1.0.x
Owner: set to Jun Omae
Status: newassigned

I'll add unit tests and try to improve changeset view with single revision.

comment:6 Changed 2 years ago by Jun Omae

Updated jomae.git@t12545 that improves changeset view with single revision (5119 ms → 1162 ms for [3fd4da50/jomae.git]).

comment:7 Changed 2 years ago by figaro

Keywords: performance added

comment:8 Changed 2 years ago by Ryan J Ollos

Milestone: next-stable-1.0.xnext-stable-1.2.x

Moved ticket assigned to next-stable-1.0.x since maintenance of 1.0.x is coming to a close. Please move the ticket back if it's critical to fix on 1.0.x.

comment:9 Changed 21 months ago by Jun Omae

Milestone: next-stable-1.2.x1.0.14

I'd like to apply [3c7ad7cf2/jomae.git] to 1.0-stable. I've encountered the same issue on production environment with Trac 1.0.x.

comment:10 Changed 21 months ago by Christian Boos

As we're talking performance here, do we really need a get_ls_tree function? Looks like it's only called on old_rev and new_rev. So instead of a tree_map, we could have old_tree and new_tree (plus a check if new_rev == old_rev: new_tree = old_tree).

comment:11 Changed 21 months ago by Jun Omae

GitNode() needs mode, sha1, filesize and filename. git diff-tree returns a list of (src-mode, dst-mode, src-sha1, dst-sha1, status, src-path, dst-path). However, the filesize cannot be retrieved from git diff-tree. Any trees cannot be solved.

Therefore, call of git ls-tree is needed.

comment:12 Changed 21 months ago by Christian Boos

Sorry, I wasn't clear, I meant that instead of having a get_ls_tree function here, we could inline it, e.g.

  • tracopt/versioncontrol/git/git_fs.py

    diff --git a/tracopt/versioncontrol/git/git_fs.py b/tracopt/versioncontrol/git/git_fs.py
    index c9f49f3f4..51bda0651 100644
    a b class GitRepository(Repository):  
    573573        if old_path != new_path:
    574574            raise TracError(_("Not supported in git_fs"))
    575575
    576         ls_tree_map = {}
    577576        old_rev = self.normalize_rev(old_rev)
    578577        new_rev = self.normalize_rev(new_rev)
    579         target_path = old_path.strip('/')
     578        if old_rev == new_rev:
     579            return []
    580580
    581         def get_ls_tree(rev, path):
    582             if rev not in ls_tree_map:
    583                 ls_tree = self.git.ls_tree(rev, target_path, recursive=True)
    584                 ls_tree_map[rev] = dict((info[4], info) for info in ls_tree)
    585             return ls_tree_map[rev].get(path, False)
     581        target_path = old_path.strip('/')
     582        old_tree = dict((info[4], info) for info in
     583                        self.git.ls_tree(old_rev, target_path, recursive=True))
     584        new_tree = dict((info[4], info) for info in
     585                        self.git.ls_tree(new_rev, target_path, recursive=True))
    586586
    587587        with self.git.get_historian(old_rev, target_path) as old_historian:
    588588            with self.git.get_historian(new_rev, target_path) as new_historian:
    class GitRepository(Repository):  
    600600
    601601                    if change != Changeset.ADD:
    602602                        old_node = self._get_node(path, old_rev,
    603                                                   get_ls_tree(old_rev, path),
     603                                                  old_tree.get(path, False),
    604604                                                  old_historian)
    605605                    if change != Changeset.DELETE:
    606606                        new_node = self._get_node(path, new_rev,
    607                                                   get_ls_tree(new_rev, path),
     607                                                  new_tree.get(path, False),
    608608                                                  new_historian)
    609609
    610610                    yield old_node, new_node, kind, change

I didn't test it though, I got an unrelated issue with that branch:

  File "D:\Trac\repos\1.2-stable\trac\notification\mail.py", line 44, in <module>
ImportError: cannot import name get_session_attribute

comment:13 Changed 21 months ago by Jun Omae

Ah, okay. That's simple. I'll push with your suggestion. Thanks.

ImportError: cannot import name get_session_attribute

Hmm. The get_session_attribute is existing in trac.web.session since r15670 at branches/1.2-stable/trac/web/session.py@15670#/get_session_attribute. Please retry after make clean.

comment:14 Changed 21 months ago by Jun Omae

Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Committed in [15701] and merged in [15702,15704].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted.
to The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.