Edgewall Software
Modify

Opened 12 years ago

Closed 11 years ago

#10978 closed defect (fixed)

SVN `get_changes` performance issue

Reported by: michal.hankiewicz Owned by: Jun Omae
Priority: normal Milestone: 0.12.6
Component: version control Version: 0.12-stable
Severity: normal Keywords: svn memory performance
Cc: Branch:
Release Notes:

Reduce memory usage on SubversionRepository.get_changes()

API Changes:
Internal Changes:

Description (last modified by Ryan J Ollos <ryan.j.ollos@…>)

Changes made in [11244] in file svn_fs.py is a performance killer.

Our plugin uses function get_changes and when dealing with bigger changesets Trac starts to consume huge amount of memory and get_changes takes ages to complete.

According to the log message change was made to fix failing tests, not because something was not working. We think that reverting this change and fixing tests is better idea than killing performance.

Attachments (0)

Change History (5)

comment:1 by Remy Blank, 12 years ago

Well, there was another effect that wasn't mentioned in that other ticket: the change in SVN 1.7.x messed up the sorting of the file list in the changeset view. The fix restores a consistent order.

But we could also move the sorting to the UI code instead of keeping it in get_changes(). Would you like to try and implement that?

comment:2 by Ryan J Ollos <ryan.j.ollos@…>, 12 years ago

API Changes: modified (diff)
Description: modified (diff)
Summary: Performance issueSVN `get_changes` performance issue

comment:3 by Christian Boos, 12 years ago

Keywords: svn memory performance added
Milestone: unscheduled

comment:4 by Jun Omae, 11 years ago

Milestone: unscheduled0.12.6
Owner: set to Jun Omae
Status: newassigned

Proposed changes can be found in log:jomae.git@t10978.

The huge amount of memory is consumed by SubversionRepository.get_node() and SubversionNode. Therefore, the changes sort deltas before creating SubversionNodes.

Unit and functional tests pass on

  • Python 2.4 - 2.7 with Subversion 1.6.17
  • Python 2.4 - 2.6 with Subversion 1.7.6

Results

Memory usage
jomae.git@t10978 39.5 MiB
Trac 0.12.5 150.3 MiB
Trac 0.12.3 36.4 MiB

Test

  1. Setup mirror of subversion repository in trac-hacks.org.
  2. Install pypi:memory_profiler and pypi:psutil.
  3. Call SubversionRepository.get_changes('/', '10272', '/', '10273') for the repository using the following script. (revision 10273 has 6755 node_change records)
    $ PYTHONPATH=. python t10978.py /path/to/tracenv trac-hacks.org / 10272 / 10273
    

t10978.py:

from memory_profiler import profile

from trac.env import Environment

@profile
def main(env_path, reponame, old_path, old_rev, new_path, new_rev):
    env = Environment(env_path)
    repos = env.get_repository(reponame)
    count = 0
    for entry in repos.get_changes(old_path, old_rev, new_path, new_rev):
        count += 1

if __name__ == '__main__':
    import sys
    main(*sys.argv[1:])

@t10978 with Subversion 1.7.6

Line #    Mem usage    Increment   Line Contents
================================================
     8     11.4 MiB      0.0 MiB   @profile
     9                             def main(env_path, reponame, old_path, old_rev, new_path, new_rev):
    10     18.5 MiB      7.2 MiB       env = Environment(env_path)
    11     27.2 MiB      8.7 MiB       repos = env.get_repository(reponame)
    12     27.2 MiB      0.0 MiB       count = 0
    13     39.5 MiB     12.3 MiB       for entry in repos.get_changes(old_path, old_rev, new_path, new_rev):
    14     39.5 MiB      0.0 MiB           count += 1

0.12.5 with Subversion 1.7.6

Line #    Mem usage    Increment   Line Contents
================================================
     8     11.4 MiB      0.0 MiB   @profile
     9                             def main(env_path, reponame, old_path, old_rev, new_path, new_rev):
    10     18.5 MiB      7.1 MiB       env = Environment(env_path)
    11     26.0 MiB      7.5 MiB       repos = env.get_repository(reponame)
    12     26.0 MiB      0.0 MiB       count = 0
    13    150.3 MiB    124.3 MiB       for entry in repos.get_changes(old_path, old_rev, new_path, new_rev):
    14    150.3 MiB      0.0 MiB           count += 1

0.12.3 with Subversion 1.7.6

Line #    Mem usage    Increment   Line Contents
================================================
     8     11.1 MiB      0.0 MiB   @profile
     9                             def main(env_path, reponame, old_path, old_rev, new_path, new_rev):
    10     17.4 MiB      6.3 MiB       env = Environment(env_path)
    11     26.0 MiB      8.6 MiB       repos = env.get_repository(reponame)
    12     26.0 MiB      0.0 MiB       count = 0
    13     36.4 MiB     10.4 MiB       for entry in repos.get_changes(old_path, old_rev, new_path, new_rev):
    14     36.4 MiB      0.0 MiB           count += 1
Last edited 11 years ago by Jun Omae (previous) (diff)

comment:5 by Jun Omae, 11 years ago

Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Committed in [12574] and merged to 1.0-stable in [12575] and trunk to [12577].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.