Edgewall Software

Opened 5 years ago

Last modified 5 years ago

#13112 closed defect

repository sync with large Subversion repository has memory leaks — at Initial Version

Reported by: Jun Omae Owned by:
Priority: normal Milestone: 1.0.18
Component: version control Version: 1.0.15
Severity: normal Keywords: svn memory
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

I got that repository sync command with large Subversion repository eats huge memory in production environment.

  • Subversion 1.9.7
  • 14 GB
  • 48665 revisions
  • Memory usage: 9 GB+

Investigating tracopt.versioncontrol.svn, I noticed repos.RevisionChangeCollector has a memory leak. According to comments of the collector, it is deprecated and ChangeCollector should be used. Using repos.ChangeCollector would reduce the memory usage, but other memory leaks still exist….

  • tracopt/versioncontrol/svn/svn_fs.py

    diff --git a/tracopt/versioncontrol/svn/svn_fs.py b/tracopt/versioncontrol/svn/svn_fs.py
    index ff82045fb..13cbc1021 100644
    a b class SubversionChangeset(Changeset):  
    10171017        pool = Pool(self.pool)
    10181018        tmp = Pool(pool)
    10191019        root = fs.revision_root(self.fs_ptr, self.rev, pool())
    1020         editor = repos.RevisionChangeCollector(self.fs_ptr, self.rev, pool())
     1020        try:
     1021            editor = repos.ChangeCollector(self.fs_ptr, root, pool())
     1022        except AttributeError:
     1023            editor = repos.RevisionChangeCollector(self.fs_ptr, self.rev,
     1024                                                   pool())
    10211025        e_ptr, e_baton = delta.make_editor(editor, pool())
    10221026        repos.svn_repos_replay(root, e_ptr, e_baton, pool())
    10231027

RSS of repository sync on Trac 1.0-stable with mirror of svn.edgewall.org/repos/trac:

Rev RSS (before patch) RSS (after patch) after/before
1 62,726,144 62,713,856 99.98%
1000 101,126,144 84,533,248 83.59%
2000 128,217,088 95,047,680 74.13%
3000 160,608,256 110,559,232 68.84%
4000 191,225,856 124,428,288 65.07%
5000 218,836,992 134,868,992 61.63%
6000 247,250,944 146,202,624 59.13%
7000 277,291,008 159,305,728 57.45%
8000 305,885,184 170,926,080 55.88%
9000 332,189,696 180,367,360 54.30%
10000 356,847,616 188,334,080 52.78%
11000 382,246,912 196,726,784 51.47%
12000 409,014,272 206,376,960 50.46%
13000 434,835,456 215,494,656 49.56%
14000 459,898,880 223,096,832 48.51%
15000 485,351,424 231,702,528 47.74%
16000 512,176,128 241,451,008 47.14%
16827 534,540,288 249,696,256 46.71%

Memory profile (before the patch)

   191    509.6 MiB      1.4 MiB                       cset = self.repos.get_changeset(next_youngest)
   192    509.6 MiB      0.0 MiB                       try:
   193                                                     # steps 1. and 2.
   194    509.6 MiB    384.9 MiB                           self._insert_changeset(db, next_youngest, cset)
   195                                                 except Exception, e: # *another* 1.1. resync attempt won
   196                                                     if isinstance(e, self.env.db_exc.IntegrityError):
   197                                                         self.log.warning("Revision %s in '%s' already "
   198                                                                          "cached: %r", next_youngest,
   199                                                                          _norm_reponame(self), e)

Memory profile (after the patch)

   191    238.3 MiB      3.4 MiB                       cset = self.repos.get_changeset(next_youngest)
   192    238.3 MiB      0.0 MiB                       try:
   193                                                     # steps 1. and 2.
   194    238.3 MiB    121.6 MiB                           self._insert_changeset(db, next_youngest, cset)
   195                                                 except Exception, e: # *another* 1.1. resync attempt won
   196                                                     if isinstance(e, self.env.db_exc.IntegrityError):
   197                                                         self.log.warning("Revision %s in '%s' already "
   198                                                                          "cached: %r", next_youngest,
   199                                                                          _norm_reponame(self), e)

Change History (1)

by Jun Omae, 5 years ago

Attachment: memory.png added
Note: See TracTickets for help on using tickets.