Edgewall Software
Modify

Opened 12 months ago

Closed 11 months ago

Last modified 11 months ago

#13112 closed defect (fixed)

repository sync with large Subversion repository has memory leaks

Reported by: Jun Omae Owned by: Jun Omae
Priority: normal Milestone: 1.0.18
Component: version control Version: 1.0.15
Severity: normal Keywords: svn memory
Cc: Branch:
Release Notes:

Reduce memory usage while repository sync command with Subversion repository.

API Changes:

Description (last modified by Jun Omae)

I got that repository sync command with large Subversion repository eats huge memory in production environment.

  • Subversion 1.9.7
  • 14 GB
  • 48,665 revision records, 394,921 node_change records
  • Memory usage: 9 GB+

Investigating tracopt.versioncontrol.svn, I noticed repos.RevisionChangeCollector has a memory leak. According to comments of the collector, it is deprecated and ChangeCollector should be used. Using repos.ChangeCollector would reduce the memory usage, but other memory leaks still exist….

  • tracopt/versioncontrol/svn/svn_fs.py

    diff --git a/tracopt/versioncontrol/svn/svn_fs.py b/tracopt/versioncontrol/svn/svn_fs.py
    index ff82045fb..13cbc1021 100644
    a b class SubversionChangeset(Changeset):  
    10171017        pool = Pool(self.pool)
    10181018        tmp = Pool(pool)
    10191019        root = fs.revision_root(self.fs_ptr, self.rev, pool())
    1020         editor = repos.RevisionChangeCollector(self.fs_ptr, self.rev, pool())
     1020        try:
     1021            editor = repos.ChangeCollector(self.fs_ptr, root, pool())
     1022        except AttributeError:
     1023            editor = repos.RevisionChangeCollector(self.fs_ptr, self.rev,
     1024                                                   pool())
    10211025        e_ptr, e_baton = delta.make_editor(editor, pool())
    10221026        repos.svn_repos_replay(root, e_ptr, e_baton, pool())
    10231027

RSS of repository sync on Trac 1.0-stable with mirror of svn.edgewall.org/repos/trac:

Rev RSS (before patch) RSS (after patch) after/before
1 62,726,144 62,713,856 99.98%
1000 101,126,144 84,533,248 83.59%
2000 128,217,088 95,047,680 74.13%
3000 160,608,256 110,559,232 68.84%
4000 191,225,856 124,428,288 65.07%
5000 218,836,992 134,868,992 61.63%
6000 247,250,944 146,202,624 59.13%
7000 277,291,008 159,305,728 57.45%
8000 305,885,184 170,926,080 55.88%
9000 332,189,696 180,367,360 54.30%
10000 356,847,616 188,334,080 52.78%
11000 382,246,912 196,726,784 51.47%
12000 409,014,272 206,376,960 50.46%
13000 434,835,456 215,494,656 49.56%
14000 459,898,880 223,096,832 48.51%
15000 485,351,424 231,702,528 47.74%
16000 512,176,128 241,451,008 47.14%
16827 534,540,288 249,696,256 46.71%

Memory profile (before the patch)

   191    509.6 MiB      1.4 MiB                       cset = self.repos.get_changeset(next_youngest)
   192    509.6 MiB      0.0 MiB                       try:
   193                                                     # steps 1. and 2.
   194    509.6 MiB    384.9 MiB                           self._insert_changeset(db, next_youngest, cset)
   195                                                 except Exception, e: # *another* 1.1. resync attempt won
   196                                                     if isinstance(e, self.env.db_exc.IntegrityError):
   197                                                         self.log.warning("Revision %s in '%s' already "
   198                                                                          "cached: %r", next_youngest,
   199                                                                          _norm_reponame(self), e)

Memory profile (after the patch)

   191    238.3 MiB      3.4 MiB                       cset = self.repos.get_changeset(next_youngest)
   192    238.3 MiB      0.0 MiB                       try:
   193                                                     # steps 1. and 2.
   194    238.3 MiB    121.6 MiB                           self._insert_changeset(db, next_youngest, cset)
   195                                                 except Exception, e: # *another* 1.1. resync attempt won
   196                                                     if isinstance(e, self.env.db_exc.IntegrityError):
   197                                                         self.log.warning("Revision %s in '%s' already "
   198                                                                          "cached: %r", next_youngest,
   199                                                                          _norm_reponame(self), e)

Attachments (2)

memory.png (31.2 KB ) - added by Jun Omae 12 months ago.
20181215-0940-memory.png (26.4 KB ) - added by Jun Omae 11 months ago.

Download all attachments as: .zip

Change History (6)

by Jun Omae, 12 months ago

Attachment: memory.png added

comment:1 by Jun Omae, 12 months ago

Description: modified (diff)

comment:2 by Jun Omae, 11 months ago

Owner: set to Jun Omae
Status: newassigned

I'm going to push the changes.

by Jun Omae, 11 months ago

Attachment: 20181215-0940-memory.png added

comment:3 by Jun Omae, 11 months ago

Memory usage is reduced in production environment after the patch (2018-12-15 09:07:08 - 2018-12-15 09:18:52):

comment:4 by Jun Omae, 11 months ago

Keywords: svn added; svn19 removed
Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Committed in [16833] and merged in [16834-16835].

Last edited 11 months ago by Jun Omae (previous) (diff)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted. Next status will be 'reopened'.
to as closed The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.