I got that repository sync
command with large Subversion repository eats huge memory in production environment.
- Subversion 1.9.7
- 14 GB
- 48,665 revision records, 394,921 node_change records
- Memory usage: 9 GB+
Investigating tracopt.versioncontrol.svn
, I noticed repos.RevisionChangeCollector
has a memory leak. According to comments of the collector, it is deprecated and ChangeCollector
should be used. Using repos.ChangeCollector
would reduce the memory usage, but other memory leaks still exist….
Rev | RSS (before patch) | RSS (after patch) | after/before
|
---|
1 | 62,726,144 | 62,713,856 | 99.98%
|
1000 | 101,126,144 | 84,533,248 | 83.59%
|
2000 | 128,217,088 | 95,047,680 | 74.13%
|
3000 | 160,608,256 | 110,559,232 | 68.84%
|
4000 | 191,225,856 | 124,428,288 | 65.07%
|
5000 | 218,836,992 | 134,868,992 | 61.63%
|
6000 | 247,250,944 | 146,202,624 | 59.13%
|
7000 | 277,291,008 | 159,305,728 | 57.45%
|
8000 | 305,885,184 | 170,926,080 | 55.88%
|
9000 | 332,189,696 | 180,367,360 | 54.30%
|
10000 | 356,847,616 | 188,334,080 | 52.78%
|
11000 | 382,246,912 | 196,726,784 | 51.47%
|
12000 | 409,014,272 | 206,376,960 | 50.46%
|
13000 | 434,835,456 | 215,494,656 | 49.56%
|
14000 | 459,898,880 | 223,096,832 | 48.51%
|
15000 | 485,351,424 | 231,702,528 | 47.74%
|
16000 | 512,176,128 | 241,451,008 | 47.14%
|
16827 | 534,540,288 | 249,696,256 | 46.71%
|
Memory profile (before the patch)
191 509.6 MiB 1.4 MiB cset = self.repos.get_changeset(next_youngest)
192 509.6 MiB 0.0 MiB try:
193 # steps 1. and 2.
194 509.6 MiB 384.9 MiB self._insert_changeset(db, next_youngest, cset)
195 except Exception, e: # *another* 1.1. resync attempt won
196 if isinstance(e, self.env.db_exc.IntegrityError):
197 self.log.warning("Revision %s in '%s' already "
198 "cached: %r", next_youngest,
199 _norm_reponame(self), e)
Memory profile (after the patch)
191 238.3 MiB 3.4 MiB cset = self.repos.get_changeset(next_youngest)
192 238.3 MiB 0.0 MiB try:
193 # steps 1. and 2.
194 238.3 MiB 121.6 MiB self._insert_changeset(db, next_youngest, cset)
195 except Exception, e: # *another* 1.1. resync attempt won
196 if isinstance(e, self.env.db_exc.IntegrityError):
197 self.log.warning("Revision %s in '%s' already "
198 "cached: %r", next_youngest,
199 _norm_reponame(self), e)
I'm going to push the changes.