Edgewall Software
Modify

Opened 16 years ago

Closed 10 years ago

Last modified 10 years ago

#8459 closed defect (fixed)

svn:mergeinfo rendering is far too slow

Reported by: Emmanuel Blot Owned by: Jun Omae
Priority: high Milestone: 0.12.6
Component: version control/browser Version: 0.12dev
Severity: major Keywords: svn svnmerge mergeinfo performance
Cc: jsceballos@…, fly1004@…, info@…, tim@…, rverchere@…, duncanphilipnorman@…, pweygand@…, mpotter@…, Jun Omae Branch:
Release Notes:

Reduce time of rendering svn:mergeinfo in browser view.

API Changes:

Added prefix_match and prefix_match_value methods to database backend classes (and abstract base class ConnectionBase that exists in release 1.1.2) for case-sensitive prefix-matching.

Internal Changes:

Description

This is a ticket follow up of #7715.

The new property rendered makes Trac browser deadly slow to render a directory that contains a lot of svn:mergeinfo values.

Typical slow down with [8352] is 40 times as slower as with the previous, dummy rendering on a Core 2 Duo 2.8Ghz machine.
It takes about 5 seconds to render our /trunk directory which only contains 7 entries, and about 450 mergeinfo sources.

Using 'defect' rather than 'enhancement', as it makes Trac browser hardly usable on such directories.

Attachments (2)

8459-optimize-eligible-r8352.patch (2.0 KB ) - added by Remy Blank 16 years ago.
Updated patch for "a faster way to compute eligible changesets" by cboos.
8459-optimize-eligible-2-r8352.patch (2.0 KB ) - added by Remy Blank 16 years ago.
Fixed a typo.

Download all attachments as: .zip

Change History (47)

comment:1 by Remy Blank, 16 years ago

Note that you can disable the property renderer on 0.11-stable if performance is bad (and I'll merge the change to trunk tonight), and you'll still get a more or less readable display of the properties.

I'll refresh the optimization patch from Christian in #7715 and attach it here.

in reply to:  1 ; comment:2 by Emmanuel Blot, 16 years ago

I wonder if there were a mean to ask the mergeinfo values directly from SVN rather than computing them in Python code.
It seems that the Python SWIG bindings do not offer such an interface though ;-(

by Remy Blank, 16 years ago

Updated patch for "a faster way to compute eligible changesets" by cboos.

comment:3 by Remy Blank, 16 years ago

Milestone: 0.11.6

The patch above is a refresh of attachment:7715-optimize-eligible-r8342.diff:ticket:7715 to current branches/0.11-stable@8352, and applies cleanly to trunk@8353 as well. I believe it also fixes an off-by-one error from the original patch.

in reply to:  3 ; comment:4 by Emmanuel Blot, 16 years ago

Replying to rblank:

The patch above is a refresh of attachment:7715-optimize-eligible-r8342.diff:ticket:7715 to current branches/0.11-stable@8352, and applies cleanly to trunk@8353 as well. I believe it also fixes an off-by-one error from the original patch.

I'm lost, again: now the rendering goes fast, but the rendering is totally different: neither "eligible" nor "merged" entries are shown, only the raw list of sn:mergeinfo properties, and the "toggle deleted branch" has been removed as well. Did I miss something?

in reply to:  4 comment:5 by Emmanuel Blot, 16 years ago

Replying to eblot:

Did I miss something?

When I remove the patch, everything goes back to normal (slow, but full mergeinfo rendering). Sorry I have no time to investigate this issue for now ;-(

in reply to:  4 ; comment:6 by Remy Blank, 16 years ago

Replying to eblot:

I'm lost, again: now the rendering goes fast, but the rendering is totally different: neither "eligible" nor "merged" entries are shown, only the raw list of sn:mergeinfo properties, and the "toggle deleted branch" has been removed as well. Did I miss something?

No, that means there's an exception during the rendering of the property, and the next rederer is used, which is the "plain" one. You should have a traceback in your log file, could you please post it here?

Also, would it be possible for you to attach the content of your svn:mergeinfo property (or to send it to me by e-mail if you prefer)? That should allow me to reproduce the issue.

in reply to:  6 comment:7 by Emmanuel Blot, 16 years ago

Replying to rblank:

Replying to eblot: Also, would it be possible for you to attach the content of your svn:mergeinfo property (or to send it to me by e-mail if you prefer)? That should allow me to reproduce the issue.

I've PMed you both files (trac.log.gz, svn-mergeinfo.gz). Let me know if you need more details.

comment:8 by Remy Blank, 16 years ago

Right, so the traceback was the following:

Traceback (most recent call last):
  File "/private/var/engine/trac-hg/trac/versioncontrol/web_ui/browser.py", line 560, in render_property
    rendered = renderer.render_property(name, mode, context, props)
  File "/private/var/engine/trac-hg/trac/versioncontrol/svn_prop.py", line 175, in render_property
    first_rev = node.get_branch_origin()
  File "/private/var/engine/trac-hg/trac/versioncontrol/svn_fs.py", line 792, in get_branch_origin
    return fs.revision_root_revision(root_path[0])
NameError: global name 'root_path' is not defined

I admit that I had only refreshed Christian's patch and not tested it more than superficially, so that part of the code hadn't been exercised at all. I'll attach an updated patch.

by Remy Blank, 16 years ago

Fixed a typo.

comment:9 by Javier Sanz <jsceballos@…>, 15 years ago

Cc: jsceballos@… added

comment:10 by fly1004@…, 15 years ago

Cc: fly1004@… added

comment:11 by Thijs Triemstra <info@…>, 15 years ago

Cc: info@… added

comment:12 by Christian Boos, 15 years ago

Cc: tim@… added
Priority: normalhigh

Another improvement would be to do range intersections and differences using directly the Ranges class, instead of creating potentially huge sets (think repositories with > 100k changesets).

I know Tim Hatch already implemented that once, but I managed to lose the patch he sent me at that time ;-)

comment:13 by anonymous, 15 years ago

WARNING: Rendering failed for property svnmerge-integrated with renderer SubversionMergePropertyRenderer:
  Traceback (most recent call last):
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/versioncontrol/web_ui/browser.py", line 561, in render_property
     rendered = renderer.render_property(name, mode, context, props)
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/versioncontrol/svn_prop.py", line 177, in render_property
     eligible -= set(Ranges(revs))
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/util/__init__.py", line 406, in __init__
     self.appendrange(r)
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/util/__init__.py", line 417, in appendrange
     a, b = int(x), int(x) ValueError: invalid literal for int(): 1-4505 /branches/bob:1-3041

svnmerge-integrated is:

/branches/alice:1-4505 /branches/bob:1-3041,3054 /branches/carroll:1-3751,3790-3884,3886-3990,3998 /branches/dave:1-1972,2067 ...

(this is obviously using svnmerge.py and subversion 1.4)

in reply to:  13 ; comment:14 by Christian Boos, 15 years ago

Keywords: svn svnmerge mergeinfo added

Replying to anonymous:

svnmerge-integrated is:

/branches/alice:1-4505 /branches/bob:1-3041,3054 /branches/carroll:1-3751,3790-3884,3886-3990,3998 /branches/dave:1-1972,2067 ...

(this is obviously using svnmerge.py and subversion 1.4)

So this is an issue with some(?) versions of svnmerge.py using space as delimiter between branches, instead of a newline. Wonder how they handle branches containing a space character…

Anyway, I guess this only applies to svnmerge- properties, not to svn:mergeinfo. Can you please try out the following patch?

  • trac/versioncontrol/svn_prop.py

     
    158158                if path not in branch_starts:
    159159                    branch_starts[path] = rev + 1
    160160        rows = []
    161         for line in props[name].splitlines():
     161        if name.startswith('svnmerge-'):
     162            sources = props[name].split()
     163        else:
     164            sources = props[name].splitlines()
     165        for line in sources:
    162166            path, revs = line.split(':', 1)
    163167            spath = _path_within_scope(repos.scope, path)
    164168            if spath is None:

in reply to:  14 ; comment:15 by anonymous, 15 years ago

Replying to cboos:

So this is an issue with some(?) versions of svnmerge.py using space as delimiter between branches, instead of a newline. Wonder how they handle branches containing a space character…

Anyway, I guess this only applies to svnmerge- properties, not to svn:mergeinfo. Can you please try out the following patch?

Correct - our svnmerge.py which we've been using for quite some time uses spaces as a separator. Your patch in previous comment 14 makes things work. Awesome!

According to a commit to svnmerge.py they deal with spaces in branch names by not allowing them, according to this check-in: http://svn.collab.net/viewvc/svn/trunk/contrib/client-side/svnmerge/svnmerge.py?view=log#rev29767

in reply to:  15 comment:16 by Christian Boos, 15 years ago

Replying to anonymous:

Your patch in previous comment 14 makes things work. Awesome!

Applied in r8368, thanks for testing!

According to a commit to svnmerge.py they deal with spaces in branch names by not allowing them, according to this check-in

I see… so this applies to all versions of svnmerge.py.

Btw, in the future, don't hesitate to create new tickets in such situations. If you create a duplicate by mistake, it's not a big deal ;-)

comment:17 by Christian Boos, 15 years ago

Owner: set to Christian Boos

comment:18 by Christian Boos, 15 years ago

In order to facilitate the testing, I've committed directly the changes:

So I'd like to get feedback about the performance of 0.11.6dev-r8403.

Next I'd like to not use sets but keep using Ranges (implement difference and intersection methods). That's probably not going to save much time but could save some memory for repositories with a long history.

in reply to:  18 comment:19 by Remy Blank, 15 years ago

Replying to cboos:

Next I'd like to not use sets but keep using Ranges (implement difference and intersection methods).

This is probably going to speed up cases where there are few disconnected ranges, and make the cases with lots of small ranges slower. What's the more frequent situation? The former, I guess.

That's probably not going to save much time but could save some memory for repositories with a long history.

If it's only for memory, I wouldn't event bother. A set with 100k integers is not going to consume more than 1-2 MB, and only temporarily. That sounds negligible. And creating such a set takes 23ms on my machine, so I would be surprised if it made much of a difference in speed, either.

in reply to:  2 ; comment:20 by rupert.thurner@…, 15 years ago

Replying to eblot:

I wonder if there were a mean to ask the mergeinfo values directly from SVN rather than computing them in Python code.
It seems that the Python SWIG bindings do not offer such an interface though ;-(

could you elaborate what you expect from svn here? the eligible merges, and how this would be defined, so we might even point subversion people to it, maybe they find it as well useful?

"modern" tools like git, mercurial do not expose "mergeinfo", and the only thing we miss in rare cases is a changeset which we do not want to merge (some equivalent to a svn "blocked revision"). for that reason i somehow like eblots proposal to display it only on request (i.e. click, or user preference) would be good for the performance and reflect best that one needs it only in rare cases.

in reply to:  20 comment:21 by Emmanuel Blot, 15 years ago

Replying to rupert.thurner@…:

Replying to eblot: could you elaborate what you expect from svn here? the eligible merges, and how this would be defined, so we might even point subversion people to it, maybe they find it as well useful?

An API to produce a result that would look like

svn mergeinfo --show-revs eligible
svn mergeinfo --show-revs merged

in reply to:  20 comment:22 by Christian Boos, 15 years ago

Replying to rupert.thurner@…:

"modern" tools like git, mercurial do not expose "mergeinfo"

When you merge a branch in those systems, you're effectively saying that you've merged all the changesets the branch contains. Whether those changes are actually in or not depend on the edits you've made before committing the merge (e.g. you could revert whole files or just the changeset(s) you don't want to actually merge by applying reverse diffs for example). But after that, those changesets are all "ancestors" and you'll never see them replayed in later merges. So it's quite like "svn:mergeinfo", except that you can not tweak the list of changesets for a merge, you always have to take all of them.

(I'm speaking primarily for Mercurial, though I believe this applies to git as well)

and the only thing we miss in rare cases is a changeset which we do not want to merge (some equivalent to a svn "blocked revision").

Precisely, there's no such thing, and some may even prefer the Subversion workflow for that.

for that reason i somehow like eblots proposal to display it only on request (i.e. click, or user preference) would be good for the performance and reflect best that one needs it only in rare cases.

I don't think it's a "rare" case, that information is certainly very useful and that's why we put so much effort in this lately… But I agree that you don't need it always. If this information comes for free for smaller repositories, like here on t.e.o I think it's better to have it shown directly, like it is now. But if it introduces a significant slow down (like for eblot's repository), then we'd better have an indirect way to get at it.

comment:23 by Christian Boos, 15 years ago

Status: newassigned

I did some performance testing on my own yesterday, and here are some numbers:

INFO: timing for get_branch_origin: 0.00124
INFO: timing for set(): 0.00075
INFO: timing for -= revs : 0.00056
INFO: timing for _get_blocked_revs: 0.00019
INFO: timing for _get_node_revs: 0.79794
INFO: timing for &=: 0.00075
INFO: timing for to_ranges: 0.00162
INFO: TIMING FOR branches/.......: 0.80423
...
INFO: timing for get_branch_origin: 0.00021
INFO: timing for set(): 0.00025
INFO: timing for -= revs : 0.00033
INFO: timing for _get_blocked_revs: 0.00018
INFO: timing for _get_node_revs: 0.84454
INFO: timing for &=: 0.00046
INFO: timing for to_ranges: 0.00085
INFO: TIMING FOR trunk: 0.8477

Which means a total of 1.65s. Before r8401, this was a bit more than 2s.

This means that there are some savings, but far from enough. And _get_node_revs is clearly the only place left to optimize. For 0.12, there's ticket:6654#comment:12, and this might be enough. For 0.11.6, I think we need to implement an indirect link.

comment:24 by Christian Boos, 15 years ago

Milestone: 0.11.6next-minor-0.12.x

For 0.11, what we have is good enough, if this is really too slow for some setups, then either:

  • upgrade to 0.12 once it's released,
  • or disable the trac.versioncontrol.svn_prop.SubversionMergePropertyRenderer component

in reply to:  24 ; comment:25 by kevin.winahradsky@…, 15 years ago

Replying to cboos:

For 0.11, what we have is good enough, if this is really too slow for some setups, then either:

  • upgrade to 0.12 once it's released,
  • or disable the trac.versioncontrol.svn_prop.SubversionMergePropertyRenderer component

I wanted to add that i also needed to disable trac.versioncontrol.svn_prop.SubversionMergePropertyDiffRenderer as rendering changesets for us was taking upwards of 30 minutes. The subversion browser rendering was also significantly long with pages taking 10-15 minutes. This was also causing many other errors like "database is locked" and "instance.dict not accessible in restricted mode".

After disabling these 2 items load times for the svn browser and changesets are back to less than a second for us.

Our svn repository has over 18000 revisions and close to 500 mergeinfo entries on the trunk. The box itself is unfortunately only a 2.4GHz celeron with 1G of ram. We started having the slowdown when we upgraded to ubuntu 9.10 and with it came trac 11.5 as we use the default packages for most things.

System Information
Trac: 0.11.5
Python: 2.6.4 (r264:75706, Dec 7 2009, 19:02:09) [GCC 4.4.1]
setuptools: 0.6c9
SQLite: 3.6.16
pysqlite: 2.4.1
Genshi: 0.5.1
mod_wsgi: 2.5
Pygments: 1.0
Subversion: 1.6.5 (r38866)
jQuery: 1.3.2

in reply to:  25 comment:26 by richard@…, 15 years ago

Replying to kevin.winahradsky@…:

Replying to cboos:

For 0.11, what we have is good enough, if this is really too slow for some setups, then either:

  • upgrade to 0.12 once it's released,
  • or disable the trac.versioncontrol.svn_prop.SubversionMergePropertyRenderer component

I wanted to add that i also needed to disable trac.versioncontrol.svn_prop.SubversionMergePropertyDiffRenderer as rendering changesets for us was taking upwards of 30 minutes. The subversion browser rendering was also significantly long with pages taking 10-15 minutes. This was also causing many other errors like "database is locked" and "instance.dict not accessible in restricted mode".

After disabling these 2 items load times for the svn browser and changesets are back to less than a second for us.

Just to let you know that we also experienced the same changeset rendering slowdown with 0.11.7.

I upgraded to 0.12b1 but found it just as slow.

Disabling SubversionMergePropertyRenderer and SubversionMergePropertyDiffRenderer plugins fixes the problem in 0.12b1 too.

This solution has been quite difficult to track down so I think you should add it to a "known issues" section on the Trac frontpage.

-RichardW.

comment:27 by Christian Boos, 15 years ago

Keywords: performance added
Milestone: next-minor-0.12.x0.12.1

It would be wonderful to actually use an age old technique for such things, the FAQ ;-) The TracFaq needs a bit of care.

We could also add a note in TracRepositoryAdmin and TracUpgrade (together with a word on optional components).

Moving to 0.12.1 to address the problem itself.

comment:28 by rverchere@…, 15 years ago

Cc: rverchere@… added

comment:29 by duncanphilipnorman@…, 14 years ago

Cc: duncanphilipnorman@… added

comment:30 by Christian Boos, 14 years ago

Sorry folks, I had no time to dig that issue further, new ideas and external contributions welcomed.

comment:31 by Christian Boos, 14 years ago

Milestone: 0.12.1next-minor-0.12.x

comment:32 by pweygand@…, 14 years ago

So We just migrated to svn 1.6 from 1.4 with trac 0.11.7. We experienced this same slow-down. It appears to be linear with respect to the amount of merges on the branch. I did a bit of basic profiling and each svn:mergeinfo entry adds aprox 0.5s for our trac installation. At about 2 mo. into our upgrade we started getting timeouts with 328 mergeinfo entries and a load time of 159 secs.

Until this issue is fixed, I would suggest disabling these options by default as this makes the trac browser unusable in a very short amount of time.

disabling the two options worked perfectly - our load times dropped back down to 1-3 sec.

[components]
trac.versioncontrol.svn_prop.subversionmergepropertydiffrenderer = disabled
trac.versioncontrol.svn_prop.subversionmergepropertyrenderer = disabled

If you would like ANY info to help you reproduce and debug this, feel free to contact me.

comment:33 by pweygand@…, 14 years ago

Cc: pweygand@… added

comment:34 by Christian Boos, 14 years ago

Note to self: usage of sets can be surprisingly slow… investigate use of dedicated algorithms working on ranges.

comment:35 by Christian Boos, 14 years ago

Milestone: next-minor-0.12.x0.12.3

comment:36 by Tim Hatch, 14 years ago

cboos: let me know if the range difference code is a bottleneck. Last I checked, it was still mostly svn calls, but there are a few things I could tweak.

comment:37 by Remy Blank, 13 years ago

I see several possible approaches here:

  • Speed up the algorithm. We have a pending patch above (8459-optimize-eligible-2-r8352.patch) that I unfortunately don't understand.
  • Delay the computation to the point where it's really needed. Currently, we always compute the svn:mergeinfo data for all branches when displaying the folder, even if the user doesn't need it, so it's a huge waste. Instead, we could link to a "computation" URL (e.g. /svn_mergeinfo/trunk?source=%2Fbranches%2F0.12-stable) that would compute the property data and redirect to the log or changeset view with the right revisions.
  • Don't calculate the merge info when rendering, and insert it through AJAX (or with an <IFRAME>, if that's possible).

The drawback of the delayed computation is that we always display a link, even if there are no eligible revisions. Currently, we display a greyed-out text in that case.

The delayed computation could be used in many more contexts, for example when going to the latest revision of a path (#8639), with the same drawback.

comment:38 by Christian Boos, 13 years ago

Milestone: 0.12.3next-minor-0.12.x

Delaying the computation is the way to go when there are lots of merge sources.

(for some future version)

comment:39 by Mark Potter <mpotter@…>, 11 years ago

Cc: mpotter@… added

comment:40 by Jun Omae, 11 years ago

Cc: Jun Omae added

comment:41 by Jun Omae, 10 years ago

I worked to reduce time of rendering svn:mergeinfo property in jomae.git@t8459_0.12.

Currently, _get_node_revs() is called for each entry of svn:mergeinfo. If number of its entries is much, the renderrer slows. In the branch, it retrieves revisions for multiple entries by single query in order to speed up.

Timing of SubversionMergePropertyRenderer on the following environment.

  • svn:mergeinfo has 25 entries.
  • node_change table has ~150,000 records.
  • revision table has ~25,000 records.
Before After
SQLite 0.460s 0.447s
PostgreSQL 1.521s 0.220s
MySQL 1.077s 0.264s

in reply to:  41 comment:42 by Jun Omae, 10 years ago

Milestone: next-minor-0.12.x0.12.6
Owner: changed from Christian Boos to Jun Omae

I worked to reduce time of rendering svn:mergeinfo property in jomae.git@t8459_0.12.

Added unit tests for Subversion property renderers in [12993-12995]. Also, prepared jomae.git@t8459_1.0 for 1.0-stable.

In [cb55f165/jomae.git], added .prefix_match() and .prefix_match_value() methods to database backend classes. The mehods are case-sensitive and index-friendly. The index is not used since .like() method is case-insensitive.

Thoughts?

comment:43 by Jun Omae, 10 years ago

Release Notes: modified (diff)

Comitted in [13015,13018] and merged in [13016-13017,13019-13020].

comment:44 by Jun Omae, 10 years ago

API Changes: modified (diff)
Resolution: fixed
Status: assignedclosed

comment:45 by Ryan J Ollos, 10 years ago

API Changes: modified (diff)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.