Edgewall Software
Modify

Ticket #8459 (assigned defect)

Opened 3 years ago

Last modified 4 months ago

svn:mergeinfo rendering is far too slow

Reported by: eblot Owned by: cboos
Priority: high Milestone: next-minor-0.12.x
Component: version control/browser Version: 0.12dev
Severity: major Keywords: svn svnmerge mergeinfo performance
Cc: jsceballos@…, fly1004@…, info@…, tim@…, rverchere@…, duncanphilipnorman@…, pweygand@…
Release Notes:
API Changes:

Description

This is a ticket follow up of #7715.

The new property rendered makes Trac browser deadly slow to render a directory that contains a lot of svn:mergeinfo values.

Typical slow down with [8352] is 40 times as slower as with the previous, dummy rendering on a Core 2 Duo 2.8Ghz machine.
It takes about 5 seconds to render our /trunk directory which only contains 7 entries, and about 450 mergeinfo sources.

Using 'defect' rather than 'enhancement', as it makes Trac browser hardly usable on such directories.

Attachments

8459-optimize-eligible-r8352.patch (2.0 KB) - added by rblank 3 years ago.
Updated patch for "a faster way to compute eligible changesets" by cboos.
8459-optimize-eligible-2-r8352.patch (2.0 KB) - added by rblank 3 years ago.
Fixed a typo.

Download all attachments as: .zip

Change History

comment:1 follow-up: Changed 3 years ago by rblank

Note that you can disable the property renderer on 0.11-stable if performance is bad (and I'll merge the change to trunk tonight), and you'll still get a more or less readable display of the properties.

I'll refresh the optimization patch from Christian in #7715 and attach it here.

comment:2 in reply to: ↑ 1 ; follow-up: Changed 3 years ago by eblot

I wonder if there were a mean to ask the mergeinfo values directly from SVN rather than computing them in Python code.
It seems that the Python SWIG bindings do not offer such an interface though ;-(

Changed 3 years ago by rblank

Updated patch for "a faster way to compute eligible changesets" by cboos.

comment:3 follow-up: Changed 3 years ago by rblank

  • Milestone set to 0.11.6

The patch above is a refresh of attachment:7715-optimize-eligible-r8342.diff:ticket:7715 to current branches/0.11-stable@8352, and applies cleanly to trunk@8353 as well. I believe it also fixes an off-by-one error from the original patch.

comment:4 in reply to: ↑ 3 ; follow-ups: Changed 3 years ago by eblot

Replying to rblank:

The patch above is a refresh of attachment:7715-optimize-eligible-r8342.diff:ticket:7715 to current branches/0.11-stable@8352, and applies cleanly to trunk@8353 as well. I believe it also fixes an off-by-one error from the original patch.

I'm lost, again: now the rendering goes fast, but the rendering is totally different: neither "eligible" nor "merged" entries are shown, only the raw list of sn:mergeinfo properties, and the "toggle deleted branch" has been removed as well.
Did I miss something?

comment:5 in reply to: ↑ 4 Changed 3 years ago by eblot

Replying to eblot:

Did I miss something?

When I remove the patch, everything goes back to normal (slow, but full mergeinfo rendering). Sorry I have no time to investigate this issue for now ;-(

comment:6 in reply to: ↑ 4 ; follow-up: Changed 3 years ago by rblank

Replying to eblot:

I'm lost, again: now the rendering goes fast, but the rendering is totally different: neither "eligible" nor "merged" entries are shown, only the raw list of sn:mergeinfo properties, and the "toggle deleted branch" has been removed as well.
Did I miss something?

No, that means there's an exception during the rendering of the property, and the next rederer is used, which is the "plain" one. You should have a traceback in your log file, could you please post it here?

Also, would it be possible for you to attach the content of your svn:mergeinfo property (or to send it to me by e-mail if you prefer)? That should allow me to reproduce the issue.

comment:7 in reply to: ↑ 6 Changed 3 years ago by eblot

Replying to rblank:

Replying to eblot:
Also, would it be possible for you to attach the content of your svn:mergeinfo property (or to send it to me by e-mail if you prefer)? That should allow me to reproduce the issue.

I've PMed you both files (trac.log.gz, svn-mergeinfo.gz). Let me know if you need more details.

comment:8 Changed 3 years ago by rblank

Right, so the traceback was the following:

Traceback (most recent call last):
  File "/private/var/engine/trac-hg/trac/versioncontrol/web_ui/browser.py", line 560, in render_property
    rendered = renderer.render_property(name, mode, context, props)
  File "/private/var/engine/trac-hg/trac/versioncontrol/svn_prop.py", line 175, in render_property
    first_rev = node.get_branch_origin()
  File "/private/var/engine/trac-hg/trac/versioncontrol/svn_fs.py", line 792, in get_branch_origin
    return fs.revision_root_revision(root_path[0])
NameError: global name 'root_path' is not defined

I admit that I had only refreshed Christian's patch and not tested it more than superficially, so that part of the code hadn't been exercised at all. I'll attach an updated patch.

Changed 3 years ago by rblank

Fixed a typo.

comment:9 Changed 3 years ago by Javier Sanz <jsceballos@…>

  • Cc jsceballos@… added

comment:10 Changed 3 years ago by fly1004@…

  • Cc fly1004@… added

comment:11 Changed 3 years ago by Thijs Triemstra <info@…>

  • Cc info@… added

comment:12 Changed 3 years ago by cboos

  • Cc tim@… added
  • Priority changed from normal to high

Another improvement would be to do range intersections and differences using directly the Ranges class, instead of creating potentially huge sets (think repositories with > 100k changesets).

I know Tim Hatch already implemented that once, but I managed to lose the patch he sent me at that time ;-)

comment:13 follow-up: Changed 3 years ago by anonymous

WARNING: Rendering failed for property svnmerge-integrated with renderer SubversionMergePropertyRenderer:
  Traceback (most recent call last):
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/versioncontrol/web_ui/browser.py", line 561, in render_property
     rendered = renderer.render_property(name, mode, context, props)
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/versioncontrol/svn_prop.py", line 177, in render_property
     eligible -= set(Ranges(revs))
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/util/__init__.py", line 406, in __init__
     self.appendrange(r)
   File "/usr/lib/python2.4/site-packages/Trac-0.11.5-py2.4.egg/trac/util/__init__.py", line 417, in appendrange
     a, b = int(x), int(x) ValueError: invalid literal for int(): 1-4505 /branches/bob:1-3041

svnmerge-integrated is:

/branches/alice:1-4505 /branches/bob:1-3041,3054 /branches/carroll:1-3751,3790-3884,3886-3990,3998 /branches/dave:1-1972,2067 ...

(this is obviously using svnmerge.py and subversion 1.4)

comment:14 in reply to: ↑ 13 ; follow-up: Changed 3 years ago by cboos

  • Keywords svn svnmerge mergeinfo added

Replying to anonymous:

svnmerge-integrated is:

/branches/alice:1-4505 /branches/bob:1-3041,3054 /branches/carroll:1-3751,3790-3884,3886-3990,3998 /branches/dave:1-1972,2067 ...

(this is obviously using svnmerge.py and subversion 1.4)

So this is an issue with some(?) versions of svnmerge.py using space as delimiter between branches, instead of a newline. Wonder how they handle branches containing a space character...

Anyway, I guess this only applies to svnmerge- properties, not to svn:mergeinfo. Can you please try out the following patch?

  • trac/versioncontrol/svn_prop.py

     
    158158                if path not in branch_starts: 
    159159                    branch_starts[path] = rev + 1 
    160160        rows = [] 
    161         for line in props[name].splitlines(): 
     161        if name.startswith('svnmerge-'): 
     162            sources = props[name].split() 
     163        else: 
     164            sources = props[name].splitlines() 
     165        for line in sources: 
    162166            path, revs = line.split(':', 1) 
    163167            spath = _path_within_scope(repos.scope, path) 
    164168            if spath is None: 

comment:15 in reply to: ↑ 14 ; follow-up: Changed 3 years ago by anonymous

Replying to cboos:

So this is an issue with some(?) versions of svnmerge.py using space as delimiter between branches, instead of a newline. Wonder how they handle branches containing a space character...

Anyway, I guess this only applies to svnmerge- properties, not to svn:mergeinfo. Can you please try out the following patch?

Correct - our svnmerge.py which we've been using for quite some time uses spaces as a separator. Your patch in previous comment 14 makes things work. Awesome!

According to a commit to svnmerge.py they deal with spaces in branch names by not allowing them, according to this check-in: http://svn.collab.net/viewvc/svn/trunk/contrib/client-side/svnmerge/svnmerge.py?view=log#rev29767

comment:16 in reply to: ↑ 15 Changed 3 years ago by cboos

Replying to anonymous:

Your patch in previous comment 14 makes things work. Awesome!

Applied in r8368, thanks for testing!

According to a commit to svnmerge.py they deal with spaces in branch names by not allowing them, according to this check-in

I see... so this applies to all versions of svnmerge.py.

Btw, in the future, don't hesitate to create new tickets in such situations. If you create a duplicate by mistake, it's not a big deal ;-)

comment:17 Changed 3 years ago by cboos

  • Owner set to cboos

comment:18 follow-up: Changed 3 years ago by cboos

In order to facilitate the testing, I've committed directly the changes:

So I'd like to get feedback about the performance of 0.11.6dev-r8403.

Next I'd like to not use sets but keep using Ranges (implement difference and intersection methods). That's probably not going to save much time but could save some memory for repositories with a long history.

comment:19 in reply to: ↑ 18 Changed 3 years ago by rblank

Replying to cboos:

Next I'd like to not use sets but keep using Ranges (implement difference and intersection methods).

This is probably going to speed up cases where there are few disconnected ranges, and make the cases with lots of small ranges slower. What's the more frequent situation? The former, I guess.

That's probably not going to save much time but could save some memory for repositories with a long history.

If it's only for memory, I wouldn't event bother. A set with 100k integers is not going to consume more than 1-2 MB, and only temporarily. That sounds negligible. And creating such a set takes 23ms on my machine, so I would be surprised if it made much of a difference in speed, either.

comment:20 in reply to: ↑ 2 ; follow-ups: Changed 3 years ago by rupert.thurner@…

Replying to eblot:

I wonder if there were a mean to ask the mergeinfo values directly from SVN rather than computing them in Python code.
It seems that the Python SWIG bindings do not offer such an interface though ;-(

could you elaborate what you expect from svn here? the eligible merges, and how this would be defined, so we might even point subversion people to it, maybe they find it as well useful?

"modern" tools like git, mercurial do not expose "mergeinfo", and the only thing we miss in rare cases is a changeset which we do not want to merge (some equivalent to a svn "blocked revision"). for that reason i somehow like eblots proposal to display it only on request (i.e. click, or user preference) would be good for the performance and reflect best that one needs it only in rare cases.

comment:21 in reply to: ↑ 20 Changed 3 years ago by eblot

Replying to rupert.thurner@…:

Replying to eblot:
could you elaborate what you expect from svn here? the eligible merges, and how this would be defined, so we might even point subversion people to it, maybe they find it as well useful?

An API to produce a result that would look like

svn mergeinfo --show-revs eligible
svn mergeinfo --show-revs merged

comment:22 in reply to: ↑ 20 Changed 3 years ago by cboos

Replying to rupert.thurner@…:

"modern" tools like git, mercurial do not expose "mergeinfo"

When you merge a branch in those systems, you're effectively saying that you've merged all the changesets the branch contains. Whether those changes are actually in or not depend on the edits you've made before committing the merge (e.g. you could revert whole files or just the changeset(s) you don't want to actually merge by applying reverse diffs for example). But after that, those changesets are all "ancestors" and you'll never see them replayed in later merges. So it's quite like "svn:mergeinfo", except that you can not tweak the list of changesets for a merge, you always have to take all of them.

(I'm speaking primarily for Mercurial, though I believe this applies to git as well)

and the only thing we miss in rare cases is a changeset which we do not want to merge (some equivalent to a svn "blocked revision").

Precisely, there's no such thing, and some may even prefer the Subversion workflow for that.

for that reason i somehow like eblots proposal to display it only on request (i.e. click, or user preference) would be good for the performance and reflect best that one needs it only in rare cases.

I don't think it's a "rare" case, that information is certainly very useful and that's why we put so much effort in this lately... But I agree that you don't need it always. If this information comes for free for smaller repositories, like here on t.e.o I think it's better to have it shown directly, like it is now. But if it introduces a significant slow down (like for eblot's repository), then we'd better have an indirect way to get at it.

comment:23 Changed 3 years ago by cboos

  • Status changed from new to assigned

I did some performance testing on my own yesterday, and here are some numbers:

INFO: timing for get_branch_origin: 0.00124
INFO: timing for set(): 0.00075
INFO: timing for -= revs : 0.00056
INFO: timing for _get_blocked_revs: 0.00019
INFO: timing for _get_node_revs: 0.79794
INFO: timing for &=: 0.00075
INFO: timing for to_ranges: 0.00162
INFO: TIMING FOR branches/.......: 0.80423
...
INFO: timing for get_branch_origin: 0.00021
INFO: timing for set(): 0.00025
INFO: timing for -= revs : 0.00033
INFO: timing for _get_blocked_revs: 0.00018
INFO: timing for _get_node_revs: 0.84454
INFO: timing for &=: 0.00046
INFO: timing for to_ranges: 0.00085
INFO: TIMING FOR trunk: 0.8477

Which means a total of 1.65s. Before r8401, this was a bit more than 2s.

This means that there are some savings, but far from enough. And _get_node_revs is clearly the only place left to optimize. For 0.12, there's ticket:6654#comment:12, and this might be enough. For 0.11.6, I think we need to implement an indirect link.

comment:24 follow-up: Changed 2 years ago by cboos

  • Milestone changed from 0.11.6 to next-minor-0.12.x

For 0.11, what we have is good enough, if this is really too slow for some setups, then either:

  • upgrade to 0.12 once it's released,
  • or disable the trac.versioncontrol.svn_prop.SubversionMergePropertyRenderer component

comment:25 in reply to: ↑ 24 ; follow-up: Changed 2 years ago by kevin.winahradsky@…

Replying to cboos:

For 0.11, what we have is good enough, if this is really too slow for some setups, then either:

  • upgrade to 0.12 once it's released,
  • or disable the trac.versioncontrol.svn_prop.SubversionMergePropertyRenderer component

I wanted to add that i also needed to disable trac.versioncontrol.svn_prop.SubversionMergePropertyDiffRenderer as rendering changesets for us was taking upwards of 30 minutes. The subversion browser rendering was also significantly long with pages taking 10-15 minutes. This was also causing many other errors like "database is locked" and "instance.dict not accessible in restricted mode".

After disabling these 2 items load times for the svn browser and changesets are back to less than a second for us.

Our svn repository has over 18000 revisions and close to 500 mergeinfo entries on the trunk. The box itself is unfortunately only a 2.4GHz celeron with 1G of ram. We started having the slowdown when we upgraded to ubuntu 9.10 and with it came trac 11.5 as we use the default packages for most things.

System Information
Trac: 0.11.5
Python: 2.6.4 (r264:75706, Dec 7 2009, 19:02:09) [GCC 4.4.1]
setuptools: 0.6c9
SQLite: 3.6.16
pysqlite: 2.4.1
Genshi: 0.5.1
mod_wsgi: 2.5
Pygments: 1.0
Subversion: 1.6.5 (r38866)
jQuery: 1.3.2

comment:26 in reply to: ↑ 25 Changed 21 months ago by richard@…

Replying to kevin.winahradsky@…:

Replying to cboos:

For 0.11, what we have is good enough, if this is really too slow for some setups, then either:

  • upgrade to 0.12 once it's released,
  • or disable the trac.versioncontrol.svn_prop.SubversionMergePropertyRenderer component

I wanted to add that i also needed to disable trac.versioncontrol.svn_prop.SubversionMergePropertyDiffRenderer as rendering changesets for us was taking upwards of 30 minutes. The subversion browser rendering was also significantly long with pages taking 10-15 minutes. This was also causing many other errors like "database is locked" and "instance.dict not accessible in restricted mode".

After disabling these 2 items load times for the svn browser and changesets are back to less than a second for us.

Just to let you know that we also experienced the same changeset rendering slowdown with 0.11.7.

I upgraded to 0.12b1 but found it just as slow.

Disabling SubversionMergePropertyRenderer? and SubversionMergePropertyDiffRenderer? plugins fixes the problem in 0.12b1 too.

This solution has been quite difficult to track down so I think you should add it to a "known issues" section on the Trac frontpage.

-RichardW.

comment:27 Changed 21 months ago by cboos

  • Keywords performance added
  • Milestone changed from next-minor-0.12.x to 0.12.1

It would be wonderful to actually use an age old technique for such things, the FAQ ;-)
The TracFaq needs a bit of care.

We could also add a note in TracRepositoryAdmin and TracUpgrade (together with a word on optional components).

Moving to 0.12.1 to address the problem itself.

comment:28 Changed 20 months ago by rverchere@…

  • Cc rverchere@… added

comment:29 Changed 19 months ago by duncanphilipnorman@…

  • Cc duncanphilipnorman@… added

comment:30 Changed 17 months ago by cboos

Sorry folks, I had no time to dig that issue further, new ideas and external contributions welcomed.

comment:31 Changed 17 months ago by cboos

  • Milestone changed from 0.12.1 to next-minor-0.12.x

comment:32 Changed 15 months ago by pweygand@…

So We just migrated to svn 1.6 from 1.4 with trac 0.11.7. We experienced this same slow-down. It appears to be linear with respect to the amount of merges on the branch.
I did a bit of basic profiling and each svn:mergeinfo entry adds aprox 0.5s for our trac installation.
At about 2 mo. into our upgrade we started getting timeouts with 328 mergeinfo entries and a load time of 159 secs.

Until this issue is fixed, I would suggest disabling these options by default as this makes the trac browser unusable in a very short amount of time.

disabling the two options worked perfectly - our load times dropped back down to 1-3 sec.

[components]
trac.versioncontrol.svn_prop.subversionmergepropertydiffrenderer = disabled
trac.versioncontrol.svn_prop.subversionmergepropertyrenderer = disabled

If you would like ANY info to help you reproduce and debug this, feel free to contact me.

comment:33 Changed 15 months ago by pweygand@…

  • Cc pweygand@… added

comment:34 Changed 12 months ago by cboos

Note to self: usage of sets can be surprisingly slow... investigate use of dedicated algorithms working on ranges.

comment:35 Changed 12 months ago by cboos

  • Milestone changed from next-minor-0.12.x to 0.12.3

comment:36 Changed 12 months ago by thatch

cboos: let me know if the range difference code is a bottleneck. Last I checked, it was still mostly svn calls, but there are a few things I could tweak.

comment:37 Changed 4 months ago by rblank

I see several possible approaches here:

  • Speed up the algorithm. We have a pending patch above (8459-optimize-eligible-2-r8352.patch) that I unfortunately don't understand.
  • Delay the computation to the point where it's really needed. Currently, we always compute the svn:mergeinfo data for all branches when displaying the folder, even if the user doesn't need it, so it's a huge waste. Instead, we could link to a "computation" URL (e.g. /svn_mergeinfo/trunk?source=%2Fbranches%2F0.12-stable) that would compute the property data and redirect to the log or changeset view with the right revisions.
  • Don't calculate the merge info when rendering, and insert it through AJAX (or with an <IFRAME>, if that's possible).

The drawback of the delayed computation is that we always display a link, even if there are no eligible revisions. Currently, we display a greyed-out text in that case.

The delayed computation could be used in many more contexts, for example when going to the latest revision of a path (#8639), with the same drawback.

comment:38 Changed 4 months ago by cboos

  • Milestone changed from 0.12.3 to next-minor-0.12.x

Delaying the computation is the way to go when there are lots of merge sources.

(for some future version)

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as assigned
as The resolution will be set. Next status will be 'closed'
to The owner will be changed from cboos. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.