Inefficient sql query
|Reported by:||Owned by:||Christian Boos|
The sql query in get_youngest_rev_in_cache is quite inefficient and does a full table scan in the revision table which scales quite bad. The test repository used for the following numbers is the repository from crystalspace3d.org which has around 25000 revisions and is 1.5gb big.
I benchmarked with apachebench with 3 times 20 requests. Everything one with a concurrency of 4 and without. As the concurrency didnt make any significant difference and as they vary much more over each series of requests, i didnt wrote the numbers down.
What i requested was the timeline with a history depth of 5 days. Hardware is a pentium 4 system, with a raid 1 disk system and 1 gig of ram.
The following numbers are for sqlite. With postgres its very much more visible, but i thought that sqlite is more wildly used.
Original query: 1.01 requests/s
SELECT MAX(CAST(rev as int)) FROM revision 1.2 requests/s
SELECT MAX(rev_int) FROM revision with rev_int beeing a column which copied rev as an int and an seperate index over it. The problem is, that sqlite doesnt seems to support indices over expressions which eg postgres does. 3.66 requests/s
The effect is also visible when using a history depth of 90 days but the differences arent _that_ big as before. I didnt really measure those as its sometimes below 0.11 requests/s …
The problem with this is, that the CAST() syntax is only supported with sqlite3 and not 2. Although not tested mysql seems to support that for some time.
And indices over expressions are not supported in sqlite at all. Which is a pity because else you could just add an index like CREATE INDEX revision_rev_asint ON revision ((CAST(rev as int)). The only even halfway practical solution i currently see is an extra column + trigger. But thats very hackish. Any other idea?
Change History (9)
follow-up: 3 comment:2 by , 16 years ago
|Component:||general → version control|