Edgewall Software
Modify

Opened 16 years ago

Last modified 8 years ago

#7739 new enhancement

trac & memcached

Reported by: antonbatenev@… Owned by:
Priority: normal Milestone: unscheduled
Component: general Version:
Severity: normal Keywords: memcached performance
Cc: shanec@…, zach@… Branch:
Release Notes:
API Changes:
Internal Changes:

Description (last modified by Christian Boos)

I think it will be useful to add memcached support to Trac to improve performance on high-load projects.

Note also that Trac 0.12 introduced some facilities to handle per-component caches, with proper invalidation semantic. Check TracDev/CacheManager.

Attachments (0)

Change History (16)

comment:1 by Remy Blank, 16 years ago

memcached does look interesting. Do you have any experience in adding memcached support to a web application? I wonder how concurrency and locking issues are managed.

Also, we need some profiling beforehand to make sure Trac is actually slowed down by DB accesses in typical situations, as that's what memcached is built to optimize.

comment:2 by ebray, 16 years ago

I've looked into hacking memcached supported into Trac before. At the time our primary bottleneck *was* in fact database lookups performed by our custom permission system. However, said permissions system was written by someone else originally, and was horribly inefficient. Rather than just throwing more technology at the problem I was able to optimize the permissions system so that instead of dozens of DB queries per page view for permissions alone there are now only one or two.

So now DB access is not really a bottleneck. I also doubt that most Trac sites are high volume enough that this would be worthwhile, though I don't have any real data to support that. It just doesn't seem like something that needs to be in Trac's core, and would be better off as a plugin. Genshi optimization is where the effort needs to go.

comment:3 by Christian Boos, 16 years ago

Keywords: memcached added
Milestone: 2.0

Well, that can't possibly be done as a plugin. Memcached support involves adding a check in the cache before every query, and writing results in the cache after every effective query following a cache miss.

One not so intrusive way to achieve this would be to do something at the cursor wrapper level for queries, but we'd also need to handle cache invalidation after every relevant change written to the database.

Not sure if this is possible at large, but possibly in some dedicated areas?

Anyway, keeping the ticket open as a scratchpad, if someone wants to contribute further ideas or even patches.

comment:4 by Remy Blank, 16 years ago

I just found out that Noah had already started something in th:CacheSystemPlugin. He seems to have stopped at 0.10, though.

in reply to:  3 ; comment:5 by ebray, 16 years ago

Replying to cboos:

Well, that can't possibly be done as a plugin. Memcached support involves adding a check in the cache before every query, and writing results in the cache after every effective query following a cache miss.

When I suggested it could be done as a plugin, what I meant was that it could be implemented as a DB backend that wraps one of the other backends. It would mostly involve using a cursor wrapper, as you suggested. Perhaps cursor.execute() could even have a keyword argument added to it for whether or not the cache should be used on that particular query, allowing it to be used in just some areas, like you suggested.

in reply to:  5 comment:6 by Christian Boos, 16 years ago

Replying to ebray:

Replying to cboos:

Well, that can't possibly be done as a plugin. Memcached support involves adding a check in the cache before every query, and writing results in the cache after every effective query following a cache miss.

When I suggested it could be done as a plugin, what I meant was that it could be implemented as a DB backend that wraps one of the other backends.

Mostly agreed. I just don't think it can be done at the DB backend only, i.e. without taking into account cache keys at a higher level. As hinted in comment:25:ticket:6436, a more general cache infrastructure would be a good thing. In that infrastructure, memcached support could be added as a plugin. But there could be other "backends":

  • Simple Cache
    This is mostly what we have now.
    • Each component can maintain its own "builtin" caches like the WikiSystem the page index cache, the TicketSystem the ticket fields cache, etc.
    • The cache validity checks will always fail for most of the usual SQL queries, meaning there will be no caching, but when checking for the information related to the builtin cache, we only fetch the information from the DB once
    • If any other save action invalidates one of those caches, config.touch() is called. This will trigger a reload of the environment at the next request and therefore the builtin caches will be cleared and rebuilt as needed.
      Btw, using config.touch() after creation or deletion of pages, instead of the periodic reload of the page index cache would already be an improvement.
  • DB Cache
    Very much like the above, but with a finer granularity. Instead of doing a config.touch() for invalidating all the caches at once, we increment a counter for a the specific key that has been invalidated (e.g. update cache set value = value + 1 where key = 'wiki_title_index'). At the beginning of each request, we get the keys and the environment can be told which key is no longer up-to-date, and the built-in caches can then be rebuilt when needed.

Well, that's just a rough sketch, feel free to expand on it or beat it down ;-)

comment:7 by Christian Boos, 16 years ago

There's also another level where the cache could play a role, caching the full content of generated pages, using the ETag key computed in Request.check_modified as a key. In Request.send the content could be saved under the ETag key (if there's an ETag header).

Of course, that would probably not play well with fine grained changes in the system, like the output of a TicketQuery macro, which can be present nearly anywhere and whose actual content depends on the state of every tickets.

There might be a trade-off to find here, eventually by adding a "generation" value that would increment whenever the resources are modified. That value could be integrated in the key, forcing a refresh of the cache whenever something changed. That would not be very effective, as probably a lot of unchanged content would be invalidated, but the presented pages would always be correct and the cache would at least ensure that no page would be generated twice with the exact same content. An intermediate term would be to cache the generation value at the content for a given ETag was generated, and set an extra header when the original and current generation differ, leading to a Javascript warning in the browser that the content might be out-of-date (we would need to ensure that we will by-pass the cache when the browser does a refresh on such "out-of-date" cached pages).

comment:8 by Shane Caraveo <shanec@…>, 15 years ago

Cc: shanec@… added

comment:9 by zach@…, 15 years ago

ebray - do you happen to have a patch of your permission system optimizations? We're getting thousands of permissions queries on ticket page views…

comment:10 by zach@…, 15 years ago

Cc: zach@… added

in reply to:  9 ; comment:11 by anonymous, 15 years ago

Replying to zach@…:

ebray - do you happen to have a patch of your permission system optimizations? We're getting thousands of permissions queries on ticket page views…

That would be very odd. Do you have plugins installed? The only time I saw anything similar to that was with a plugin. If so, disable them, re-enable one at a time until you find the culprit.

in reply to:  11 comment:12 by Zach Wily <zach@…>, 15 years ago

Replying to anonymous:

That would be very odd. Do you have plugins installed? The only time I saw anything similar to that was with a plugin. If so, disable them, re-enable one at a time until you find the culprit.

We don't have any plugins enabled, and the queries go away when turning off restrict_owner.

comment:13 by Christian Boos, 15 years ago

Description: modified (diff)

comment:14 by Christian Boos, 14 years ago

Milestone: 2.0unscheduled

Milestone 2.0 deleted

comment:15 by figaro, 9 years ago

Keywords: performance added

comment:16 by Archimedes Trajano, 8 years ago

Perhaps the TracDev/CacheManager can be ported to use memcache as the store rather than managing it as part of the database.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
The ticket will be disowned.
as The resolution will be set. Next status will be 'closed'.
The owner will be changed from (none) to anonymous. Next status will be 'assigned'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.