Edgewall Software

Version 1 (modified by Christian Boos, 11 years ago) ( diff )

Initial documentation taken from TracDev/Proposals/CacheInvalidation@21#Finalimplementation

Caching and cache invalidation

Trac uses various caches at the component level, in order to speed-up costly tasks. Some examples are the ticket fields cache (#6436), others are the InterMapTxt cache, the user permission cache, the oldest example being the Wiki page cache.

Those cache are held at the level of Component instances. For a given class, there's one such instance per environment in any given server process. The first thing to take into account here is that those caches must be safely accessed and modified when accessed by concurrent threads (in multi-threaded web front ends, that is).

But due to the always possible concurrent access at the underlying database level by multiple processes, there's also a need to maintain a consistency and up-to-date status of those caches across all processes involved. Otherwise, you might do a change by the way of one request and the next request (even the GET following a redirect after your POST!) might be handled by a different server process which has a different "view" of the application state.

This doesn't even have to imply a multi-process server setup, as all what is needed is e.g. a modification of the database done using trac-admin.

Past Situation

So the current solution to the above problem is to use some kind of global reset mechanism, which will not only invalidate the caches, but simply "throw away" all the Component instances of the environment that has been globally reset. That reset happens by the way of a simulated change on the TracIni file, triggered by a call to self.config.touch() from a component instance. The next time an environment instance is retrieved, the old environment instance is found to be out of date and a new one will be created (see trac.env.open_environment). Consequently, new Component instances will be created as well, and the caches will be repopulated as needed.

Limitations:

  • it's a bit costly - if this full reset happens too frequently, then the benefits from the caches will simply disappears.
  • it's all or nothing - the more we rely on this mechanism for different caches, the more we'll worsen the above situation. Ideally, invalidating one cache should not force all the other caches to be reset.

The Cache Manager

Starting with Trac 0.12 (more precisely r8071), we introduced a CacheManager component. That component is mostly transparent to the end developer, which only has to deal with two decorators that can be used to create cached attributes.

  • Creating a cached attribute is done by defining a retrieval function and decorating it with the @cached_value decorator. For example, for the wiki page names:
    @cached_value
    def pages(self, db):
        """Return the names of all existing wiki pages."""
        cursor = db.cursor()
        cursor.execute("SELECT DISTINCT name FROM wiki")
        return [name for (name,) in cursor]
    
  • Invalidating a cached attribute is done by deleting the attribute:
    def wiki_page_added(self, page):
        if not self.has_page(page.name):
            del self.pages
    
  • If more control is needed, for example to invalidate an attribute within an existing transaction, the attribute should be decorated with the @cached decorator. Accessing the attribute then yields a proxy object with two methods, get() and invalidate(), taking an optional db argument. For example, this is used in the case of ticket fields to invalidate them in the same transaction as e.g. an enum modification.
  • The cache is consistent within a request. That is, a cached attribute will always have the same value during a given transaction. Obviously, cached values should be treated as immutable.
  • The CacheManager component contains all the logic for data retrieval, caching and invalidation. Cache invalidation across processes is done by incrementing a generation counter for the given attribute in the cache database table. The invalidation granularity is at the attribute level.
  • There are two cache levels:
    • A thread-local (per-request) cache is used to minimize locking and ensure that the cached data is consistent during a request. It is emptied at the beginning of every request.
    • A process cache keeps retrieved data as long as it has not been invalidated.
  • The cache table is read the first time a cached attribute is accessed during a request. This avoids slowing down requests that don't touch cached attributes, like requests for static content for example.

See also TracDev/Proposals/CacheInvalidation for the history of the implementation details.

Note: See TracWiki for help on using the wiki.