#11164 closed defect (fixed)
Git persistent cache not refreshed on changeset added notification from another process
Reported by: | Owned by: | Jun Omae | |
---|---|---|---|
Priority: | normal | Milestone: | 1.0.2 |
Component: | plugin/git | Version: | 1.0.1 |
Severity: | normal | Keywords: | |
Cc: | Peter Suter, Jun Omae | Branch: | |
Release Notes: |
Fix be not refreshing Git repository with persistent cache if it has been changed. |
||
API Changes: | |||
Internal Changes: |
Description
With git persistent_cache enabled and repository sync_per_request empty, the persistent cache isn't refreshed after a changeset added notification is dispatched from another process. This is problematic because changeset added notifications are normally dispatched from a Git post-receive hook, which is obviously in a different process to the process(es) running Trac.
Str:
- Configure a Git repository, set repository sync_per_request to empty, enable git persistent_cache and disable git cached_repository.
- Start an instance of tracd.
- Open the Trac revision log.
- Result (correct): The latest revision is displayed at the top.
- Commit something to the Git repository.
- Open the Trac revision log.
- Expected: The latest revision should be displayed at the top.
- Actual: The revision displayed at the top is the same as in step 3.
Unless I'm missing something, this makes (unfortunate) sense looking at the code. Calling sync() on the repository (and thus the underlying PyGIT.Storage) would refresh the cache, but it has no way of knowing that sync() was called from another process in response to a changeset added notification.
This gets even more interesting if you enable git cached_repository. In that case, GitCachedRepository knows about the new revision - even invalidation of the cached metadata is handled correctly cross-process via the cache db table - but it runs into trouble when it gets to the underlying PyGIT.Storage instance.
I guess one way to solve this is to make GitRepository use a cached property somewhere so it knows when to resync the underlying PyGIT.Storage instance. I'm not quite sure on specifics, though. :)
Attachments (2)
Change History (20)
comment:1 by , 12 years ago
Milestone: | → next-stable-1.0.x |
---|
by , 12 years ago
Attachment: | 11164.patch added |
---|
Patch which refreshes the persistent cache if necessary.
comment:2 by , 12 years ago
This patch calls sync() on the PyGIT.Storage instance when retrieving it if it has been cached persistently. This will cause one call to git to get the youngest revision for each request, but it's still a lot faster than rebuilding the entire cache if the youngest revision hasn't changed.
It might still be nicer to use a cached property, perhaps for the youngest revision, to avoid even this one short call to git on every request. I'm not quite sure how to manage that yet, though.
comment:3 by , 12 years ago
Milestone: | next-stable-1.0.x → 1.0.2 |
---|---|
Owner: | set to |
Status: | new → assigned |
Thanks for the patch! Looks fine, AFAICT. Jun, any opinion on this?
follow-up: 6 comment:5 by , 12 years ago
Hmm, after the patch, Trac synchronizes all git repositories on every request. That is the same to specify the repositories in [trac] repository_sync_per_request
(workaround) and would make no sense to enable [git] persistent_cache
when especially it has many git repositories….
I think that we should use @cached
for the persistent_cache
feature like CachedRepository.metadata
and invalidate the caches.
comment:6 by , 12 years ago
Replying to jomae:
Hmm, after the patch, Trac synchronizes all git repositories on every request. That is the same to specify the repositories in
[trac] repository_sync_per_request
Oops. I guess it is at that. I figured it was still slightly better if cached_repository is enabled, as that way, sync() isn't being needlessly called on GitCachedRepository
. However, I guess that is practically a no-op anyway if it's already been synced.
I think that we should use
@cached
for thepersistent_cache
feature likeCachedRepository.metadata
and invalidate the caches.
The tricky part is what property to cache, what id to give it and how to trigger a resync when that property changes. Ideally, we just want to know when sync() has been called elsewhere, but the only way I can think of to do that with a property is pretty hacky (an arbitrary counter or timestamp which just keeps increasing). Another possibility is to cache the youngest rev and call sync() when fetching it, though I'm not sure if this might cause problems.
comment:7 by , 11 years ago
Milestone: | 1.0.2 → next-stable-1.0.x |
---|---|
Owner: | removed |
Status: | assigned → new |
Postponing, as no patch is ready.
by , 11 years ago
Attachment: | 11164.2.patch added |
---|
Patch which uses a cached property to ensure the persistent cache is only updated when Trac is notified of a repository change.
comment:8 by , 11 years ago
This version of the patch uses @cached to ensure that the repo is only synced if Trac was notified of a repo change; i.e. sync() was called explicitly elsewhere. I'm caching the youngest revision and calling sync() if the cache is invalidated. The cached property is only used if persistent_cache is enabled.
In order for arbitrary changesets to work where the youngest rev isn't fetched, I have to call the cached property in the constructor so the repo will be synced if the cache was invalidated. This is a bit ugly, but I can't come up with anything more elegant.
comment:9 by , 11 years ago
Milestone: | next-stable-1.0.x → 1.0.2 |
---|---|
Owner: | set to |
Status: | new → assigned |
Thanks for the patch! It seems good. I'll try it.
comment:10 by , 11 years ago
I've revised the patch and fixed another issue in log:jomae.git:ticket11164.
When a branch is removed or a non-latest commit is pushed, GitRepository.sync
does nothing cause it uses result of git rev-list --max-count=1 --topo-order --all
.
comment:11 by , 11 years ago
Thanks. Using the rev cache as the cached property is certainly a lot cleaner!
comment:12 by , 11 years ago
Release Notes: | modified (diff) |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
Committed in [12357-12358] into 1.0-stable and merged in [12359]. If you encounter it again, please feel free to reopen this issue. Thanks.
comment:13 by , 11 years ago
Cc: | added |
---|---|
Owner: | changed from | to
comment:14 by , 11 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Oops. My changes are broken. If persistent_cache = disable
and visiting browser page after something is committed, the page shows the revision before the commits. I'll rework or revert.
comment:15 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | reopened → assigned |
follow-up: 17 comment:16 by , 11 years ago
I just reworked. Proposed fix with unit tests is in log:jomae.git:ticket11164.1. Also, if persistent_cache
is turned off, GitRepository
would ignore new revisions. The branch includes fix for it.
I'll push later.
comment:17 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Committed in [12479:12480] and merged to trunk in [12481].
Sorry, not much experience here on cached git repositories. PatchWelcome.