#9060 closed defect (fixed)
"no results to fetch" after a ''changeset modified''
| Reported by: | Christian Boos | Owned by: | Christian Boos |
|---|---|---|---|
| Priority: | high | Milestone: | 0.12 |
| Component: | version control | Version: | 0.12dev |
| Severity: | normal | Keywords: | transaction |
| Cc: | Branch: | ||
| Release Notes: | |||
| API Changes: | |||
| Internal Changes: | |||
Description
Just got this after a svn:log edit (t.e.o running r9188):
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/trac/admin/console.py", line 107, in onecmd
rv = cmd.Cmd.onecmd(self, line) or 0
File "/usr/lib/python2.5/cmd.py", line 218, in onecmd
return self.default(line)
File "build/bdist.linux-x86_64/egg/trac/admin/console.py", line 251, in default
return cmd_mgr.execute_command(*args)
File "build/bdist.linux-x86_64/egg/trac/admin/api.py", line 118, in execute_command
return f(*fargs)
File "build/bdist.linux-x86_64/egg/trac/versioncontrol/admin.py", line 95, in _do_changeset_modified
rm.notify('changeset_modified', reponame, revs)
File "build/bdist.linux-x86_64/egg/trac/versioncontrol/api.py", line 611, in notify
args.append(repos.sync_changeset(rev))
File "build/bdist.linux-x86_64/egg/trac/versioncontrol/cache.py", line 90, in sync_changeset
@with_transaction(self.env)
File "build/bdist.linux-x86_64/egg/trac/db/util.py", line 35, in transaction_wrapper
fn(dbtmp)
File "build/bdist.linux-x86_64/egg/trac/versioncontrol/cache.py", line 97, in do_sync
for time, author, message in cursor:
File "build/bdist.linux-x86_64/egg/trac/db/util.py", line 67, in __iter__
row = self.cursor.fetchone()
ProgrammingError: no results to fetch
From #8993, it seems there must have been a db.commit() after the self.cursor has been created from the db and before we call self.cursor.fetchone().
Attachments (5)
Change History (38)
follow-up: 2 comment:1 by , 16 years ago
follow-up: 3 comment:2 by , 16 years ago
Replying to rblank:
What happens in the following case:
db1 = self.env.get_db_cnx() db2 = self.env.get_db_cnx()Do we have
db1 is db2or not?
If we're going through the ConnectionPool, yes.
And for now this is always the case, even when 'poolable' is set to False.
(double negation omitted ;-) ) this would mean that doing
db2.commit()would invalidate all cursors created fromdb1, and therefore "nested" transactions would be an issue.
If the nested transaction recreates a db, yes. But this is not the intended usage pattern, as the nested with_transaction should be given the current db, and then will not manage the transaction and will not try to commit().
OTOH, in this particular case, it looks again like someone (ahem) forgot a
breakstatement at the end of the loop. Or rather, indented the secondcursor.execute()one level too many (see [9183/trunk/trac/versioncontrol/cache.py]).
Ah yes, simple indent error… oops ;-)
We really need a test for sync_changeset, I suppose.
comment:3 by , 16 years ago
Replying to cboos:
If the nested transaction recreates a
db, yes. But this is not the intended usage pattern, as the nestedwith_transactionshould be given the currentdb, and then will not manage the transaction and will not try tocommit().
This means that we must be careful always to pass the db argument, and be sure that any function called within a transaction block doesn't re-create a db. Would it be worth experimenting with some kind of (per-thread) "nesting counting" of transaction blocks, so that only the outermost block actually results in a commit or rollback?
follow-up: 6 comment:4 by , 16 years ago
I suppose this is an explicit vs. implicit thing, where your suggestion would be to add a safeguard against bad usage of the explicit way. But if we have such counting mechanism in place, then why not make the whole thing implicit, so we never have to pass the db to with_transaction?
comment:6 by , 16 years ago
Replying to cboos:
But if we have such counting mechanism in place, then why not make the whole thing implicit, so we never have to pass the
dbtowith_transaction?
Yes, that was my next thought, too. I suggested "experimenting" with such a mechanism, because I'm not sure if it will work in all cases, but maybe it's worth a shot.
comment:7 by , 16 years ago
Something along the lines of:
-
trac/db/util.py
diff --git a/trac/db/util.py b/trac/db/util.py
a b 15 15 # 16 16 # Author: Christopher Lenz <cmlenz@gmx.de> 17 17 18 try: 19 import threading 20 except ImportError: 21 import dummy_threading as threading 22 23 _with_transaction_local = threading.local() 24 18 25 def with_transaction(env, db=None): 19 26 """Transaction decorator for simple use-once transactions. 20 27 Will be replaced by a context manager once python 2.4 support is dropped. 21 28 22 29 >>> def api_method(p1, p2): 23 30 >>> result[0] = value1 24 >>> @with_transaction(env , db)31 >>> @with_transaction(env) 25 32 >>> def implementation_method(db): 26 33 >>> # implementation 27 34 >>> result[0] = value2 28 35 """ 29 36 def transaction_wrapper(fn): 30 if db:31 fn( db)37 if hasattr(_with_transaction_local, 'db'): 38 fn(_with_transaction_local.db) 32 39 else: 33 dbtmp= env.get_db_cnx()40 _with_transaction_local.db = db = env.get_db_cnx() 34 41 try: 35 fn(dbtmp) 36 dbtmp.commit() 42 fn(db) 43 db.commit() 44 del _with_transaction_local.db 37 45 except: 38 dbtmp.rollback() 46 db.rollback() 47 del _with_transaction_local.db 39 48 raise 40 49 return transaction_wrapper
All the tests still pass, which doesn't mean the above is correct as we probably need a specific transaction nesting test…
If we go for this, then we should remove the db=None and all the db parameters in the calls, of course.
comment:8 by , 16 years ago
Maybe we should think about it some more, but yeah, something like that.
comment:9 by , 16 years ago
| Resolution: | fixed |
|---|---|
| Status: | closed → reopened |
… so that we don't forget to make a decision on comment:7.
by , 16 years ago
| Attachment: | with_transaction-r9321.diff added |
|---|
with_transaction doesn't take a db parameter anymore, instead the outermost transaction puts the db connection in thread local storage and clears it once the transaction has been either committed or rollbacked.
follow-up: 11 comment:10 by , 16 years ago
Please review with_transaction-r9321.diff.
A consequence of this change is that we would no longer need to pass the db connection around.
Provided we go with the above patch, the question is whether we keep those db=None parameters for now and simply advertise this usage as being deprecated and unsupported. The problem is that this would give a false sense of security, the calls would still be syntactically correct but semantically wrong, as the called method would trigger a commit.
As this goes towards simplifying the API, I'm supportive of the change and I think we should make the change explicit by removing the db parameters.
follow-up: 12 comment:11 by , 16 years ago
Replying to cboos:
Please review with_transaction-r9321.diff.
Yes, the more I think about it, the more I like automatic transaction control. This should avoid all weird bugs due to "hidden" nested transactions.
As this goes towards simplifying the API, I'm supportive of the change and I think we should make the change explicit by removing the db parameters.
+1, better be explicit about it.
Another possible simplification could be to get a cursor automatically, as all transaction blocks start with the following lines anyway:
@with_transaction(self.env) def do_transaction(db): cursor = db.cursor() ...
So we could do the following instead:
@with_transaction(self.env) def do_transaction(db, cursor): ...
And when we switch to a context manager, it could return a tuple:
with transaction(self.env) as (db, cursor): ...
comment:12 by , 16 years ago
Replying to rblank:
Another possible simplification could be to get a cursor automatically, as all transaction blocks start with …
Good suggestion, I'll do it.
comment:13 by , 16 years ago
On second thought, both of my suggestions turned out to be bad ideas:
- There are many places in the code where a
dbis needed but nocursor, so it's a waste to always get a cursor inwith_transaction(). - Removing the
db=Nonearguments would break far too much code. Basically, any code using model objects would have to be converted towith_transaction(). I think this is too much breakage.
Fortunately, there is a simple way to make both variants (explicit db and implicit (thread-local)) work. So I think we should keep the db=None argument in with_transaction() but document that it is intended only for legacy code. Also, we should keep the db=None argument in model methods that existed in 0.11, but not in new methods, and document that it is deprecated in favor of with_transaction().
Patch follows.
by , 16 years ago
| Attachment: | 9060-with-transaction-r9348.patch added |
|---|
Use implicit nested transactions whenever possible.
follow-up: 15 comment:14 by , 16 years ago
The patch above adds implicit nested transactions, while still allowing explicit nesting through a db argument. The optional db=None arguments are left as-is in model methods for attachments, tickets and wiki pages, and the semantics remain correct.
It also adds a nice ThreadLocal class that allows defining a thread-local storage with default values. This avoids having to resort to hasattr() / getattr() magic due to non-existing attributes, and makes the code more readable. All the users of local have been changed to use ThreadLocal instead.
If we go this route, I have two more questions:
- Should we use
with_transaction()also for read-only DB accesses?
- The cache code could be greatly simplified by removing the
@cacheddecorator and theCacheProxyclass and only keeping@cached_value, as passing thedbattribute is not necessary anymore, and invalidation usingdelcan simply be nested in awith_transaction(). Should we do that?
follow-up: 16 comment:15 by , 16 years ago
Replying to rblank:
On second thought, both of my suggestions turned out to be bad ideas:
- There are many places in the code where a
dbis needed but nocursor, so it's a waste to always get a cursor inwith_transaction().
Not necessarily a bad idea, maybe we just need a different method for this.
What about having with_transaction() -> db, cursor and a new one with_db_for_write() -> db?
This would lead naturally to:
Replying to rblank:
- Should we use
with_transaction()also for read-only DB accesses?
with_query() -> db, cursor and with_db_for_read() -> db?
- The cache code could be greatly simplified by removing the
@cacheddecorator and theCacheProxyclass and only keeping@cached_value, as passing thedbattribute is not necessary anymore, and invalidation usingdelcan simply be nested in awith_transaction(). Should we do that?
Yes, I think so, and after the simplification rename @cached_value -> @cached?
The patch above adds implicit nested transactions, while still allowing explicit nesting through a
dbargument. The optionaldb=Nonearguments are left as-is in model methods for attachments, tickets and wiki pages, and the semantics remain correct.
The patch is nice.
It also adds a nice
ThreadLocalclass that allows defining a thread-local storage with default values. This avoids having to resort tohasattr()/getattr()magic due to non-existing attributes, and makes the code more readable. All the users oflocalhave been changed to useThreadLocalinstead.
I recalled a few gc related bugs with subclasses of thread-local, but looking again, it doesn't seem we could be affected here (PythonBug:6990).
comment:16 by , 16 years ago
Replying to cboos:
Not necessarily a bad idea, maybe we just need a different method for this. What about having
with_transaction() -> db, cursorand a new onewith_db_for_write() -> db?
I think I prefer keeping a single entry point, it makes for a simpler API. Also, having to write cursor = db.cursor() at the beginning is not so annoying.
Yes, I think so, and after the simplification rename
@cached_value -> @cached?
Exactly what I was going to suggest, too.
I have an even better implementation of with_transaction() that allows nesting uses with explicit db and implicit db arbitrarily, and to detect bad uses. Patch coming later tonight.
I recalled a few gc related bugs with subclasses of thread-local, but looking again, it doesn't seem we could be affected here (PythonBug:6990).
I didn't know that. I'll review that bug tonight as well.
by , 16 years ago
| Attachment: | 9060-with-transaction-r9351.patch added |
|---|
Allow arbitrary nesting of transactions. With unit tests.
comment:17 by , 16 years ago
The patch above has an improved implementation that allows arbitrary nesting of implicit and explicit transactions. It also detects invalid transaction nesting. A new test module adds unit tests for with_transaction().
The improvements to the cache module follow.
by , 16 years ago
| Attachment: | 9060-cache-simplification.patch added |
|---|
Simplification of the cache code, applies on top of 9060-with-transaction-r9351.patch.
comment:18 by , 16 years ago
The patch above simplifies the cache code as suggested in comment:14 and comment:15. Unfortunately, this required converting CachedRepository.sync() to use with_transaction(). I believe the semantics are nearly correct, except for a db.rollback() that had to be kept in there, and would abort an enclosing transaction.
I have added a helper function get_read_db() in trac.db.util to get a database connection for reading. It returns the current connection if a transaction is in progress, otherwise it gets a new connection from the environment.
I'm quite happy about the cache simplification. But I'm a bit wary of the sync() changes, so a review would be most welcome.
follow-up: 20 comment:19 by , 16 years ago
On second thought, maybe it would be simpler to just wrap the content of sync() in a transaction with an explicit db:
def sync(self, clean): @with_transaction(self.env, self.env.get_db_cnx()) def do_sync(db): ...
follow-up: 21 comment:20 by , 16 years ago
Replying to rblank:
On second thought, maybe it would be simpler to just wrap the content of
sync()in a transaction with an explicitdb:
@with_transaction(self.env, self.env.get_db_cnx())
I don't think this makes any difference in practice, because of the pool and the _active dictionary: env.get_db_cnx() gives you back the current connection for the thread, if any.
There's certainly some overlap here with what with_transaction does, so maybe we could unify both (e.g. make the pool use _transaction_local instead of the _active dict).
Also just imagine env.get_db_cnx() would not already do this, if you could get a different connection in the same thread this would be a good recipe for getting into a deadlock sooner or later…
This doesn't mean we shouldn't have tests which guard against that possibility of course, as we could one day introduce a regression in that area.
follow-up: 22 comment:21 by , 16 years ago
Replying to cboos:
I don't think this makes any difference in practice, because of the pool and the
_activedictionary:env.get_db_cnx()gives you back the current connection for the thread, if any.
Sure. But it means fewer changes to sync(), hence less probability to introduce a bug :)
There's certainly some overlap here with what
with_transactiondoes, so maybe we could unify both (e.g. make the pool use_transaction_localinstead of the_activedict).
I also wondered about that. Could we even drop the pool altogether, and just keep one connection around for each thread in _transaction_local? Or is that what you meant?
follow-up: 23 comment:22 by , 16 years ago
Replying to rblank:
Replying to cboos:
I don't think this makes any difference in practice, because of the pool and the
_activedictionary:env.get_db_cnx()gives you back the current connection for the thread, if any.Sure. But it means fewer changes to
sync(), hence less probability to introduce a bug :)
Also, doing a full sync in a single transaction would be too problematic (locking the db for a very long time, risk of getting the whole sync wasted in case of an interruption - but sure this could eventually fix #9129 ;-) ).
Or did you intend to keep the explicit commit in that case?
There's certainly some overlap here with what
with_transactiondoes, so maybe we could unify both (e.g. make the pool use_transaction_localinstead of the_activedict).I also wondered about that. Could we even drop the pool altogether, and just keep one connection around for each thread in
_transaction_local? Or is that what you meant?
No, I don't think it's a good idea to get rid of the cache and only keep a thread local connection, as we need a way to limit the overall number of opened connections and make it possible to reuse already opened connections. So I only meant to replace the _active dict inside the PooledConnection with the _transaction_local - but that could eventually be done in a second step. Let's focus on the first set of patches, I'll do a more thorough review later today.
comment:23 by , 16 years ago
Replying to cboos:
Or did you intend to keep the explicit
commitin that case?
I intended to keep the code unchanged, but to wrap it in a @with_transaction so that nested transactions (in the cache code) get the db reference via _transaction_local. The commits and rollback would be kept as-is.
No, I don't think it's a good idea to get rid of the cache and only keep a thread local connection, as we need a way to limit the overall number of opened connections and make it possible to reuse already opened connections.
Makes sense.
by , 16 years ago
| Attachment: | 9060-cache-simplification-2.patch added |
|---|
Same patch, but fewer changes in CachedRepository.sync().
comment:24 by , 16 years ago
This patch has the same functionality as the previous, but fewer changes in CachedRepository.sync(). The former should be better, as it ensures a db.rollback() is done in case of an error, but the latter is probably less "risky".
Let me know which one you prefer.
comment:25 by , 16 years ago
No forgotten, but I needed to get a few things "off my plate" first. I'll try to review and test ASAP.
follow-up: 30 comment:27 by , 16 years ago
All the tests pass for the 3 database backends.
I've again tried to imagine a way to do without the db argument to with_transaction but I guess there's really no possibility yet. The next opportunity for simplification will be the move to context managers, and at that time we may change the code more significantly, perhaps by removing the Environment.get_db_cnx() so that the only way to get a connection would be via with_transaction (and get_read_db).
By the way, as both of these functions need an Environment, we could make them Environment methods or add methods that would call these functions, in order to keep the bulk of the code in trac.db.util. That would save the extra import and be more consistent with the idea of deprecating the get_db_cnx method in favor of them.
But let's commit 9060-with-transaction-r9351.patch first.
For 9060-cache-simplification.patch, a small update is needed:
-
trac/wiki/model.py
diff --git a/trac/wiki/model.py b/trac/wiki/model.py
a b class WikiPage(object): 177 177 178 178 cursor.execute("UPDATE wiki SET name=%s WHERE name=%s", 179 179 (new_name, old_name)) 180 WikiSystem(self.env).pages.invalidate(db)180 del WikiSystem(self.env).pages 181 181 from trac.attachment import Attachment 182 182 Attachment.reparent_all(self.env, 'wiki', old_name, 183 183 'wiki', new_name, db)
We need to rethink the CachedRepository.sync method (#9129), but here again, let's do that after the above patch has been committed.
follow-up: 29 comment:28 by , 16 years ago
Ok, I'll commit 9060-with-transaction-r9351.patch, and 9060-cache-simplification.patch with your proposed change (good catch!).
follow-up: 31 comment:30 by , 16 years ago
Both patches committed in [9436].
Replying to cboos:
For 9060-cache-simplification.patch, a small update is needed:
Actually, I had that change already, together with the simplifications in the wiki rename part, in my sandbox. I forgot to post updated patches, sorry about that.
By the way, as both of these functions need an
Environment, we could make themEnvironmentmethods or add methods that would call these functions, in order to keep the bulk of the code intrac.db.util. That would save the extraimportand be more consistent with the idea of deprecating theget_db_cnxmethod in favor of them.
I thought along the same lines too, but finally preferred separating the database stuff from the environment as much as possible. The Environment class is already quite an "usine à gaz", so I'd rather remove stuff from there rather than pile up. I do understand the convenience aspect, though, especially for an often-used function like getting a database connection, so I'm ok with adding the two methods if you think it's worth it.
comment:31 by , 16 years ago
| Resolution: | → fixed |
|---|---|
| Status: | reopened → closed |
Replying to rblank:
Replying to cboos:
By the way, as both of these functions need an
Environment, we could make themEnvironmentmethods or add methods that would call these functions, in order to keep the bulk of the code intrac.db.util. That would save the extraimportand be more consistent with the idea of deprecating theget_db_cnxmethod in favor of them.I thought along the same lines too, but finally preferred separating the database stuff from the environment as much as possible. The
Environmentclass is already quite an "usine à gaz", so I'd rather remove stuff from there rather than pile up. I do understand the convenience aspect, though, especially for an often-used function like getting a database connection, so I'm ok with adding the two methods if you think it's worth it.
Yes, I think it's worth it. Added this in r9442, and a couple of other changes on this topic done in r9438 and r9439, which concludes this set of db related refactorings.



What happens in the following case:
Do we have
db1 is db2or not? If not, are the two connections independent or not? If not, this would mean that doingdb2.commit()would invalidate all cursors created fromdb1, and therefore "nested" transactions would be an issue.OTOH, in this particular case, it looks again like someone (ahem) forgot a
breakstatement at the end of the loop. Or rather, indented the secondcursor.execute()one level too many (see [9183/trunk/trac/versioncontrol/cache.py]).