#9060 closed defect (fixed)
"no results to fetch" after a ''changeset modified''
Reported by: | Christian Boos | Owned by: | Christian Boos |
---|---|---|---|
Priority: | high | Milestone: | 0.12 |
Component: | version control | Version: | 0.12dev |
Severity: | normal | Keywords: | transaction |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
Just got this after a svn:log edit (t.e.o running r9188):
Traceback (most recent call last): File "build/bdist.linux-x86_64/egg/trac/admin/console.py", line 107, in onecmd rv = cmd.Cmd.onecmd(self, line) or 0 File "/usr/lib/python2.5/cmd.py", line 218, in onecmd return self.default(line) File "build/bdist.linux-x86_64/egg/trac/admin/console.py", line 251, in default return cmd_mgr.execute_command(*args) File "build/bdist.linux-x86_64/egg/trac/admin/api.py", line 118, in execute_command return f(*fargs) File "build/bdist.linux-x86_64/egg/trac/versioncontrol/admin.py", line 95, in _do_changeset_modified rm.notify('changeset_modified', reponame, revs) File "build/bdist.linux-x86_64/egg/trac/versioncontrol/api.py", line 611, in notify args.append(repos.sync_changeset(rev)) File "build/bdist.linux-x86_64/egg/trac/versioncontrol/cache.py", line 90, in sync_changeset @with_transaction(self.env) File "build/bdist.linux-x86_64/egg/trac/db/util.py", line 35, in transaction_wrapper fn(dbtmp) File "build/bdist.linux-x86_64/egg/trac/versioncontrol/cache.py", line 97, in do_sync for time, author, message in cursor: File "build/bdist.linux-x86_64/egg/trac/db/util.py", line 67, in __iter__ row = self.cursor.fetchone() ProgrammingError: no results to fetch
From #8993, it seems there must have been a db.commit()
after the self.cursor
has been created from the db
and before we call self.cursor.fetchone()
.
Attachments (5)
Change History (38)
follow-up: 2 comment:1 by , 15 years ago
follow-up: 3 comment:2 by , 15 years ago
Replying to rblank:
What happens in the following case:
db1 = self.env.get_db_cnx() db2 = self.env.get_db_cnx()Do we have
db1 is db2
or not?
If we're going through the ConnectionPool, yes.
And for now this is always the case, even when 'poolable' is set to False
.
(double negation omitted ;-) ) this would mean that doing
db2.commit()
would invalidate all cursors created fromdb1
, and therefore "nested" transactions would be an issue.
If the nested transaction recreates a db
, yes. But this is not the intended usage pattern, as the nested with_transaction
should be given the current db
, and then will not manage the transaction and will not try to commit()
.
OTOH, in this particular case, it looks again like someone (ahem) forgot a
break
statement at the end of the loop. Or rather, indented the secondcursor.execute()
one level too many (see [9183/trunk/trac/versioncontrol/cache.py]).
Ah yes, simple indent error… oops ;-)
We really need a test for sync_changeset
, I suppose.
comment:3 by , 15 years ago
Replying to cboos:
If the nested transaction recreates a
db
, yes. But this is not the intended usage pattern, as the nestedwith_transaction
should be given the currentdb
, and then will not manage the transaction and will not try tocommit()
.
This means that we must be careful always to pass the db
argument, and be sure that any function called within a transaction block doesn't re-create a db
. Would it be worth experimenting with some kind of (per-thread) "nesting counting" of transaction blocks, so that only the outermost block actually results in a commit or rollback?
follow-up: 6 comment:4 by , 15 years ago
I suppose this is an explicit vs. implicit thing, where your suggestion would be to add a safeguard against bad usage of the explicit way. But if we have such counting mechanism in place, then why not make the whole thing implicit, so we never have to pass the db
to with_transaction
?
comment:6 by , 15 years ago
Replying to cboos:
But if we have such counting mechanism in place, then why not make the whole thing implicit, so we never have to pass the
db
towith_transaction
?
Yes, that was my next thought, too. I suggested "experimenting" with such a mechanism, because I'm not sure if it will work in all cases, but maybe it's worth a shot.
comment:7 by , 15 years ago
Something along the lines of:
-
trac/db/util.py
diff --git a/trac/db/util.py b/trac/db/util.py
a b 15 15 # 16 16 # Author: Christopher Lenz <cmlenz@gmx.de> 17 17 18 try: 19 import threading 20 except ImportError: 21 import dummy_threading as threading 22 23 _with_transaction_local = threading.local() 24 18 25 def with_transaction(env, db=None): 19 26 """Transaction decorator for simple use-once transactions. 20 27 Will be replaced by a context manager once python 2.4 support is dropped. 21 28 22 29 >>> def api_method(p1, p2): 23 30 >>> result[0] = value1 24 >>> @with_transaction(env , db)31 >>> @with_transaction(env) 25 32 >>> def implementation_method(db): 26 33 >>> # implementation 27 34 >>> result[0] = value2 28 35 """ 29 36 def transaction_wrapper(fn): 30 if db:31 fn( db)37 if hasattr(_with_transaction_local, 'db'): 38 fn(_with_transaction_local.db) 32 39 else: 33 dbtmp= env.get_db_cnx()40 _with_transaction_local.db = db = env.get_db_cnx() 34 41 try: 35 fn(dbtmp) 36 dbtmp.commit() 42 fn(db) 43 db.commit() 44 del _with_transaction_local.db 37 45 except: 38 dbtmp.rollback() 46 db.rollback() 47 del _with_transaction_local.db 39 48 raise 40 49 return transaction_wrapper
All the tests still pass, which doesn't mean the above is correct as we probably need a specific transaction nesting test…
If we go for this, then we should remove the db=None
and all the db
parameters in the calls, of course.
comment:8 by , 15 years ago
Maybe we should think about it some more, but yeah, something like that.
comment:9 by , 15 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
… so that we don't forget to make a decision on comment:7.
by , 15 years ago
Attachment: | with_transaction-r9321.diff added |
---|
with_transaction doesn't take a db parameter anymore, instead the outermost transaction puts the db connection in thread local storage and clears it once the transaction has been either committed or rollbacked.
follow-up: 11 comment:10 by , 15 years ago
Please review with_transaction-r9321.diff.
A consequence of this change is that we would no longer need to pass the db connection around.
Provided we go with the above patch, the question is whether we keep those db=None
parameters for now and simply advertise this usage as being deprecated and unsupported. The problem is that this would give a false sense of security, the calls would still be syntactically correct but semantically wrong, as the called method would trigger a commit.
As this goes towards simplifying the API, I'm supportive of the change and I think we should make the change explicit by removing the db parameters.
follow-up: 12 comment:11 by , 15 years ago
Replying to cboos:
Please review with_transaction-r9321.diff.
Yes, the more I think about it, the more I like automatic transaction control. This should avoid all weird bugs due to "hidden" nested transactions.
As this goes towards simplifying the API, I'm supportive of the change and I think we should make the change explicit by removing the db parameters.
+1, better be explicit about it.
Another possible simplification could be to get a cursor automatically, as all transaction blocks start with the following lines anyway:
@with_transaction(self.env) def do_transaction(db): cursor = db.cursor() ...
So we could do the following instead:
@with_transaction(self.env) def do_transaction(db, cursor): ...
And when we switch to a context manager, it could return a tuple:
with transaction(self.env) as (db, cursor): ...
comment:12 by , 15 years ago
Replying to rblank:
Another possible simplification could be to get a cursor automatically, as all transaction blocks start with …
Good suggestion, I'll do it.
comment:13 by , 15 years ago
On second thought, both of my suggestions turned out to be bad ideas:
- There are many places in the code where a
db
is needed but nocursor
, so it's a waste to always get a cursor inwith_transaction()
. - Removing the
db=None
arguments would break far too much code. Basically, any code using model objects would have to be converted towith_transaction()
. I think this is too much breakage.
Fortunately, there is a simple way to make both variants (explicit db
and implicit (thread-local)) work. So I think we should keep the db=None
argument in with_transaction()
but document that it is intended only for legacy code. Also, we should keep the db=None
argument in model methods that existed in 0.11, but not in new methods, and document that it is deprecated in favor of with_transaction()
.
Patch follows.
by , 15 years ago
Attachment: | 9060-with-transaction-r9348.patch added |
---|
Use implicit nested transactions whenever possible.
follow-up: 15 comment:14 by , 15 years ago
The patch above adds implicit nested transactions, while still allowing explicit nesting through a db
argument. The optional db=None
arguments are left as-is in model methods for attachments, tickets and wiki pages, and the semantics remain correct.
It also adds a nice ThreadLocal
class that allows defining a thread-local storage with default values. This avoids having to resort to hasattr()
/ getattr()
magic due to non-existing attributes, and makes the code more readable. All the users of local
have been changed to use ThreadLocal
instead.
If we go this route, I have two more questions:
- Should we use
with_transaction()
also for read-only DB accesses?
- The cache code could be greatly simplified by removing the
@cached
decorator and theCacheProxy
class and only keeping@cached_value
, as passing thedb
attribute is not necessary anymore, and invalidation usingdel
can simply be nested in awith_transaction()
. Should we do that?
follow-up: 16 comment:15 by , 15 years ago
Replying to rblank:
On second thought, both of my suggestions turned out to be bad ideas:
- There are many places in the code where a
db
is needed but nocursor
, so it's a waste to always get a cursor inwith_transaction()
.
Not necessarily a bad idea, maybe we just need a different method for this.
What about having with_transaction() -> db, cursor
and a new one with_db_for_write() -> db
?
This would lead naturally to:
Replying to rblank:
- Should we use
with_transaction()
also for read-only DB accesses?
with_query() -> db, cursor
and with_db_for_read() -> db
?
- The cache code could be greatly simplified by removing the
@cached
decorator and theCacheProxy
class and only keeping@cached_value
, as passing thedb
attribute is not necessary anymore, and invalidation usingdel
can simply be nested in awith_transaction()
. Should we do that?
Yes, I think so, and after the simplification rename @cached_value -> @cached
?
The patch above adds implicit nested transactions, while still allowing explicit nesting through a
db
argument. The optionaldb=None
arguments are left as-is in model methods for attachments, tickets and wiki pages, and the semantics remain correct.
The patch is nice.
It also adds a nice
ThreadLocal
class that allows defining a thread-local storage with default values. This avoids having to resort tohasattr()
/getattr()
magic due to non-existing attributes, and makes the code more readable. All the users oflocal
have been changed to useThreadLocal
instead.
I recalled a few gc related bugs with subclasses of thread-local, but looking again, it doesn't seem we could be affected here (PythonBug:6990).
comment:16 by , 15 years ago
Replying to cboos:
Not necessarily a bad idea, maybe we just need a different method for this. What about having
with_transaction() -> db, cursor
and a new onewith_db_for_write() -> db
?
I think I prefer keeping a single entry point, it makes for a simpler API. Also, having to write cursor = db.cursor()
at the beginning is not so annoying.
Yes, I think so, and after the simplification rename
@cached_value -> @cached
?
Exactly what I was going to suggest, too.
I have an even better implementation of with_transaction()
that allows nesting uses with explicit db
and implicit db
arbitrarily, and to detect bad uses. Patch coming later tonight.
I recalled a few gc related bugs with subclasses of thread-local, but looking again, it doesn't seem we could be affected here (PythonBug:6990).
I didn't know that. I'll review that bug tonight as well.
by , 15 years ago
Attachment: | 9060-with-transaction-r9351.patch added |
---|
Allow arbitrary nesting of transactions. With unit tests.
comment:17 by , 15 years ago
The patch above has an improved implementation that allows arbitrary nesting of implicit and explicit transactions. It also detects invalid transaction nesting. A new test module adds unit tests for with_transaction()
.
The improvements to the cache module follow.
by , 15 years ago
Attachment: | 9060-cache-simplification.patch added |
---|
Simplification of the cache code, applies on top of 9060-with-transaction-r9351.patch.
comment:18 by , 15 years ago
The patch above simplifies the cache code as suggested in comment:14 and comment:15. Unfortunately, this required converting CachedRepository.sync()
to use with_transaction()
. I believe the semantics are nearly correct, except for a db.rollback()
that had to be kept in there, and would abort an enclosing transaction.
I have added a helper function get_read_db()
in trac.db.util
to get a database connection for reading. It returns the current connection if a transaction is in progress, otherwise it gets a new connection from the environment.
I'm quite happy about the cache simplification. But I'm a bit wary of the sync()
changes, so a review would be most welcome.
follow-up: 20 comment:19 by , 15 years ago
On second thought, maybe it would be simpler to just wrap the content of sync()
in a transaction with an explicit db
:
def sync(self, clean): @with_transaction(self.env, self.env.get_db_cnx()) def do_sync(db): ...
follow-up: 21 comment:20 by , 15 years ago
Replying to rblank:
On second thought, maybe it would be simpler to just wrap the content of
sync()
in a transaction with an explicitdb
:
@with_transaction(self.env, self.env.get_db_cnx())
I don't think this makes any difference in practice, because of the pool and the _active
dictionary: env.get_db_cnx()
gives you back the current connection for the thread, if any.
There's certainly some overlap here with what with_transaction
does, so maybe we could unify both (e.g. make the pool use _transaction_local
instead of the _active
dict).
Also just imagine env.get_db_cnx()
would not already do this, if you could get a different connection in the same thread this would be a good recipe for getting into a deadlock sooner or later…
This doesn't mean we shouldn't have tests which guard against that possibility of course, as we could one day introduce a regression in that area.
follow-up: 22 comment:21 by , 15 years ago
Replying to cboos:
I don't think this makes any difference in practice, because of the pool and the
_active
dictionary:env.get_db_cnx()
gives you back the current connection for the thread, if any.
Sure. But it means fewer changes to sync()
, hence less probability to introduce a bug :)
There's certainly some overlap here with what
with_transaction
does, so maybe we could unify both (e.g. make the pool use_transaction_local
instead of the_active
dict).
I also wondered about that. Could we even drop the pool altogether, and just keep one connection around for each thread in _transaction_local
? Or is that what you meant?
follow-up: 23 comment:22 by , 15 years ago
Replying to rblank:
Replying to cboos:
I don't think this makes any difference in practice, because of the pool and the
_active
dictionary:env.get_db_cnx()
gives you back the current connection for the thread, if any.Sure. But it means fewer changes to
sync()
, hence less probability to introduce a bug :)
Also, doing a full sync in a single transaction would be too problematic (locking the db for a very long time, risk of getting the whole sync wasted in case of an interruption - but sure this could eventually fix #9129 ;-) ).
Or did you intend to keep the explicit commit
in that case?
There's certainly some overlap here with what
with_transaction
does, so maybe we could unify both (e.g. make the pool use_transaction_local
instead of the_active
dict).I also wondered about that. Could we even drop the pool altogether, and just keep one connection around for each thread in
_transaction_local
? Or is that what you meant?
No, I don't think it's a good idea to get rid of the cache and only keep a thread local connection, as we need a way to limit the overall number of opened connections and make it possible to reuse already opened connections. So I only meant to replace the _active
dict inside the PooledConnection with the _transaction_local
- but that could eventually be done in a second step. Let's focus on the first set of patches, I'll do a more thorough review later today.
comment:23 by , 15 years ago
Replying to cboos:
Or did you intend to keep the explicit
commit
in that case?
I intended to keep the code unchanged, but to wrap it in a @with_transaction
so that nested transactions (in the cache code) get the db
reference via _transaction_local
. The commits and rollback would be kept as-is.
No, I don't think it's a good idea to get rid of the cache and only keep a thread local connection, as we need a way to limit the overall number of opened connections and make it possible to reuse already opened connections.
Makes sense.
by , 15 years ago
Attachment: | 9060-cache-simplification-2.patch added |
---|
Same patch, but fewer changes in CachedRepository.sync()
.
comment:24 by , 15 years ago
This patch has the same functionality as the previous, but fewer changes in CachedRepository.sync()
. The former should be better, as it ensures a db.rollback()
is done in case of an error, but the latter is probably less "risky".
Let me know which one you prefer.
comment:25 by , 15 years ago
No forgotten, but I needed to get a few things "off my plate" first. I'll try to review and test ASAP.
follow-up: 30 comment:27 by , 15 years ago
All the tests pass for the 3 database backends.
I've again tried to imagine a way to do without the db argument to with_transaction
but I guess there's really no possibility yet. The next opportunity for simplification will be the move to context managers, and at that time we may change the code more significantly, perhaps by removing the Environment.get_db_cnx()
so that the only way to get a connection would be via with_transaction
(and get_read_db
).
By the way, as both of these functions need an Environment
, we could make them Environment
methods or add methods that would call these functions, in order to keep the bulk of the code in trac.db.util
. That would save the extra import
and be more consistent with the idea of deprecating the get_db_cnx
method in favor of them.
But let's commit 9060-with-transaction-r9351.patch first.
For 9060-cache-simplification.patch, a small update is needed:
-
trac/wiki/model.py
diff --git a/trac/wiki/model.py b/trac/wiki/model.py
a b class WikiPage(object): 177 177 178 178 cursor.execute("UPDATE wiki SET name=%s WHERE name=%s", 179 179 (new_name, old_name)) 180 WikiSystem(self.env).pages.invalidate(db)180 del WikiSystem(self.env).pages 181 181 from trac.attachment import Attachment 182 182 Attachment.reparent_all(self.env, 'wiki', old_name, 183 183 'wiki', new_name, db)
We need to rethink the CachedRepository.sync
method (#9129), but here again, let's do that after the above patch has been committed.
follow-up: 29 comment:28 by , 15 years ago
Ok, I'll commit 9060-with-transaction-r9351.patch, and 9060-cache-simplification.patch with your proposed change (good catch!).
follow-up: 31 comment:30 by , 15 years ago
Both patches committed in [9436].
Replying to cboos:
For 9060-cache-simplification.patch, a small update is needed:
Actually, I had that change already, together with the simplifications in the wiki rename part, in my sandbox. I forgot to post updated patches, sorry about that.
By the way, as both of these functions need an
Environment
, we could make themEnvironment
methods or add methods that would call these functions, in order to keep the bulk of the code intrac.db.util
. That would save the extraimport
and be more consistent with the idea of deprecating theget_db_cnx
method in favor of them.
I thought along the same lines too, but finally preferred separating the database stuff from the environment as much as possible. The Environment
class is already quite an "usine à gaz", so I'd rather remove stuff from there rather than pile up. I do understand the convenience aspect, though, especially for an often-used function like getting a database connection, so I'm ok with adding the two methods if you think it's worth it.
comment:31 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Replying to rblank:
Replying to cboos:
By the way, as both of these functions need an
Environment
, we could make themEnvironment
methods or add methods that would call these functions, in order to keep the bulk of the code intrac.db.util
. That would save the extraimport
and be more consistent with the idea of deprecating theget_db_cnx
method in favor of them.I thought along the same lines too, but finally preferred separating the database stuff from the environment as much as possible. The
Environment
class is already quite an "usine à gaz", so I'd rather remove stuff from there rather than pile up. I do understand the convenience aspect, though, especially for an often-used function like getting a database connection, so I'm ok with adding the two methods if you think it's worth it.
Yes, I think it's worth it. Added this in r9442, and a couple of other changes on this topic done in r9438 and r9439, which concludes this set of db related refactorings.
What happens in the following case:
Do we have
db1 is db2
or not? If not, are the two connections independent or not? If not, this would mean that doingdb2.commit()
would invalidate all cursors created fromdb1
, and therefore "nested" transactions would be an issue.OTOH, in this particular case, it looks again like someone (ahem) forgot a
break
statement at the end of the loop. Or rather, indented the secondcursor.execute()
one level too many (see [9183/trunk/trac/versioncontrol/cache.py]).