Edgewall Software
Modify

Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#9060 closed defect (fixed)

"no results to fetch" after a ''changeset modified''

Reported by: Christian Boos Owned by: Christian Boos
Priority: high Milestone: 0.12
Component: version control Version: 0.12dev
Severity: normal Keywords: transaction
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

Just got this after a svn:log edit (t.e.o running r9188):

Traceback (most recent call last):
  File "build/bdist.linux-x86_64/egg/trac/admin/console.py", line 107, in onecmd
    rv = cmd.Cmd.onecmd(self, line) or 0
  File "/usr/lib/python2.5/cmd.py", line 218, in onecmd
    return self.default(line)
  File "build/bdist.linux-x86_64/egg/trac/admin/console.py", line 251, in default
    return cmd_mgr.execute_command(*args)
  File "build/bdist.linux-x86_64/egg/trac/admin/api.py", line 118, in execute_command
    return f(*fargs)
  File "build/bdist.linux-x86_64/egg/trac/versioncontrol/admin.py", line 95, in _do_changeset_modified
    rm.notify('changeset_modified', reponame, revs)
  File "build/bdist.linux-x86_64/egg/trac/versioncontrol/api.py", line 611, in notify
    args.append(repos.sync_changeset(rev))
  File "build/bdist.linux-x86_64/egg/trac/versioncontrol/cache.py", line 90, in sync_changeset
    @with_transaction(self.env)
  File "build/bdist.linux-x86_64/egg/trac/db/util.py", line 35, in transaction_wrapper
    fn(dbtmp)
  File "build/bdist.linux-x86_64/egg/trac/versioncontrol/cache.py", line 97, in do_sync
    for time, author, message in cursor:
  File "build/bdist.linux-x86_64/egg/trac/db/util.py", line 67, in __iter__
    row = self.cursor.fetchone()
ProgrammingError: no results to fetch

From #8993, it seems there must have been a db.commit() after the self.cursor has been created from the db and before we call self.cursor.fetchone().

Attachments (5)

with_transaction-r9321.diff (10.8 KB ) - added by Christian Boos 12 years ago.
with_transaction doesn't take a db parameter anymore, instead the outermost transaction puts the db connection in thread local storage and clears it once the transaction has been either committed or rollbacked.
9060-with-transaction-r9348.patch (31.4 KB ) - added by Remy Blank 12 years ago.
Use implicit nested transactions whenever possible.
9060-with-transaction-r9351.patch (44.2 KB ) - added by Remy Blank 12 years ago.
Allow arbitrary nesting of transactions. With unit tests.
9060-cache-simplification.patch (27.0 KB ) - added by Remy Blank 12 years ago.
Simplification of the cache code, applies on top of 9060-with-transaction-r9351.patch.
9060-cache-simplification-2.patch (30.7 KB ) - added by Remy Blank 12 years ago.
Same patch, but fewer changes in CachedRepository.sync().

Download all attachments as: .zip

Change History (38)

comment:1 by Remy Blank, 12 years ago

What happens in the following case:

db1 = self.env.get_db_cnx()
db2 = self.env.get_db_cnx()

Do we have db1 is db2 or not? If not, are the two connections independent or not? If not, this would mean that doing db2.commit() would invalidate all cursors created from db1, and therefore "nested" transactions would be an issue.

OTOH, in this particular case, it looks again like someone (ahem) forgot a break statement at the end of the loop. Or rather, indented the second cursor.execute() one level too many (see [9183/trunk/trac/versioncontrol/cache.py]).

in reply to:  1 ; comment:2 by Christian Boos, 12 years ago

Replying to rblank:

What happens in the following case:

db1 = self.env.get_db_cnx()
db2 = self.env.get_db_cnx()

Do we have db1 is db2 or not?

If we're going through the ConnectionPool, yes. And for now this is always the case, even when 'poolable' is set to False.

(double negation omitted ;-) ) this would mean that doing db2.commit() would invalidate all cursors created from db1, and therefore "nested" transactions would be an issue.

If the nested transaction recreates a db, yes. But this is not the intended usage pattern, as the nested with_transaction should be given the current db, and then will not manage the transaction and will not try to commit().

OTOH, in this particular case, it looks again like someone (ahem) forgot a break statement at the end of the loop. Or rather, indented the second cursor.execute() one level too many (see [9183/trunk/trac/versioncontrol/cache.py]).

Ah yes, simple indent error… oops ;-)

We really need a test for sync_changeset, I suppose.

in reply to:  2 comment:3 by Remy Blank, 12 years ago

Replying to cboos:

If the nested transaction recreates a db, yes. But this is not the intended usage pattern, as the nested with_transaction should be given the current db, and then will not manage the transaction and will not try to commit().

This means that we must be careful always to pass the db argument, and be sure that any function called within a transaction block doesn't re-create a db. Would it be worth experimenting with some kind of (per-thread) "nesting counting" of transaction blocks, so that only the outermost block actually results in a commit or rollback?

comment:4 by Christian Boos, 12 years ago

I suppose this is an explicit vs. implicit thing, where your suggestion would be to add a safeguard against bad usage of the explicit way. But if we have such counting mechanism in place, then why not make the whole thing implicit, so we never have to pass the db to with_transaction?

comment:5 by Christian Boos, 12 years ago

Resolution: fixed
Status: newclosed

Fixed in r9191, with a test.

in reply to:  4 comment:6 by Remy Blank, 12 years ago

Replying to cboos:

But if we have such counting mechanism in place, then why not make the whole thing implicit, so we never have to pass the db to with_transaction?

Yes, that was my next thought, too. I suggested "experimenting" with such a mechanism, because I'm not sure if it will work in all cases, but maybe it's worth a shot.

comment:7 by Christian Boos, 12 years ago

Something along the lines of:

  • trac/db/util.py

    diff --git a/trac/db/util.py b/trac/db/util.py
    a b  
    1515#
    1616# Author: Christopher Lenz <cmlenz@gmx.de>
    1717
     18try:
     19    import threading
     20except ImportError:
     21    import dummy_threading as threading
     22
     23_with_transaction_local = threading.local()
     24
    1825def with_transaction(env, db=None):
    1926    """Transaction decorator for simple use-once transactions.
    2027    Will be replaced by a context manager once python 2.4 support is dropped.
    2128
    2229    >>> def api_method(p1, p2):
    2330    >>>     result[0] = value1
    24     >>>     @with_transaction(env, db)
     31    >>>     @with_transaction(env)
    2532    >>>     def implementation_method(db):
    2633    >>>         # implementation
    2734    >>>         result[0] = value2
    2835    """
    2936    def transaction_wrapper(fn):
    30         if db:
    31             fn(db)
     37        if hasattr(_with_transaction_local, 'db'):
     38            fn(_with_transaction_local.db)
    3239        else:
    33             dbtmp = env.get_db_cnx()
     40            _with_transaction_local.db = db = env.get_db_cnx()
    3441            try:
    35                 fn(dbtmp)
    36                 dbtmp.commit()
     42                fn(db)
     43                db.commit()
     44                del _with_transaction_local.db
    3745            except:
    38                 dbtmp.rollback()
     46                db.rollback()
     47                del _with_transaction_local.db
    3948                raise
    4049    return transaction_wrapper

All the tests still pass, which doesn't mean the above is correct as we probably need a specific transaction nesting test…

If we go for this, then we should remove the db=None and all the db parameters in the calls, of course.

comment:8 by Remy Blank, 12 years ago

Maybe we should think about it some more, but yeah, something like that.

comment:9 by Christian Boos, 12 years ago

Resolution: fixed
Status: closedreopened

… so that we don't forget to make a decision on comment:7.

by Christian Boos, 12 years ago

Attachment: with_transaction-r9321.diff added

with_transaction doesn't take a db parameter anymore, instead the outermost transaction puts the db connection in thread local storage and clears it once the transaction has been either committed or rollbacked.

comment:10 by Christian Boos, 12 years ago

Please review with_transaction-r9321.diff.

A consequence of this change is that we would no longer need to pass the db connection around.

Provided we go with the above patch, the question is whether we keep those db=None parameters for now and simply advertise this usage as being deprecated and unsupported. The problem is that this would give a false sense of security, the calls would still be syntactically correct but semantically wrong, as the called method would trigger a commit.

As this goes towards simplifying the API, I'm supportive of the change and I think we should make the change explicit by removing the db parameters.

in reply to:  10 ; comment:11 by Remy Blank, 12 years ago

Replying to cboos:

Please review with_transaction-r9321.diff.

Yes, the more I think about it, the more I like automatic transaction control. This should avoid all weird bugs due to "hidden" nested transactions.

As this goes towards simplifying the API, I'm supportive of the change and I think we should make the change explicit by removing the db parameters.

+1, better be explicit about it.

Another possible simplification could be to get a cursor automatically, as all transaction blocks start with the following lines anyway:

@with_transaction(self.env)
def do_transaction(db):
    cursor = db.cursor()
    ...

So we could do the following instead:

@with_transaction(self.env)
def do_transaction(db, cursor):
    ...

And when we switch to a context manager, it could return a tuple:

with transaction(self.env) as (db, cursor):
    ...

in reply to:  11 comment:12 by Christian Boos, 12 years ago

Replying to rblank:

Another possible simplification could be to get a cursor automatically, as all transaction blocks start with …

Good suggestion, I'll do it.

comment:13 by Remy Blank, 12 years ago

On second thought, both of my suggestions turned out to be bad ideas:

  • There are many places in the code where a db is needed but no cursor, so it's a waste to always get a cursor in with_transaction().
  • Removing the db=None arguments would break far too much code. Basically, any code using model objects would have to be converted to with_transaction(). I think this is too much breakage.

Fortunately, there is a simple way to make both variants (explicit db and implicit (thread-local)) work. So I think we should keep the db=None argument in with_transaction() but document that it is intended only for legacy code. Also, we should keep the db=None argument in model methods that existed in 0.11, but not in new methods, and document that it is deprecated in favor of with_transaction().

Patch follows.

by Remy Blank, 12 years ago

Use implicit nested transactions whenever possible.

comment:14 by Remy Blank, 12 years ago

The patch above adds implicit nested transactions, while still allowing explicit nesting through a db argument. The optional db=None arguments are left as-is in model methods for attachments, tickets and wiki pages, and the semantics remain correct.

It also adds a nice ThreadLocal class that allows defining a thread-local storage with default values. This avoids having to resort to hasattr() / getattr() magic due to non-existing attributes, and makes the code more readable. All the users of local have been changed to use ThreadLocal instead.

If we go this route, I have two more questions:

  • Should we use with_transaction() also for read-only DB accesses?
  • The cache code could be greatly simplified by removing the @cached decorator and the CacheProxy class and only keeping @cached_value, as passing the db attribute is not necessary anymore, and invalidation using del can simply be nested in a with_transaction(). Should we do that?

in reply to:  14 ; comment:15 by Christian Boos, 12 years ago

Replying to rblank:

On second thought, both of my suggestions turned out to be bad ideas:

  • There are many places in the code where a db is needed but no cursor, so it's a waste to always get a cursor in with_transaction().

Not necessarily a bad idea, maybe we just need a different method for this. What about having with_transaction() -> db, cursor and a new one with_db_for_write() -> db?

This would lead naturally to:

Replying to rblank:

  • Should we use with_transaction() also for read-only DB accesses?

with_query() -> db, cursor and with_db_for_read() -> db?

  • The cache code could be greatly simplified by removing the @cached decorator and the CacheProxy class and only keeping @cached_value, as passing the db attribute is not necessary anymore, and invalidation using del can simply be nested in a with_transaction(). Should we do that?

Yes, I think so, and after the simplification rename @cached_value -> @cached?

The patch above adds implicit nested transactions, while still allowing explicit nesting through a db argument. The optional db=None arguments are left as-is in model methods for attachments, tickets and wiki pages, and the semantics remain correct.

The patch is nice.

It also adds a nice ThreadLocal class that allows defining a thread-local storage with default values. This avoids having to resort to hasattr() / getattr() magic due to non-existing attributes, and makes the code more readable. All the users of local have been changed to use ThreadLocal instead.

I recalled a few gc related bugs with subclasses of thread-local, but looking again, it doesn't seem we could be affected here (PythonBug:6990).

in reply to:  15 comment:16 by Remy Blank, 12 years ago

Replying to cboos:

Not necessarily a bad idea, maybe we just need a different method for this. What about having with_transaction() -> db, cursor and a new one with_db_for_write() -> db?

I think I prefer keeping a single entry point, it makes for a simpler API. Also, having to write cursor = db.cursor() at the beginning is not so annoying.

Yes, I think so, and after the simplification rename @cached_value -> @cached?

Exactly what I was going to suggest, too.

I have an even better implementation of with_transaction() that allows nesting uses with explicit db and implicit db arbitrarily, and to detect bad uses. Patch coming later tonight.

I recalled a few gc related bugs with subclasses of thread-local, but looking again, it doesn't seem we could be affected here (PythonBug:6990).

I didn't know that. I'll review that bug tonight as well.

by Remy Blank, 12 years ago

Allow arbitrary nesting of transactions. With unit tests.

comment:17 by Remy Blank, 12 years ago

The patch above has an improved implementation that allows arbitrary nesting of implicit and explicit transactions. It also detects invalid transaction nesting. A new test module adds unit tests for with_transaction().

The improvements to the cache module follow.

by Remy Blank, 12 years ago

Simplification of the cache code, applies on top of 9060-with-transaction-r9351.patch.

comment:18 by Remy Blank, 12 years ago

The patch above simplifies the cache code as suggested in comment:14 and comment:15. Unfortunately, this required converting CachedRepository.sync() to use with_transaction(). I believe the semantics are nearly correct, except for a db.rollback() that had to be kept in there, and would abort an enclosing transaction.

I have added a helper function get_read_db() in trac.db.util to get a database connection for reading. It returns the current connection if a transaction is in progress, otherwise it gets a new connection from the environment.

I'm quite happy about the cache simplification. But I'm a bit wary of the sync() changes, so a review would be most welcome.

comment:19 by Remy Blank, 12 years ago

On second thought, maybe it would be simpler to just wrap the content of sync() in a transaction with an explicit db:

def sync(self, clean):
    @with_transaction(self.env, self.env.get_db_cnx())
    def do_sync(db):
        ...

in reply to:  19 ; comment:20 by Christian Boos, 12 years ago

Replying to rblank:

On second thought, maybe it would be simpler to just wrap the content of sync() in a transaction with an explicit db:
@with_transaction(self.env, self.env.get_db_cnx())

I don't think this makes any difference in practice, because of the pool and the _active dictionary: env.get_db_cnx() gives you back the current connection for the thread, if any.

There's certainly some overlap here with what with_transaction does, so maybe we could unify both (e.g. make the pool use _transaction_local instead of the _active dict).

Also just imagine env.get_db_cnx() would not already do this, if you could get a different connection in the same thread this would be a good recipe for getting into a deadlock sooner or later…

This doesn't mean we shouldn't have tests which guard against that possibility of course, as we could one day introduce a regression in that area.

in reply to:  20 ; comment:21 by Remy Blank, 12 years ago

Replying to cboos:

I don't think this makes any difference in practice, because of the pool and the _active dictionary: env.get_db_cnx() gives you back the current connection for the thread, if any.

Sure. But it means fewer changes to sync(), hence less probability to introduce a bug :)

There's certainly some overlap here with what with_transaction does, so maybe we could unify both (e.g. make the pool use _transaction_local instead of the _active dict).

I also wondered about that. Could we even drop the pool altogether, and just keep one connection around for each thread in _transaction_local? Or is that what you meant?

in reply to:  21 ; comment:22 by Christian Boos, 12 years ago

Replying to rblank:

Replying to cboos:

I don't think this makes any difference in practice, because of the pool and the _active dictionary: env.get_db_cnx() gives you back the current connection for the thread, if any.

Sure. But it means fewer changes to sync(), hence less probability to introduce a bug :)

Also, doing a full sync in a single transaction would be too problematic (locking the db for a very long time, risk of getting the whole sync wasted in case of an interruption - but sure this could eventually fix #9129 ;-) ).

Or did you intend to keep the explicit commit in that case?

There's certainly some overlap here with what with_transaction does, so maybe we could unify both (e.g. make the pool use _transaction_local instead of the _active dict).

I also wondered about that. Could we even drop the pool altogether, and just keep one connection around for each thread in _transaction_local? Or is that what you meant?

No, I don't think it's a good idea to get rid of the cache and only keep a thread local connection, as we need a way to limit the overall number of opened connections and make it possible to reuse already opened connections. So I only meant to replace the _active dict inside the PooledConnection with the _transaction_local - but that could eventually be done in a second step. Let's focus on the first set of patches, I'll do a more thorough review later today.

in reply to:  22 comment:23 by Remy Blank, 12 years ago

Replying to cboos:

Or did you intend to keep the explicit commit in that case?

I intended to keep the code unchanged, but to wrap it in a @with_transaction so that nested transactions (in the cache code) get the db reference via _transaction_local. The commits and rollback would be kept as-is.

No, I don't think it's a good idea to get rid of the cache and only keep a thread local connection, as we need a way to limit the overall number of opened connections and make it possible to reuse already opened connections.

Makes sense.

by Remy Blank, 12 years ago

Same patch, but fewer changes in CachedRepository.sync().

comment:24 by Remy Blank, 12 years ago

This patch has the same functionality as the previous, but fewer changes in CachedRepository.sync(). The former should be better, as it ensures a db.rollback() is done in case of an error, but the latter is probably less "risky".

Let me know which one you prefer.

comment:25 by Christian Boos, 12 years ago

No forgotten, but I needed to get a few things "off my plate" first. I'll try to review and test ASAP.

comment:26 by Remy Blank, 12 years ago

Any news on this?

comment:27 by Christian Boos, 12 years ago

All the tests pass for the 3 database backends.

I've again tried to imagine a way to do without the db argument to with_transaction but I guess there's really no possibility yet. The next opportunity for simplification will be the move to context managers, and at that time we may change the code more significantly, perhaps by removing the Environment.get_db_cnx() so that the only way to get a connection would be via with_transaction (and get_read_db).

By the way, as both of these functions need an Environment, we could make them Environment methods or add methods that would call these functions, in order to keep the bulk of the code in trac.db.util. That would save the extra import and be more consistent with the idea of deprecating the get_db_cnx method in favor of them.

But let's commit 9060-with-transaction-r9351.patch first.

For 9060-cache-simplification.patch, a small update is needed:

  • trac/wiki/model.py

    diff --git a/trac/wiki/model.py b/trac/wiki/model.py
    a b class WikiPage(object):  
    177177
    178178            cursor.execute("UPDATE wiki SET name=%s WHERE name=%s",
    179179                           (new_name, old_name))
    180             WikiSystem(self.env).pages.invalidate(db)
     180            del WikiSystem(self.env).pages
    181181            from trac.attachment import Attachment
    182182            Attachment.reparent_all(self.env, 'wiki', old_name,
    183183                                    'wiki', new_name, db)

We need to rethink the CachedRepository.sync method (#9129), but here again, let's do that after the above patch has been committed.

comment:28 by Remy Blank, 12 years ago

Ok, I'll commit 9060-with-transaction-r9351.patch, and 9060-cache-simplification.patch with your proposed change (good catch!).

in reply to:  28 comment:29 by Christian Boos, 12 years ago

Replying to rblank:

(good catch!)

Thanks to the tests ;-)

in reply to:  27 ; comment:30 by Remy Blank, 12 years ago

Both patches committed in [9436].

Replying to cboos:

For 9060-cache-simplification.patch, a small update is needed:

Actually, I had that change already, together with the simplifications in the wiki rename part, in my sandbox. I forgot to post updated patches, sorry about that.

By the way, as both of these functions need an Environment, we could make them Environment methods or add methods that would call these functions, in order to keep the bulk of the code in trac.db.util. That would save the extra import and be more consistent with the idea of deprecating the get_db_cnx method in favor of them.

I thought along the same lines too, but finally preferred separating the database stuff from the environment as much as possible. The Environment class is already quite an "usine à gaz", so I'd rather remove stuff from there rather than pile up. I do understand the convenience aspect, though, especially for an often-used function like getting a database connection, so I'm ok with adding the two methods if you think it's worth it.

in reply to:  30 comment:31 by Christian Boos, 12 years ago

Resolution: fixed
Status: reopenedclosed

Replying to rblank:

Replying to cboos:

By the way, as both of these functions need an Environment, we could make them Environment methods or add methods that would call these functions, in order to keep the bulk of the code in trac.db.util. That would save the extra import and be more consistent with the idea of deprecating the get_db_cnx method in favor of them.

I thought along the same lines too, but finally preferred separating the database stuff from the environment as much as possible. The Environment class is already quite an "usine à gaz", so I'd rather remove stuff from there rather than pile up. I do understand the convenience aspect, though, especially for an often-used function like getting a database connection, so I'm ok with adding the two methods if you think it's worth it.

Yes, I think it's worth it. Added this in r9442, and a couple of other changes on this topic done in r9438 and r9439, which concludes this set of db related refactorings.

comment:32 by Christian Boos, 12 years ago

Owner: set to Christian Boos

(rolling dice … me)

in reply to:  32 comment:33 by Remy Blank, 12 years ago

Replying to cboos:

(rolling dice … me)

Heh, I'm still ahead by a few tickets anyway :-)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.