Edgewall Software

Opened 15 years ago

Last modified 9 years ago

#8443 closed defect

postgres and "IDLE in transaction" status — at Version 2

Reported by: mixedpuppy Owned by:
Priority: high Milestone:
Component: database backend Version: 0.11-stable
Severity: critical Keywords: postgresql
Cc: dfraser, pacopablo@… Branch:
Release Notes:
API Changes:
Internal Changes:

Description (last modified by Remy Blank)

I've noticed a lot of "IDLE in transaction" statuses on postgres connections from trac after a request is finished. The problem with this is that the connection essentially becomes unusable until a rollback or commit is executed, sometimes resulting in locked up requests. I've tracked it down, and have a hack of a fix for it, but first some background.

My feeling is that people may only see this depending on the installed plugins and how they are used, default trac may not exhibit the behaviour required to cause the problem though the bug is in trac. My reliable repro involves using the IncludePages plugin to include one wiki page inside another.

The database connection pool in trac is called with every env.get_db_cnx call. It has a list of "active" connections, which are connections currently in use, and then the pool. The typical pattern that happens when using a database connection looks like:

  • some function in trac is entered
  • get connection
  • if connection in active list for thread, use that
  • otherwise get one from pool (or created one)
  • a cursor is created
  • psycopg may execute a BEGIN statement, based on the default settings for the postgres server. The installed default for postgres causes psycopg to do that.
  • function in trac finishes
  • connection wrapper is garbage collected, causing the wrapper to "reinsert" the connection into the pool
  • if it is the last "reference" to the connection, trac attempts a rollback (which typically matches the begin above)
  • if it is NOT the last "reference" to the connection, and no explicit rollback was done (eg. after an insert, trac/plugins usually do a commit) then the pool also does not rollback (THIS IS THE PROBLEM). The result is that the begin statement above remains in effect, causing "IDLE in transaction".

Fixing this in the pool would be the wrong place since you can never know if a rollback is safe on an "active" connection. I'm also wondering if the rollback in there currently is due to bugs reported on this issue before.

An easy repro is to use the includepages plugin for the wiki and include one wiki page within another. At the end of a request for that wiki page you will have the IDLE in transaction state on the connection.

Without requiring a repro, this illustrates how it happens:

def do_something(env):
    db = env.get_db_cnx()
    c = db.cursor()
    c.execute("SELECT foo FROM bar")
    # since we're the second reference, no
    # rollback is called, but since it's
    # a new cursor, BEGIN was called

def main(env):
    db = env.get_db_cnx()
    c = db.cursor()
    c.execute("SELECT bat FROM fab")
    do_something(env)
    # once we're done, db is reinserted 
    # to the pool and rollback is called

With the above outline, two BEGIN statements were executed, but only on rollback.

The simple hack fix is to modify main.dispatch_request to get a handle on the database connection prior to handling the request, then doing a rollback after the request:

    req = Request(environ, start_response)
    try:
        db = env.get_db_cnx()
        return _dispatch_request(req, env, env_error)
    finally:
        db.rollback()
        if env and not run_once:
            env.shutdown(threading._get_ident())

I call it a hack fix because it seems ugly to me, but it's the only place that this can be handled reliably without the potential to step on a commit that is in process, and without adding commit/rollback all over the trac (and plugins) code.

The fix does not break current transaction area's (eg. after an update/insert) since psycopg will issue a new BEGIN on the next execute if commit or rollback was called. Updates, Inserts, etc. should remain explicit about calling commit, since a select will not get the new data unless that is done.

It also seems like some extra work is happening in the current db pooling, such as creating a PooledConnection class for each call to env.get_db_cnx even if the connection is in the "active" list.

Looking at other bugs regarding this issue, it seems that is would also be a good idea to force close the connection at the end of a request IF the status is "in transaction" still. Log or raise and error with it as well. This would at least prevent problems around replication and vacuuming, and let people know there is a bug (likely a missing commit in some plugin).

Change History (3)

comment:1 by mixedpuppy, 15 years ago

formatting, formatting, formatting. sorry.

comment:2 by Remy Blank, 15 years ago

Description: modified (diff)

Indeed ;-)

by mixedpuppy, 15 years ago

Attachment: postgres-idle.patch added

postgres idle fix

Note: See TracTickets for help on using tickets.