Edgewall Software
Modify

Opened 12 years ago

Closed 3 years ago

#4984 closed defect (fixed)

trac/mod_python doesn't reconnect to postgresql after db restart

Reported by: andrewdied@… Owned by: Jun Omae
Priority: normal Milestone: 1.0.12
Component: database backend Version: 0.10.3
Severity: normal Keywords: postgresql patch pool
Cc: amalaev@…, Jonas Borgström
Release Notes:

Reconnect to PostgreSQL server after restarting it.

API Changes:

Description

This may be a reopening of #3394. I am using trac 0.10.3, python 2.5, apache 2.2.3, Suse 10.2, mod_python 3.2.10, postgresql 8.1.5, Pypgsql 2.5.1.

If I restart postgres while apache/trac is running, trac does not automatically reconnect once the database is back up.

Steps to reproduce:

  1. Have a running apache/mod_python/trac environment.
  2. Restart postgres, like "sudo /etc/init.d/postgres restart"
  3. Try to run a report, like http://example.com/trac/instancename/report/3

Expected results:

  1. The page would take slightly longer to display as trac reconnected to the database. Alternately, See one error message telling the user to reload the page, then trac would function normally.

Actual results: The on-screen oops message is:

Python Traceback
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 387, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 183, in dispatch
    req.perm = PermissionCache(self.env, req.authname)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 263, in __init__
    self.perms = PermissionSystem(env).get_user_permissions(username)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 227, in get_user_permissions
    for perm in self.store.get_user_permissions(username):
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 110, in get_user_permissions
    cursor = db.cursor()
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2599, in cursor
    return Cursor(self, name, isRefCursor)
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2718, in __init__
    self.conn._Connection__setupTransaction()
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2510, in __setupTransaction
    self.conn.query("BEGIN WORK")
OperationalError: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.

In the trac log file:

2007-03-20 08:11:21,499 Trac[main] ERROR: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 387, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 183, in dispatch
    req.perm = PermissionCache(self.env, req.authname)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 263, in __init__
    self.perms = PermissionSystem(env).get_user_permissions(username)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 227, in get_user_permissions
    for perm in self.store.get_user_permissions(username):
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 110, in get_user_permissions
    cursor = db.cursor()
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2599, in cursor
    return Cursor(self, name, isRefCursor)
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2718, in __init__
    self.conn._Connection__setupTransaction()
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2510, in __setupTransaction
    self.conn.query("BEGIN WORK")
OperationalError: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

Attachments (1)

dont-add-failing-connections-to-the-pool.diff (2.8 KB ) - added by tlk@… 9 years ago.
do not add failing connections to the pool (0.11-stable)

Download all attachments as: .zip

Change History (25)

comment:1 Changed 12 years ago by Christian Boos

Keywords: postgresql added
Milestone: 0.10.5

Agreed, we should handle this kind of error more gracefully.

comment:2 Changed 10 years ago by Christian Boos

Milestone: 0.10.60.11.3

#7797 was closed as duplicate.

comment:3 Changed 10 years ago by amalaev@…

Cc: amalaev@… added

comment:4 Changed 10 years ago by anonymous

Has anyone had the time to look at this bug?

I can confirm that it is still happening in Debian Lenny.

comment:5 Changed 9 years ago by tlk

postgresql 8.3, psycopg2, trac 0.11.4 - running through tracd

Restarting postgresql, and then visiting trac results in an error message like the reporter described. In addition, tracd also outputs the following on a single line:

Exception psycopg2.InterfaceError: 
'connection already closed' in 
<bound method PooledConnection.__del__ of 
<trac.db.pool.PooledConnection object at 
0x2aaaabd09290>> ignored

However, if the page is reloaded it works and subsequent http requests also works as expected.

comment:6 Changed 9 years ago by tlk

Can reproduce running through mod_python. The error shows up at random times.

It looks like a problem with ConnectionPoolBackend._pool still holding the PostgreSQLConnection(s) that was closed during database restart.

Last edited 3 years ago by Ryan J Ollos (previous) (diff)

Changed 9 years ago by tlk@…

do not add failing connections to the pool (0.11-stable)

comment:7 Changed 9 years ago by tlk

The patch wont hide connection failures but it will prevent the pool from reusing connections already know to fail.

comment:8 Changed 9 years ago by Christian Boos

Milestone: 0.11.60.11.5

The patch looks good, thanks. I'd like jonas to have a look on this.

Also, I wonder if we couldn't try an auto-reconnect once before failing.

comment:9 Changed 9 years ago by anonymous

Agreed, it would be nice to be able to auto-reconnect once before failing. We could create a new connection in the cursor method if an exception is thrown, but currently the PooledConnection does not have any knowledge about what kind of database connection it holds, so it would be necessary to add information about that. Also, we need to make sure not messing up _pool in ConnectionPoolBackend.

Perhaps a cleaner solution is to add another class e.g. ReliableConnection(ConnectionWrapper) encapsulating a PooledConnection. Then everything related to database error handling can be kept inside the ReliableConnection class.

Last edited 3 years ago by Ryan J Ollos (previous) (diff)

comment:10 Changed 9 years ago by tlk

Another issue is how database errors are presented to the user. My suggestion is to introduce a DatabaseError(Exception) and handle these kind of exceptions in the user interface layers.

Last edited 3 years ago by Ryan J Ollos (previous) (diff)

comment:11 Changed 9 years ago by tlk

also see #6348 (Catch database exceptions in a backend neutral way)

comment:12 Changed 9 years ago by Christian Boos

Cc: Jonas Borgström added

I'd like to get an "ack" from jonas on this patch. The change itself might also interfere with #8443.

comment:13 Changed 9 years ago by Christian Boos

Owner: changed from Jonas Borgström to Christian Boos

Doing self.close() in PooledConnection seems a bit too much, we might recover from a failed query in the user code and retry another query The close should instead be done when we decide to not reuse the connection.

As mentioned in comment:11, the changes here are also a good step in direction of #6348.

comment:14 Changed 9 years ago by Christian Boos

Component: generaldatabase backend

comment:15 Changed 9 years ago by Christian Boos

Milestone: 0.11.6next-minor-0.12.x

Postponed.

comment:16 Changed 9 years ago by anonymous

This issue still happens on 11.6 version. We need a solution.

Regards.

comment:17 Changed 8 years ago by anonymous

This problem still exists in Trac 0.12.2 using mod_wsgi 3.3, and is incredibly annoying.

comment:18 Changed 4 years ago by Ryan J Ollos

Milestone: next-minor-0.12.xnext-stable-1.0.x

comment:19 Changed 4 years ago by Ryan J Ollos

Owner: Christian Boos deleted

comment:20 Changed 3 years ago by figaro

Keywords: patch added

comment:21 Changed 3 years ago by Jun Omae

Keywords: pool added

I think we could simply solve it by adding ping() to PostgreSQLConnection class like this:

  • trac/db/postgres_backend.py

    diff --git a/trac/db/postgres_backend.py b/trac/db/postgres_backend.py
    index 4f66292c7..833919221 100644
    a b class PostgreSQLConnection(ConnectionWrapper):  
    290290    def like_escape(self, text):
    291291        return _like_escape_re.sub(r'/\1', text)
    292292
     293    def ping(self):
     294        cursor = self.cnx.cursor()
     295        cursor.execute('SELECT 1')
     296
    293297    def prefix_match(self):
    294298        """Return a case sensitive prefix-matching operator."""
    295299        return "LIKE %s ESCAPE '/'"

The following is logged after restarting postgresql:

2016-05-24 19:30:38,591 Trac[pool] ERROR: Exception caught on ping
Traceback (most recent call last):
  File "/src/tracdev/git/trac/db/pool.py", line 101, in get_cnx
    cnx.ping()
  File "/src/tracdev/git/trac/db/postgres_backend.py", line 295, in ping
    cursor.execute('SELECT 1')
OperationalError: terminating connection due to administrator command
SSL connection has been closed unexpectedly

Last edited 3 years ago by Jun Omae (previous) (diff)

comment:22 Changed 3 years ago by Ryan J Ollos

Looks like a good idea as far as I can tell. I guess this would make PostgreSQLConnection consistent with MySQLConnection as MySQLdb must already implement ping: tags/trac-1.0.11/trac/db/mysql_backend.py@:383#L382.

comment:23 Changed 3 years ago by Jun Omae

Milestone: next-stable-1.0.x1.0.12
Owner: set to Jun Omae
Status: newassigned

Thanks. I'll push the change to 1.0-stable.

comment:24 Changed 3 years ago by Jun Omae

Release Notes: modified (diff)
Resolution: fixed
Status: assignedclosed

Committed in [14843] and merged to trunk in [14844].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Jun Omae.
The resolution will be deleted.
to The owner will be changed from Jun Omae to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.