Edgewall Software
Modify

Ticket #4984 (new defect)

Opened 5 years ago

Last modified 11 months ago

trac/mod_python doesn't reconnect to postgresql after db restart

Reported by: andrewdied@… Owned by: cboos
Priority: normal Milestone: next-minor-0.12.x
Component: database backend Version: 0.10.3
Severity: normal Keywords: postgresql
Cc: amalaev@…, jonas
Release Notes:
API Changes:

Description

This may be a reopening of #3394. I am using trac 0.10.3, python 2.5, apache 2.2.3, Suse 10.2, mod_python 3.2.10, postgresql 8.1.5, Pypgsql 2.5.1.

If I restart postgres while apache/trac is running, trac does not automatically reconnect once the database is back up.

Steps to reproduce:

  1. Have a running apache/mod_python/trac environment.
  2. Restart postgres, like "sudo /etc/init.d/postgres restart"
  3. Try to run a report, like http://example.com/trac/instancename/report/3

Expected results:

  1. The page would take slightly longer to display as trac reconnected to the database. Alternately, See one error message telling the user to reload the page, then trac would function normally.

Actual results:
The on-screen oops message is:

Python Traceback
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 387, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 183, in dispatch
    req.perm = PermissionCache(self.env, req.authname)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 263, in __init__
    self.perms = PermissionSystem(env).get_user_permissions(username)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 227, in get_user_permissions
    for perm in self.store.get_user_permissions(username):
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 110, in get_user_permissions
    cursor = db.cursor()
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2599, in cursor
    return Cursor(self, name, isRefCursor)
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2718, in __init__
    self.conn._Connection__setupTransaction()
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2510, in __setupTransaction
    self.conn.query("BEGIN WORK")
OperationalError: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.

In the trac log file:

2007-03-20 08:11:21,499 Trac[main] ERROR: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 387, in dispatch_request
    dispatcher.dispatch(req)
  File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 183, in dispatch
    req.perm = PermissionCache(self.env, req.authname)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 263, in __init__
    self.perms = PermissionSystem(env).get_user_permissions(username)
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 227, in get_user_permissions
    for perm in self.store.get_user_permissions(username):
  File "/usr/lib/python2.5/site-packages/trac/perm.py", line 110, in get_user_permissions
    cursor = db.cursor()
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
    return IterableCursor(self.cnx.cursor())
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2599, in cursor
    return Cursor(self, name, isRefCursor)
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2718, in __init__
    self.conn._Connection__setupTransaction()
  File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2510, in __setupTransaction
    self.conn.query("BEGIN WORK")
OperationalError: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

Attachments

dont-add-failing-connections-to-the-pool.diff (2.8 KB) - added by tlk@… 3 years ago.
do not add failing connections to the pool (0.11-stable)

Download all attachments as: .zip

Change History

comment:1 Changed 5 years ago by cboos

  • Keywords postgresql added
  • Milestone set to 0.10.5

Agreed, we should handle this kind of error more gracefully.

comment:2 Changed 3 years ago by cboos

  • Milestone changed from 0.10.6 to 0.11.3

#7797 was closed as duplicate.

comment:3 Changed 3 years ago by amalaev@…

  • Cc amalaev@… added

comment:4 Changed 3 years ago by anonymous

Has anyone had the time to look at this bug?

I can confirm that it is still happening in Debian Lenny.

comment:5 Changed 3 years ago by tlk

postgresql 8.3, psycopg2, trac 0.11.4 - running through tracd

Restarting postgresql, and then visiting trac results in an error message like the reporter described. In addition, tracd also outputs the following on a single line:

Exception psycopg2.InterfaceError: 
'connection already closed' in 
<bound method PooledConnection.__del__ of 
<trac.db.pool.PooledConnection object at 
0x2aaaabd09290>> ignored

However, if the page is reloaded it works and subsequent http requests also works as expected.

comment:6 Changed 3 years ago by tlk

Can reproduce running through mod_python. The error shows up at random times.

It looks like a problem with ConnectionPoolBackend?._pool still holding the PostgreSQLConnection(s) that was closed during database restart.

Changed 3 years ago by tlk@…

do not add failing connections to the pool (0.11-stable)

comment:7 Changed 3 years ago by tlk

The patch wont hide connection failures but it will prevent the pool from reusing connections already know to fail.

comment:8 Changed 3 years ago by cboos

  • Milestone changed from 0.11.6 to 0.11.5

The patch looks good, thanks. I'd like jonas to have a look on this.

Also, I wonder if we couldn't try an auto-reconnect once before failing.

comment:9 Changed 3 years ago by anonymous

Agreed, it would be nice to be able to auto-reconnect once before failing. We could create a new connection in the cursor method if an exception is thrown, but currently the PooledConnection? does not have any knowledge about what kind of database connection it holds, so it would be necessary to add information about that. Also, we need to make sure not messing up _pool in ConnectionPoolBackend?.

Perhaps a cleaner solution is to add another class e.g. ReliableConnection?(ConnectionWrapper?) encapsulating a PooledConnection?. Then everything related to database error handling can be kept inside the ReliableConnection? class.

comment:10 Changed 3 years ago by tlk

Another issue is how database errors are presented to the user. My suggestion is to introduce a DatabaseError?(Exception) and handle these kind of exceptions in the user interface layers.

comment:11 Changed 3 years ago by tlk

also see #6348 (Catch database exceptions in a backend neutral way)

comment:12 Changed 3 years ago by cboos

  • Cc jonas added

I'd like to get an "ack" from jonas on this patch. The change itself might also interfere with #8443.

comment:13 Changed 3 years ago by cboos

  • Owner changed from jonas to cboos

Doing self.close() in PooledConnection seems a bit too much, we might recover from a failed query in the user code and retry another query
The close should instead be done when we decide to not reuse the connection.

As mentioned in comment:11, the changes here are also a good step in direction of #6348.

comment:14 Changed 3 years ago by cboos

  • Component changed from general to database backend

comment:15 Changed 2 years ago by cboos

  • Milestone changed from 0.11.6 to next-minor-0.12.x

Postponed.

comment:16 Changed 2 years ago by anonymous

This issue still happens on 11.6 version. We need a solution.

Regards.

comment:17 Changed 11 months ago by anonymous

This problem still exists in Trac 0.12.2 using mod_wsgi 3.3, and is incredibly annoying.

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as new
as The resolution will be set. Next status will be 'closed'
to The owner will be changed from cboos. Next status will be 'new'
The owner will be changed from cboos to anonymous. Next status will be 'assigned'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.