Ticket #4984 (new defect)
Opened 5 years ago
Last modified 11 months ago
trac/mod_python doesn't reconnect to postgresql after db restart
| Reported by: | andrewdied@… | Owned by: | cboos |
|---|---|---|---|
| Priority: | normal | Milestone: | next-minor-0.12.x |
| Component: | database backend | Version: | 0.10.3 |
| Severity: | normal | Keywords: | postgresql |
| Cc: | amalaev@…, jonas | ||
| Release Notes: | |||
| API Changes: | |||
Description
This may be a reopening of #3394. I am using trac 0.10.3, python 2.5, apache 2.2.3, Suse 10.2, mod_python 3.2.10, postgresql 8.1.5, Pypgsql 2.5.1.
If I restart postgres while apache/trac is running, trac does not automatically reconnect once the database is back up.
Steps to reproduce:
- Have a running apache/mod_python/trac environment.
- Restart postgres, like "sudo /etc/init.d/postgres restart"
- Try to run a report, like http://example.com/trac/instancename/report/3
Expected results:
- The page would take slightly longer to display as trac reconnected to the database. Alternately, See one error message telling the user to reload the page, then trac would function normally.
Actual results:
The on-screen oops message is:
Python Traceback
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 387, in dispatch_request
dispatcher.dispatch(req)
File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 183, in dispatch
req.perm = PermissionCache(self.env, req.authname)
File "/usr/lib/python2.5/site-packages/trac/perm.py", line 263, in __init__
self.perms = PermissionSystem(env).get_user_permissions(username)
File "/usr/lib/python2.5/site-packages/trac/perm.py", line 227, in get_user_permissions
for perm in self.store.get_user_permissions(username):
File "/usr/lib/python2.5/site-packages/trac/perm.py", line 110, in get_user_permissions
cursor = db.cursor()
File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
return IterableCursor(self.cnx.cursor())
File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
return IterableCursor(self.cnx.cursor())
File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2599, in cursor
return Cursor(self, name, isRefCursor)
File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2718, in __init__
self.conn._Connection__setupTransaction()
File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2510, in __setupTransaction
self.conn.query("BEGIN WORK")
OperationalError: FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
In the trac log file:
2007-03-20 08:11:21,499 Trac[main] ERROR: FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 387, in dispatch_request
dispatcher.dispatch(req)
File "/usr/lib/python2.5/site-packages/trac/web/main.py", line 183, in dispatch
req.perm = PermissionCache(self.env, req.authname)
File "/usr/lib/python2.5/site-packages/trac/perm.py", line 263, in __init__
self.perms = PermissionSystem(env).get_user_permissions(username)
File "/usr/lib/python2.5/site-packages/trac/perm.py", line 227, in get_user_permissions
for perm in self.store.get_user_permissions(username):
File "/usr/lib/python2.5/site-packages/trac/perm.py", line 110, in get_user_permissions
cursor = db.cursor()
File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
return IterableCursor(self.cnx.cursor())
File "/usr/lib/python2.5/site-packages/trac/db/util.py", line 78, in cursor
return IterableCursor(self.cnx.cursor())
File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2599, in cursor
return Cursor(self, name, isRefCursor)
File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2718, in __init__
self.conn._Connection__setupTransaction()
File "/usr/local/lib/python2.5/site-packages/pyPgSQL/PgSQL.py", line 2510, in __setupTransaction
self.conn.query("BEGIN WORK")
OperationalError: FATAL: terminating connection due to administrator command
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Attachments
Change History
comment:1 Changed 5 years ago by cboos
- Keywords postgresql added
- Milestone set to 0.10.5
comment:2 Changed 3 years ago by cboos
- Milestone changed from 0.10.6 to 0.11.3
#7797 was closed as duplicate.
comment:3 Changed 3 years ago by amalaev@…
- Cc amalaev@… added
comment:4 Changed 3 years ago by anonymous
Has anyone had the time to look at this bug?
I can confirm that it is still happening in Debian Lenny.
comment:5 Changed 3 years ago by tlk
postgresql 8.3, psycopg2, trac 0.11.4 - running through tracd
Restarting postgresql, and then visiting trac results in an error message like the reporter described. In addition, tracd also outputs the following on a single line:
Exception psycopg2.InterfaceError: 'connection already closed' in <bound method PooledConnection.__del__ of <trac.db.pool.PooledConnection object at 0x2aaaabd09290>> ignored
However, if the page is reloaded it works and subsequent http requests also works as expected.
comment:6 Changed 3 years ago by tlk
Can reproduce running through mod_python. The error shows up at random times.
It looks like a problem with ConnectionPoolBackend?._pool still holding the PostgreSQLConnection(s) that was closed during database restart.
Changed 3 years ago by tlk@…
- Attachment dont-add-failing-connections-to-the-pool.diff added
do not add failing connections to the pool (0.11-stable)
comment:7 Changed 3 years ago by tlk
The patch wont hide connection failures but it will prevent the pool from reusing connections already know to fail.
comment:8 Changed 3 years ago by cboos
- Milestone changed from 0.11.6 to 0.11.5
The patch looks good, thanks. I'd like jonas to have a look on this.
Also, I wonder if we couldn't try an auto-reconnect once before failing.
comment:9 Changed 3 years ago by anonymous
Agreed, it would be nice to be able to auto-reconnect once before failing. We could create a new connection in the cursor method if an exception is thrown, but currently the PooledConnection? does not have any knowledge about what kind of database connection it holds, so it would be necessary to add information about that. Also, we need to make sure not messing up _pool in ConnectionPoolBackend?.
Perhaps a cleaner solution is to add another class e.g. ReliableConnection?(ConnectionWrapper?) encapsulating a PooledConnection?. Then everything related to database error handling can be kept inside the ReliableConnection? class.
comment:10 Changed 3 years ago by tlk
Another issue is how database errors are presented to the user. My suggestion is to introduce a DatabaseError?(Exception) and handle these kind of exceptions in the user interface layers.
comment:11 Changed 3 years ago by tlk
also see #6348 (Catch database exceptions in a backend neutral way)
comment:12 Changed 3 years ago by cboos
- Cc jonas added
I'd like to get an "ack" from jonas on this patch. The change itself might also interfere with #8443.
comment:13 Changed 3 years ago by cboos
- Owner changed from jonas to cboos
Doing self.close() in PooledConnection seems a bit too much, we might recover from a failed query in the user code and retry another query
The close should instead be done when we decide to not reuse the connection.
As mentioned in comment:11, the changes here are also a good step in direction of #6348.
comment:14 Changed 3 years ago by cboos
- Component changed from general to database backend
comment:15 Changed 2 years ago by cboos
- Milestone changed from 0.11.6 to next-minor-0.12.x
Postponed.
comment:16 Changed 2 years ago by anonymous
This issue still happens on 11.6 version. We need a solution.
Regards.
comment:17 Changed 11 months ago by anonymous
This problem still exists in Trac 0.12.2 using mod_wsgi 3.3, and is incredibly annoying.



Agreed, we should handle this kind of error more gracefully.