[Patch] Trac crashing under MySQL
|Reported by:||Owned by:|
We experienced a strange bug in which trac 0.10.4 using the mysql backend would sometimes crash, with the following traceback:
Traceback (most recent call last): File "trac/web/main.py", line 406, in dispatch_request File "trac/web/main.py", line 182, in dispatch File "trac/web/main.py", line 146, in authenticate File "TracAccountManager-0.1.3dev_r2171-py2.4.egg/acct_mgr/web_ui.py", line 334, in wrap File "TracAccountManager-0.1.3dev_r2171-py2.4.egg/acct_mgr/web_ui.py", line 345, in authenticate File "trac/web/auth.py", line 69, in authenticate File "trac/web/auth.py", line 174, in _get_name_for_cookie File "trac/env.py", line 203, in get_db_cnx File "trac/db/api.py", line 76, in get_connection File "trac/db/pool.py", line 96, in get_cnx File "trac/db/pool.py", line 61, in try_rollback File "trac/db/mysql_backend.py", line 186, in close ProgrammingError: closing a closed connection
Sleuthing the root cause was real tricky; long story short, the attached patch solves the problem , at least for us.
Below are excerpts from a thread on our internal mailing list, giving information about what the problem was and why the patch solves it (the credit for tracking this down goes to our in-house systems god, Rick Warner):
Been doing a bit more research into this.
MySQL has a verb 'EXPLAIN' to show a bit about how it will evaluate a SELECT statement. Partly useful by itself, but more useful is that document page for EXPLAIN. Apparently one of the major factors when doing SELECTS using variables from multiple tables is what is in the INDEX. From the manual: "MySQL can use indexes on columns more efficiently if they are declared as the same type and size". For some reason the common variable in the trac tables session and session_attributes, 'sid', is type 'text'. 'text' seems like a strange choice; it is similar to 'varchar' but there is no trailing-space removal as there is for 'varchar'. More problematic for using 'text' in an index: "if a |TEXT| column is indexed, index entry comparisons are space-padded at the end."
Looking at our trac instance, the primary key in the session table is (sid(111), authenticated) and for the session_attributes table the primary key is (sid(116), authenticated, name(111)). So sid in the indexes for the two tables are two different length values, and since 'text' indexes are space padded there is a problem. By dropping the primary key on session_attribute and creating one with the same definition except that the length for sid matched that in the session table index I was able to drop the time for the problematic SELECT from 4.5 minutes to 4 minutes. Still not good enough.
Next step. Drop the primary key for both tables, change the column 'sid' to be VARCHAR, and then recreate the indices. Time for the same SELECT is now sub 1 sec.
Any reason why 'sid' is text?
The '1205' error comes from MySQL; it happens whenever a transaction cannot complete within the configured timeout period. This is normal transaction processing; MySQL leaves it to the application as to how it handles the situation with the best practice to retry some N number of times. I have not delved into the Trac source code, so not sure what it does in the situation.
Timeouts usually occur because table(s) are locked by some other operation. Since we seem to be having problems, the question arises: what is locking the table(s) for so long?
I just started some rudimentary logging on MySQL to see what is happening under the skin. Since I turned the logging on about an hour ago there have been a few slow 'queries'; most seem to be related to session logging. Several have taken around 30sec and are 'UPDATE session' statements. The longest two so far are 'DELETE FROM session_attribute' statements that have taken 270 and 280 secs. Yep, tied things up for over 4 minutes at a whack deleting stuff. Of course, the statements should not take that long, so we need to look at what indexes are set and see if we can optimize these.
The offending DELETE from statement was reworked by Mike and Aaron, and the reworked version change made on forge1. Let us know if you anyone sees the '1205' error (i.e., MySQL timeout).
For those interested, the structure of the statement was such that it processed 6.5M rows in a brute force fashion to delete records from tables with 4K and 16K rows.
Change History (10)
comment:5 by , 5 years ago
|Summary:||trac crashing under mysql, w/ fix → [Patch] Trac crashing under MySQL|