#13634 closed defect (fixed)
cache invalidation for Environment::get_known_users can cause DB temp space exhaustion with MySQL backend
Reported by: | Owned by: | Jun Omae | |
---|---|---|---|
Priority: | normal | Milestone: | 1.6.1 |
Component: | database backend | Version: | 1.4.3 |
Severity: | normal | Keywords: | mysql temporary table known_users disk full |
Cc: | klaus.espenlaub@… | Branch: | |
Release Notes: |
Use |
||
API Changes: | |||
Internal Changes: |
Description
On larger Trac deployments (ours has several backend servers, each with multiple worker processes) with frequent accesses the cache invalidation logic essentially implements a trigger for a "thundering herd": when the generation counter is increased then all worker processes will rather quickly run the query in Environment::_known_users. Looks all normal and harmless on first sight, but:
MySQL creates a temporary table each for such SELECT DISTINCT ... ORDER BY ...
queries. What makes it worse is that the query has columns of type mediumtext and this forces the temporary table to go straight to disk. As long as the DB server CPU, IO and memory load is low this doesn't cause noticeable issues. With our approx. 200000 known users results in a temp table of about 8MB (taking about 2 seconds for the query which isn't too annoying because almost always it is cached). If this gets multiplied by ~35 (the number of Trac WSGI workers in our setup for a busy situation) then it still works just fine, just causes some slowdown due to IO. When there is other activity (such as viewing source with 'blame' etc.) then the refreshing of the known users can get to DB connections timing out.
That's the "point of no return": the query wasn't finished yet but the worker will issue another one. This growing avalanche of queries (all limited by IO performance and needing 8MB of temp table space each) will exhaust the disk space of the DB server at the maximum IO performance, using all CPU and memory resources on the way too. No wonder with about 1500 such queries running simultaneously at the end (they build up over 45 to 60 minutes in our case).
Once the disk space is exhausted the MySQL server goes into "offline mode", needing a server restart (until that happens Trac is obviously down). After the restart everything is perfectly working again. Temporary tables are automatically removed, therefore the disk usage is back to normal.
Proposed solution: remove the DISTINCT
qualifier from this query. Due to the design of the session
and session_attribute
tables I can't see how it could ever produce non-unique results.
Reference: MySQL 8.0 documentation about internal use of temporary tables
Attachments (4)
Change History (15)
comment:1 by , 12 months ago
Component: | general → database backend |
---|---|
Owner: | set to |
Status: | new → assigned |
by , 12 months ago
Attachment: | explain_analyze_select_distinct_order_by.txt added |
---|
EXPLAIN ANALYZE of the statement
by , 12 months ago
Attachment: | explain_analyze_select_order_by.txt added |
---|
EXPLAIN ANALYZE of the statement without DISTINCT
by , 12 months ago
Attachment: | explain_select_order_by.txt added |
---|
EXPLAIN of the statement without DISTINCT
comment:2 by , 12 months ago
Attached the requested information plus a bit more variation. This is with MySQL Database Service in the Oracle Cloud (not that I have any reason to believe that it would be significantly different to running MySQL Community Server yourself), which reports itself as 8.0.32-u3-cloud.
comment:3 by , 12 months ago
Milestone: | → 1.6.1 |
---|
Thanks for the quick response. Your proposed fix seems to be good.
comment:4 by , 12 months ago
With one big catch: I thought that this change will get our Trac deployment stable again. Which looked good for a while (made the change in our production environment over a day ago), but it's not solving the issue fully. The rate of temp disk table creation has decreased but not as much as I would've expected.
comment:5 by , 12 months ago
Could you please check performance schema and share the queries which are using tmp disk table if the performance schema is enabled in your database?
SELECT schema_name, digest_text, count_star AS n, sum_created_tmp_disk_tables AS tmp_disk_tables, sum_created_tmp_tables AS tmp_tables FROM performance_schema.events_statements_summary_by_digest WHERE sum_created_tmp_disk_tables > 0 \G
comment:6 by , 12 months ago
Ouch… guess we both saw the "using filesort" and didn't immediate realize what it means: the sorting is done on disk because it doesn't fit into the default sort buffer size of 256K. Thinking about what Trac really needs I decided to remove the ORDER by s.sid
part of the query. Checked all uses of Environment::get_known_users and found nothing which depended on it being sorted.
Bumping the sort buffer size is in principle also possible but would need a bit more effort (reconfiguring the DB server in the cloud, …). Something for when things are stable again and I can access this stuff myself.
The remaining queries needing temp disk tables are very rarely used (less than once every 5 minutes). The most frequent one in our setup is the one with digest
SELECT DISTINCTROW `rev` , ( CASE WHEN PATH IN (...) THEN PATH WHEN PATH LIKE ? ESCAPE ? THEN ? END ) AS PATH FROM `node_change` WHERE `repos` = ? AND `rev` >= ? AND `rev` <= ? AND ( PATH IN (...) OR PATH LIKE ? ESCAPE ? )
which appears to be the one in CachedRepository::_get_changed_revs. No immediate idea how it's triggered (couldn't spot any obvious callers) but it's the only candidate.
comment:7 by , 11 months ago
Removing the ORDER BY s.sid
leads incompatibility with docstring of Environment.get_known_users
. The docstring describes that ordered alpha-numerically by username at source:/tags/trac-1.6/trac/env.py@:708#L701.
Instead, we could use sorted()
/ list.sort()
like the following:
-
trac/env.py
diff --git a/trac/env.py b/trac/env.py index 043ce7f5b..a76525e19 100644
a b class Environment(Component, ComponentManager): 715 715 716 716 @cached 717 717 def _known_users(self): 718 return self.db_query(""" 719 SELECT DISTINCT s.sid, n.value, e.value 718 # Use sorted() instead of "ORDER BY s.sid" in order to avoid filesort 719 # caused by indexing only a prefix of column values on MySQL. 720 users = self.db_query(""" 721 SELECT s.sid, n.value, e.value 720 722 FROM session AS s 721 723 LEFT JOIN session_attribute AS n ON (n.sid=s.sid 722 724 AND n.authenticated=1 AND n.name = 'name') 723 725 LEFT JOIN session_attribute AS e ON (e.sid=s.sid 724 726 AND e.authenticated=1 AND e.name = 'email') 725 WHERE s.authenticated=1 ORDER BY s.sid727 WHERE s.authenticated=1 726 728 """) 729 return sorted(users, key=lambda u: u[0]) 727 730 728 731 @cached 729 732 def _known_users_dict(self): -
trac/web/session.py
diff --git a/trac/web/session.py b/trac/web/session.py index 14dbde704..64b2d6466 100644
a b class SessionAdmin(Component): 463 463 sids = {self._split_sid(sid) 464 464 for sid in sids 465 465 if sid not in ('anonymous', 'authenticated', '*')} 466 # Use sort() instead of "ORDER BY s.sid, s.authenticated" in order to 467 # avoid filesort caused by indexing only a prefix of column values on 468 # MySQL. 466 469 rows = self.env.db_query(""" 467 SELECT DISTINCT s.sid, s.authenticated, s.last_visit,468 n.value, e.value,h.value470 SELECT s.sid, s.authenticated, s.last_visit, n.value, e.value, 471 h.value 469 472 FROM session AS s 470 473 LEFT JOIN session_attribute AS n 471 474 ON (n.sid=s.sid AND n.authenticated=s.authenticated … … class SessionAdmin(Component): 476 479 LEFT JOIN session_attribute AS h 477 480 ON (h.sid=s.sid AND h.authenticated=s.authenticated 478 481 AND h.name='default_handler') 479 ORDER BY s.sid, s.authenticated480 482 """) 483 rows.sort(key=lambda u: (u[0], u[1])) 481 484 for sid, authenticated, last_visit, name, email, handler in rows: 482 485 if all_anon and not authenticated or all_auth and authenticated \ 483 486 or (sid, authenticated) in sids:
comment:8 by , 11 months ago
Yes, might be the clean solution (normally I'd be all for letting the DB do the sorting, but if that ends up "on disk" it won't be fast when the filesystem cache can't cover it all). I went for the easy solution because my priority is getting our Trac environment stable again. So far it's running for 3 days. Next week
We had MySQL in offline state on average every 2 days for the last 2 weeks. The start of those outages correlate somewhat but not really well with switching from mod_wsgi to gunicorn for running Trac - that successfully addressed the WSGI/Apache worker thread depletion issue which increasingly haunted us in the last 10 years in two installs (in the older we fully automated killing suspect workers to keep things going). I suspect that it's simply the slowly growing user count which pushed us into this "DB explosion" situation related to temp table usage in combination with high load spikes.
comment:9 by , 11 months ago
Oops, sent before finishing my edit. "Next week… we can hopefully declare it stable again".
comment:10 by , 11 months ago
Release Notes: | modified (diff) |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
comment:11 by , 11 months ago
With our approx. 200000 known users results …
If session
and session_attribute
tables have too many records, I'd suggest disabling purging old anonymous sessions to set 0
to [trac] anonymous_session_lifetime option, and purging the anonymous sessions in periodic job. See [trac] anonymous_session_lifetime and #12953.
Please post results of
explain analyze
with the following queries on your database.