Opened 16 years ago
Closed 4 years ago
#7785 closed defect (worksforme)
Trac slow / unreachable objects / database locked
Reported by: | cmarschalek | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | general | Version: | 0.11.1 |
Severity: | normal | Keywords: | performance database locked svn apr_terminate |
Cc: | Emmanuel Blot, felix.schwarz@… | Branch: | |
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description (last modified by )
Since I've upgraded trac to 0.11 (0.11.1 didn't fix this) it became very slow.
If you open a few changesets in tabs, they sometimes take very long to get ready, but when they do, they all load at the same time. While they're loading the apache process takes up pretty much the whole cpu. The speed somewhat improves but trac is a lot slower as a whole since the upgrade. Nothing else has changed on the server. Looking at my trac.log file I see lots of unreachable objects messages like:
2008-11-05 10:37:26,680 Trac[main] DEBUG: 880 unreachable objects found. 2008-11-05 10:37:29,217 Trac[main] DEBUG: Dispatching <Request "GET u'/changeset/12445'"> 2008-11-05 10:37:29,233 Trac[session] DEBUG: Retrieving session for ID u'marschalek' 2008-11-05 10:37:29,250 Trac[chrome] DEBUG: Prepare chrome data for request 2008-11-05 10:37:36,715 Trac[main] DEBUG: 1533 unreachable objects found. 2008-11-05 10:37:37,171 Trac[main] DEBUG: Dispatching <Request "GET u'/changeset/12446'"> 2008-11-05 10:37:37,195 Trac[session] DEBUG: Retrieving session for ID u'marschalek' 2008-11-05 10:37:37,211 Trac[chrome] DEBUG: Prepare chrome data for request 2008-11-05 10:37:39,828 Trac[main] DEBUG: 694 unreachable objects found.
Could this be the problem? At least the timestamps show a significant increase in time.
Rarely I also get database locked errors, which go away after a short while. Here's the last one I got:
How to Reproduce
While doing a GET operation on /timeline
, Trac issued an internal error.
(please provide additional details here)
System Information
Trac | 0.11.1
|
Python | 2.3.3 (#1, Nov 2 2005, 16:31:23) [GCC 3.3.3 (SuSE Linux)]
|
setuptools | 0.6c8
|
SQLite | 2.8.12
|
pysqlite | 1.0.1
|
Genshi | 0.5
|
mod_python | < 3.2
|
Subversion | 1.3.0 (r17949)
|
Python Traceback
Traceback (most recent call last): File "/usr/lib/python2.3/site-packages/Trac-0.11.1-py2.3.egg/trac/web/main.py", line 423, in _dispatch_request dispatcher.dispatch(req) File "/usr/lib/python2.3/site-packages/Trac-0.11.1-py2.3.egg/trac/web/main.py", line 222, in dispatch req.session.save() File "/usr/lib/python2.3/site-packages/Trac-0.11.1-py2.3.egg/trac/web/session.py", line 97, in save (self.sid,)) File "/usr/lib/python2.3/site-packages/Trac-0.11.1-py2.3.egg/trac/db/util.py", line 50, in execute return self.cursor.execute(sql_escape_percent(sql), args) File "/usr/lib/python2.3/site-packages/sqlite/main.py", line 237, in execute self.con._begin() File "/usr/lib/python2.3/site-packages/sqlite/main.py", line 503, in _begin self.db.execute("BEGIN") OperationalError: database is locked
I don't know if these problems are related though.
Issue Summary
The first report (above) was about a perceived slowness when rendering big changesets, after an upgrade from 0.10 to 0.11. Well, that's Genshi mainly, see #7975 for the ticket addressing specifically this point.
Then, in comment:12, some specific bug involving the svn bindings and apr_terminate
, which was later fixed in comment:16.
comment:18 reported another, similar error, involving apr_pool_destroy
this time. The latter is quite obscure and hard to reproduce, the present ticket remains open for this problem.
Attachments (0)
Change History (44)
comment:1 by , 16 years ago
Version: | none → 0.11.1 |
---|
comment:2 by , 16 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
Except for Trac, you seem to have very outdated versions of everything :-)
Please upgrade to SQLite 3 and Pysqlite 2 (see the PySqlite page), for a start. Getting mod_python 3.3.1 or one of the recommended versions of mod_wsgi might help as well.
Python 2.5 or 2.6 would also presumably give you a boost (and have a reasonably recent version of SQLite/Pysqlite2 bundled).
Don't hesitate to reopen if you still have performance issues after those updates.
comment:3 by , 16 years ago
Keywords: | database added; databse unreachable objects removed |
---|
Some follow-ups, explaining why it's now slower than it used to be.
Replying to cmarschalek:
Since I've upgraded trac to 0.11 (0.11.1 didn't fix this) it became very slow.
If you open a few changesets in tabs, they sometimes take very long to get ready, but when they do, they all load at the same time.
The 0.10 → 0.11 upgrade also introduced the switch from ClearSilver to Genshi. While more flexible and feature rich, this template engine is also much slower and it especially shows for big pages likes changeset pages can be (probably up to 10 times slower in this case).
Looking at my trac.log file I see lots of unreachable objects messages like:
Not indicative of a problem, this is just feedback from the garbage collector (those unreachable objects are collected).
follow-up: 5 comment:4 by , 16 years ago
Resolution: | worksforme |
---|---|
Status: | closed → reopened |
I have the exact same problem here when upgrading from Debian/Etch (trac 0.10.3) to Debian/Lenny (trac 0.11.1) with the default sqlite3 backend. Is there anything I can do? I've read that switching to MySQL would probably fix this database locked issue but I've read as well that the MySQL backend is actually slower than the sqlite3 backend …
comment:5 by , 16 years ago
Keywords: | needinfo added |
---|
What are the versions of the various libraries used? If it's really the "exact same problem" as originally reported here, then it's not sqlite3 that you're using but sqlite2 … (see description).
Besides, what kind of web front-end are you using? mod_python? mod_wsgi? Which versions? In any case, please report how tracd performs in comparison.
This is likely a duplicate of #7490, but please first answer on this ticket.
comment:6 by , 16 years ago
Resolution: | → worksforme |
---|---|
Status: | reopened → closed |
Well, if someone feels like reopening that ticket again, then please:
- subscribe to the ticket CC:
- be prepared to answer to the questions asked (unless you provide by yourself all the relevant information, of course, but that never happen the first time ;-) )
comment:7 by , 16 years ago
Resolution: | worksforme |
---|---|
Status: | closed → reopened |
follow-up: 9 comment:8 by , 16 years ago
So let give a try:
I went for a day off on friday, our Trac installation just stop working with the infamous "Database is locked" error message - again ;-)
Which kind of information would you need?
- Trac environment:
Trac: 0.12dev-r7811 Python: 2.5.2 (r252:60911, Jan 4 2009, 17:40:26) [GCC 4.3.2] setuptools: 0.6c8 SQLite: 3.5.9 pysqlite: 2.4.1 Genshi: 0.6dev-r1052 Pygments: 0.10 Subversion: 1.5.6 (r36142) jQuery: 1.2.6
- OS
Debian Lenny, with some "testing" or "unstable" packages, such as SVN 1.5.6
- Apache
Server Settings Server Version: Apache/2.2.9 (Debian) DAV/2 SVN/1.5.6 mod_fastcgi/2.4.6 Server Built: Jan 20 2009 17:25:53 Module Magic Number: 20051115:15 Timeouts: connection: 300 keep-alive: 300 MPM Name: Worker MPM Information: Max Daemons: 3 Threaded: yes Forked: yes Server Architecture: 32-bit Server Root: /etc/apache2 Config File: /etc/apache2/apache2.conf Server Built With: -D APACHE_MPM_DIR="server/mpm/worker" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D HTTPD_ROOT="" -D SUEXEC_BIN="/usr/lib/apache2/suexec" -D DEFAULT_PIDLOG="/var/run/apache2.pid" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="/etc/apache2/mime.types" -D SERVER_CONFIG_FILE="/etc/apache2/apache2.conf"
I guess one of the main cause of the "database is locked" issue is the Bitten plugin.
However, when I came back this morning, one of trac.fcgi process was eating 99.9% cpu time, and /etc/init.d/apache stop
has not been able to shut it down: I had to use kill -9 to knock it down.
follow-up: 10 comment:9 by , 16 years ago
Replying to eblot:
I guess one of the main cause of the "database is locked" issue is the Bitten plugin.
Does it keep connections opened for a long time? Consider using fetchall()
in places where it does a lot of processing while iterating over cursor results.
However, when I came back this morning, one of trac.fcgi process was eating 99.9% cpu time, and
/etc/init.d/apache stop
has not been able to shut it down: I had to use kill -9 to knock it down.
Too bad… gdb python + attaching that process could have shed some light on this…
If you're already using trunk, then one option would be to switch to MultiRepos and use the new hooks (fresh from today ;-) ) to perform the sync()
offline instead of for each request. That would presumably put much less stress on the SQLite database. I don't know how Bitten operates, but maybe if it could be driven by the hooks (i.e. implement IChangesetListener), that would also help.
This is not to say that having such fcgi process going crazy is not a bug … but it's hard to tell what really happens without a backtrace. Most likely a timeout, then some busy loop trying to recover the situation?
follow-ups: 11 12 14 comment:10 by , 16 years ago
Replying to cboos:
Replying to eblot:
I guess one of the main cause of the "database is locked" issue is the Bitten plugin.
Does it keep connections opened for a long time? Consider using
fetchall()
in places where it does a lot of processing while iterating over cursor results.
I have not been through this part of the code, but I guess so. I had to ditch mod_python in favor of mod_fastcgi, as w/ mod_python, "database is locked" errors used to occur several time an hour. Since I moved to mod_fastcgi and I've filter out the SVN checkout message, this kind of error occurs only a very few times a month.
Too bad… gdb python + attaching that process could have shed some light on this…
Yes, but people were waiting to commit their work. Next time, I'll try to run gdb.
If you're already using trunk, then one option would be to switch to MultiRepos and use the new hooks (fresh from today ;-) ) to perform the
sync()
offline instead of for each request.
I have not had a look at the hooks yet, however if these are commit hooks, I have some work to do first: we use our own hooks, and they cannot be removed. I'll have a look and keep you posted.
I'm not sure to understand how the MultiRepos can help here (?).
Most likely a timeout, then some busy loop trying to recover the situation?
I can't tell, but next time I'll try to get the call stack.
comment:11 by , 16 years ago
Replying to eblot:
I have not had a look at the hooks yet, however if these are commit hooks, I have some work to do first: we use our own hooks, and they cannot be removed. I'll have a look and keep you posted.
Yes, this is actually a script to be called from the svn post-commit hook (source:sandbox/multirepos/contrib/trac-svn-post-commit-hook). Integrating it in an existing commit hook should always be possible (if not, I'm interested to know why).
I'm not sure to understand how the MultiRepos can help here (?).
Because there, we're going to get rid of the call to repos.sync()
at the beginning of each request. Well, it's not finalized yet (see discussion at the bottom of #7723) but for now simply remove or comment out the preprocess request call in RepositoryManager.
comment:12 by , 16 years ago
Keywords: | svn apr_terminate added; needinfo removed |
---|---|
Milestone: | → 0.11.5 |
Priority: | normal → high |
Replying to eblot:
(…) next time I'll try to get the call stack.
Here we go:
(gdb) bt #0 0xb7f4c424 in __kernel_vsyscall () #1 0xb7f35bc9 in __lll_lock_wait () from /lib/i686/cmov/libpthread.so.0 #2 0xb7f30ff9 in _L_lock_89 () from /lib/i686/cmov/libpthread.so.0 #3 0xb7f30902 in pthread_mutex_lock () from /lib/i686/cmov/libpthread.so.0 #4 0xb53139e1 in apr_thread_mutex_lock () from /usr/lib/libapr-1.so.0 #5 0xb5314988 in apr_pool_destroy () from /usr/lib/libapr-1.so.0 #6 0xb5314935 in apr_pool_destroy () from /usr/lib/libapr-1.so.0 #7 0xb53145a7 in apr_pool_terminate () from /usr/lib/libapr-1.so.0 #8 0xb53174ec in apr_terminate () from /usr/lib/libapr-1.so.0 #9 0xb7dc8b59 in exit () from /lib/i686/cmov/libc.so.6 #10 0xb7db077d in __libc_start_main () from /lib/i686/cmov/libc.so.6 #11 0x08058661 in _start ()
apr_terminate
called by an atexit
handler, once the fcgi application has been told to exit (#9), hangs.
comment:13 by , 16 years ago
Cc: | added |
---|
It looks like the swig bindings for svn do an atexit(apr_terminate)
since 1.3.0 or so (in swigutil_py.c).
Maybe we shouldn't call it ourselves explicitly, then. Please apply the following change on your server and see if the problem happens again (you could send a few SIGHUPs to the python process after having browsed the repository, to exercise the code).
-
trac/versioncontrol/svn_fs.py
76 76 core.svn_node_file: Node.FILE} 77 77 # Protect svn.core methods from GC 78 78 Pool.apr_pool_clear = staticmethod(core.apr_pool_clear) 79 Pool.apr_terminate = staticmethod(core.apr_terminate)80 79 Pool.apr_pool_destroy = staticmethod(core.apr_pool_destroy) 81 80 82 81 def _to_svn(*args): … … 199 198 # Clear application pool and terminate APR if necessary 200 199 if not self._parent_pool: 201 200 application_pool = None 202 self.apr_terminate()203 201 204 202 self._mark_invalid()
follow-up: 15 comment:14 by , 16 years ago
Replying to eblot:
Replying to cboos:
Replying to eblot:
I guess one of the main cause of the "database is locked" issue is the Bitten plugin.
Does it keep connections opened for a long time? Consider using
fetchall()
in places where it does a lot of processing while iterating over cursor results.I have not been through this part of the code, but I guess so. I had to ditch mod_python in favor of mod_fastcgi, as w/ mod_python, "database is locked" errors used to occur several time an hour. Since I moved to mod_fastcgi and I've filter out the SVN checkout message, this kind of error occurs only a very few times a month.
did you ever try mod_wsgi as well? since we switched to it we do not have hanging apache threads any more eating up the cpu, which makes it the same speed as mod_python, just more stable.
comment:15 by , 15 years ago
Replying to anonymous:
did you ever try mod_wsgi as well? since we switched to it we do not have hanging apache threads any more eating up the cpu, which makes it the same speed as mod_python, just more stable.
I moved away mod_python, as it made Trac pretty unusable when Bitten plugin was enabled ("Database is locked"). I never gave a try to mod_wsgi, but does it offer better performances than mod_fastcgi?
follow-up: 32 comment:16 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
From eblot feedback, it seems that the problem was fixed by the patch comment:13.
Patch committed in [8281].
Of course, please reopen if the problem reappears (and if so, with a gdb backtrace please, like in comment:12! This is the only way we could eventually fix something)
comment:17 by , 15 years ago
Owner: | set to |
---|
comment:18 by , 15 years ago
Milestone: | 0.11.5 → 0.11.6 |
---|---|
Resolution: | fixed |
Status: | closed → reopened |
The recent troubles on t.e.o (slowness and all) might be related to this problem, which has not been completely fixed by r8281 (t.e.o is currently running r8354).
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9375 www-data 20 0 2051m 1.1g 4964 S 100 37.8 142:54.16 trac.fcgi
So 100% CPU, for more than two hours… That couldn't possibly be Genshi, right?
No, it's our good friend apr_pool_destroy
:
(gdb) bt 2 #0 0x00007fcf03641b04 in __lll_lock_wait () from /lib/libpthread.so.0 3 #1 0x00007fcf0363d1a0 in _L_lock_102 () from /lib/libpthread.so.0 4 #2 0x00007fcf0363caae in pthread_mutex_lock () from /lib/libpthread.so.0 5 #3 0x00007fcef3c15b22 in apr_thread_mutex_lock () from /usr/lib/libapr-1.so.0 6 #4 0x00007fcef3c175bf in ?? () from /usr/lib/libapr-1.so.0 7 #5 0x00007fcef3c17d36 in apr_pool_destroy () from /usr/lib/libapr-1.so.0 8 #6 0x00007fcef3c17c4f in apr_pool_destroy () from /usr/lib/libapr-1.so.0 9 #7 0x00007fcef3c17c4f in apr_pool_destroy () from /usr/lib/libapr-1.so.0 10 #8 0x00007fcef3c17c4f in apr_pool_destroy () from /usr/lib/libapr-1.so.0 11 #9 0x00007fcef3c17c4f in apr_pool_destroy () from /usr/lib/libapr-1.so.0 12 #10 0x00007fcef325f3c4 in ?? () from /var/lib/python-support/python2.5/libsvn/_core.so 13 #11 0x0000000000491052 in PyEval_EvalFrameEx () 14 #12 0x00000000004920e2 in PyEval_EvalFrameEx () 15 #13 0x00000000004927cd in PyEval_EvalCodeEx () 16 #14 0x00000000004dd5b9 in ?? () 17 #15 0x0000000000418c33 in PyObject_Call () 18 #16 0x000000000048f927 in PyEval_EvalFrameEx () 19 #17 0x00000000004927cd in PyEval_EvalCodeEx () 20 #18 0x00000000004dd4c2 in ?? () 21 #19 0x0000000000418c33 in PyObject_Call () 22 #20 0x000000000048a852 in PyEval_CallObjectWithKeywords () 23 #21 0x00000000004b1e8e in Py_Finalize () 24 #22 0x000000000041431b in Py_Main () 25 #23 0x00007fcf02c54466 in __libc_start_main () from /lib/libc.so.6 26 #24 0x0000000000413ca9 in _start ()
But this backtrace is significantly different from the one in comment:12 (no apr_terminate
).
Note that we're about to exit the server, in Py_Finalize
. This must be some clean-up code within the SWIG bindings calling this, probably on the binding's global pool.
comment:19 by , 15 years ago
See also the list of libraries.
I noticed there: -
Loaded symbols for /usr/lib/libldap_r-2.4.so.2 ... Loaded symbols for /usr/lib/libsasl2.so.2
Hm, why ldap here? Probably because of libpq (PostgreSQL client library can be configured --with-ldap
support), but libmysqlclient_r was also loaded.
There was a report of a similar hang on the Subversion-devel list, svnserve: lock up when using SASL/LDAP. Unfortunately no follow-up. It might be related, don't know yet, still looking for clues.
comment:20 by , 15 years ago
Besides, from subversion/bindings/swig/python/libsvn/core.py
:
class apr_pool_t: """Proxy of C apr_pool_t struct""" #... def destroy(self): """Destroy embedded memory pool. If you do not destroy the memory pool manually, Python will destroy it automatically.""" global application_pool self.assert_valid() is_application_pool = not self._parent_pool # Destroy pool self._apr_pool_destroy(self) # Clear application pool if necessary if is_application_pool: application_pool = None self._svn_swig_py_clear_application_pool() # Mark self as invalid if hasattr(self, "_parent_pool"): del self._parent_pool if hasattr(self, "_is_valid"): del self._is_valid def __del__(self): """Automatically destroy memory pools, if necessary""" if self.valid(): self.destroy()
So we might want to invalidate that global pool before it's getting destroyed automatically, apparently too late.
comment:21 by , 15 years ago
Cc: | added |
---|
comment:22 by , 15 years ago
jonas: about r8281, munin's IRQ stats show a similar peak back in March, so before that changeset. Though it's not possible to known if this was the same issue we're seeing now, my feeling is that it is.
follow-up: 24 comment:23 by , 15 years ago
Don't know if it is any help, but since I reported this in the first place, and we didn't yet upgrade the server (not that easy…):
My issues seem to be related to opening rather large changesets, or when I open a few in seperate tabs. But it's really hard to pinpoint… Here's what top showed while the tabs where loading:
top - 18:04:22 up 388 days, 12:58, 1 user, load average: 4.86, 1.71, 0.61 Tasks: 68 total, 1 running, 67 sleeping, 0 stopped, 0 zombie Cpu(s): 2.0% us, 1.0% sy, 0.0% ni, 0.0% id, 96.7% wa, 0.3% hi, 0.0% si Mem: 507792k total, 502728k used, 5064k free, 7456k buffers Swap: 1020088k total, 103036k used, 917052k free, 73688k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6557 wwwrun 15 0 106m 66m 8880 D 0.7 13.5 2:31.24 httpd2-prefork 14225 wwwrun 15 0 89224 57m 8672 D 0.7 11.5 1:00.90 httpd2-prefork 14443 wwwrun 15 0 101m 66m 8672 D 0.3 13.3 1:16.92 httpd2-prefork 20251 wwwrun 15 0 87636 56m 8932 D 0.3 11.4 1:05.53 httpd2-prefork 27381 wwwrun 18 0 71996 43m 8648 D 0.3 8.7 0:08.97 httpd2-prefork 27382 wwwrun 18 0 70744 41m 8628 D 0.3 8.4 0:04.22 httpd2-prefork
Interesting is the percentage the cpu is waiting…
follow-up: 25 comment:24 by , 15 years ago
Milestone: | 0.11.6 → 0.11.7 |
---|
Replying to anonymous:
Don't know if it is any help, but since I reported this in the first place, and we didn't yet upgrade the server (not that easy…):
My issues seem to be related to opening rather large changesets,
This is a known problem with Genshi, see explanations in ticket:7029#comment:1.
You may want to update to Genshi's trunk, which should be a bit faster (and that's an "easy" update). Still, upgrading from whatever Trac 0.11.x you have to 0.11.6dev (or wait a bit for the 0.11.6 release proper) shouldn't be a problem, no? Or if it is a problem, then it's our problem and we're interested to know what we did that prevents you to upgrade from one minor version to the next ;-)
or when I open a few in separate tabs. But it's really hard to pinpoint… Here's what top showed while the tabs where loading: … Interesting is the percentage the cpu is waiting…
As usual, it would make sense to hook into those processes with gdb and see what's going on, otherwise it's a bit too hard to guess…
For the issue discussed here in comment:18, we've not see that happen again on t.e.o, so no real way to troubleshoot that either.
On hold.
follow-up: 26 comment:25 by , 15 years ago
Replying to cboos:
You may want to update to Genshi's trunk, which should be a bit faster (and that's an "easy" update). Still, upgrading from whatever Trac 0.11.x you have to 0.11.6dev (or wait a bit for the 0.11.6 release proper) shouldn't be a problem, no? Or if it is a problem, then it's our problem and we're interested to know what we did that prevents you to upgrade from one minor version to the next ;-)
I ment the upgrade you suggested here: #comment:2
Our trac is always up to date, and upgrading works just fine ;)
As usual, it would make sense to hook into those processes with gdb and see what's going on, otherwise it's a bit too hard to guess…
I'll try doing that!
comment:26 by , 15 years ago
Replying to cmarschalek:
Replying to cboos: I ment the upgrade you suggested here: #comment:2
Our trac is always up to date, and upgrading works just fine ;)
Ah great, thanks for the clarification!
But still, don't expect the best performances if you stick to using sqlite 2… Note that we removed support for it in 0.12 and 0.11.6 will give you a deprecation warning.
The upgrade is really easy (PySqlite#UpgradingSQLitefrom2.xto3.x) and installing the new pysqlite bindings should be easy as well.
mod_python < 3.2 is possibly what gives you the main performance trouble, after all.
Installing mod_wsgi is also really straightforward (./configure --with-apxs=/yourapacheinstall/bin/apxs; make install
), and can coexist with your legacy mod_python apps if any.
As usual, it would make sense to hook into those processes with gdb and see what's going on, otherwise it's a bit too hard to guess…
I'll try doing that!
comment:27 by , 15 years ago
I always planned on upgrading the whole server to a newer os-version, but I'll try upgrading the components, as you suggested. In the meantime, here's a bt from on of the http2-prefork processes. Due to lack of debugging information, the bt is a little scarce, but sqlite seems to be a problem (we have >17000 changesets btw).
(gdb) bt #0 0xffffe410 in ?? () #1 0xbfffcbb8 in ?? () #2 0x00000400 in ?? () #3 0x095e8400 in ?? () #4 0x40349e93 in __read_nocancel () from /lib/tls/libpthread.so.0 #5 0x4249aa63 in sqliteOsRead () from /usr/lib/libsqlite.so.0 #6 0x4249d1ac in sqlitepager_get () from /usr/lib/libsqlite.so.0 #7 0x4248666e in moveToChild () from /usr/lib/libsqlite.so.0 #8 0x42486799 in moveToLeftmost () from /usr/lib/libsqlite.so.0 #9 0x424b2898 in sqliteVdbeExec () from /usr/lib/libsqlite.so.0 #10 0x4248ef90 in sqliteExec () from /usr/lib/libsqlite.so.0 #11 0x424a0806 in sqliteParser () from /usr/lib/libsqlite.so.0 #12 0x424a9461 in sqliteRunParser () from /usr/lib/libsqlite.so.0 #13 0x42499d51 in sqliteMain () from /usr/lib/libsqlite.so.0 #14 0x424623c9 in _con_execute () from /usr/lib/python2.3/site-packages/_sqlite.so #15 0x4061d293 in PyCFunction_Call () from /usr/lib/apache2/mod_python.so #16 0x4064be34 in eval_frame () from /usr/lib/apache2/mod_python.so #17 0x4064d6ae in PyEval_EvalCodeEx () from /usr/lib/apache2/mod_python.so #18 0x4064bdce in eval_frame () from /usr/lib/apache2/mod_python.so #19 0x4064d6ae in PyEval_EvalCodeEx () from /usr/lib/apache2/mod_python.so #20 0x4064bdce in eval_frame () from /usr/lib/apache2/mod_python.so #21 0x4064d6ae in PyEval_EvalCodeEx () from /usr/lib/apache2/mod_python.so #22 0x4064bdce in eval_frame () from /usr/lib/apache2/mod_python.so #23 0x4064d6ae in PyEval_EvalCodeEx () from /usr/lib/apache2/mod_python.so #24 0x4064bdce in eval_frame () from /usr/lib/apache2/mod_python.so #25 0x4064d6ae in PyEval_EvalCodeEx () from /usr/lib/apache2/mod_python.so #26 0x4064bdce in eval_frame () from /usr/lib/apache2/mod_python.so #27 0x4064d050 in eval_frame () from /usr/lib/apache2/mod_python.so #28 0x4064d050 in eval_frame () from /usr/lib/apache2/mod_python.so #29 0x4064d050 in eval_frame () from /usr/lib/apache2/mod_python.so #30 0x4064d050 in eval_frame () from /usr/lib/apache2/mod_python.so #31 0x4064d050 in eval_frame () from /usr/lib/apache2/mod_python.so #32 0x4064d050 in eval_frame () from /usr/lib/apache2/mod_python.so #33 0x4064d6ae in PyEval_EvalCodeEx () from /usr/lib/apache2/mod_python.so #34 0x4068ca5a in function_call () from /usr/lib/apache2/mod_python.so #35 0x40603077 in PyObject_Call () from /usr/lib/apache2/mod_python.so #36 0x4060acab in instancemethod_call () from /usr/lib/apache2/mod_python.so #37 0x40603077 in PyObject_Call () from /usr/lib/apache2/mod_python.so #38 0x4060394f in PyObject_CallMethod () from /usr/lib/apache2/mod_python.so #39 0x405ffaf4 in python_handler () from /usr/lib/apache2/mod_python.so #40 0x08076ce8 in ap_run_handler () #41 0x080a11a0 in num_request_notes () #42 0x080bc690 in ?? () #43 0x080a24a8 in __JCR_LIST__ () #44 0x08c326c0 in ?? () #45 0x08c326c0 in ?? () #46 0xbfffeeb8 in ?? () #47 0x0807a537 in ap_invoke_handler () Previous frame inner to this frame (corrupt stack?) (gdb)
comment:28 by , 15 years ago
I'm not sure the above backtrace is significative of that problem, but you may simply experience the poor handling of concurrent queries that SQLite / PySqlite used to have.
Upgrading to a recent SQLite 3, Pysqlite 2.5.x and the "eager cursor" behavior of Trac 0.11.6dev (#3446) would certainly be a win for you.
comment:29 by , 15 years ago
Seems like upgrading sqlite and pysqlite has boosted performance a lot (and was quite easy). Now I wonder how much the "eager cursor" in 0.11.6 will help.
Thanks for now!
comment:30 by , 15 years ago
What's also interesting is, that subsequent clicks on the same changset load A LOT faster than the initial click!
comment:31 by , 15 years ago
Description: | modified (diff) |
---|---|
Milestone: | 0.11.7 → not applicable |
Ok, it didn't happen since a while and as it's not something we can reproduce at will, I'll move that ticket out of sight…
comment:32 by , 15 years ago
Milestone: | not applicable → next-minor-0.12.x |
---|
Replying to cboos:
From eblot feedback, it seems that the problem was fixed by the patch comment:13.
… reopen if the problem reappears
Avoiding to call apr_terminate
was not enough. What actually happens is that there's a deadlock between two apr_pool_destroy
calls.
(gdb) bt #0 0x00007f6b21f54b04 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x00007f6b21f501a0 in _L_lock_102 () from /lib/libpthread.so.0 #2 0x00007f6b21f4faae in pthread_mutex_lock () from /lib/libpthread.so.0 #3 0x00007f6b168c9b22 in apr_thread_mutex_lock () from /usr/lib/libapr-1.so.0 #4 0x00007f6b168cbca6 in apr_pool_destroy () from /usr/lib/libapr-1.so.0 #5 0x00007f6b168cbc4f in apr_pool_destroy () from /usr/lib/libapr-1.so.0 #6 0x00007f6b168cb874 in apr_pool_terminate () from /usr/lib/libapr-1.so.0 #7 0x00007f6b168cd13c in apr_terminate () from /usr/lib/libapr-1.so.0 #8 0x00007f6b2157f6ad in exit () from /lib/libc.so.6 #9 0x00007f6b2156746d in __libc_start_main () from /lib/libc.so.6 #10 0x0000000000413ca9 in _start ()
(gdb) info threads 5 Thread 0x43cc6950 (LWP 20364) 0x00007f6b21f542f1 in sem_wait () from /lib/libpthread.so.0 4 Thread 0x434c5950 (LWP 20365) 0x00007f6b21f542f1 in sem_wait () from /lib/libpthread.so.0 3 Thread 0x44cc8950 (LWP 20366) 0x00007f6b168cb635 in ?? () from /usr/lib/libapr-1.so.0 2 Thread 0x454c9950 (LWP 24553) 0x00007f6b21f542f1 in sem_wait () from /lib/libpthread.so.0 1 Thread 0x7f6b223776e0 (LWP 13309) 0x00007f6b21f54b04 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) thread apply 3 bt Thread 3 (Thread 0x44cc8950 (LWP 20366)): #0 0x00007f6b168cb635 in ?? () from /usr/lib/libapr-1.so.0 #1 0x00007f6b168cbd36 in apr_pool_destroy () from /usr/lib/libapr-1.so.0 #2 0x00007f6b15f133c4 in ?? () from /var/lib/python-support/python2.5/libsvn/_core.so #3 0x0000000000491052 in PyEval_EvalFrameEx () #4 0x00000000004920e2 in PyEval_EvalFrameEx () #5 0x00000000004927cd in PyEval_EvalCodeEx () #6 0x00000000004dd4c2 in ?? () #7 0x0000000000418c33 in PyObject_Call () #8 0x000000000041fb08 in ?? () #9 0x0000000000418c33 in PyObject_Call () #10 0x000000000048a852 in PyEval_CallObjectWithKeywords () #11 0x0000000000422c3b in ?? () #12 0x0000000000446f1f in PyDict_DelItem () #13 0x0000000000420854 in ?? () #14 0x000000000044a719 in PyObject_SetAttr () #15 0x000000000048df2a in PyEval_EvalFrameEx () #16 0x00000000004920e2 in PyEval_EvalFrameEx () #17 0x00000000004927cd in PyEval_EvalCodeEx () #18 0x00000000004dd4c2 in ?? () #19 0x0000000000418c33 in PyObject_Call () #20 0x000000000041fb08 in ?? () #21 0x0000000000418c33 in PyObject_Call () #22 0x000000000048a852 in PyEval_CallObjectWithKeywords () #23 0x0000000000422c3b in ?? () #24 0x0000000000446d23 in ?? () #25 0x000000000045f47b in ?? () #26 0x0000000000446d23 in ?? () #27 0x000000000045f47b in ?? () #28 0x00000000004dbcd3 in ?? () #29 0x00000000004db02a in ?? () #30 0x000000000043b84e in ?? () #31 0x000000000043bd90 in ?? () #32 0x000000000046df6b in ?? () #33 0x0000000000418c33 in PyObject_Call () #34 0x000000000048f582 in PyEval_EvalFrameEx () #35 0x00000000004920e2 in PyEval_EvalFrameEx () #36 0x00000000004920e2 in PyEval_EvalFrameEx () #37 0x00000000004920e2 in PyEval_EvalFrameEx () #38 0x00000000004927cd in PyEval_EvalCodeEx () #39 0x0000000000490cf3 in PyEval_EvalFrameEx () #40 0x00000000004927cd in PyEval_EvalCodeEx () #41 0x0000000000490cf3 in PyEval_EvalFrameEx () #42 0x00000000004920e2 in PyEval_EvalFrameEx () #43 0x00000000004920e2 in PyEval_EvalFrameEx () #44 0x00000000004920e2 in PyEval_EvalFrameEx () #45 0x00000000004920e2 in PyEval_EvalFrameEx () #46 0x00000000004927cd in PyEval_EvalCodeEx () #47 0x00000000004dd4c2 in ?? () #48 0x0000000000418c33 in PyObject_Call () #49 0x000000000041fb08 in ?? () #50 0x0000000000418c33 in PyObject_Call () #51 0x000000000048a852 in PyEval_CallObjectWithKeywords () #52 0x00000000004be71d in ?? () #53 0x00007f6b21f4e3ea in start_thread () from /lib/libpthread.so.0 #54 0x00007f6b2162fcbd in clone () from /lib/libc.so.6 #55 0x0000000000000000 in ?? ()
comment:33 by , 15 years ago
Interesting, I just got the following backtrace + crash on Windows, when stopping tracd with CTRL-C:
Error in atexit._run_exitfuncs: Traceback (most recent call last): File "C:\Dev\Python261\lib\atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "C:\Dev\Python261\lib\site-packages\svn\core.py", line 75, in _cleanup_application_pool application_pool.destroy() File "C:\Dev\Python261\lib\site-packages\libsvn\core.py", line 6119, in destroy self._svn_swig_py_clear_application_pool() KeyboardInterrupt Error in sys.exitfunc: Traceback (most recent call last): File "C:\Dev\Python261\lib\atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "C:\Dev\Python261\lib\site-packages\svn\core.py", line 75, in _cleanup_application_pool application_pool.destroy() File "C:\Dev\Python261\lib\site-packages\libsvn\core.py", line 6119, in destroy self._svn_swig_py_clear_application_pool() KeyboardInterrupt
That's the first time, so this is a rare event given the number of times I did this in the past couple years, but maybe there's a connection with the first backtrace seen in comment:32 above.
comment:34 by , 15 years ago
Nobody just payed attention in the client side? Just try to load the page with "nc" (netcat).
In my problem (I guess it is not the same as it bug but may be related in the client side) I noticed a very very slow page load in all pages of the trac 0.12b1 with all the rest (Python, SQLite, etc) updated. A very very slow page loading in all sections (no matters where). I just guessed it is a slowdown in trac.
I'm using trac.fcg in an apache web server with mod-fastcgi. It was working perfect in my old server (software, the hardware is new 4GB RAM dual channel and so). I just do an software upgrade (Installed Ubuntu 9.10 server) and reconfigured all the system to the new distro (I was using LFS an oldest one 5 years old).
I noticed that loading trac.fcg pages was very slow from my own computer!
I tried with top, gdb, and a lot of other workarounds to take account in what happening there. top just showed 0% cpu load most of the time. The global system load was near 0. So?? Where is the problem? The system have full power to server the request and? Maybe a network problem… I tested the communication with fastcgi spawned processes and all is ok/faster enought! Too fast for my like!
So I decided to load that page from netcat:
nc servidor 80 GET /page/trac.fcg/proyect
What I get is the page what I was espectind but very fast! So the problem is my client navigator: Firefox. I tested with Epiphany and yes, Epiphany loaded it very faster. Continiung the tests I noticed a firefox halt in:
<script type="text/javascript">/*<![CDATA[*/ jQuery(document).ready(function($) { var desc = $("form.newticket textarea"); desc.text(desc.eq(0).text() .replace(/==== System Information ====\s+/m, "User Agent was: `" + navigator.userAgent + "`\n\n$&") .replace(/\|\|\s+==== Enabled Plugins ====/m, "||\n|| '''`jQuery`''' || `" + $().jquery + "` $&")); }); /*]]>*/</script>
mmm, that maybe the problem.
I will try updating jquery so including an updated version in my own Trac system. Why I'm writting this here? Because you need to pay attention too to the client side. Some slowdowns may be clientside scripts, so you must too correct that scripts for that. And why not rewrite some portions to be more webbrowser friendly.
comment:35 by , 15 years ago
I investigated my case in the clientside and was a problem specific with firefox (tested with google chrome, epiphany, and netcat in command line working well). Not tested with any more browsers but I discovered the problem. In fact, it is clientside but too server related. A combination of the deflate module in apache and Trac with fastcgi does not work properly in firefox. I guess it is because firefox is the only webbrowser what I have supporting for gzip encoded data. I used wireshark to discover that. And just discovered that firefox accepted gzip encoding receiving some files like CSS from the server encoded in gzip instead in plaintext form. Giving it a size near 4156 bytes… vs 17K that is actually the file is a great choice but:
I recommend to test the combination of mod_deflate with mod_fastcgi and the fcgi_frontend from Trac. Because that was my case of a slow Trac webpage loading in firefox. As a workaround avoid using that combination.
follow-up: 37 comment:36 by , 14 years ago
can someone help me …i've problem in my trac…trac application is not able to accesss so i need to restart the trac application… can you please help me how i can troubleshoot it. when i had check in syslog it shows 0 unreachable objects found.
comment:37 by , 14 years ago
Replying to anonymous:
can someone help me …
Hijacking a random ticket is definitely not the proper way to get some help. Please read the InstallationIssue page.
comment:38 by , 10 years ago
Milestone: | next-minor-0.12.x → next-stable-1.0.x |
---|
comment:39 by , 9 years ago
Owner: | removed |
---|---|
Status: | reopened → new |
comment:40 by , 9 years ago
Keywords: | performance added; slow removed |
---|
comment:41 by , 9 years ago
Priority: | high → normal |
---|
Issue not reported in over 5 years, workaround available: changed priority from high to normal.
comment:42 by , 8 years ago
Milestone: | next-stable-1.0.x → next-stable-1.2.x |
---|
Moved ticket assigned to next-stable-1.0.x since maintenance of 1.0.x is coming to a close. Please move the ticket back if it's critical to fix on 1.0.x.
comment:43 by , 5 years ago
Milestone: | next-stable-1.2.x → next-stable-1.4.x |
---|
comment:44 by , 4 years ago
Milestone: | next-stable-1.4.x |
---|---|
Resolution: | → worksforme |
Status: | new → closed |
I'm going to go ahead and close this one since most or all of the discussion is quite old. We can create new tickets if we have performance issues to address.
forgot to set the version