Edgewall Software

Ticket #4399 (new defect)

Opened 21 months ago

Last modified 3 days ago

trac.fcgi process memory usage increases with HTTP hits

Reported by: Pistos Owned by: jonas
Priority: high Milestone: not applicable
Component: general Version: 0.10.3
Severity: critical Keywords: memory
Cc: docwhat@…

Description

# ps aux | grep trac | grep -v grep
apache   28347  1.0  1.8  51164 13816 ?        S    22:39   0:01 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.0  1.8  51164 13816 ?        S    22:39   0:01 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.2  1.8  51752 14420 ?        S    22:39   0:01 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.4  1.9  51992 14600 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.6  1.9  52104 14760 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.7  1.9  52320 14848 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.8  1.9  52320 14856 ?        S    22:39   0:02 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi
# ps aux | grep trac | grep -v grep
apache   28347  1.8  1.9  52320 14944 ?        S    22:39   0:03 /usr/bin/python /var/www/localhost/cgi-bin/trac.fcgi

Above shows the trac.fcgi process going up in memory usage after every page refresh of a subtree in my svn repository. I only noticed this behaviour two or three weeks ago. I also have several lines in /var/log/messages indicating the Linux OOM killer killing off trac.fcgi processes when I did not notice them myself to SIGKILL them; these by-kernel kills are spread out maybe once every one or two days.

At first I thought this had to do with the svn problems that were fixed from 0.10.2 -> 0.10.3, but I just upgraded, and the behaviour remains. The memory also increases when refreshing a ticket list.

I periodically upgrade the packages on my system, so that might have caused something. e.g. a Python upgrade, library upgrade, or somesuch.

This is on a Gentoo server (kernel version 2.6.17), running FastCGI (2.4.0) under Apache (2.0.58). Python version 2.4.3.

Attachments

Change History

Changed 21 months ago by Pistos

  • summary changed from trac.fcgi process memory usage increases with repo browser hits to trac.fcgi process memory usage increases with HTTP hits

Changed 19 months ago by anonymous

Any thoughts on this issue? It remains a problem up to today, and I am having to SIGKILL trac processes at least once a day.

Changed 19 months ago by Pistos

Looks like ticket:4081 is related.

Changed 19 months ago by cboos

  • keywords memory added
  • severity changed from normal to major
  • milestone set to none

Is the PySqlite db backend involved in this case? If yes, what versions of the bindings and the sqlite library itself are in use?

Changed 19 months ago by Pistos

Yes, it's an SQLite backend. sqlite-3.3.5, pysqlite-2.3.1.

Changed 19 months ago by Pistos

It may also be worth mentioning that this behaviour was not always the case. I've run trac in the past before without this issue. I think that was with the 0.9 series.

Changed 19 months ago by Pistos

Further data: When I only SIGTERM the process instead of SIGKILL it, it grows in memory at an alarming rate. In the area of 1-2 MB per second.

Changed 18 months ago by Pistos

I've used an SQLite to PostgreSQL Trac converter script to change from SQLite to PostgreSQL as my backend.

It doesn't help the problem. I am still having to SIGKILL two to five times a day (or else trac.fcgi processes consume 15-40% of my RAM). This is very annoying and inconvenient...

Changed 17 months ago by Pistos

Switching to .cgi and/or re-emerging trac with the postgres USE flag (in Gentoo) seems to have made the problem go away. So, it looks like there may be a problem with FastCGI, my FastCGI settings, and/or Trac's usage of FastCGI.

Changed 17 months ago by mgood

Memory will never accumulate when using CGI since it starts a new process for each request, so the processes are very short-lived.

Changed 4 months ago by esm-trac@…

Seeing the same behavior here as well, with trac 0.10.4 running under lighttpd as a fastcgi process: a single refresh of any trac page grows the RSS of the process by anywhere from 80-256k.

This is new behavior since switching to fastcgi; on previous hosting, I was using mod_python without any noticable issues.

Changed 3 months ago by docwhat@…

  • cc docwhat@… added

Changed 3 months ago by Joschi

  • priority changed from normal to high
  • severity changed from major to critical

yep, same probleme here... sometime the fcgi proccess is going crazy... will try to switch back to cgi...

Changed 6 weeks ago by kiniry@…

We are seeing similar problems on our five Tracs running on Apache 2 and OS X Leopard Server.

Additionally, approximately half a dozen FCGIs per day stop responding and consume as much CPU as they can. E.g.,

kind:BONc# peek fcgi
_www     89173  52.0  0.6   110604  27140   ??  R     4:28PM 360:08.64 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/csi-trac.fcgi
_www     11346  48.7  0.4    98716  16664   ??  R    10:41PM 353:11.91 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/csi-trac.fcgi
_www     11624  48.1  0.6   103268  23136   ??  R    11:00PM 329:11.37 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     89431  45.0  0.6   101188  23080   ??  R     4:41PM 605:35.12 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     18227  43.7  2.1   164836  88048   ??  R     6:42AM  79:45.42 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     89167  41.8  0.8   113012  33124   ??  R     4:28PM 394:37.74 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/mobius-trac.fcgi
_www     11345  40.9  0.4    98912  16876   ??  R    10:41PM 356:56.74 /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python /Volumes/Data/web/CGI-Executables/csi-trac.fcgi

Tracing them indicates that they are all blocked on semaphores inside of Python.

I suspect that we are witnessing a concurrency issue (perhaps a livelock) related to other mod_python concurrency bugs found here. As we are running one or more FCGI processes per Trac we were hoping that switching from mod_python to FCGI would avoid these concurrency problems that have bit us in the past, but perhaps we were wrong.

Are requests shuttled through Apache to a FCGI application serialized in some fashion? How does Apache know when to spawn new FCGI processes for a given ScriptAlias?? I am new to the whole FCGI API and am digging into it now, but some of these simple questions have not yet been answered in my initial reading.

Joe Kiniry

Changed 3 days ago by jon

I'm having this problem as well. Is there anything I can do to debug this and get it fixed? I'm not sure how other sites can handle using the fastcgi interface, with how the memory gets used...

Add/Change #4399 (trac.fcgi process memory usage increases with HTTP hits)

Author



Change Properties
<Author field>
Action
as new
as The resolution will be set. Next status will be 'closed'
to The owner will change. Next status will be 'new'
The owner will change to anonymous. Next status will be 'assigned'
 
Note: See TracTickets for help on using tickets.