Edgewall Software
Modify

Opened 14 years ago

Closed 14 years ago

Last modified 10 years ago

#2252 closed defect (worksforme)

FastCGI gets a timeout but renders correct pages afterwards

Reported by: thomas.jachmann@… Owned by: Matthew Good
Priority: normal Milestone:
Component: general Version: 0.9.4
Severity: normal Keywords:
Cc: thomas.jachmann@… Branch:
Release Notes:
API Changes:

Description

From time to time (every 5 to 10 requests), trac hangs. This is the output of apache's error log:

[Thu Oct 20 21:33:01 2005] [error] [client ...] FastCGI: comm with server "/usr/share/trac/cgi-bin/trac.fcgi" aborted: idle timeout (30 sec), referer: [...]
[Thu Oct 20 21:33:01 2005] [error] [client ...] FastCGI: server "/usr/share/trac/cgi-bin/trac.fcgi" stderr: Traceback (most recent call last):, referer: [...]
[Thu Oct 20 21:33:01 2005] [error] [client ...] FastCGI: server "/usr/share/trac/cgi-bin/trac.fcgi" stderr:   File "/usr/lib/python2.3/site-packages/trac/web/_fcgi.py", line 567, in run, referer: [...]
[Thu Oct 20 21:33:01 2005] [error] [client ...] FastCGI: server "/usr/share/trac/cgi-bin/trac.fcgi" stderr:     protocolStatus, appStatus = self.server.handler(self), referer: [...]
[Thu Oct 20 21:33:01 2005] [error] [client ...] FastCGI: server "/usr/share/trac/cgi-bin/trac.fcgi" stderr: TypeError: unpack non-sequence, referer: [...]

The first line only appears when the request hangs, the remaining four lines are written to the error log for each - even not-hanging - request.

This seems to be an error with trac.fcgi. Parts of the page are rendered, then it hangs for 30 seconds (the FastCGI timeout). Then, the rest of the page is passed to the browser and gets rendered. Nothing is broken. From the browser's view, it just looks like the server takes a break in the middle of the page. I just don't understand the timeout - this seems as if trac.fcgi execution gets cancelled by FastCGI, but still all content gets back to the browser in the end.

The following is my current apache configuration for the virtual host running trac. I run several projects off the root of the virtual host, avoiding the trac.fcgi script in the URL by using the ScriptAliasMatch directive. This is all taken from trac's documentation.

FastCgiConfig -initial-env TRAC_ENV_PARENT_DIR=/var/trac/ -idle-timeout 1
<VirtualHost *:80>
        ServerName [...]
        DocumentRoot /usr/share/trac/htdocs

        <Directory "/usr/share/trac/htdocs">
                Options Indexes MultiViews
                AllowOverride None
                Order allow,deny
                Allow from all
        </Directory>

        AliasMatch ^/[^/]+/chrome/common(.*) /usr/share/trac/htdocs$1
        ScriptAliasMatch ^(.*) /usr/share/trac/cgi-bin/trac.fcgi$1
</VirtualHost>

As you can see, I avoided the lag by just reducing the timeout of FastCGI to one second. This way, users don't notice that trac.fcgi gets a timeout. But with increasing load on the server, one second might not be sufficient.

I use:

  • Fedora Core 3
  • Apache 2.0.53
  • Python 2.3.4
  • Trac 0.9b2

Attachments (0)

Change History (17)

comment:1 by thomas.jachmann@…, 14 years ago

Version: 0.8.40.9b2

comment:2 by Jonas Borgström, 14 years ago

Resolution: duplicate
Status: newclosed

Duplicate of #2106.

comment:3 by Matthew Good, 14 years ago

Resolution: duplicate
Status: closedreopened

No, this is a distinct issue from #2106, since [2425] should have already fixed that one.

I have this issue on my production server, but I've been unable to reproduce it elsewhere to do much testing. However, it seems like it must be an error within the fcgi module, not the Trac code that calls it, since I've verified that the Trac handler returns normally. I believe that somehow the buffers are not being flushed properly.

comment:4 by anonymous, 14 years ago

I think mgood's right, since the machine isn't experiencing any load during the lag. It just sits there waiting until the timeout occurs. This also matches the fact that the content is completely rendered after the timeout - a flush might be issued on timeout.

comment:5 by Matthew Good, 14 years ago

#2139 has been marked as a duplicate of this ticket.

comment:6 by fago, 14 years ago

i think, i am concerned by the same issue.

i'm using debian sarge with its apache2 (2.0.54) and python 2.3.5

basically everything works fine, however every 5-10 page load results in the described behaviour: no output until the fastcgi process times out. it seems to me, that it appears most time with ticket overview/view pages.

the server load is up to 99% idle during the timeout error log: [Tue Dec 13 12:06:16 2005] [error] [client 193.170.48.58] FastCGI: comm with server "/var/www/tracwrapper.fcgi" aborted: idle timeout (5 sec)

(i'm just setting the environment for trac in tracwrapper.fcgi) however i don't get the python traceback?

comment:7 by fago, 14 years ago

sry, forgot to mention my trac version, i am using the latest stable release: 0.9.2

comment:8 by anonymous, 14 years ago

Version: 0.9b20.9.2

After upgrading to 0.9.2, I still have the same problem, although I have lesser error messages in apache's error log than with 0.9b2:

[Thu Dec 22 12:29:12 2005] [error] [client ...] FastCGI: comm with server "/usr/share/trac/cgi-bin/trac.fcgi" aborted: idle timeout (2 sec)

I still can't figure out what's going wrong. I also tried the -flush parameter to FastCGI and configuring the script as static FastCgiServer instead of dynamic, but both without any effect.

This keeps me from setting up a public trac instance for one of our open source projects, since we've got quite some traffic on the project's current site. Unfortunately, I'm not too familiar with python and wasn't lucky looking around in trac's code trying to find the cause.

comment:9 by anonymous, 14 years ago

Cc: thomas.jachmann@… added

comment:10 by Matthew Good, 14 years ago

Owner: changed from Jonas Borgström to Matthew Good
Status: reopenednew

Well, I think that the problem lies outside of the Trac code. I encountered the problem when we switched the FastCGI module since it was not compatible with the change to a BSD license. I've tried some debugging and verfied that it stalls after Trac's FastCGI handler has exited. Unfortunately I can't reproduce the problem on my own system and haven't wanted to interrupt my production server to debug it. However, I think that the site should be unused this weekend and I may be able to take it down so that I can look into it.

comment:11 by thomas.jachmann <thomas.jachmann@…>, 14 years ago

I don't know if this is related, but sometimes I also get errors where the content of apache's "Internal Server Error" page is displayed somewhere within the page trac is about to generate, usually at the same spot. In the error log, I have the following:

[Tue Jan 03 17:30:38 2006] [error] [client ...] (104)Connection reset by
peer: FastCGI: comm with server "/usr/share/trac/cgi-bin/trac.fcgi"
aborted: read failed

AFAIK, this usually is written to the log when the server was unable to send data back to the browser since the connection has been cut, eg the browser has been closed before the page has been fully delivered. But this didn't occur. Maybe this helps in finding the problem?

comment:12 by james@…, 14 years ago

I get a similar thing:

[Tue Mar 21 14:23:50 2006] [error] [client xxx] FastCGI: comm with server "/usr/share/trac/cgi-bin/trac.fcgi" aborted: idle timeout (30 sec), referer: https://blah/foo/wiki/QuickTortoiseUsage

Occasionally too I get 'read failed', and on and off the page rendered will include Apache's error page, or it also may just be truncated. It's a lottery, really.

I have no idea how to fix this - should I use a different connection method altogether? I'm using Trac 0.9.4 and FastCGI 2.4.2 on Debian Sarge with Apache 2.

comment:13 by Thomas Jachmann <thomas.jachmann@…>, 14 years ago

Version: 0.9.20.9.4

comment:14 by otto.hilska@…, 14 years ago

I had the same problem with Apache 2, mod_fastcgi and Trac 0.9.4. However, switching mod_fastcgi to mod_fcgid helped, so I guess this really isn't a Trac problem.

comment:15 by Thomas Jachmann <thomas.jachmann@…>, 14 years ago

Resolution: worksforme
Status: newclosed

OK, I also got this working, so i'll better close the ticket. Thanks Otto for the hint!

If anyone else is interested:

  1. download mod_fcgid from http://fastcgi.coremail.cn/download.htm and do make/make install
  2. on Fedora, put this into /etc/http/conf.d/fcgid.conf:
    LoadModule fcgid_module modules/mod_fcgid.so
    <IfModule mod_fcgid.c>
        AddHandler fcgid-script .fcgid
        SocketPath /tmp/fcgid/sock
        IPCCommTimeout 60
    </IfModule>
    
  3. start apache
  4. chmod -R 777 /tmp/fcgid/
  5. restart apache

The IPCCommTimeout is necessary since some reports and changeset views can run quite long. See the following URLs for any configuration hints:

comment:16 by anonymous, 10 years ago

"FastCgiConfig -initial-env TRAC_ENV_PARENT_DIR=/var/trac/ -idle-timeout 1"

I think you are doing the opposite of what you want to achieve. As I understand it, if the application doesn't produce any output within the idle timeout period then it will be aborted producing an error similar to: [Sun Jun 28 20:07:54 2009] [error] [client x.x.x.x] FastCGI: comm with server "/var/www/fcgi-bin/test.fcgi" aborted: idle timeout (30 s ec), referer: http://www.example.com/

From the documentation: http://www.fastcgi.com/mod_fastcgi/docs/mod_fastcgi.html#FastCgiServer http://www.fastcgi.com/mod_fastcgi/docs/mod_fastcgi.html#FastCgiConfig -idle-timeout n (30 seconds) The number of seconds of FastCGI application inactivity allowed before the request is aborted and the event is logged (at the error LogLevel). The inactivity timer applies only as long as a connection is pending with the FastCGI application. If a request is queued to an application, but the application doesn't respond (by writing and flushing) within this period, the request will be aborted. If communication is complete with the application but incomplete with the client (the response is buffered), the timeout does not apply.

comment:17 by anonymous, 10 years ago

I just experienced something thats sounds a lot like what is described in this ticket. The only difference I saw was no log messages (or very occasional fastcgi timeout messages).

I am on Ubuntu 9.04, with trac 0.11, apache2.

The problem for me was the way Ubuntu (and debian, don't know about fedora) arranges the apache configuration when installing using the package manager. The apache2.conf imports all module configuration first by loading, then by configuring (mods-enabled/*.load, then mods-enabled/*.conf), which ends with the opposite of what is described in TracFastCgi.

The key to fixing this for me was to switch the contents of mods-available/fastcgi.load and mods-available/fastcgi.conf, thus getting the desired load order. I guess that makes this a problem with individual distro packages? Restarting apache gets rid of the hang until timeout for me.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Matthew Good.
The resolution will be deleted. Next status will be 'reopened'.
to as closed The owner will be changed from Matthew Good to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.