Edgewall Software
Modify

Opened 17 years ago

Closed 14 years ago

Last modified 6 years ago

#3957 closed enhancement (fixed)

unhandled exception - socket

Reported by: james@… Owned by: Christian Boos
Priority: high Milestone: 0.11.6
Component: web frontend Version: devel
Severity: normal Keywords: fastcgi fcgi
Cc: wkornewald Branch:
Release Notes:
API Changes:
Internal Changes:

Description

can we handle this appropriately?

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xbe0757ec>>
Traceback (most recent call last):
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Broken pipe')

Attachments (0)

Change History (32)

comment:1 by Jonas Borgström, 17 years ago

Keywords: needinfo added

Can you give some brief instructions on how to reproduce this?

comment:2 by wkornewald, 17 years ago

Cc: wkornewald added

comment:3 by wkornewald, 17 years ago

Priority: highhighest

From what I've read this bug can be caused by a timed out MySQL connection. Maybe we could catch this exception and reconnect the DB when it happens? At least, I saw in a mailing list post that Rails' ActiveRecord is handling the exception that way.

It seems that this bug is related to #3369 and #3868. We're experiencing it on a shared host with Apache, FastCGI, and MySQL. Switching to sqlite is not a real option because our user login data is stored in a MySQL DB and I'm missing nice tools like phpmyadmin.

Could you please have a look at this bug? It's really annoying because sometimes pages aren't displayed, at all. Sometimes even reloading after that error fails. It also happens that I get an HTTP 500 error.

I'll also contact my hosting provider and ask whether they could update FastCGI (in case they are using an old version). Maybe that helps…

comment:4 by wkornewald <wkornewald@…>, 17 years ago

Interesting articles: http://chris.gaskett.com/articles/rails-keeping-fastcgi-alive/ http://work.alexyoung.org/archives/102/dreamhost-and-rails-500-errors

More info from DreamHost (Rails-specific, but maybe useful):

If you are using fcgi with Ruby on Rails, you might get what appear to be random 500 errors occasionally. Alex Young's workaround (http://work.alexyoung.org/archives/102/dreamhost-and-rails-500-errors), adapted from Gary Lin's workaround (http://thread.gmane.org/gmane.comp.lang.ruby.rails/33057) seems to resolve this. Basically, add the following code to your dispatch.fcgi, right before RailsFCGIHandler.process!

class RailsFCGIHandler private

def frao_handler(signal)

dispatcher_log :info, "asked to terminate immediately" dispatcher_log :info, "frao handler working its magic!" restart_handler(signal)

end alias_method :exit_now_handler, :frao_handler

end

comment:5 by Christian Boos, 17 years ago

Reformatting the Ruby code snippet from above:

class RailsFCGIHandler
  private
    def frao_handler(signal)
      dispatcher_log :info, "asked to terminate immediately"
      dispatcher_log :info, "frao handler working its magic!"
      restart_handler(signal)
    end
    alias_method :exit_now_handler, :frao_handler
end

comment:6 by wkornewald, 17 years ago

Maybe this Apache log is helpful (esp. the two errors):

[Wed Nov 29 00:29:35 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 16594) termination signaled
[Wed Nov 29 00:29:36 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 16594) terminated by calling exit with status '0'
[Wed Nov 29 00:30:36 2006] [warn] FastCGI: scheduled the restart of the last (dynamic) server "/home/user/public_html/trac" process: reached dynamicMaxClassProcs (5)
[Wed Nov 29 00:30:36 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (uid 32005, gid 32008) restarted (pid 19715)
[Wed Nov 29 00:34:38 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 19715) termination signaled
[Wed Nov 29 00:34:38 2006] [error] [client 217.12.250.67] (104)Connection reset by peer: FastCGI: comm with server "/home/user/public_html/trac" aborted: read failed
[Wed Nov 29 00:34:38 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 19715) terminated by calling exit with status '0'
[Wed Nov 29 00:34:38 2006] [error] [client 217.12.250.67] FastCGI: incomplete headers (0 bytes) received from server "/home/user/public_html/trac"
[Wed Nov 29 00:36:00 2006] [warn] FastCGI: scheduled the restart of the last (dynamic) server "/home/user/public_html/trac" process: reached dynamicMaxClassProcs (5)

We're on a shared host (with MySQL and Trac 0.10.3dev) and we are only allowed to run five fcgi processes. For some reason, Trac is running about 8-12 processes the whole time. Sometimes this prevents me from doing anything at the shell (-bash: fork: Resource temporarily unavailable).

Could it be possible that Trac is not handling the exit signal correctly?

comment:7 by Daniel Schiffner, 17 years ago

at me it looks like this:

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xf577d06c>>
Traceback (most recent call last):
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Daten\xfcbergabe unterbrochen (broken pipe)')

comment:8 by anonymous, 17 years ago

Resolution: fixed
Status: newclosed

comment:9 by anonymous, 17 years ago

Resolution: fixed
Status: closedreopened

comment:10 by TWCore.org, 17 years ago

I have the exact same problem, getting the following errors in the error log:

[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] FastCGI: comm with (dynamic) server "/home/twcore/twcore.org/index.fcgi" aborted: (first read) idle timeout (60 sec)
[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] FastCGI: incomplete headers (0 bytes) received from server "/home/twcore/twcore.org/index.fcgi"
[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] File does not exist: /home/twcore/twcore.org/internal_error.html

Does anyone know how this can be solved?

comment:11 by Christian Boos, 17 years ago

Keywords: fastcgi added; fcgi removed

(moving to 0.10.x along with other fastcgi tickets)

comment:12 by Christian Boos, 17 years ago

Milestone: 0.110.10.5

comment:13 by anonymous, 16 years ago

As far as I know, this isn't an issue in flup 1.0. As in, it at the very least doesn't print out that error, which in my experience happens when an HTTP request is interrupted. I don't know why it happens with Trac.

But why was flup copied and pasted into Trac anyway? Can't you just declare it as a dependency in setup.py and import it?

comment:14 by Emmanuel Blot, 16 years ago

#4976 marked has a duplicate

comment:15 by erose, 16 years ago

I get this in my FastCGI/SQLite setup, FWIW.

comment:16 by anonymous, 16 years ago

Milestone: 0.10.60.11.1
Type: defectenhancement

comment:17 by Jonas Borgström, 16 years ago

Milestone: 0.11.10.11.2
Priority: highesthigh

Moving this to 0.11.2 since this "only" results in excessive logging and does no failures.

in reply to:  13 comment:18 by Christian Boos, 15 years ago

Component: generalweb frontend
Keywords: fcgi added; web needinfo removed
Milestone: 0.11-retriage0.12

Replying to anonymous:

As far as I know, this isn't an issue in flup 1.0. … But why was flup copied and pasted into Trac anyway?

No idea, but eventually we could integrate a more recent version (http://www.saddi.com/software/flup/dist/). A superficial look to 1.0 shows that it's not 2.3 compatible anymore, so I'll tentatively target 0.12.

comment:19 by Remy Blank, 15 years ago

Am I right to assume that flup is the _fcgi.py module? The current version has a small bug described in #7239.

comment:20 by Christian Boos, 15 years ago

Yes, trac/web/_fcgi.py correspond to flup's client/fcgi_app.py, and last update was done in r3677. So "our" _fcgi.py is based on upstream r2025, and the 1.0 version discussed above corresponds to the newer r2109 (which is apparently not brand new either).

comment:21 by Christian Boos, 15 years ago

Looking again at the _fcgi.py code, I see that r3677 actually removed the special case for E.PIPE. Maybe we can just add it back?

comment:22 by Christian Boos, 15 years ago

Owner: Jonas Borgström removed
Status: reopenednew

comment:23 by Remy Blank, 14 years ago

Milestone: 0.12next-minor-0.12.x

Not a blocker.

comment:24 by Christian Boos, 14 years ago

Milestone: next-minor-0.12.x0.11.6
Owner: set to Christian Boos
Status: newassigned

Seen that in t.e.o logs as well. Will fix it by applying upstream fix.

comment:25 by Christian Boos, 14 years ago

Resolution: fixed
Status: assignedclosed

Fixed in [8820].

comment:26 by Christian Boos, 14 years ago

I also see this error now:

Traceback (most recent call last):
  File "/usr/local/virtualenv/lib/python2.5/site-packages/Trac-0.11.6stable_r8820-py2.5.egg/trac/web/_fcgi.py", line 661, in run
    except (select.error, socket.error), e:
AttributeError: 'NoneType' object has no attribute 'error'

Not sure if it's related to r8820 though.

comment:27 by Christian Boos, 14 years ago

Another backtrace, similar to the original one:

Traceback (most recent call last):
  File ".../trac/web/_fcgi.py", line 658, in run
    self.process_input()
  File ".../trac/web/_fcgi.py", line 694, in process_input
    self._do_params(rec)
  File ".../trac/web/_fcgi.py", line 793, in _do_params
    self._start_request(req)
  File ".../trac/web/_fcgi.py", line 777, in _start_request
    req.run()
  File ".../trac/web/_fcgi.py", line 573, in run
    self.stderr.flush()
  File ".../trac/web/_fcgi.py", line 340, in flush
    self._write(data)
  File ".../trac/web/_fcgi.py", line 311, in _write
    self._conn.writeRecord(rec)
  File ".../trac/web/_fcgi.py", line 709, in writeRecord
    rec.write(self._sock)
  File ".../trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File ".../trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)

in reply to:  26 comment:28 by Christian Boos, 14 years ago

Also got this, while terminating tracd:

Unhandled exception in thread started by <function serve at 0x02DC74F0>
Traceback (most recent call last):
  File "trac/web/standalone.py", line 268, in serve
    httpd.serve_forever()
  File "C:\Dev\Python261\lib\SocketServer.py", line 224, in serve_forever
    r, w, e = select.select([self], [], [], poll_interval)
AttributeError: 'NoneType' object has no attribute 'select'

Which suggests that there's might be some kind of cleanup done for the select module (an atexit handler?).

comment:29 by Remy Blank, 14 years ago

I observe the same traceback regularly since I switched to Python 2.6.

in reply to:  29 comment:30 by Christian Boos, 13 years ago

Replying to rblank:

I observe the same traceback regularly since I switched to Python 2.6.

Sebastian Krysmanski also reported this today (#9960), still with 2.6.

comment:31 by anonymous, 6 years ago

2018 calling… still seeing this on 0.12.5

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0x37d7610>>
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 658, in run
    self.process_input()
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 694, in process_input
    self._do_params(rec)
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 793, in _do_params
    self._start_request(req)
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 777, in _start_request
    req.run()
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 584, in run
    self._end(appStatus, protocolStatus) 
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 590, in _end
    self._conn.end_request(self, appStatus, protocolStatus)
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 725, in end_request
    self.writeRecord(rec)
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 709, in writeRecord
    rec.write(self._sock)
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/usr/lib/python2.7/dist-packages/trac/web/_fcgi.py", line 520, in _sendall
    sent = sock.send(data)
socket.error: [Errno 104] Connection reset by peer

fastcgi.conf

<IfModule mod_fastcgi.c>
  AddHandler fastcgi-script .fcgi
  FastCgiIpcDir /var/lib/apache2/fastcgi
  FastCgiConfig -initial-env TRAC_ENV_PARENT_DIR=/var/www/vhosts/<redacted>/trac  \
                -initial-env PYTHON_EGG_CACHE=/var/www/egg_cache \
                -initial-env LC_TIME=en_AU.UTF-8 \
                -maxClassProcesses 60 -maxProcesses 150 -idle-timeout 300
</IfModule>

No mysql involved here.

in reply to:  31 comment:32 by Jun Omae, 6 years ago

Replying to anonymous:

2018 calling… still seeing this on 0.12.5

socket.error: [Errno 104] Connection reset by peer

We should ignore errno.ECONNRESET also but 0.12-stable branch is no longer maintained. We recommend to upgrade to Trac 1.0.x, at least. If it is hard to upgrade, try to apply the following patch for 0.12-stable.

1.0-stable:

  • branches/1.0-stable/trac/web/_fcgi.py

     
    6262if not hasattr(socket, 'SHUT_WR'):
    6363    socket.SHUT_WR = 1
    6464
     65from trac.web.api import is_client_disconnect_exception
     66
     67
    6568__all__ = ['WSGIServer']
    6669
    6770# Constants from the spec.
     
    583586            self._flush()
    584587            self._end(appStatus, protocolStatus)
    585588        except socket.error, e:
    586             if e[0] != errno.EPIPE:
     589            if not is_client_disconnect_exception(e):
    587590                raise
    588591
    589592    def _end(self, appStatus=0L, protocolStatus=FCGI_REQUEST_COMPLETE):
     
    12701273                    if hasattr(result, 'close'):
    12711274                        result.close()
    12721275            except socket.error, e:
    1273                 if e[0] != errno.EPIPE:
     1276                if not is_client_disconnect_exception(e):
    12741277                    raise # Don't let EPIPE propagate beyond server
    12751278        finally:
    12761279            if not self.multithreaded:

0.12-stable:

  • branches/0.12-stable/trac/web/_fcgi.py

     
    583583            self._flush()
    584584            self._end(appStatus, protocolStatus)
    585585        except socket.error, e:
    586             if e[0] != errno.EPIPE:
     586            if e[0] not in (errno.EPIPE, errno.ECONNRESET):
    587587                raise
    588588
    589589    def _end(self, appStatus=0L, protocolStatus=FCGI_REQUEST_COMPLETE):
     
    12681268                    if hasattr(result, 'close'):
    12691269                        result.close()
    12701270            except socket.error, e:
    1271                 if e[0] != errno.EPIPE:
     1271                if e[0] not in (errno.EPIPE, errno.ECONNRESET):
    12721272                    raise # Don't let EPIPE propagate beyond server
    12731273        finally:
    12741274            if not self.multithreaded:

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.