Edgewall Software
Modify

Opened 8 years ago

Closed 5 years ago

Last modified 4 years ago

#3957 closed enhancement (fixed)

unhandled exception - socket

Reported by: james@… Owned by: cboos
Priority: high Milestone: 0.11.6
Component: web frontend Version: devel
Severity: normal Keywords: fastcgi fcgi
Cc: wkornewald
Release Notes:
API Changes:

Description

can we handle this appropriately?

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xbe0757ec>>
Traceback (most recent call last):
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Broken pipe')

Attachments (0)

Change History (30)

comment:1 Changed 8 years ago by jonas

  • Keywords needinfo added

Can you give some brief instructions on how to reproduce this?

comment:2 Changed 8 years ago by wkornewald

  • Cc wkornewald added

comment:3 Changed 8 years ago by wkornewald

  • Priority changed from high to highest

From what I've read this bug can be caused by a timed out MySQL connection. Maybe we could catch this exception and reconnect the DB when it happens? At least, I saw in a mailing list post that Rails' ActiveRecord? is handling the exception that way.

It seems that this bug is related to #3369 and #3868. We're experiencing it on a shared host with Apache, FastCGI, and MySQL. Switching to sqlite is not a real option because our user login data is stored in a MySQL DB and I'm missing nice tools like phpmyadmin.

Could you please have a look at this bug? It's really annoying because sometimes pages aren't displayed, at all. Sometimes even reloading after that error fails. It also happens that I get an HTTP 500 error.

I'll also contact my hosting provider and ask whether they could update FastCGI (in case they are using an old version). Maybe that helps…

comment:4 Changed 8 years ago by wkornewald <wkornewald@…>

Interesting articles: http://chris.gaskett.com/articles/rails-keeping-fastcgi-alive/ http://work.alexyoung.org/archives/102/dreamhost-and-rails-500-errors

More info from DreamHost? (Rails-specific, but maybe useful):

If you are using fcgi with Ruby on Rails, you might get what appear to be random 500 errors occasionally. Alex Young's workaround (http://work.alexyoung.org/archives/102/dreamhost-and-rails-500-errors), adapted from Gary Lin's workaround (http://thread.gmane.org/gmane.comp.lang.ruby.rails/33057) seems to resolve this. Basically, add the following code to your dispatch.fcgi, right before RailsFCGIHandler.process!

class RailsFCGIHandler private

def frao_handler(signal)

dispatcher_log :info, "asked to terminate immediately" dispatcher_log :info, "frao handler working its magic!" restart_handler(signal)

end alias_method :exit_now_handler, :frao_handler

end

comment:5 Changed 8 years ago by cboos

Reformatting the Ruby code snippet from above:

class RailsFCGIHandler
  private
    def frao_handler(signal)
      dispatcher_log :info, "asked to terminate immediately"
      dispatcher_log :info, "frao handler working its magic!"
      restart_handler(signal)
    end
    alias_method :exit_now_handler, :frao_handler
end

comment:6 Changed 8 years ago by wkornewald

Maybe this Apache log is helpful (esp. the two errors):

[Wed Nov 29 00:29:35 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 16594) termination signaled
[Wed Nov 29 00:29:36 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 16594) terminated by calling exit with status '0'
[Wed Nov 29 00:30:36 2006] [warn] FastCGI: scheduled the restart of the last (dynamic) server "/home/user/public_html/trac" process: reached dynamicMaxClassProcs (5)
[Wed Nov 29 00:30:36 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (uid 32005, gid 32008) restarted (pid 19715)
[Wed Nov 29 00:34:38 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 19715) termination signaled
[Wed Nov 29 00:34:38 2006] [error] [client 217.12.250.67] (104)Connection reset by peer: FastCGI: comm with server "/home/user/public_html/trac" aborted: read failed
[Wed Nov 29 00:34:38 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 19715) terminated by calling exit with status '0'
[Wed Nov 29 00:34:38 2006] [error] [client 217.12.250.67] FastCGI: incomplete headers (0 bytes) received from server "/home/user/public_html/trac"
[Wed Nov 29 00:36:00 2006] [warn] FastCGI: scheduled the restart of the last (dynamic) server "/home/user/public_html/trac" process: reached dynamicMaxClassProcs (5)

We're on a shared host (with MySQL and Trac 0.10.3dev) and we are only allowed to run five fcgi processes. For some reason, Trac is running about 8-12 processes the whole time. Sometimes this prevents me from doing anything at the shell (-bash: fork: Resource temporarily unavailable).

Could it be possible that Trac is not handling the exit signal correctly?

comment:7 Changed 8 years ago by Daniel Schiffner

at me it looks like this:

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xf577d06c>>
Traceback (most recent call last):
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Daten\xfcbergabe unterbrochen (broken pipe)')

comment:8 Changed 8 years ago by anonymous

  • Resolution set to fixed
  • Status changed from new to closed

comment:9 Changed 8 years ago by anonymous

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:10 Changed 8 years ago by TWCore.org

I have the exact same problem, getting the following errors in the error log:

[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] FastCGI: comm with (dynamic) server "/home/twcore/twcore.org/index.fcgi" aborted: (first read) idle timeout (60 sec)
[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] FastCGI: incomplete headers (0 bytes) received from server "/home/twcore/twcore.org/index.fcgi"
[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] File does not exist: /home/twcore/twcore.org/internal_error.html

Does anyone know how this can be solved?

comment:11 Changed 7 years ago by cboos

  • Keywords fastcgi added; fcgi removed

(moving to 0.10.x along with other fastcgi tickets)

comment:12 Changed 7 years ago by cboos

  • Milestone changed from 0.11 to 0.10.5

comment:13 follow-up: Changed 7 years ago by anonymous

As far as I know, this isn't an issue in flup 1.0. As in, it at the very least doesn't print out that error, which in my experience happens when an HTTP request is interrupted. I don't know why it happens with Trac.

But why was flup copied and pasted into Trac anyway? Can't you just declare it as a dependency in setup.py and import it?

comment:14 Changed 7 years ago by eblot

#4976 marked has a duplicate

comment:15 Changed 6 years ago by erose

I get this in my FastCGI/SQLite setup, FWIW.

comment:16 Changed 6 years ago by anonymous

  • Milestone changed from 0.10.6 to 0.11.1
  • Type changed from defect to enhancement

comment:17 Changed 6 years ago by jonas

  • Milestone changed from 0.11.1 to 0.11.2
  • Priority changed from highest to high

Moving this to 0.11.2 since this "only" results in excessive logging and does no failures.

comment:18 in reply to: ↑ 13 Changed 6 years ago by cboos

  • Component changed from general to web frontend
  • Keywords fcgi added; web needinfo removed
  • Milestone changed from 0.11-retriage to 0.12

Replying to anonymous:

As far as I know, this isn't an issue in flup 1.0. … But why was flup copied and pasted into Trac anyway?

No idea, but eventually we could integrate a more recent version (http://www.saddi.com/software/flup/dist/). A superficial look to 1.0 shows that it's not 2.3 compatible anymore, so I'll tentatively target 0.12.

comment:19 Changed 6 years ago by rblank

Am I right to assume that flup is the _fcgi.py module? The current version has a small bug described in #7239.

comment:20 Changed 6 years ago by cboos

Yes, trac/web/_fcgi.py correspond to flup's client/fcgi_app.py, and last update was done in r3677. So "our" _fcgi.py is based on upstream r2025, and the 1.0 version discussed above corresponds to the newer r2109 (which is apparently not brand new either).

comment:21 Changed 5 years ago by cboos

Looking again at the _fcgi.py code, I see that r3677 actually removed the special case for E.PIPE. Maybe we can just add it back?

comment:22 Changed 5 years ago by cboos

  • Owner jonas deleted
  • Status changed from reopened to new

comment:23 Changed 5 years ago by rblank

  • Milestone changed from 0.12 to next-minor-0.12.x

Not a blocker.

comment:24 Changed 5 years ago by cboos

  • Milestone changed from next-minor-0.12.x to 0.11.6
  • Owner set to cboos
  • Status changed from new to assigned

Seen that in t.e.o logs as well. Will fix it by applying upstream fix.

comment:25 Changed 5 years ago by cboos

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in [8820].

comment:26 follow-up: Changed 5 years ago by cboos

I also see this error now:

Traceback (most recent call last):
  File "/usr/local/virtualenv/lib/python2.5/site-packages/Trac-0.11.6stable_r8820-py2.5.egg/trac/web/_fcgi.py", line 661, in run
    except (select.error, socket.error), e:
AttributeError: 'NoneType' object has no attribute 'error'

Not sure if it's related to r8820 though.

comment:27 Changed 5 years ago by cboos

Another backtrace, similar to the original one:

Traceback (most recent call last):
  File ".../trac/web/_fcgi.py", line 658, in run
    self.process_input()
  File ".../trac/web/_fcgi.py", line 694, in process_input
    self._do_params(rec)
  File ".../trac/web/_fcgi.py", line 793, in _do_params
    self._start_request(req)
  File ".../trac/web/_fcgi.py", line 777, in _start_request
    req.run()
  File ".../trac/web/_fcgi.py", line 573, in run
    self.stderr.flush()
  File ".../trac/web/_fcgi.py", line 340, in flush
    self._write(data)
  File ".../trac/web/_fcgi.py", line 311, in _write
    self._conn.writeRecord(rec)
  File ".../trac/web/_fcgi.py", line 709, in writeRecord
    rec.write(self._sock)
  File ".../trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File ".../trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)

comment:28 in reply to: ↑ 26 Changed 5 years ago by cboos

Also got this, while terminating tracd:

Unhandled exception in thread started by <function serve at 0x02DC74F0>
Traceback (most recent call last):
  File "trac/web/standalone.py", line 268, in serve
    httpd.serve_forever()
  File "C:\Dev\Python261\lib\SocketServer.py", line 224, in serve_forever
    r, w, e = select.select([self], [], [], poll_interval)
AttributeError: 'NoneType' object has no attribute 'select'

Which suggests that there's might be some kind of cleanup done for the select module (an atexit handler?).

comment:29 follow-up: Changed 5 years ago by rblank

I observe the same traceback regularly since I switched to Python 2.6.

comment:30 in reply to: ↑ 29 Changed 4 years ago by cboos

Replying to rblank:

I observe the same traceback regularly since I switched to Python 2.6.

Sebastian Krysmanski also reported this today (#9960), still with 2.6.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed The owner will remain cboos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from cboos to the specified user.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.