Edgewall Software
Modify

Ticket #3957 (closed enhancement: fixed)

Opened 5 years ago

Last modified 13 months ago

unhandled exception - socket

Reported by: james@… Owned by: cboos
Priority: high Milestone: 0.11.6
Component: web frontend Version: devel
Severity: normal Keywords: fastcgi fcgi
Cc: wkornewald
Release Notes:
API Changes:

Description

can we handle this appropriately?

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xbe0757ec>>
Traceback (most recent call last):
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/usr/local/lib/python2.4/site-packages/trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Broken pipe')

Attachments

Change History

comment:1 Changed 5 years ago by jonas

  • Keywords fcgi needinfo added; fcgi, removed

Can you give some brief instructions on how to reproduce this?

comment:2 Changed 5 years ago by wkornewald

  • Cc wkornewald added

comment:3 Changed 5 years ago by wkornewald

  • Priority changed from high to highest

From what I've read this bug can be caused by a timed out MySQL connection. Maybe we could catch this exception and reconnect the DB when it happens? At least, I saw in a mailing list post that Rails' ActiveRecord? is handling the exception that way.

It seems that this bug is related to #3369 and #3868. We're experiencing it on a shared host with Apache, FastCGI, and MySQL. Switching to sqlite is not a real option because our user login data is stored in a MySQL DB and I'm missing nice tools like phpmyadmin.

Could you please have a look at this bug? It's really annoying because sometimes pages aren't displayed, at all. Sometimes even reloading after that error fails. It also happens that I get an HTTP 500 error.

I'll also contact my hosting provider and ask whether they could update FastCGI (in case they are using an old version). Maybe that helps...

comment:4 Changed 5 years ago by wkornewald <wkornewald@…>

Interesting articles:
http://chris.gaskett.com/articles/rails-keeping-fastcgi-alive/
http://work.alexyoung.org/archives/102/dreamhost-and-rails-500-errors

More info from DreamHost? (Rails-specific, but maybe useful):

If you are using fcgi with Ruby on Rails, you might get what appear to be random 500 errors occasionally. Alex Young's workaround (http://work.alexyoung.org/archives/102/dreamhost-and-rails-500-errors), adapted from Gary Lin's workaround (http://thread.gmane.org/gmane.comp.lang.ruby.rails/33057) seems to resolve this. Basically, add the following code to your dispatch.fcgi, right before RailsFCGIHandler.process!

class RailsFCGIHandler
private

def frao_handler(signal)

dispatcher_log :info, "asked to terminate immediately"
dispatcher_log :info, "frao handler working its magic!"
restart_handler(signal)

end
alias_method :exit_now_handler, :frao_handler

end

comment:5 Changed 5 years ago by cboos

Reformatting the Ruby code snippet from above:

class RailsFCGIHandler
  private
    def frao_handler(signal)
      dispatcher_log :info, "asked to terminate immediately"
      dispatcher_log :info, "frao handler working its magic!"
      restart_handler(signal)
    end
    alias_method :exit_now_handler, :frao_handler
end

comment:6 Changed 5 years ago by wkornewald

Maybe this Apache log is helpful (esp. the two errors):

[Wed Nov 29 00:29:35 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 16594) termination signaled
[Wed Nov 29 00:29:36 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 16594) terminated by calling exit with status '0'
[Wed Nov 29 00:30:36 2006] [warn] FastCGI: scheduled the restart of the last (dynamic) server "/home/user/public_html/trac" process: reached dynamicMaxClassProcs (5)
[Wed Nov 29 00:30:36 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (uid 32005, gid 32008) restarted (pid 19715)
[Wed Nov 29 00:34:38 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 19715) termination signaled
[Wed Nov 29 00:34:38 2006] [error] [client 217.12.250.67] (104)Connection reset by peer: FastCGI: comm with server "/home/user/public_html/trac" aborted: read failed
[Wed Nov 29 00:34:38 2006] [warn] FastCGI: (dynamic) server "/home/user/public_html/trac" (pid 19715) terminated by calling exit with status '0'
[Wed Nov 29 00:34:38 2006] [error] [client 217.12.250.67] FastCGI: incomplete headers (0 bytes) received from server "/home/user/public_html/trac"
[Wed Nov 29 00:36:00 2006] [warn] FastCGI: scheduled the restart of the last (dynamic) server "/home/user/public_html/trac" process: reached dynamicMaxClassProcs (5)

We're on a shared host (with MySQL and Trac 0.10.3dev) and we are only allowed to run five fcgi processes. For some reason, Trac is running about 8-12 processes the whole time. Sometimes this prevents me from doing anything at the shell (-bash: fork: Resource temporarily unavailable).

Could it be possible that Trac is not handling the exit signal correctly?

comment:7 Changed 5 years ago by Daniel Schiffner

at me it looks like this:

Unhandled exception in thread started by <bound method Connection.run of <trac.web._fcgi.Connection object at 0xf577d06c>>
Traceback (most recent call last):
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 654, in run
    self.process_input()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 690, in process_input
    self._do_params(rec)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 789, in _do_params
    self._start_request(req)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 773, in _start_request
    req.run()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 582, in run
    self._flush()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 589, in _flush
    self.stdout.close()
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 348, in close
    self._conn.writeRecord(rec)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 705, in writeRecord
    rec.write(self._sock)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File "/var/lib/python-support/python2.4/trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)
socket.error: (32, 'Daten\xfcbergabe unterbrochen (broken pipe)')

comment:8 Changed 5 years ago by anonymous

  • Resolution set to fixed
  • Status changed from new to closed

comment:9 Changed 5 years ago by anonymous

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:10 Changed 5 years ago by TWCore.org

I have the exact same problem, getting the following errors in the error log:

[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] FastCGI: comm with (dynamic) server "/home/twcore/twcore.org/index.fcgi" aborted: (first read) idle timeout (60 sec)
[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] FastCGI: incomplete headers (0 bytes) received from server "/home/twcore/twcore.org/index.fcgi"
[Tue Feb 27 04:18:03 2007] [error] [client 66.249.65.201] File does not exist: /home/twcore/twcore.org/internal_error.html

Does anyone know how this can be solved?

comment:11 Changed 5 years ago by cboos

  • Keywords fastcgi added; fcgi removed

(moving to 0.10.x along with other fastcgi tickets)

comment:12 Changed 5 years ago by cboos

  • Milestone changed from 0.11 to 0.10.5

comment:13 follow-up: Changed 4 years ago by anonymous

As far as I know, this isn't an issue in flup 1.0. As in, it at the very least doesn't print out that error, which in my experience happens when an HTTP request is interrupted. I don't know why it happens with Trac.

But why was flup copied and pasted into Trac anyway? Can't you just declare it as a dependency in setup.py and import it?

comment:14 Changed 4 years ago by eblot

#4976 marked has a duplicate

comment:15 Changed 4 years ago by erose

I get this in my FastCGI/SQLite setup, FWIW.

comment:16 Changed 4 years ago by anonymous

  • Milestone changed from 0.10.6 to 0.11.1
  • Type changed from defect to enhancement

comment:17 Changed 4 years ago by jonas

  • Milestone changed from 0.11.1 to 0.11.2
  • Priority changed from highest to high

Moving this to 0.11.2 since this "only" results in excessive logging and does no failures.

comment:18 in reply to: ↑ 13 Changed 3 years ago by cboos

  • Component changed from general to web frontend
  • Keywords fcgi added; web needinfo removed
  • Milestone changed from 0.11-retriage to 0.12

Replying to anonymous:

As far as I know, this isn't an issue in flup 1.0. ...
But why was flup copied and pasted into Trac anyway?

No idea, but eventually we could integrate a more recent version (http://www.saddi.com/software/flup/dist/). A superficial look to 1.0 shows that it's not 2.3 compatible anymore, so I'll tentatively target 0.12.

comment:19 Changed 3 years ago by rblank

Am I right to assume that flup is the _fcgi.py module? The current version has a small bug described in #7239.

comment:20 Changed 3 years ago by cboos

Yes, trac/web/_fcgi.py correspond to flup's client/fcgi_app.py, and last update was done in r3677. So "our" _fcgi.py is based on upstream r2025, and the 1.0 version discussed above corresponds to the newer r2109 (which is apparently not brand new either).

comment:21 Changed 3 years ago by cboos

Looking again at the _fcgi.py code, I see that r3677 actually removed the special case for E.PIPE. Maybe we can just add it back?

comment:22 Changed 2 years ago by cboos

  • Owner jonas deleted
  • Status changed from reopened to new

comment:23 Changed 2 years ago by rblank

  • Milestone changed from 0.12 to next-minor-0.12.x

Not a blocker.

comment:24 Changed 2 years ago by cboos

  • Milestone changed from next-minor-0.12.x to 0.11.6
  • Owner set to cboos
  • Status changed from new to assigned

Seen that in t.e.o logs as well. Will fix it by applying upstream fix.

comment:25 Changed 2 years ago by cboos

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in [8820].

comment:26 follow-up: Changed 2 years ago by cboos

I also see this error now:

Traceback (most recent call last):
  File "/usr/local/virtualenv/lib/python2.5/site-packages/Trac-0.11.6stable_r8820-py2.5.egg/trac/web/_fcgi.py", line 661, in run
    except (select.error, socket.error), e:
AttributeError: 'NoneType' object has no attribute 'error'

Not sure if it's related to r8820 though.

comment:27 Changed 2 years ago by cboos

Another backtrace, similar to the original one:

Traceback (most recent call last):
  File ".../trac/web/_fcgi.py", line 658, in run
    self.process_input()
  File ".../trac/web/_fcgi.py", line 694, in process_input
    self._do_params(rec)
  File ".../trac/web/_fcgi.py", line 793, in _do_params
    self._start_request(req)
  File ".../trac/web/_fcgi.py", line 777, in _start_request
    req.run()
  File ".../trac/web/_fcgi.py", line 573, in run
    self.stderr.flush()
  File ".../trac/web/_fcgi.py", line 340, in flush
    self._write(data)
  File ".../trac/web/_fcgi.py", line 311, in _write
    self._conn.writeRecord(rec)
  File ".../trac/web/_fcgi.py", line 709, in writeRecord
    rec.write(self._sock)
  File ".../trac/web/_fcgi.py", line 542, in write
    self._sendall(sock, header)
  File ".../trac/web/_fcgi.py", line 519, in _sendall
    sent = sock.send(data)

comment:28 in reply to: ↑ 26 Changed 2 years ago by cboos

Also got this, while terminating tracd:

Unhandled exception in thread started by <function serve at 0x02DC74F0>
Traceback (most recent call last):
  File "trac/web/standalone.py", line 268, in serve
    httpd.serve_forever()
  File "C:\Dev\Python261\lib\SocketServer.py", line 224, in serve_forever
    r, w, e = select.select([self], [], [], poll_interval)
AttributeError: 'NoneType' object has no attribute 'select'

Which suggests that there's might be some kind of cleanup done for the select module (an atexit handler?).

comment:29 follow-up: Changed 2 years ago by rblank

I observe the same traceback regularly since I switched to Python 2.6.

comment:30 in reply to: ↑ 29 Changed 13 months ago by cboos

Replying to rblank:

I observe the same traceback regularly since I switched to Python 2.6.

Sebastian Krysmanski also reported this today (#9960), still with 2.6.

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
to The owner will be changed from cboos. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.