|
View:
New views
10 Messages
—
Rating Filter:
Alert me
|
|
|
server stops handling requests - nowhere near 100% CPU or Memory usedHello all,
I recently made a lot of changes to my ZODB app, and now I'm experiencing some puzzling behavior. Normally I have ~15 clients making requests to the server every few seconds. Some of these
requests commit new data to the database, while others just process existing data. I also have a 'heartbeat' client, which access the view called '/is_alive', which is entirely described
by this function: def is_alive(request): return True Once a day or so, I will get a report from the heartbeat client that the server is down - no heartbeat
has succeeded in the past 10 minutes. Indeed, any URL I go to on the server, will simply not load, but take forever - even this '/is_alive' view. I can still SSH into the server, however. Running 'top', I see that the server is not taking any
CPU time, and is not taking any large amount of memory. The computer itself runs just fine. I can even access the database with the paster shell, make requests, commit things, etc., without any exceptional delays. If I CTRL+C the webserver and restart it,
things immediately work just fine. Looking at the immediately preceding stdout, I see no hints of errors of any kind. However, a printline is executed on the server every time two of those clients completes a request,
and I notice that, leading up to the current point in the stdout (where the webserver does not respond anymore), there are a few "Pipe is broken" messages (which happen when, e.g., I go to a view with a web-browser and close the tab before the page loads), and I notice
that requests stopped completing from one of the two clients, then from the other. It seems a gradual slowdown of some kind, though I'm not entirely certain. My question is: what could possibly be causing the server to 'lock up', even on a
simple view like 'is_alive', without using any memory or CPU? Is there some ZODB resource that might be getting gradually exhausted because I'm not handling it properly?
Thanks in advance, - Claudiu
_______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedOn Thu, Apr 19, 2012 at 17:20, Claudiu Saftoiu <csaftoiu@...> wrote:
> My question is: what could possibly be causing the server to 'lock up', even > on a > simple view like 'is_alive', without using any memory or CPU? Is there some > ZODB > resource that might be getting gradually exhausted because I'm not handling > it properly? I don't know, anything could lock up your site if something is waiting for a lock, for example. Use http://pypi.python.org/pypi/z3c.deadlockdebugger to figure out what the threads are doing at this time. Preferably, trigger the dump_threads() method of that module on SIGUSR1, like the Zope signalstack product does (see http://svn.plone.org/svn/collective/Products.signalstack/trunk/Products/signalstack/__init__.py for the exact code to bind the signal handler). That'll tell you exactly what each thread is busy with when you send the signal. -- Martijn Pieters _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedOn Thu, Apr 19, 2012 at 11:33 AM, Martijn Pieters <mj@...> wrote:
Are there locks that could possibly be used for the 'is_alive' function? Here is the definition in its entirety. In 'configure.zcml':
<view view=".views.is_alive" name="is_alive" renderer="json" /> in 'views.py':
def is_alive(request): return True Whatever the problem is, it causes 'is_alive' to take forever, and the CPU is not spinning at 100%, and memory usage is not high. Could this be a lock problem?
Use http://pypi.python.org/pypi/z3c.deadlockdebugger to figure out Thanks, I'll check this out. - Claudiu _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedOn Thu, Apr 19, 2012 at 18:22, Claudiu Saftoiu <csaftoiu@...> wrote:
> Are there locks that could possibly be used for the 'is_alive' function? > Here is the > definition in its entirety. > > In 'configure.zcml': > <view > view=".views.is_alive" > name="is_alive" > renderer="json" > /> > in 'views.py': > def is_alive(request): > return True > > Whatever the problem is, it causes 'is_alive' to take forever, and the CPU > is not > spinning at 100%, and memory usage is not high. Could this be a lock > problem? I have no idea; all you told is that you use the ZODB, not what server framework you use to register your views. Is this Grok, Bluebream, Repoze.BFG, Zope 2 or something else? I don't think that is_alive would be the cause of this, it looks like a simple enough view. But if your server uses a thread pool, and all the other threads are now occupied by something that got locked up, then it could be that the server is not answering your is_alive request at all because it is waiting for a thread to free up first. -- Martijn Pieters _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedI have no idea; all you told is that you use the ZODB, not what server Ah yes, sorry about that. I'm using Repoze.BFG . Does that help any? I don't think that is_alive would be the cause of this, it looks like Yea, I don't think it is the 'is_alive'. I more mentioned that to help diagnose the problem, i.e., whatever the problem is, it also
affects simple views that don't touch the database at all. Hmm, it might be the thread pool issue that you mention. That seems to make sense. I'll have to see if I have any views that are never finishing. Thanks
for the pointer. Thanks again, - Claudiu _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedI don't know, anything could lock up your site if something is waiting That module seems to be just the trick for seeing if I have a deadlock issue. I set up the SIGUSR1 thing, but whenever I would
send the signal to the server, it would stop, saying something about system call interrupting the 'select', so I just made it output the contents of dump_threads() to a file once a minute. If it happens again I'll look
at the tail end of the file & see what they're all busy with. Thanks, seems like this will likely do the trick, - Claudiu _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedI don't think that is_alive would be the cause of this, it looks like Ah, so it seems that, as I leave the server running longer & longer, more & more threads are taken up with a `.recv()` call. I think one of
my clients opens requests and does not read them/close them. Eventually all the threads are blocking in that fashion. I will fix my clients. But, is there a server-side fix to this (again, using Repoze.BFG)?
Something to time out the connection after 60 seconds or so if nothing has happened? Thanks, - Claudiu _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory used
Additional info: in particular the blocked threads' stack dumps look like this:
Thread 140605868680960: File "/usr/lib/python2.6/threading.py", line 504, in __bootstrap self.__bootstrap_inner() File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run() File "/usr/lib/python2.6/threading.py", line 484, in run self.__target(*self.__args, **self.__kwargs) File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 878, in worker_thread_callback
runnable() File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 1052, in <lambda> lambda: self.process_request_in_thread(request, client_address))
File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 1068, in process_request_in_thread self.finish_request(request, client_address) File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request
self.RequestHandlerClass(request, client_address, self) File "/usr/lib/python2.6/SocketServer.py", line 617, in __init__ self.handle() File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 442, in handle
BaseHTTPRequestHandler.handle(self) File "/usr/lib/python2.6/BaseHTTPServer.py", line 329, in handle self.handle_one_request() File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 437, in handle_one_request
self.wsgi_execute() File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 287, in wsgi_execute self.wsgi_start_response) File "/home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/connector.py", line 21, in __call__
result = self.next_app(environ, start_response) File "/home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/cachecleanup.py", line 25, in __call__ return self.next_app(environ, start_response)
File "/home/tsa/env/lib/python2.6/site-packages/repoze/retry/__init__.py", line 65, in __call__ chunk = original_wsgi_input.read(rest) File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", line 474, in read
data = self.file.read(length) File "/usr/lib/python2.6/socket.py", line 377, in read data = self._sock.recv(left) Was my assessment of the situation accurate?
_______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory used
I would say so, and it seems that what makes the thread hang on is that the socket is not closed by your clients. (otherwise recv would exit with an exception). Alex _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
|
|
Re: server stops handling requests - nowhere near 100% CPU or Memory usedOn Fri, Apr 20, 2012 at 10:06:40AM -0400, Claudiu Saftoiu wrote:
> > > > Ah, so it seems that, as I leave the server running longer & longer, > > more & more threads are taken up with a `.recv()` call. I think one of > > my clients opens requests and does not read them/close them. Does not write, rather. > > Eventually > > all the threads are blocking in that fashion. > > > > I will fix my clients. But, is there a server-side fix to this (again, > > using Repoze.BFG)? > > > > Additional info: in particular the blocked threads' stack dumps look like > this: > Thread 140605868680960: > File "/usr/lib/python2.6/threading.py", line 504, in __bootstrap > self.__bootstrap_inner() > File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner > self.run() > File "/usr/lib/python2.6/threading.py", line 484, in run > self.__target(*self.__args, **self.__kwargs) > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 878, in worker_thread_callback > runnable() > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 1052, in <lambda> > lambda: self.process_request_in_thread(request, client_address)) > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 1068, in process_request_in_thread > self.finish_request(request, client_address) > File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request > self.RequestHandlerClass(request, client_address, self) > File "/usr/lib/python2.6/SocketServer.py", line 617, in __init__ > self.handle() > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 442, in handle > BaseHTTPRequestHandler.handle(self) > File "/usr/lib/python2.6/BaseHTTPServer.py", line 329, in handle > self.handle_one_request() > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 437, in handle_one_request > self.wsgi_execute() > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 287, in wsgi_execute > self.wsgi_start_response) > File > "/home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/connector.py", > line 21, in __call__ > result = self.next_app(environ, start_response) > File > "/home/tsa/env/lib/python2.6/site-packages/repoze/zodbconn/cachecleanup.py", > line 25, in __call__ > return self.next_app(environ, start_response) > File > "/home/tsa/env/lib/python2.6/site-packages/repoze/retry/__init__.py", line > 65, in __call__ > chunk = original_wsgi_input.read(rest) > File "/home/tsa/env/lib/python2.6/site-packages/paste/httpserver.py", > line 474, in read > data = self.file.read(length) > File "/usr/lib/python2.6/socket.py", line 377, in read > data = self._sock.recv(left) > > Was my assessment of the situation accurate? https://github.com/repoze/repoze.retry/blob/master/repoze/retry/__init__.py Line numbers differ somewhat; the original_wsgi_input.read(rest) is on line 71 on that github page right now. It does look like this is the bit that reads the HTTP request from the client. > > Something to time out the connection after 60 seconds or so if nothing has > > happened? You're using paste.httpserver. It has support for killing hung threads: http://pythonpaste.org/modules/httpserver.html#paste.httpserver.server_runner has a list of options. You could also put nginx in front, to take care of ill-behaved HTTP clients. Marius Gedminas -- Never be afraid to tell the world who you are. -- Anonymous _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@... https://mail.zope.org/mailman/listinfo/zodb-dev |
| Free embeddable forum powered by Nabble | Forum Help |