|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
wsgiserver.py possible improvements - LRU thread selection for connectionsHello, I've been reading through the wsgiserver.py code the last few days, and have this suggestion. Using the most recently busy threads will make wsgiserver.py a happy camper. This is nicer because: - the threads memory is more likely cached. - less contention over the one request queue by all the worker threads. - the busy threads might not have to go to sleep I'm not quite sure the best way to do this yet... so here is the plan: any ideas welcome. The idea would be to have a queue per WorkerThread for requests. Each WorkerThread would wait on it's own request queue. The WorkerThread would need to be put on a queue for most recently busy. It will be responsible for putting itself on a LIFO in the server when it has finished it's request... then waiting on its internal queue. So then the main thread gets the most recently busy WorkerThread off the LIFO, and hands it a connection to its internal request queue. So, what if there are not enough threads for requests? Where does the request go in this case? Well, since the server now knows it is too busy, it can do: - wait until a thread is ready. - grow the thread pool. - add request into a backlog of requests, and feed it to workers next time around the loop OR send a special FEEDWORKERS to one of the worker threads OR have a separate worker thread for feeding threads from the back log. The selection of threads could be different... like it could now be fairly easy to pick a thread which has already served a similar request... or one which has already done a request to that ip address... but that can come later out of this design change. I'm not sure if having threads wait on separate queues would be faster and use less memory... but I think it might. What do you think? Worth trying to implement it/test/benchmark? cheers, --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cherrypy-users" group. To post to this group, send email to cherrypy-users@... To unsubscribe from this group, send email to cherrypy-users+unsubscribe@... For more options, visit this group at http://groups.google.com/group/cherrypy-users?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: wsgiserver.py possible improvements - LRU thread selection for connectionsillume wrote: > I've been reading through the wsgiserver.py code the last few days, > and have this suggestion. > > Using the most recently busy threads will make wsgiserver.py a happy > camper. > > This is nicer because: > - the threads memory is more likely cached. > - less contention over the one request queue by all the worker > threads. > - the busy threads might not have to go to sleep > > I'm not quite sure the best way to do this yet... so here is the > plan: any ideas welcome. > > > The idea would be to have a queue per WorkerThread for requests. Each > WorkerThread would wait on it's own request queue. Since each thread only handles one request at a time, you don't really need a queue per thread, just set self.conn = server.requests.get, and set self.conn back to None when you're done. The only benefit to a Queue in that case would be the backoff algorithm Queues use (wait longer and longer while empty), which is also available in other threading synchronization classes. See threading._Condition.wait for details. > The WorkerThread would need to be put on a queue for most recently > busy. It will be responsible for putting itself on a LIFO in the > server when it has finished it's request... then waiting on its > internal queue. Where LIFO is spelled "[].append() and .pop()", right? ;) > So then the main thread gets the most recently busy WorkerThread off > the LIFO, and hands it a connection to its internal request queue. The main thread should obey the ThreadPool API and call request.put(conn). A separate "delegator thread" should get the most recently busy WorkerThread off the LIFO, and hand it the HTTPConnection object it obtains via ThreadPool.get(). If no WorkerThreads are free at the moment, that would allow the main thread to go back to listening on the socket, letting the delegator thread wait/grow/backlog/feed instead. > So, what if there are not enough threads for requests? Where does the > request go in this case? Well, since the server now knows it is too > busy, it can do: > - wait until a thread is ready. > - grow the thread pool. > - add request into a backlog of requests, and feed it to workers next > time around the loop OR send a special FEEDWORKERS to one of the > worker threads OR have a separate worker thread for feeding threads > from the back log. > > The selection of threads could be different... like it could now be > fairly easy to pick a thread which has already served a similar > request... or one which has already done a request to that ip > address... but that can come later out of this design change. > > I'm not sure if having threads wait on separate queues would be faster > and use less memory... but I think it might. > > > What do you think? Worth trying to implement it/test/benchmark? Sounds worth it to me (although only measurement will tell). This is one of the reasons I pulled the ThreadPool out of the WSGIServer, so you could plug in a compatible one that worked differently. Robert Brewer fumanchu@... --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cherrypy-users" group. To post to this group, send email to cherrypy-users@... To unsubscribe from this group, send email to cherrypy-users+unsubscribe@... For more options, visit this group at http://groups.google.com/group/cherrypy-users?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: wsgiserver.py possible improvements - LRU thread selection for connectionsOn Sep 27, 2:46 am, "Robert Brewer" <fuman...@...> wrote: > Since each thread only handles one request at a time, you don't really > need a queue per thread, just set self.conn = server.requests.get, and > set self.conn back to None when you're done. The only benefit to a Queue > in that case would be the backoff algorithm Queues use (wait longer and > longer while empty), which is also available in other threading > synchronization classes. See threading._Condition.wait for details. > indeed. > > The WorkerThread would need to be put on a queue for most recently > > busy. It will be responsible for putting itself on a LIFO in the > > server when it has finished it's request... then waiting on its > > internal queue. > > Where LIFO is spelled "[].append() and .pop()", right? ;) > yeah, cool :) > > So then the main thread gets the most recently busy WorkerThread off > > the LIFO, and hands it a connection to its internal request queue. > > The main thread should obey the ThreadPool API and call > request.put(conn). A separate "delegator thread" should get the most > recently busy WorkerThread off the LIFO, and hand it the HTTPConnection > object it obtains via ThreadPool.get(). If no WorkerThreads are free at > the moment, that would allow the main thread to go back to listening on > the socket, letting the delegator thread wait/grow/backlog/feed instead. > > yeah, I think that will work ok. cheers! --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cherrypy-users" group. To post to this group, send email to cherrypy-users@... To unsubscribe from this group, send email to cherrypy-users+unsubscribe@... For more options, visit this group at http://groups.google.com/group/cherrypy-users?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: wsgiserver.py possible improvements - LRU thread selection for connectionsillume wrote: > I've been reading through the wsgiserver.py code the last few days, > and have this suggestion. > > Using the most recently busy threads will make wsgiserver.py a happy > camper. > > This is nicer because: > - the threads memory is more likely cached. > No. All threads in a process share the same memory space, and hence the same cache behavior. > - less contention over the one request queue by all the worker > threads. > I don't see how "contention" could be an issue. Manipulating a queue is not a resource-intensive operation. > - the busy threads might not have to go to sleep > Why would busy threads ever go to sleep? > I'm not quite sure the best way to do this yet... so here is the > plan: any ideas welcome. > > The idea would be to have a queue per WorkerThread for requests. Each > WorkerThread would wait on it's own request queue. > So, at the risk of oversimplifying your proposal, you're suggesting that we replace a single queue of incoming requests with a series of queues, one per thread? The biggest problem with a scheme like that is that it requires the scheduler to be omniscient about the requests that are currently pending. If one thread is given a task that takes a very long time, any requests in its queue will be delayed until the long request finishes. The scheduler can't possibly know that when it is assigning requests to queues. The single queue model eliminates that problem; the long-running task does not interfere with the scheduling of future requests. This kind of request scheduling is a well-studied research topic, with much of the fundamental research having been done in the 1960s. Unless I have misunderstood your proposal, I think you'll find that your suggestion is counterproductive. -- Tim Roberts, timr@... Providenza & Boekelheide, Inc. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cherrypy-users" group. To post to this group, send email to cherrypy-users@... To unsubscribe from this group, send email to cherrypy-users+unsubscribe@... For more options, visit this group at http://groups.google.com/group/cherrypy-users?hl=en -~----------~----~----~----~------~----~------~--~--- |
|
|
Re: wsgiserver.py possible improvements - LRU thread selection for connectionsOn Sep 28, 6:55 pm, Tim Roberts <t...@...> wrote: > illume wrote: > > I've been reading through the wsgiserver.py code the last few days, > > and have this suggestion. > > > Using the most recently busy threads will make wsgiserver.py a happy > > camper. > > > This is nicer because: > > - the threads memory is more likely cached. > > No. All threads in a process share the same memory space, and hence the > same cache behavior. > Hello, That's not entirely true these days. Apart from thread local storage, threads have their own stack space(glibc defaults to 2MB I think), and also migrate to different cpus, and store different data in different cpu caches. So trying to use threads in this manner - to try and reduce the number of threads doing work - will try and reduce the memory used, and reduce the number of context switches. > > - less contention over the one request queue by all the worker > > threads. > > I don't see how "contention" could be an issue. Manipulating a queue is > not a resource-intensive operation. > If multiple threads are trying to get at the queue, and it's a single reader - single writer queue, then there can be problems. I'm not exactly sure how the python Queues work... but when I've used them it's been slower than when I've avoided using them. > > - the busy threads might not have to go to sleep > > Why would busy threads ever go to sleep? Once a thread has done it's job it tries to get something else off the queue. Then there might be 32 other threads waiting around for something to do. So now the thread has a smaller chance of getting right back to work - and can instead go back to sleep. Unfortunately this means you are not making optimal use of your available threads - in that keeping a smaller set of busy threads constantly working is better. > > > I'm not quite sure the best way to do this yet... so here is the > > plan: any ideas welcome. > > > The idea would be to have a queue per WorkerThread for requests. Each > > WorkerThread would wait on it's own request queue. > > So, at the risk of oversimplifying your proposal, you're suggesting that > we replace a single queue of incoming requests with a series of queues, > one per thread? The biggest problem with a scheme like that is that it > requires the scheduler to be omniscient about the requests that are > currently pending. If one thread is given a task that takes a very long > time, any requests in its queue will be delayed until the long request > finishes. The scheduler can't possibly know that when it is assigning > requests to queues. The single queue model eliminates that problem; the > long-running task does not interfere with the scheduling of future requests. > The idea(currently) is to have one queue still... but feeding the threads in a way such that the most recently used threads are selected first. So rather than have the threads look on the queue for connections, they wait until they are given a connection to process. How will the workers be fed? A separate thread will take connections from the main thread then feed them to the workers. This will allow for the case when there are more connections than threads. I think the main thread should try and feed the other threads if it is able to. That is it should first look at the fifo for an available worker, then give that worker the thread to do some work. > This kind of request scheduling is a well-studied research topic, with > much of the fundamental research having been done in the 1960s. Unless > I have misunderstood your proposal, I think you'll find that your > suggestion is counterproductive. > The idea is not my own, but is used in other servers and programs I've seen (non python). However python throws a bunch of other variables into the mix. Trying it is probably the only way to see for sure I guess... and I think it's worth it for me to try now (but still far from certain :) Either way I think it'll be an educational little project for me to try out. Thanks for your comments. cheers! --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cherrypy-users" group. To post to this group, send email to cherrypy-users@... To unsubscribe from this group, send email to cherrypy-users+unsubscribe@... For more options, visit this group at http://groups.google.com/group/cherrypy-users?hl=en -~----------~----~----~----~------~----~------~--~--- |
| Free embeddable forum powered by Nabble | Forum Help |