Paul Davis wrote:
> Changes:
>
> * use poll+read, not just read, when waiting for clients to finish up
> non-process "event" handling
> * mark clients as Finished after their process callback has
> executed
> * remove clients that failed to respond to events
> * add new -r option to completely remove the JACK shm registry
> at startup (orthogonal to everything else, but in my codebase for
> months)
>
> I would commit this directly, but I'm trying to be cautious for once. It
> works much better for me now. Note that I believe there may be some
> locking issues still to address in the code (insufficient locking, that
> is, not deadlocks).
I've tested it, and it seems to lessen the problems.
before doing the following was a recipe for disaster:
* jackd -R -d dummy -p 64
* open/close an ardour2 session with 32 tracks
With this patch I cannot reproduce this anymore. However, doing the
following is still not working:
* jackd -R -d dummy -p 32
* start ardour2
The error messages (non-debug) are as follows:
loading bindings from /home/ppalmers/.ardour2/ardour.bindings
ardour: [ERROR]: JACK: cannot read result for request type 6 from server
(Success)
ardour: [ERROR]: cannot activate JACK client
ardour: [ERROR]: JACK: cannot send event response to engine (Broken pipe)
ardour: [ERROR]: JACK: cannot complete execution of the processing graph
(Resource temporarily unavailable)
ardour: [ERROR]: JACK: zombified - calling shutdown handler
and
loading driver ..
creating dummy driver ... dummy_pcm|48000|32|666|2|2
subgraph starting at ardour timed out (subgraph_wait_fd=-1, status = -1,
state = Not triggered, revents = 0x0000)
bad status (1) for client event handling (type = 5)
cannot write request result to client
could not handle external client request
subgraph starting at ardour timed out (subgraph_wait_fd=-1, status = -1,
state = Not triggered, revents = 0x0000)
bad status (1) for client event handling (type = 5)
cannot write request result to client
could not handle external client request
delay of 1455.978 usecs exceeds estimated spare time of 662.000; restart ...
subgraph starting at ardour timed out (subgraph_wait_fd=-1, status = -1,
state = Not triggered, revents = 0x0000)
bad status (1) for client event handling (type = 5)
cannot write request result to client
could not handle external client request
delay of 898.114 usecs exceeds estimated spare time of 662.000; restart ...
I know it's pushing the limits, but that's what testing is supposed to
be no? I don't think this is the behavior we want.
Anyway, some comments:
int poll_timeout = (engine->client_timeout_msecs > 0 ?
engine->client_timeout_msecs :
1 + engine->driver->period_usecs/1000);
I think that for the event delivery timeout using the period time is not
a good idea. It could be that a client uses a lot of CPU (e.g. 60%),
therefore it takes 0.6*period_usecs to receive the request, then it
could be that it takes 0.6*period_usecs to send the reply. Didn't look
at the respective code, so I don't know how realistic this is.
In any case, I think it's a bit premature to kill a client simply
because it fails to reply within one period. That would mean the same RT
for processing events as for process(). Is that the case?
The main goal of this was to detect and kick clients that don't return
from their process callback (or don't respond to events for another
reason). I think it therefore is ok to use a longer timeout.
In configure.ac:
"IPC Temporary directory" and "Default tmp dir" refer to the default
tmpdir. Intentional? Useful? isn't the tmp dir the ipc dir?
Greets,
Pieter
_______________________________________________
Jack-Devel mailing list
Jack-Devel@...
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org