|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
new patch to help jackd alongChanges:
* use poll+read, not just read, when waiting for clients to finish up non-process "event" handling * mark clients as Finished after their process callback has executed * remove clients that failed to respond to events * add new -r option to completely remove the JACK shm registry at startup (orthogonal to everything else, but in my codebase for months) I would commit this directly, but I'm trying to be cautious for once. It works much better for me now. Note that I believe there may be some locking issues still to address in the code (insufficient locking, that is, not deadlocks). --p [jack-20080508.patch] Index: libjack/client.c =================================================================== --- libjack/client.c (revision 1175) +++ libjack/client.c (working copy) @@ -1683,14 +1683,23 @@ /* wait for first wakeup from server */ if (jack_thread_first_wait (client) == control->nframes) { + /* now run till we're done */ + if (control->process) { + /* run process callback, then wait... ad-infinitum */ - while (jack_thread_wait (client, - control->process (control->nframes, - control->process_arg)) == - control->nframes) - ; + + while (1) { + int status = (control->process (control->nframes, + control->process_arg) == + control->nframes); + control->state = Finished; + if (!jack_thread_wait (client, status)) { + break; + } + } + } else { /* no process handling but still need to process events */ while (jack_thread_wait (client, 0) == control->nframes) Index: libjack/shm.c =================================================================== --- libjack/shm.c (revision 1175) +++ libjack/shm.c (working copy) @@ -243,7 +243,7 @@ * returns: 0 if successful */ static int -jack_server_initialize_shm (void) +jack_server_initialize_shm (int new_registry) { int rc; @@ -254,6 +254,11 @@ rc = jack_access_registry (®istry_info); + if (new_registry) { + jack_remove_shm (®istry_id); + rc = ENOENT; + } + switch (rc) { case ENOENT: /* registry does not exist */ rc = jack_create_registry (®istry_info); @@ -302,6 +307,7 @@ jack_set_server_prefix (server_name); jack_shm_lock_registry (); + if ((rc = jack_access_registry (®istry_info)) == 0) { if ((rc = jack_shm_validate_registry ()) != 0) { jack_error ("Incompatible shm registry, " @@ -373,7 +379,7 @@ * ENOMEM if unable to access shared memory registry */ int -jack_register_server (const char *server_name) +jack_register_server (const char *server_name, int new_registry) { int i; pid_t my_pid = getpid (); @@ -382,7 +388,7 @@ jack_info ("JACK compiled with %s SHM support.", JACK_SHM_TYPE); - if (jack_server_initialize_shm ()) + if (jack_server_initialize_shm (new_registry)) return ENOMEM; jack_shm_lock_registry (); Index: configure.ac =================================================================== --- configure.ac (revision 1175) +++ configure.ac (working copy) @@ -789,5 +789,6 @@ echo \| Shared memory interface............................... : $JACK_SHM_TYPE echo \| IPC Temporary directory............................... : $DEFAULT_TMP_DIR echo \| Install prefix........................................ : $prefix +echo \| Default tmp dir....................................... : $DEFAULT_TMP_DIR echo Index: jack/engine.h =================================================================== --- jack/engine.h (revision 1175) +++ jack/engine.h (working copy) @@ -116,6 +116,7 @@ int reordered; int watchdog_check; int feedbackcount; + int removing_clients; pid_t wait_pid; pthread_t freewheel_thread; int nozombies; Index: jack/shm.h =================================================================== --- jack/shm.h (revision 1175) +++ jack/shm.h (working copy) @@ -95,7 +95,7 @@ /* here beginneth the API */ -extern int jack_register_server (const char *server_name); +extern int jack_register_server (const char *server_name, int new_registry); extern void jack_unregister_server (const char *server_name); extern int jack_initialize_shm (const char *server_name); Index: jackd/jackd.1.in =================================================================== --- jackd/jackd.1.in (revision 1175) +++ jackd/jackd.1.in (working copy) @@ -58,6 +58,13 @@ Set the maximum number of ports the JACK server can manage. The default value is 256. .TP +\fB\-r, \-\-replace-registry\fR +.br +Remove the shared memory registry used by all JACK server instances +before startup. This should rarely be used, and is intended only +for occasions when the structure of this registry changes in ways +that are incompatible across JACK versions (which is rare). +.TP \fB\-R, \-\-realtime\fR .br Use realtime scheduling. This is needed for reliable low\-latency Index: jackd/clientengine.c =================================================================== --- jackd/clientengine.c (revision 1175) +++ jackd/clientengine.c (working copy) @@ -25,6 +25,7 @@ #include <errno.h> #include <stdio.h> +#include <unistd.h> #include <string.h> #include <jack/internal.h> @@ -37,8 +38,6 @@ #include "clientengine.h" #include "transengine.h" -#define JACK_ERROR_WITH_SOCKETS 10000000 - static void jack_client_disconnect_ports (jack_engine_t *engine, jack_client_internal_t *client) @@ -68,7 +67,10 @@ jack_client_internal_t *client, int sort_graph) { /* caller must hold engine->client_lock and must have checked for and/or - * cleared all connections held by client. */ + * cleared all connections held by client. + */ + VERBOSE(engine,"+++ deactivate %s", client->control->name); + client->control->active = FALSE; jack_transport_client_exit (engine, client); @@ -77,7 +79,7 @@ engine->external_client_cnt > 0) { engine->external_client_cnt--; } - + if (sort_graph) { jack_sort_graph (engine); } @@ -166,6 +168,12 @@ JSList *tmp, *node; int need_sort = FALSE; jack_client_internal_t *client; + + if (engine->removing_clients) { + return; + } + + engine->removing_clients++; /* remove all dead clients */ @@ -174,6 +182,8 @@ tmp = jack_slist_next (node); client = (jack_client_internal_t *) node->data; + + VERBOSE(engine, "client %s error status %d", client->control->name, client->error); if (client->error) { @@ -220,6 +230,8 @@ } jack_engine_reset_rolling_usecs (engine); + + engine->removing_clients--; } static int @@ -848,6 +860,10 @@ if (((jack_client_internal_t *) node->data)->request_fd == fd) { client = (jack_client_internal_t *) node->data; + VERBOSE (engine, "marking socket error on client %s state = " + "%s errors = %d", client->control->name, + jack_client_state_name (client), + client->error); if (client->error < JACK_ERROR_WITH_SOCKETS) { client->error += JACK_ERROR_WITH_SOCKETS; } @@ -961,3 +977,4 @@ req->status = JackFailure; } } + Index: jackd/engine.c =================================================================== --- jackd/engine.c (revision 1175) +++ jackd/engine.c (working copy) @@ -1018,8 +1018,9 @@ void jack_driver_unload (jack_driver_t *driver) { + void* handle = driver->handle; driver->finish (driver); - dlclose (driver->handle); + dlclose (handle); } int @@ -1471,6 +1472,7 @@ } if (pfd[i].revents & ~POLLIN) { + VERBOSE (engine, "client poll reports non-input condition, fd was %d", pfd[i].fd); jack_client_disconnect (engine, pfd[i].fd); } else if (pfd[i].revents & POLLIN) { if (handle_external_client_request @@ -1630,6 +1632,7 @@ engine->feedbackcount = 0; engine->wait_pid = wait_pid; engine->nozombies = nozombies; + engine->removing_clients = 0; engine->audio_out_cnt = 0; engine->audio_in_cnt = 0; @@ -1815,7 +1818,6 @@ static void jack_engine_delay (jack_engine_t *engine, float delayed_usecs) { - JSList *node; jack_event_t event; engine->control->frame_timer.reset_pending = 1; @@ -1827,13 +1829,7 @@ event.type = XRun; - jack_lock_graph (engine); - for (node = engine->clients; node; node = jack_slist_next (node)) { - jack_deliver_event (engine, - (jack_client_internal_t *) node->data, - &event); - } - jack_unlock_graph (engine); + jack_deliver_event_to_all (engine, &event); } static inline void @@ -2265,6 +2261,7 @@ event); } jack_unlock_graph (engine); + jack_remove_clients (engine); } static void @@ -2307,9 +2304,8 @@ our check on a client's continued well-being */ - if (client->control->dead - || (client->control->type == ClientExternal - && kill (client->control->pid, 0))) { + if (client->control->dead || client->error >= JACK_ERROR_WITH_SOCKETS + || (client->control->type == ClientExternal && kill (client->control->pid, 0))) { DEBUG ("client %s is dead - no event sent", client->control->name); return 0; @@ -2378,32 +2374,74 @@ jack_error ("cannot send event to client [%s]" " (%s)", client->control->name, strerror (errno)); - client->error++; + client->error = JACK_ERROR_WITH_SOCKETS+99; } - - DEBUG ("engine reading from event fd"); - - if (!client->error && - (read (client->event_fd, &status, sizeof (status)) - != sizeof (status))) { - jack_error ("cannot read event response from " - "client [%s] (%s)", - client->control->name, - strerror (errno)); - client->error++; - } - DEBUG ("engine reading from event fd DONE"); - - if (status != 0) { - jack_error ("bad status for client event " - "handling (type = %d)", - event->type); - client->error++; - } + if (client->error) { + status = 1; + } else { + // then we check whether there really is an error.... :) + + struct pollfd pfd[1]; + pfd[0].fd = client->event_fd; + //pfd[0].events = POLLERR|POLLIN|POLLHUP|POLLNVAL; + pfd[0].events = POLLIN; + + int poll_timeout = (engine->client_timeout_msecs > 0 ? + engine->client_timeout_msecs : + 1 + engine->driver->period_usecs/1000); + + //poll_timeout = 200; + //poll_timeout = 30000; // 30 seconds + + int poll_ret; + // printf("################ poll_timeout = %d (%d or 1 + %d/1000)\n", poll_timeout, engine->client_timeout_msecs, engine->driver->period_usecs); + + if ( (poll_ret = poll (pfd, 1, poll_timeout)) < 0) { + DEBUG ("client event poll not ok! (-1) poll returned an error"); + jack_error ("poll on subgraph processing failed (%s)", strerror (errno)); + status = -1; + } else { + + DEBUG ("\n\n\n\n\n back from client event poll, revents = 0x%x\n\n\n", pfd[0].revents); + + if (pfd[0].revents & ~POLLIN) { + DEBUG ("client event poll not ok! (-2), revents = %d\n", pfd[0].revents); + jack_error ("subgraph starting at %s lost client", client->control->name); + status = -2; + } + + if (pfd[0].revents & POLLIN) { + DEBUG ("client event poll ok!"); + status = 0; + } else { + DEBUG ("client event poll not ok! (1 = poll timed out, revents = 0x%04x, poll_ret = %d)", pfd[0].revents, poll_ret); + jack_error ("subgraph starting at %s timed out " + "(subgraph_wait_fd=%d, status = %d, state = %s, revents = 0x%04x)", + client->control->name, + client->subgraph_wait_fd, status, + jack_client_state_name (client), pfd[0].revents); + status = 1; + } + } + } + + if (status == 0) { + if (read (client->event_fd, &status, sizeof (status)) != sizeof (status)) { + jack_error ("cannot read event response from " + "client [%s] (%s)", + client->control->name, + strerror (errno)); + client->error = JACK_ERROR_WITH_SOCKETS+99; + } + } else { + jack_error ("bad status (%d) for client event " + "handling (type = %d)", + status,event->type); + client->error = JACK_ERROR_WITH_SOCKETS+99; + } } } - DEBUG ("event delivered"); return 0; @@ -2431,6 +2469,10 @@ next = jack_slist_next (node); + VERBOSE(engine, "+++ client is now %s active ? %d", + ((jack_client_internal_t *) node->data)->control->name, + ((jack_client_internal_t *) node->data)->control->active); + if (((jack_client_internal_t *) node->data)->control->active) { client = (jack_client_internal_t *) node->data; @@ -2560,6 +2602,8 @@ subgraph_client->subgraph_wait_fd, n); } + jack_remove_clients (engine); + VERBOSE (engine, "-- jack_rechain_graph()"); return err; @@ -3161,6 +3205,8 @@ } jack_unlock_graph (engine); + jack_remove_clients (engine); + return 0; } @@ -3262,6 +3308,8 @@ } } + jack_remove_clients (engine); + if (check_acyclic) { jack_check_acyclic (engine); } @@ -3785,6 +3833,8 @@ } } } + + jack_remove_clients (engine); } void @@ -3820,6 +3870,8 @@ } } } + + jack_remove_clients (engine); } int Index: jackd/clientengine.h =================================================================== --- jackd/clientengine.h (revision 1175) +++ jackd/clientengine.h (working copy) @@ -40,6 +40,8 @@ return client_state_names[client->control->state]; } +#define JACK_ERROR_WITH_SOCKETS 10000000 + int jack_client_activate (jack_engine_t *engine, jack_client_id_t id); int jack_client_deactivate (jack_engine_t *engine, jack_client_id_t id); int jack_client_create (jack_engine_t *engine, int client_fd); Index: jackd/jackd.c =================================================================== --- jackd/jackd.c (revision 1175) +++ jackd/jackd.c (working copy) @@ -371,6 +371,7 @@ " [ --debug-timer OR -D ]\n" " [ --verbose OR -v ]\n" " [ --clocksource OR -c [ c(ycle) | h(pet) | s(ystem) ]\n" +" [ --replace-registry OR -r ]\n" " [ --silent OR -s ]\n" " [ --version OR -V ]\n" " [ --nozombies OR -Z ]\n" @@ -520,6 +521,7 @@ { "name", 1, 0, 'n' }, { "unlock", 0, 0, 'u' }, { "realtime", 0, 0, 'R' }, + { "replace-registry", 0, 0, 'r' }, { "realtime-priority", 1, 0, 'P' }, { "timeout", 1, 0, 't' }, { "temporary", 0, 0, 'T' }, @@ -537,6 +539,7 @@ JSList * driver_params; int driver_nargs = 1; int show_version = 0; + int replace_registry = 0; int i; int rc; @@ -593,6 +596,10 @@ realtime_priority = atoi (optarg); break; + case 'r': + replace_registry = 1; + break; + case 'R': realtime = 1; break; @@ -695,7 +702,7 @@ copyright (stdout); - rc = jack_register_server (server_name); + rc = jack_register_server (server_name, replace_registry); switch (rc) { case EEXIST: fprintf (stderr, "`%s' server already active\n", server_name); _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Thu, 2008-05-08 at 22:24 -0400, Paul Davis wrote:
> Changes: > > * use poll+read, not just read, when waiting for clients to finish up > non-process "event" handling > * mark clients as Finished after their process callback has > executed > * remove clients that failed to respond to events > * add new -r option to completely remove the JACK shm registry > at startup (orthogonal to everything else, but in my codebase for > months) > > I would commit this directly, but I'm trying to be cautious for once. It > works much better for me now. Note that I believe there may be some > locking issues still to address in the code (insufficient locking, that > is, not deadlocks). thanks+++++++ I'll try it out asap... -- Fernando _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongPaul Davis wrote:
> Changes: > > * use poll+read, not just read, when waiting for clients to finish up > non-process "event" handling > * mark clients as Finished after their process callback has > executed > * remove clients that failed to respond to events > * add new -r option to completely remove the JACK shm registry > at startup (orthogonal to everything else, but in my codebase for > months) > > I would commit this directly, but I'm trying to be cautious for once. It > works much better for me now. Note that I believe there may be some > locking issues still to address in the code (insufficient locking, that > is, not deadlocks). I've tested it, and it seems to lessen the problems. before doing the following was a recipe for disaster: * jackd -R -d dummy -p 64 * open/close an ardour2 session with 32 tracks With this patch I cannot reproduce this anymore. However, doing the following is still not working: * jackd -R -d dummy -p 32 * start ardour2 The error messages (non-debug) are as follows: loading bindings from /home/ppalmers/.ardour2/ardour.bindings ardour: [ERROR]: JACK: cannot read result for request type 6 from server (Success) ardour: [ERROR]: cannot activate JACK client ardour: [ERROR]: JACK: cannot send event response to engine (Broken pipe) ardour: [ERROR]: JACK: cannot complete execution of the processing graph (Resource temporarily unavailable) ardour: [ERROR]: JACK: zombified - calling shutdown handler and loading driver .. creating dummy driver ... dummy_pcm|48000|32|666|2|2 subgraph starting at ardour timed out (subgraph_wait_fd=-1, status = -1, state = Not triggered, revents = 0x0000) bad status (1) for client event handling (type = 5) cannot write request result to client could not handle external client request subgraph starting at ardour timed out (subgraph_wait_fd=-1, status = -1, state = Not triggered, revents = 0x0000) bad status (1) for client event handling (type = 5) cannot write request result to client could not handle external client request delay of 1455.978 usecs exceeds estimated spare time of 662.000; restart ... subgraph starting at ardour timed out (subgraph_wait_fd=-1, status = -1, state = Not triggered, revents = 0x0000) bad status (1) for client event handling (type = 5) cannot write request result to client could not handle external client request delay of 898.114 usecs exceeds estimated spare time of 662.000; restart ... I know it's pushing the limits, but that's what testing is supposed to be no? I don't think this is the behavior we want. Anyway, some comments: int poll_timeout = (engine->client_timeout_msecs > 0 ? engine->client_timeout_msecs : 1 + engine->driver->period_usecs/1000); I think that for the event delivery timeout using the period time is not a good idea. It could be that a client uses a lot of CPU (e.g. 60%), therefore it takes 0.6*period_usecs to receive the request, then it could be that it takes 0.6*period_usecs to send the reply. Didn't look at the respective code, so I don't know how realistic this is. In any case, I think it's a bit premature to kill a client simply because it fails to reply within one period. That would mean the same RT for processing events as for process(). Is that the case? The main goal of this was to detect and kick clients that don't return from their process callback (or don't respond to events for another reason). I think it therefore is ok to use a longer timeout. In configure.ac: "IPC Temporary directory" and "Default tmp dir" refer to the default tmpdir. Intentional? Useful? isn't the tmp dir the ipc dir? Greets, Pieter _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongPieter Palmers wrote:
> I've tested it, and it seems to lessen the problems. > before doing the following was a recipe for disaster: > * jackd -R -d dummy -p 64 > * open/close an ardour2 session with 32 tracks > > With this patch I cannot reproduce this anymore. However, doing the > following is still not working: > * jackd -R -d dummy -p 32 > * start ardour2 OK, I wasn't comparing apples to apples... The first test was using a debug build, the test today were a normal one. If I use a debug build, this patch doesn't help. The * jackd -R -d dummy -p 64 * open/close an ardour2 session with 32 tracks still fails after a few open/close tries Greets, Pieter _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Fri, May 9, 2008 03:24, Paul Davis wrote: > Changes: > > * use poll+read, not just read, when waiting for clients to finish up > non-process "event" handling > * mark clients as Finished after their process callback has executed > * remove clients that failed to respond to events > * add new -r option to completely remove the JACK shm registry > at startup (orthogonal to everything else, but in my codebase for months) > > I would commit this directly, but I'm trying to be cautious for once. It > works much better for me now. Note that I believe there may be some > locking issues still to address in the code (insufficient locking, > that is, not deadlocks). > sorry to tell, but it still fails on the jack_test2.c crash tester (the very same at stake on that last night in Cologne:) fyi, this is just one simple client that, after a while, enters into an endless loop and tests for jackd being able to detect and remove it from the graph. what happens is that jackd gets severely stuck and jack_watchdog kicks in and bang! everything is thrown to the floor funny thing, and it might just be relevant to the case, is that this meltdown behavior seems to be most evident when, and only when, the bad client shares the graph with any other client. when left alone, everything seems to work just fine. puzzled ;) byee -- rncbc aka Rui Nuno Capela rncbc@... _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Fri, 2008-05-09 at 12:20 +0200, Fernando Lopez-Lezcano wrote:
> On Thu, 2008-05-08 at 22:24 -0400, Paul Davis wrote: > > Changes: > > > > * use poll+read, not just read, when waiting for clients to finish up > > non-process "event" handling > > * mark clients as Finished after their process callback has > > executed > > * remove clients that failed to respond to events > > * add new -r option to completely remove the JACK shm registry > > at startup (orthogonal to everything else, but in my codebase for > > months) > > > > I would commit this directly, but I'm trying to be cautious for once. It > > works much better for me now. Note that I believe there may be some > > locking issues still to address in the code (insufficient locking, that > > is, not deadlocks). > > thanks+++++++ > I'll try it out asap... ardour is started: ---- 15:55:57.269 ALSA connection graph change. 15:55:57.371 ALSA connection change. **** alsa_pcm: xrun of at least 0.029 msecs [[ always seems to happen, normal? ]] 15:55:57.504 XRUN callback (1). 15:55:57.723 JACK connection graph change. 15:56:06.041 JACK connection graph change. 15:56:06.287 JACK connection graph change. 15:56:06.388 JACK connection graph change. unknown source port in attempted connection [system:capture_3] unknown destination port in attempted connection [system:playback_3] unknown source port in attempted connection [system:capture_4] unknown destination port in attempted connection [system:playback_4] unknown source port in attempted connection [system:capture_5] unknown destination port in attempted connection [system:playback_5] unknown source port in attempted connection [system:capture_6] unknown destination port in attempted connection [system:playback_6] unknown source port in attempted connection [system:capture_7] unknown destination port in attempted connection [system:playback_7] unknown source port in attempted connection [system:capture_8] unknown destination port in attempted connection [system:playback_8] 15:56:06.582 JACK connection change. ---- start jackmix, quit jackmix, start jack-rack, load freeverb: ---- 15:56:06.582 JACK connection change. 15:57:26.502 JACK connection graph change. 15:57:26.563 ALSA connection graph change. 15:57:27.551 JACK connection graph change. 15:57:27.685 JACK connection change. 15:57:30.171 JACK connection graph change. 15:57:30.291 JACK connection change. 15:57:35.850 JACK connection graph change. 15:57:35.869 ALSA connection graph change. 15:57:35.899 JACK connection change. 15:57:35.900 ALSA connection change. ---- quit jack-rack: ---- 16:00:47.320 JACK connection graph change. 16:00:47.441 JACK connection change. 16:00:47.623 ALSA connection graph change. 16:00:47.644 ALSA connection change. ---- Good, it did not take down ardour with it!! Now the interesting part: downgrade to the previous version of jackd I had installed. Start ardour, start jack-rack, load freeverb, quit jack-rack and _ardour is kicked out of the graph_!! So, your patch is indeed fixing this particular problem (which was a bad one)... Thanks... -- Fernando _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongRui Nuno Capela wrote:
> On Fri, May 9, 2008 03:24, Paul Davis wrote: >> Changes: >> >> * use poll+read, not just read, when waiting for clients to finish up >> non-process "event" handling >> * mark clients as Finished after their process callback has executed >> * remove clients that failed to respond to events >> * add new -r option to completely remove the JACK shm registry >> at startup (orthogonal to everything else, but in my codebase for months) >> >> I would commit this directly, but I'm trying to be cautious for once. It >> works much better for me now. Note that I believe there may be some >> locking issues still to address in the code (insufficient locking, >> that is, not deadlocks). >> > > sorry to tell, but it still fails on the jack_test2.c crash tester (the > very same at stake on that last night in Cologne:) > > fyi, this is just one simple client that, after a while, enters into an > endless loop and tests for jackd being able to detect and remove it from > the graph. what happens is that jackd gets severely stuck and > jack_watchdog kicks in and bang! everything is thrown to the floor > > funny thing, and it might just be relevant to the case, is that this > meltdown behavior seems to be most evident when, and only when, the bad > client shares the graph with any other client. when left alone, everything > seems to work just fine. puzzled ;) this might be the reason. I've attached a slightly modified version of your tester that also displays a loop counter for when the process callback enters the stuck loop. for me this gives the following: ppalmers@ox-D820:~/programming/jack/tests$ ./jack_test2 2> log seconds to run: 30 num.of ports: 2 client_name: jack_test2-32116 jack_test2-32116: client_new jack_test2-32116: port_register jack_test2-32116: set_process_callback jack_test2-32116: on_shutdown jack_test2-32116: activate jack_test2-32116: connect: jack_test2-32116:out_0 -> system:playback_1 jack_test2-32116: connect: jack_test2-32116:out_1 -> system:playback_2 jack_test2-32116: running(0, 0)... 1 jack_test2-32116: mark! jack_test2-32116: running(1, 2071971914)... 54 *** shutdown *** jack_test2-32116: running(1, 2147483647)... 48 ppalmers@ox-D820:~/programming/jack/tests$ after the shutdown message, the counter stops increasing, so the process callback is dead. The client itself stays alive, as I would expect it to be. This is both with and without other clients in the graph. Greets, Pieter /* jack_test2.c */ /**************************************************************************** Copyright (C) 2007, rncbc aka Rui Nuno Capela. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. *****************************************************************************/ #include <stdio.h> #include <stdlib.h> #include <math.h> #include <jack/jack.h> jack_client_t *client; jack_port_t **iports; jack_port_t **oports; unsigned int seconds_to_run = 30; unsigned int num_of_ports = 2; float mixer_gain = 1.0f; int ret = 0; long int loop_count = 0; /* shutdown process.. */ void shutdown_0 (void *arg) { fprintf(stdout, "\n*** shutdown ***\n"); // ret = 0; } /* stand-alone process. mix all the input ports to each output port. */ int process_0 (jack_nframes_t frames, void *arg) { jack_default_audio_sample_t *ibuf; jack_default_audio_sample_t *obuf; jack_nframes_t k; int i, j, n; if (ret > 0) for (n = 0; n < 0x7fffffff; ++n) { loop_count++; } for (i = 0; i < num_of_ports; i++) { obuf = (jack_default_audio_sample_t*) jack_port_get_buffer(oports[i], frames); for (j = 0; j < num_of_ports; j++) { ibuf = (jack_default_audio_sample_t*) jack_port_get_buffer(iports[j], frames); for (k = 0; k < frames; k++) { if (j == 0) obuf[k] = 0.0; if (ibuf[k] < -1E-6f || ibuf[k] > +1E-6f) obuf[k] += ibuf[k]; if (j == num_of_ports - 1) obuf[k] *= mixer_gain; } } } return 0; // ret; } int main(int argc, char *argv[]) { char client_name[33]; char iport_name[33]; char oport_name[33]; const char **pports; int i; /* seconds_to_run: default = 60 seconds. */ if (argc > 1) seconds_to_run = (unsigned int) atoi(argv[1]); fprintf(stdout, "seconds to run: %u\n", seconds_to_run); /* num_of_ports: default = 2 ports. */ if (argc > 2) num_of_ports = (unsigned int) atoi(argv[2]); fprintf(stdout, "num.of ports: %u\n", num_of_ports); sprintf(client_name, "jack_test2-%u", getpid()); fprintf(stdout, "client_name: %s\n", client_name); fprintf(stdout, "%s: client_new\n", client_name); client = jack_client_new(client_name); if (!client) { fprintf(stdout, "%s: jackd not running?\n", client_name); return 1; } mixer_gain = 1.0f / (float) num_of_ports; iports = (jack_port_t **) malloc(num_of_ports * sizeof(jack_port_t *)); oports = (jack_port_t **) malloc(num_of_ports * sizeof(jack_port_t *)); fprintf(stdout, "%s: port_register\n", client_name); for (i = 0; i < num_of_ports; i++) { sprintf(iport_name, "in_%d", i); iports[i] = jack_port_register(client, iport_name, JACK_DEFAULT_AUDIO_TYPE, JackPortIsInput, 0); if (iports[i] == NULL) { fprintf(stdout, "%s:%s port registration failed\n", client_name, iport_name); goto exit1; } sprintf(oport_name, "out_%d", i); oports[i] = jack_port_register(client, oport_name, JACK_DEFAULT_AUDIO_TYPE, JackPortIsOutput, 0); if (oports[i] == NULL) { fprintf(stdout, "%s:%s port registration failed\n", client_name, oport_name); goto exit1; } } fprintf(stdout, "%s: set_process_callback\n", client_name); jack_set_process_callback(client, process_0, 0); fprintf(stdout, "%s: on_shutdown\n", client_name); jack_on_shutdown(client, shutdown_0, 0); fprintf(stdout, "%s: activate\n", client_name); jack_activate(client); /* try to connect to available physical outputs. */ pports = jack_get_ports(client, 0, 0, JackPortIsInput | JackPortIsPhysical); if (pports) { for (i = 0; i < num_of_ports && pports[i]; i++) { sprintf(oport_name, "%s:out_%d", client_name, i); fprintf(stdout, "%s: connect: %s -> %s\n", client_name, oport_name, pports[i]); if (jack_connect(client, oport_name, pports[i]) != 0) { fprintf(stdout, "%s: connect failed\n"); goto exit2; } } free(pports); } /* ok, we're up and running... */ for (i = seconds_to_run; i > 0; --i) { fprintf(stdout, "%s: running(%d, %ld)...%3d\r", client_name, ret, loop_count, i); fflush(stdout); sleep(1); } /* make it blast... */ fprintf(stdout, "\n%s: mark!\n", client_name); ret = 1; /* duh?, still here... */ for (i = seconds_to_run*2; i > 0; --i) { fprintf(stdout, "%s: running(%d, %ld)...%3d\r", client_name, ret, loop_count, i); fflush(stdout); sleep(1); } exit2: fprintf(stdout, "%s: deactivate\n", client_name); jack_deactivate(client); exit1: fprintf(stdout, "%s: close\n", client_name); jack_client_close(client); free(iports); free(oports); return 0; } /* end of jack_test2.c */ _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongFernando Lopez-Lezcano wrote:
> On Fri, 2008-05-09 at 12:20 +0200, Fernando Lopez-Lezcano wrote: >> On Thu, 2008-05-08 at 22:24 -0400, Paul Davis wrote: ... > Good, it did not take down ardour with it!! > > Now the interesting part: downgrade to the previous version of jackd I > had installed. Start ardour, start jack-rack, load freeverb, quit > jack-rack and _ardour is kicked out of the graph_!! So, your patch is > indeed fixing this particular problem (which was a bad one)... I don't think it's 100% fixed. It's just less likely to happen. We have found a repeatable test case where we create and destroy ports in a running client that takes down the client. Most likely this is one of the problems causing issues at client startup/shutdown. Greets, Pieter _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Fri, 2008-05-09 at 16:39 +0200, Pieter Palmers wrote:
> Rui Nuno Capela wrote: > > On Fri, May 9, 2008 03:24, Paul Davis wrote: > >> Changes: > >> > >> * use poll+read, not just read, when waiting for clients to finish up > >> non-process "event" handling > >> * mark clients as Finished after their process callback has executed > >> * remove clients that failed to respond to events > >> * add new -r option to completely remove the JACK shm registry > >> at startup (orthogonal to everything else, but in my codebase for months) > >> > >> I would commit this directly, but I'm trying to be cautious for once. It > >> works much better for me now. Note that I believe there may be some > >> locking issues still to address in the code (insufficient locking, > >> that is, not deadlocks). > >> > > > > sorry to tell, but it still fails on the jack_test2.c crash tester (the > > very same at stake on that last night in Cologne:) > > > > fyi, this is just one simple client that, after a while, enters into an > > endless loop and tests for jackd being able to detect and remove it from > > the graph. what happens is that jackd gets severely stuck and > > jack_watchdog kicks in and bang! everything is thrown to the floor > > > > funny thing, and it might just be relevant to the case, is that this > > meltdown behavior seems to be most evident when, and only when, the bad > > client shares the graph with any other client. when left alone, everything > > seems to work just fine. puzzled ;) > > For me it seems to be working fine. I'm on a dual core machine though so > this might be the reason. > > I've attached a slightly modified version of your tester that also > displays a loop counter for when the process callback enters the stuck > loop. for me this gives the following: > > ppalmers@ox-D820:~/programming/jack/tests$ ./jack_test2 2> log > seconds to run: 30 > num.of ports: 2 > client_name: jack_test2-32116 > jack_test2-32116: client_new > jack_test2-32116: port_register > jack_test2-32116: set_process_callback > jack_test2-32116: on_shutdown > jack_test2-32116: activate > jack_test2-32116: connect: jack_test2-32116:out_0 -> system:playback_1 > jack_test2-32116: connect: jack_test2-32116:out_1 -> system:playback_2 > jack_test2-32116: running(0, 0)... 1 > jack_test2-32116: mark! > jack_test2-32116: running(1, 2071971914)... 54 > *** shutdown *** > jack_test2-32116: running(1, 2147483647)... 48 > ppalmers@ox-D820:~/programming/jack/tests$ > > after the shutdown message, the counter stops increasing, so the process > callback is dead. The client itself stays alive, as I would expect it to > be. This is both with and without other clients in the graph. In my computer the counter keeps decrementing after a zombified message (no other clients, jackd running in a terminal with -R -d alsa -d hw -n 3): ---- $ ./jack_test2 seconds to run: 30 num.of ports: 2 client_name: jack_test2-27026 jack_test2-27026: client_new SSE2 detected SSE2 detected jack_test2-27026: port_register jack_test2-27026: set_process_callback jack_test2-27026: on_shutdown jack_test2-27026: activate jack_test2-27026: connect: jack_test2-27026:out_0 -> system:playback_1 jack_test2-27026: connect: jack_test2-27026:out_1 -> system:playback_2 jack_test2-27026: running(0, 0)... 1 jack_test2-27026: mark! zombified - calling shutdown handler821)... 55 *** shutdown *** jack_test2-27026: deactivate 2147483647)... 1 jack_test2-27026: close ---- Seems to be repeatable. I'm seeing the same behavior without Paul's patch. Looks like I'm getting an xrun some time during the second countdown (which I don't see with the patch). But, without the patch and running jackd inside qjackctl I get subgraph timeouts from qjackctl and eventually the whole thing is killed by the watchdog timer. With the patch, jack in qjackctl, the watchdog does not trigger. -- Fernando _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Fri, May 9, 2008 15:39, Pieter Palmers wrote: > Rui Nuno Capela wrote: > >> On Fri, May 9, 2008 03:24, Paul Davis wrote: >> >>> Changes: >>> >>> >>> * use poll+read, not just read, when waiting for clients to finish up >>> non-process "event" handling * mark clients as Finished after their >>> process callback has executed * remove clients that failed to respond >>> to events * add new -r option to completely remove the JACK shm >>> registry at startup (orthogonal to everything else, but in my codebase >>> for months) >>> >>> I would commit this directly, but I'm trying to be cautious for once. >>> It >>> works much better for me now. Note that I believe there may be some >>> locking issues still to address in the code (insufficient locking, >>> that is, not deadlocks). >>> >> >> sorry to tell, but it still fails on the jack_test2.c crash tester (the >> very same at stake on that last night in Cologne:) >> >> fyi, this is just one simple client that, after a while, enters into an >> endless loop and tests for jackd being able to detect and remove it >> from the graph. what happens is that jackd gets severely stuck and >> jack_watchdog kicks in and bang! everything is thrown to the floor >> >> funny thing, and it might just be relevant to the case, is that this >> meltdown behavior seems to be most evident when, and only when, the bad >> client shares the graph with any other client. when left alone, >> everything seems to work just fine. puzzled ;) > > For me it seems to be working fine. I'm on a dual core machine though so > this might be the reason. > > I've attached a slightly modified version of your tester that also > displays a loop counter for when the process callback enters the stuck > loop. for me this gives the following: > > ppalmers@ox-D820:~/programming/jack/tests$ ./jack_test2 2> log > seconds to run: 30 num.of ports: 2 client_name: jack_test2-32116 > jack_test2-32116: client_new > jack_test2-32116: port_register > jack_test2-32116: set_process_callback > jack_test2-32116: on_shutdown > jack_test2-32116: activate > jack_test2-32116: connect: jack_test2-32116:out_0 -> system:playback_1 > jack_test2-32116: connect: jack_test2-32116:out_1 -> system:playback_2 > jack_test2-32116: running(0, 0)... 1 > jack_test2-32116: mark! > jack_test2-32116: running(1, 2071971914)... 54 > *** shutdown *** > jack_test2-32116: running(1, 2147483647)... 48 > ppalmers@ox-D820:~/programming/jack/tests$ > > > after the shutdown message, the counter stops increasing, so the process > callback is dead. The client itself stays alive, as I would expect it to > be. This is both with and without other clients in the graph. > did you test it also when some other (sane) client is in the graph? eg. qjackctl active? i do get a similar (good) behavior when jack_test2 is the *single* client around, having either jackd patched or not. big problem seems to occur when jack_test2 is *not* the first client in the graph cyaa -- rncbc aka Rui Nuno Capela rncbc@... _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongRui Nuno Capela wrote:
> On Fri, May 9, 2008 15:39, Pieter Palmers wrote: >> Rui Nuno Capela wrote: >> >>> On Fri, May 9, 2008 03:24, Paul Davis wrote: >>> >>>> Changes: >>>> >>>> >>>> * use poll+read, not just read, when waiting for clients to finish up >>>> non-process "event" handling * mark clients as Finished after their >>>> process callback has executed * remove clients that failed to respond >>>> to events * add new -r option to completely remove the JACK shm >>>> registry at startup (orthogonal to everything else, but in my codebase >>>> for months) >>>> >>>> I would commit this directly, but I'm trying to be cautious for once. >>>> It >>>> works much better for me now. Note that I believe there may be some >>>> locking issues still to address in the code (insufficient locking, >>>> that is, not deadlocks). >>>> >>> sorry to tell, but it still fails on the jack_test2.c crash tester (the >>> very same at stake on that last night in Cologne:) >>> >>> fyi, this is just one simple client that, after a while, enters into an >>> endless loop and tests for jackd being able to detect and remove it >>> from the graph. what happens is that jackd gets severely stuck and >>> jack_watchdog kicks in and bang! everything is thrown to the floor >>> >>> funny thing, and it might just be relevant to the case, is that this >>> meltdown behavior seems to be most evident when, and only when, the bad >>> client shares the graph with any other client. when left alone, >>> everything seems to work just fine. puzzled ;) >> For me it seems to be working fine. I'm on a dual core machine though so >> this might be the reason. >> >> I've attached a slightly modified version of your tester that also >> displays a loop counter for when the process callback enters the stuck >> loop. for me this gives the following: >> >> ppalmers@ox-D820:~/programming/jack/tests$ ./jack_test2 2> log >> seconds to run: 30 num.of ports: 2 client_name: jack_test2-32116 >> jack_test2-32116: client_new >> jack_test2-32116: port_register >> jack_test2-32116: set_process_callback >> jack_test2-32116: on_shutdown >> jack_test2-32116: activate >> jack_test2-32116: connect: jack_test2-32116:out_0 -> system:playback_1 >> jack_test2-32116: connect: jack_test2-32116:out_1 -> system:playback_2 >> jack_test2-32116: running(0, 0)... 1 >> jack_test2-32116: mark! >> jack_test2-32116: running(1, 2071971914)... 54 >> *** shutdown *** >> jack_test2-32116: running(1, 2147483647)... 48 >> ppalmers@ox-D820:~/programming/jack/tests$ >> >> >> after the shutdown message, the counter stops increasing, so the process >> callback is dead. The client itself stays alive, as I would expect it to >> be. This is both with and without other clients in the graph. >> > > did you test it also when some other (sane) client is in the graph? eg. > qjackctl active? > > i do get a similar (good) behavior when jack_test2 is the *single* client > around, having either jackd patched or not. big problem seems to occur > when jack_test2 is *not* the first client in the graph works like a charm here. even with ardour2 and qjackctl in the graph. I tested with ardour2 connected to the test appliction too, worked fine. Well... ardour got kicked in that case, but that's fine. the watchdog didn't kick in. Greets, Pieter _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongPieter Palmers wrote:
> Rui Nuno Capela wrote: >> On Fri, May 9, 2008 15:39, Pieter Palmers wrote: >>> Rui Nuno Capela wrote: >>> >>>> On Fri, May 9, 2008 03:24, Paul Davis wrote: >>>> >>>>> Changes: >>>>> >>>>> >>>>> * use poll+read, not just read, when waiting for clients to finish up >>>>> non-process "event" handling * mark clients as Finished after their >>>>> process callback has executed * remove clients that failed to respond >>>>> to events * add new -r option to completely remove the JACK shm >>>>> registry at startup (orthogonal to everything else, but in my codebase >>>>> for months) >>>>> >>>>> I would commit this directly, but I'm trying to be cautious for once. >>>>> It >>>>> works much better for me now. Note that I believe there may be some >>>>> locking issues still to address in the code (insufficient locking, >>>>> that is, not deadlocks). >>>>> >>>> sorry to tell, but it still fails on the jack_test2.c crash tester (the >>>> very same at stake on that last night in Cologne:) >>>> >>>> fyi, this is just one simple client that, after a while, enters into an >>>> endless loop and tests for jackd being able to detect and remove it >>>> from the graph. what happens is that jackd gets severely stuck and >>>> jack_watchdog kicks in and bang! everything is thrown to the floor >>>> >>>> funny thing, and it might just be relevant to the case, is that this >>>> meltdown behavior seems to be most evident when, and only when, the bad >>>> client shares the graph with any other client. when left alone, >>>> everything seems to work just fine. puzzled ;) >>> For me it seems to be working fine. I'm on a dual core machine though so >>> this might be the reason. >>> >>> I've attached a slightly modified version of your tester that also >>> displays a loop counter for when the process callback enters the stuck >>> loop. for me this gives the following: >>> >>> ppalmers@ox-D820:~/programming/jack/tests$ ./jack_test2 2> log >>> seconds to run: 30 num.of ports: 2 client_name: jack_test2-32116 >>> jack_test2-32116: client_new >>> jack_test2-32116: port_register >>> jack_test2-32116: set_process_callback >>> jack_test2-32116: on_shutdown >>> jack_test2-32116: activate >>> jack_test2-32116: connect: jack_test2-32116:out_0 -> system:playback_1 >>> jack_test2-32116: connect: jack_test2-32116:out_1 -> system:playback_2 >>> jack_test2-32116: running(0, 0)... 1 >>> jack_test2-32116: mark! >>> jack_test2-32116: running(1, 2071971914)... 54 >>> *** shutdown *** >>> jack_test2-32116: running(1, 2147483647)... 48 >>> ppalmers@ox-D820:~/programming/jack/tests$ >>> >>> >>> after the shutdown message, the counter stops increasing, so the process >>> callback is dead. The client itself stays alive, as I would expect it to >>> be. This is both with and without other clients in the graph. >>> >> >> did you test it also when some other (sane) client is in the graph? eg. >> qjackctl active? >> >> i do get a similar (good) behavior when jack_test2 is the *single* client >> around, having either jackd patched or not. big problem seems to occur >> when jack_test2 is *not* the first client in the graph > > works like a charm here. > > even with ardour2 and qjackctl in the graph. > > I tested with ardour2 connected to the test appliction too, worked fine. > Well... ardour got kicked in that case, but that's fine. the watchdog > didn't kick in. > ok. my crap. please forgive todays precipitation. now that i got home and tested the patch on all of my boxes for real (and retested, unpatched, and back again to have it for sure), i can see that the patch really *is* a healer one. my early report was based on very dumb tests conducted under a guest of virtualbox, hardly a baseline machine for real-time audio applications :) so, i'll repeat: paul's patch is the holy one! i shall be damned to hell, oh me unbeliever :) -- rncbc aka Rui Nuno Capela rncbc@... _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Thu, May 08, 2008 at 10:24:16PM -0400, Paul Davis wrote:
> Changes: > > * use poll+read, not just read, when waiting for clients to finish up > non-process "event" handling > * mark clients as Finished after their process callback has > executed > * remove clients that failed to respond to events > * add new -r option to completely remove the JACK shm registry > at startup (orthogonal to everything else, but in my codebase for > months) > > I would commit this directly, but I'm trying to be cautious for once. It > works much better for me now. Note that I believe there may be some > locking issues still to address in the code (insufficient locking, that > is, not deadlocks). > > --p > this helps a lot, but I'm still seeing occasional subgraph timeouts. guff:~% jackd -d sun -p 4096 -r 44100 jackd:/usr/local/lib/jack/jack_dummy.so: undefined symbol 'clock_nanosleep' could not open driver .so '/usr/local/lib/jack/jack_dummy.so': Cannot load specified object jackd 0.111.5 Copyright 2001-2005 Paul Davis and others. jackd comes with ABSOLUTELY NO WARRANTY This is free software, and you are welcome to redistribute it under certain conditions; see the file COPYING for details JACK compiled with System V SHM support. loading driver .. Enhanced3DNow! detected SSE2 detected sun_driver: indevbuf 16384 B, outdevbuf 16384 B sun_driver: playback xrun of 4096 frames (92.879822 msec) sun_driver: writing 4096 frames of silence to correct I/O sync sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle subgraph starting at xine timed out (subgraph_wait_fd=9, status = -102, state = Finished, revents = 0x0000) bad status (1) for client event handling (type = 5) sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle sun_driver: running null cycle subgraph starting at xine timed out (subgraph_wait_fd=12, status = -102, state = Finished, revents = 0x0000) bad status (1) for client event handling (type = 5) basically, I have a couple clients running (kaffeine, qsynth) then run jack_lsp repeatedly by hand. eventually (though it takes a lot more tries than before), the subgraph times out and the first client gets "zombified". no idea why jackd decides to run null cycles when that happens either, but probably related. -- jakemsr@... SDF Public Access UNIX System - http://sdf.lonestar.org _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Fri, May 09, 2008 at 10:45:22PM +0000, Jacob Meuser wrote:
> On Thu, May 08, 2008 at 10:24:16PM -0400, Paul Davis wrote: > > Changes: > > > > * use poll+read, not just read, when waiting for clients to finish up > > non-process "event" handling > > * mark clients as Finished after their process callback has > > executed > > * remove clients that failed to respond to events > > * add new -r option to completely remove the JACK shm registry > > at startup (orthogonal to everything else, but in my codebase for > > months) > > > > I would commit this directly, but I'm trying to be cautious for once. It > > works much better for me now. Note that I believe there may be some > > locking issues still to address in the code (insufficient locking, that > > is, not deadlocks). > > > > --p > > > > this helps a lot, but I'm still seeing occasional subgraph > timeouts. actually, it doesn't help at all. the difference was I was using a larger buffer size. > guff:~% jackd -d sun -p 4096 -r 44100 > jackd:/usr/local/lib/jack/jack_dummy.so: undefined symbol 'clock_nanosleep' > could not open driver .so '/usr/local/lib/jack/jack_dummy.so': Cannot load specified object > > jackd 0.111.5 > Copyright 2001-2005 Paul Davis and others. > jackd comes with ABSOLUTELY NO WARRANTY > This is free software, and you are welcome to redistribute it > under certain conditions; see the file COPYING for details > > JACK compiled with System V SHM support. > loading driver .. > Enhanced3DNow! detected > SSE2 detected > sun_driver: indevbuf 16384 B, outdevbuf 16384 B > sun_driver: playback xrun of 4096 frames (92.879822 msec) > sun_driver: writing 4096 frames of silence to correct I/O sync > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > subgraph starting at xine timed out (subgraph_wait_fd=9, status = -102, state = Finished, revents = 0x0000) > bad status (1) for client event handling (type = 5) > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > sun_driver: running null cycle > subgraph starting at xine timed out (subgraph_wait_fd=12, status = -102, state = Finished, revents = 0x0000) > bad status (1) for client event handling (type = 5) > > > > basically, I have a couple clients running (kaffeine, qsynth) then run > jack_lsp repeatedly by hand. eventually (though it takes a lot more > tries than before), the subgraph times out and the first client gets > "zombified". no idea why jackd decides to run null cycles when > that happens either, but probably related. ah, I guess that would be from waiting in poll()? -- jakemsr@... SDF Public Access UNIX System - http://sdf.lonestar.org _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Fri, May 09, 2008 at 10:58:50PM +0000, Jacob Meuser wrote:
> On Fri, May 09, 2008 at 10:45:22PM +0000, Jacob Meuser wrote: > > On Thu, May 08, 2008 at 10:24:16PM -0400, Paul Davis wrote: > > > Changes: > > > > > > * use poll+read, not just read, when waiting for clients to finish up > > > non-process "event" handling > > this helps a lot, but I'm still seeing occasional subgraph > > timeouts. > > actually, it doesn't help at all. the difference was I was using a > larger buffer size. here's a snippet of debug messages from a client with the patch I proposed: jack: 95:276838844959 client.c:jack_client_core_wait:1428: client polling on event_fd and graph_wait_fd... jack: 95:276838861577 client.c:jack_client_core_wait:1455: pfd[EVENT].revents = 0x0 pfd[WAIT].revents = 0x1 jack: 95:276838864034 client.c:jack_client_core_wait:1498: time to run process() jack: 95:276838866039 client.c:jack_wake_next_client:1528: client sent message to next stage by 276838866036 jack: 95:276838866853 client.c:jack_wake_next_client:1530: reading cleanup byte from pipe 17 jack: 95:276838867652 client.c:jack_client_core_wait:1428: client polling on event_fd and graph_wait_fd... jack: 95:276838901045 client.c:jack_client_core_wait:1455: pfd[EVENT].revents = 0x0 pfd[WAIT].revents = 0x1 jack: 95:276838902050 client.c:jack_client_core_wait:1498: time to run process() jack: 95:276838915474 client.c:jack_wake_next_client:1528: client sent message to next stage by 276838915470 jack: 95:276838916337 client.c:jack_wake_next_client:1530: reading cleanup byte from pipe 17 jack: 95:276838917174 client.c:jack_wake_next_client:1554: cleanup byte from pipe 17 not available? jack: 95:276838917958 client.c:jack_client_core_wait:1428: client polling on event_fd and graph_wait_fd... jack: 95:276838918756 client.c:jack_client_core_wait:1455: pfd[EVENT].revents = 0x1 pfd[WAIT].revents = 0x0 jack: 95:276838919547 client.c:jack_client_process_events:1307: client receives an event, now reading on event fd jack: 95:276838920871 client.c:jack_handle_reorder:487: graph reorder jack: 95:276838920994 client.c:jack_handle_reorder:490: closing graph_wait_fd==17 jack: 95:276838921033 client.c:jack_handle_reorder:496: closing graph_next_fd==18 jack: 95:276838921076 client.c:jack_handle_reorder:508: opened new graph_wait_fd 17 (/tmp/jack-1000/default/jack-ack-fifo-25704-0) jack: 95:276838921104 client.c:jack_handle_reorder:523: opened new graph_next_fd 18 (/tmp/jack-1000/default/jack-ack-fifo-25704-1) (upstream is jackd? 1) jack: 95:276838921123 client.c:jack_client_process_events:1408: client has dealt with the event, writing response on event fd jack: 95:276838931247 client.c:jack_client_core_wait:1455: pfd[EVENT].revents = 0x0 pfd[WAIT].revents = 0x1 jack: 95:276838933680 client.c:jack_client_core_wait:1498: time to run process() jack: 95:276838935685 client.c:jack_wake_next_client:1528: client sent message to next stage by 276838935682 jack: 95:276838936513 client.c:jack_wake_next_client:1530: reading cleanup byte from pipe 17 jack: 95:276838937319 client.c:jack_client_core_wait:1428: client polling on event_fd and graph_wait_fd... jack: 95:276838959307 client.c:jack_client_core_wait:1455: pfd[EVENT].revents = 0x0 pfd[WAIT].revents = 0x1 jack: 95:276838960548 client.c:jack_client_core_wait:1498: time to run process() jack: 95:276838961168 client.c:jack_wake_next_client:1528: client sent message to next stage by 276838961165 jack: 95:276838961194 client.c:jack_wake_next_client:1530: reading cleanup byte from pipe 17 jack: 95:276838961226 client.c:jack_client_core_wait:1428: client polling on event_fd and graph_wait_fd... jack: 95:276838977672 client.c:jack_client_core_wait:1455: pfd[EVENT].revents = 0x0 pfd[WAIT].revents = 0x1 jack: 95:276838979520 client.c:jack_client_core_wait:1498: time to run process() note the "cleanup byte from pipe 17 not available?" line (that's from my patch; that's printed if the read() is skipped) and how right _after_ that "pfd[WAIT].revents = 0x0", which means that the _next_ loop doesn't hit that code. without that patch, or with Paul's patch, that's the point where things go wrong. looks to me like the original poll() on the wait_fd is in the wrong place. I'm wondering if maybe the cleanup byte simply gets buffered on some systems but not others? for completeness, here's the corresponding server debug messages: jack:25704:276838861533 engine.c:jack_engine_process:789: considering client sun for processing jack:25704:276838861539 engine.c:jack_process_internal:578: invoking an internal client's callbacks jack:25704:276838861545 engine.c:jack_engine_process:789: considering client xine for processing jack:25704:276838861552 engine.c:jack_process_external:657: calling process() on an external subgraph, fd==7 jack:25704:276838862663 engine.c:jack_process_external:680: waiting on fd==9 for process() subgraph to finish jack:25704:276838864883 engine.c:jack_process_external:688: back from subgraph poll, revents = 0x1 jack:25704:276838864987 engine.c:jack_process_external:740: reading byte from subgraph_wait_fd==9 jack:25704:276838865024 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838865030 engine.c:jack_run_one_cycle:2070: cycle finished, status = 0 jack:25704:276838884609 ../jack/engine.h:jack_try_lock_graph:192: TRYING to acquiring graph lock jack:25704:276838884993 engine.c:jack_run_one_cycle:2013: waiting for driver read jack:25704:276838885010 engine.c:jack_run_one_cycle:2019: run process jack:25704:276838885016 engine.c:jack_engine_process:789: considering client sun for processing jack:25704:276838885022 engine.c:jack_process_internal:578: invoking an internal client's callbacks jack:25704:276838885028 engine.c:jack_engine_process:789: considering client xine for processing jack:25704:276838885035 engine.c:jack_process_external:657: calling process() on an external subgraph, fd==7 jack:25704:276838885048 engine.c:jack_process_external:680: waiting on fd==9 for process() subgraph to finish jack:25704:276838891666 engine.c:jack_server_thread:1461: server thread back from poll jack:25704:276838892655 engine.c:jack_server_thread:1506: pfd[0].revents & POLLIN jack:25704:276838892689 ../jack/engine.h:jack_lock_graph:186: acquiring graph lock jack:25704:276838902896 engine.c:jack_process_external:688: back from subgraph poll, revents = 0x1 jack:25704:276838903025 engine.c:jack_process_external:740: reading byte from subgraph_wait_fd==9 jack:25704:276838903067 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838903075 engine.c:jack_run_one_cycle:2070: cycle finished, status = 0 jack:25704:276838903093 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838903156 ../jack/engine.h:jack_lock_graph:186: acquiring graph lock jack:25704:276838903165 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838904645 engine.c:jack_server_thread:1450: start while jack:25704:276838905454 engine.c:jack_server_thread:1461: server thread back from poll jack:25704:276838906058 engine.c:jack_server_thread:1540: pfd[1].revents & POLLIN jack:25704:276838906873 engine.c:jack_server_thread:1450: start while jack:25704:276838908125 engine.c:jack_server_thread:1461: server thread back from poll jack:25704:276838909366 engine.c:handle_external_client_request:1343: HIT: before lock jack:25704:276838909511 ../jack/engine.h:jack_lock_graph:186: acquiring graph lock jack:25704:276838910058 engine.c:handle_external_client_request:1347: HIT: before for jack:25704:276838910175 engine.c:handle_external_client_request:1351: HIT: in for jack:25704:276838910718 engine.c:handle_external_client_request:1356: HIT: after for jack:25704:276838910820 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838911370 ../jack/engine.h:jack_lock_graph:186: acquiring graph lock jack:25704:276838911552 engine.c:jack_deliver_event:2304: delivering event (type 5) jack:25704:276838912107 engine.c:jack_deliver_event:2318: client sun is still alive jack:25704:276838912211 engine.c:jack_deliver_event:2407: event delivered jack:25704:276838912761 engine.c:jack_get_fifo_fd:3335: /tmp/jack-1000/default/jack-ack-fifo-25704-0 jack:25704:276838912890 engine.c:jack_get_fifo_fd:3335: /tmp/jack-1000/default/jack-ack-fifo-25704-1 jack:25704:276838913440 engine.c:jack_deliver_event:2304: delivering event (type 5) jack:25704:276838913545 engine.c:jack_deliver_event:2318: client xine is still alive jack:25704:276838914090 engine.c:jack_deliver_event:2374: engine writing on event fd jack:25704:276838914194 engine.c:jack_deliver_event:2384: engine reading from event fd jack:25704:276838914700 ../jack/engine.h:jack_try_lock_graph:192: TRYING to acquiring graph lock sun_driver: running null cycle jack:25704:276838921167 engine.c:jack_deliver_event:2396: engine reading from event fd DONE jack:25704:276838921313 engine.c:jack_deliver_event:2407: event delivered jack:25704:276838921322 engine.c:jack_get_fifo_fd:3335: /tmp/jack-1000/default/jack-ack-fifo-25704-1 jack:25704:276838921336 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838921344 engine.c:jack_server_thread:1450: start while jack:25704:276838931040 ../jack/engine.h:jack_try_lock_graph:192: TRYING to acquiring graph lock jack:25704:276838931175 engine.c:jack_run_one_cycle:2013: waiting for driver read jack:25704:276838931190 engine.c:jack_run_one_cycle:2019: run process jack:25704:276838931196 engine.c:jack_engine_process:789: considering client sun for processing jack:25704:276838931202 engine.c:jack_process_internal:578: invoking an internal client's callbacks jack:25704:276838931207 engine.c:jack_engine_process:789: considering client xine for processing jack:25704:276838931215 engine.c:jack_process_external:657: calling process() on an external subgraph, fd==7 jack:25704:276838932314 engine.c:jack_process_external:680: waiting on fd==9 for process() subgraph to finish jack:25704:276838934515 engine.c:jack_process_external:688: back from subgraph poll, revents = 0x1 jack:25704:276838934622 engine.c:jack_process_external:740: reading byte from subgraph_wait_fd==9 jack:25704:276838934659 ../jack/engine.h:jack_unlock_graph:198: releasing graph lock jack:25704:276838934666 engine.c:jack_run_one_cycle:2070: cycle finished, status = 0 jack:25704:276838954258 ../jack/engine.h:jack_try_lock_graph:192: TRYING to acquiring graph lock jack:25704:276838954470 engine.c:jack_run_one_cycle:2013: waiting for driver read -- jakemsr@... SDF Public Access UNIX System - http://sdf.lonestar.org _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Thu, May 08, 2008 at 10:24:16PM -0400, Paul Davis wrote:
> I would commit this directly, but I'm trying to be cautious for once. It > works much better for me now. Same thing here, it seems to take care of my problem. But as this particular problem happens only once or twice a day with my normal setup, I'll test for a few more days before concluding. To sum up, my setup is: * jackd, realtime, 64 frames, alsa driver * two oss2jack clients * one CVO (metro-like) client * ardour I launch the setup with a script that ends with a dozen jack_[dis]connect to get my port connections right. When I upgraded from jackd 0.103 to 0.109 (or CVS), two things happened: 1) my jack_[dis]connect started spewing out "zombified" messages with a 1:10 probability (never happened before), and jackd would then often get stuck and killed by the watchdog, 2) my regular clients started getting diconnected randomly for no apparent reason, a couple of times a day, and not while connecting or disconnecting anything, just while running. That did happen before, but something like a couple of times a *year*. So I created a test case that showed problem #1 and Jacob Meuser's patch took care of that. On the other hand, I still had problem #2 with that patch. I'm now trying with your patch (without Jacob's), and so far, it seems to take care of problem #2. On the other hand, #1 is still present: when I [dis]connect client ports in a loop, I get "zombified" messages from jack_[dis]connect, the client and/or jackd dies after a while, etc. I'll test with both patches, but first, I'll make sure it's really stable with just your patch during normal operation. _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Sat, 2008-05-10 at 07:39 +0000, Mikael Bouillot wrote: > On Thu, May 08, 2008 at 10:24:16PM -0400, Paul Davis wrote: > > I would commit this directly, but I'm trying to be cautious for once. It > > works much better for me now. > > Same thing here, it seems to take care of my problem. But as this > particular problem happens only once or twice a day with my normal > setup, I'll test for a few more days before concluding. We have substantial evidence that the problems people are experiencing with JACK are actually issues with the Linux kernel. The problems are not replicable on OS X (with the same codebase and test clients), and appear to be related to JACK clients being preempted incorrectly and for absurdly long periods of time. We are discussing the issue with the people involved in Linux-RT work. This seems to be the result of changes made to the kernel at some point in the last several months (as in, 3-8 months). I'll keep you all posted. _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd along> We have substantial evidence that the problems people are experiencing
> with JACK are actually issues with the Linux kernel. On the other hand, with the same 2.6.22.6 kernel, jackd 0.103 did work for me whereas 0.109 doesn't. _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Sat, May 10, 2008 at 03:26:53PM +0000, Mikael Bouillot wrote:
> > We have substantial evidence that the problems people are experiencing > > with JACK are actually issues with the Linux kernel. > > On the other hand, with the same 2.6.22.6 kernel, jackd 0.103 did work > for me whereas 0.109 doesn't. This is similar to my experience with the 2.6.20 kernel. I recently upgraded to 0.109, but when trying to use ardour2 with jamin, jamin kept getting zombified. This usually occurred when opening or closing sessions. It didn't happen every time, but often enough to disrupt things. I've just reverted to 0.103.0, which doesn't do this. John _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
|
|
Re: new patch to help jackd alongOn Sat, May 10, 2008 at 09:52:35AM -0400, Paul Davis wrote:
> > On Sat, 2008-05-10 at 07:39 +0000, Mikael Bouillot wrote: > > On Thu, May 08, 2008 at 10:24:16PM -0400, Paul Davis wrote: > > > I would commit this directly, but I'm trying to be cautious for once. It > > > works much better for me now. > > > > Same thing here, it seems to take care of my problem. But as this > > particular problem happens only once or twice a day with my normal > > setup, I'll test for a few more days before concluding. > > > We have substantial evidence that the problems people are experiencing > with JACK are actually issues with the Linux kernel. The problems are > not replicable on OS X (with the same codebase and test clients), and > appear to be related to JACK clients being preempted incorrectly and for > absurdly long periods of time. We are discussing the issue with the > people involved in Linux-RT work. This seems to be the result of changes > made to the kernel at some point in the last several months (as in, 3-8 > months). that's definitely not the cause of my problems. I'm seeing problems on OpenBSD, which has no preemption mechanism. it doesn't have realtime scheduling either. I see the problem even when jack is running in non-realtime mode. OS X is not involved because it doesn't poll() the wait_fd: #ifndef JACK_USE_MACH_THREADS client->pollfd[WAIT_POLL_INDEX].events = POLLIN|POLLERR|POLLHUP|POLLNVAL; #endif and of course in jack_client_core_wait(), everything with the wait_fd is in '#ifndef JACK_USE_MACH_THREADS ... #endif' -- jakemsr@... SDF Public Access UNIX System - http://sdf.lonestar.org _______________________________________________ Jack-Devel mailing list Jack-Devel@... http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |