|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
More odd behaviourOk guys, I’ve been working on an application to start,
monitor and stop the River core services (LUS, TM, Space, Class
Server). I’m using an event driven model using Lookup Cache and
ServiceDiscoveryListener. There is a ServiceDiscoveryManager that is started to set up
the LookupCaches mgr = new LookupDiscoveryManager(DiscoveryGroupManagement.ALL_GROUPS,
null, // unicast locators
null); // DiscoveryListener sdm = new
ServiceDiscoveryManager(mgr, new LeaseRenewalManager()); I was having trouble stopping the LUS. I had
originally used the SDM directly with a DiscoveryListener, but kept getting
uncatchable exceptions from the SDM when I killed the LUS. I can’t
terminate the SDM in this app (that stops the exception if you do), because I
want to monitor the system as long as the app is running. So, I tried to use a LookupCache for the ServiceRegistrar… classes = new Class[]
{ServiceRegistrar.class};
template = new
ServiceTemplate(null,
classes,
null); lusCache = sdm.createLookupCache(template,
null, lusMonitor); I use one for the space and one the TM as well. In the lusMonitor code I have this… public void serviceAdded(ServiceDiscoveryEvent evt) { ServiceItem si =
evt.getPostEventServiceItem(); Object service = si.service; if(service instanceof ServiceRegistrar) { sp.setStatusRunning(); } } public void serviceRemoved(ServiceDiscoveryEvent
serviceDiscoveryEvent) { sp.setStatusStopped(); } And the code that kills the LUS is in another class… public void stopLUS() { LookupCache cache = csm.getLUSCache(); ServiceItem si = cache.lookup(null); Object lusProxy = si.service; if(lusProxy instanceof Administrable) { try { Object admin =
((Administrable)lusProxy).getAdmin(); DestroyAdmin da =
(DestroyAdmin)admin;
cache.discard(si); da.destroy(); } catch(Exception ex) {
System.out.println("Error getting LUS DestroyAdmin");
ex.printStackTrace(); } } } Ok, so, when the LUS starts, the ServiceAdded method in the
ServiceDiscoveryListener (lusMonitor) is quickly invoked. When the LUS is killed, it takes about 10 minutes for that
event to be fired. The LUS is dead, dead, dead, I assume it has been
discarded from the LookupCache. This works almost instantly for the space and TM, but it
looks like a lease expiration or something is holding up the discard event for
the LUS. Any ideas on how to get around this? BAR |
|
|
Re: More odd behaviourOn Mon, Jan 19, 2009 at 8:16 PM, Rawlings, Bill A
<bill.a.rawlings@...> wrote: > When the LUS is killed, it takes about 10 minutes for that event to be > fired. The LUS is dead, dead, dead, I assume it has been discarded from the > LookupCache. Sounds like RMI/JERI expiry of remote reference. When you say "killed", how is that done? A nice and clean shutdown or some abrupt process termination? Cheers Niclas -- http://www.qi4j.org - New Energy for Java -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... |
|
|
Re: More odd behaviourRawlings, Bill A wrote:
> Ok guys, I’ve been working on an application to start, monitor and stop > the River core services (LUS, TM, Space, Class Server). > I was having trouble stopping the LUS. I had originally used the SDM > directly with a DiscoveryListener, but kept getting uncatchable > exceptions from the SDM when I killed the LUS. I can’t terminate the > SDM in this app (that stops the exception if you do), because I want to > monitor the system as long as the app is running. This is an old "feature request." At issue is that reggie does not use Runtime.addShutdownHook() to cause it to send out appropriate events at termination. Thus, you don't see it disappear until the notify() leases expire. Gregg Wonderly -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... |
|
|
Re: More odd behaviourHi Bill:
Look in the ServiceDiscoveryManager javadocs and specification for the 'discardWait' configuration parameter and teh "service discard problem". Essentially, the fact that a lookup service has disappeared (and remember, there may very well be more than one) doesn't tell SDM anything about the availability of a service. If a service is un-registered with all the LUS's, then they will notify the interested SDM's that the service is gone. However if SDM loses contact with one or more LUS's, or if one or more LUS's still have a service registration, SDM can't say for sure that the service is unregistered, so it waits until the 'discardWait' period (default 10 minutes) expires before it sends out notifications that the service is gone from the lookup cache. You can configure the wait time. However, there's a bigger concept here, that Jini newcomers often miss. I don't know if you're making this mistake, but for the benefit of posterity, let me state it again: - You don't know if a service has failed until you try to use it, and you can't. - Conversely, the fact that there is a service registration, or that you can renew a lease with the service, tells you nothing about whether a service is "up". - And in an odd twist, the fact that a service registration disappears from your lookup cache in no way indicates that the service is "down". The LUS might be down, or the service may have unregistered itself for some reason, but might still be open for business with its current clients. Perhaps the LUS could come back, or another LUS might take its place, and the service might re-register with it. I'll say it again for emphasis: - You don't know a service has failed until you try to use it, and you can't - You don't know a service is operational until you try to use, and you can. Also, the instant after you use it, it might be gone. So the best you can do is put a time bound on how long it is until you know a service has failed. You would do this by actually accessing the service at some interval. Please don't make the mistake of thinking that renewing your lease with a service proves that the service is operational. It just means whatever service is renewing leases is operational. By the way, fully embracing this concept of partial failure and limited knowledge of the overall system state is an important step on the road from "Wow, Jini is complex" to "Wow, Jini is a work of genius". Cheers, Greg. On Mon, 2009-01-19 at 14:16, Rawlings, Bill A wrote: > Ok guys, I?ve been working on an application to start, monitor and > stop the River core services (LUS, TM, Space, Class Server). > > > > I?m using an event driven model using Lookup Cache and > ServiceDiscoveryListener. > > > > There is a ServiceDiscoveryManager that is started to set up the > LookupCaches > > > > mgr = new > LookupDiscoveryManager(DiscoveryGroupManagement.ALL_GROUPS, > > null, // unicast > locators > > null); // DiscoveryListener > > sdm = new ServiceDiscoveryManager(mgr, new > LeaseRenewalManager()); > > > > I was having trouble stopping the LUS. I had originally used the SDM > directly with a DiscoveryListener, but kept getting uncatchable > exceptions from the SDM when I killed the LUS. I can?t terminate the > SDM in this app (that stops the exception if you do), because I want > to monitor the system as long as the app is running. > > > > So, I tried to use a LookupCache for the ServiceRegistrar? > > > > classes = new Class[] > {ServiceRegistrar.class}; > > template = new ServiceTemplate(null, > > classes, > > null); > > > > lusCache = sdm.createLookupCache(template, null, lusMonitor); > > > > I use one for the space and one the TM as well. > > > > In the lusMonitor code I have this? > > > > public void serviceAdded(ServiceDiscoveryEvent evt) > > { > > ServiceItem si = evt.getPostEventServiceItem(); > > > > Object service = si.service; > > if(service instanceof ServiceRegistrar) > > { > > sp.setStatusRunning(); > > } > > } > > > > public void serviceRemoved(ServiceDiscoveryEvent > serviceDiscoveryEvent) > > { > > sp.setStatusStopped(); > > } > > > > And the code that kills the LUS is in another class? > > > > public void stopLUS() > > { > > LookupCache cache = csm.getLUSCache(); > > ServiceItem si = cache.lookup(null); > > Object lusProxy = si.service; > > if(lusProxy instanceof Administrable) > > { > > try > > { > > Object admin = ((Administrable)lusProxy).getAdmin(); > > DestroyAdmin da = (DestroyAdmin)admin; > > cache.discard(si); > > da.destroy(); > > } > > catch(Exception ex) > > { > > System.out.println("Error getting LUS DestroyAdmin"); > > ex.printStackTrace(); > > } > > } > > } > > > > Ok, so, when the LUS starts, the ServiceAdded method in the > ServiceDiscoveryListener (lusMonitor) is quickly invoked. > > > > When the LUS is killed, it takes about 10 minutes for that event to be > fired. The LUS is dead, dead, dead, I assume it has been discarded > from the LookupCache. > > > > This works almost instantly for the space and TM, but it looks like a > lease expiration or something is holding up the discard event for the > LUS. > > > > Any ideas on how to get around this? > > > > BAR > > > -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... Greg Trasuk, President StratusCom Manufacturing Systems Inc. - We use information technology to solve business problems on your plant floor. http://stratuscom.com -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... |
|
|
|
|
|
Re: More odd behaviourRawlings, Bill A wrote:
> Thanks for the responses guys. I figured it was a lease waiting to > expire. I guess I can get around it by setting the LUS status to be > “down” right after the destroy() call, because that does indeed kill the > LUS. > > I’d like to use this problem as something I think is important for > future River use. This is an app I should not have to be writing. The com.sun.jini.start.ServerStarter class illustrates how a container can be created which uses the ApplicationDescriptor mechanisms to manage service instance lifecycle. It is configuration based in that application, but could be done with a GUI as well. I started to do something along these lines a while back, but put it aside when some other work intervened. What little I got done is out at: http://pescade.dev.java.net/ Gregg Wonderly -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... |
| Free embeddable forum powered by Nabble | Forum Help |