|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Reggie not answering on lookupHi
First of all--should I be posting questions regarding Jini 2.1 here or on the River list? We have a problem that's shown up intermittently on our production systems. We have two LUS (Reggie) instances configured with the same group name, a number of service interfaces registered (34 at last count), and about two dozen Jini clients. Via JMX we can monitor the Jini lookup for any single client, and we have a "service monitor" Jini client also available via JMX (which tracks services over time, pings them regularly for reachability using a special interface, etc.). We also have a command line tool which can locate LUS instances either by group name or via unicast and display any registrars discovered and the service interfaces registered with them. What has happened on occasion (including today) is that, on starting a given service, the LUS stops responding to any lookup requests. When the service is shut down again, the LUS will respond and all previous registrations are still there (as long as the lease hasn't expired). During this "blackout" period, any clients that already have a handle on a service operate without problems (they all rely on a LookupCache), but any attempts to reach the LUS will fail. In our command-line LUS viewer, the lookup is initiated by simply new LookupDiscoveryManager(groupNames, new LookupLocator[]{}, this); where "this" is a DiscoveryListener. Our normal client-lookup stack is more complex but also relies on multicast discovery (group name + interface). What is also unusual is that in today's incident, we have two copies of the same service installed to two different online hosts (e.g. for clustering). We can start instance A without problem, but starting instance B will cause the LUS to stop responding. Shutting down instance B will clear the problem up. The code in this case is the same; they are just located on different hosts. Note that otherwise, all other (34) service interfaces have been available and discoverable without problem. Unfortunately, we have not been able to reproduce this in dev or staging environments, even with the exact same versions of the service implementations. Trying to track this down while online is very risky; while we can reproduce the problem, if the LUS remain unreachable, at some points clients performing a new lookup will fail to find the services they need. Note that outside of these "poison-pill" services (about which we see nothing unusual) our Jini infrastructure has been stable for some time now. We'd appreciate any help in trying to track this down. Thanks Patrick -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... |
|
|
|
|
|
Re: Reggie not answering on lookupPatrick Wright wrote:
> Follow-up (sometimes it does indeed help to describe this to someone > in writing): we now suspect a problem with the codebase server for the > "poison pill" service instance. What we just found in testing was that > from some of our hosts, the codebase server is not reachable, likely > due to a misconfigured firewall. We suspect that would cause any > resolution of service instances to block as the Jini clients tried to > download that service's downloadable jars. > > I will post back to the list (for posterity) what we find out. I use my vhttp: protocol handler to cache, locally, downloadable codebase jars to speed thing up when there are lots of services to find and/or limited bandwidth. One of the features of this protocol handler is that it logs any failure to download something. It is amazing how helpful this kind of logging is for debugging codebase server issues. It is the first sign of an incorrect hostname on a new server etc. Gregg Wonderly -------------------------------------------------------------------------- Getting Started: http://www.jini.org/wiki/Category:Getting_Started Community Web Site: http://jini.org jini-users Archive: http://archives.java.sun.com/archives/jini-users.html Unsubscribing: email "signoff JINI-USERS" to listserv@... |
| Free embeddable forum powered by Nabble | Forum Help |