|
View:
New views
11 Messages
—
Rating Filter:
Alert me
|
|
|
New clients sometimes don't join the groupHello,
i sometimes have the problem that new clients sometimes don't join a existing group. Then i get lots of these messages on both clients: Feb 29, 2000 7:41:07 AM org.jgroups.logging.JDKLogImpl warn WARNING: localhost-54698: discarded message from non-member l4work-57960, my view is [localhost-54698|0] [localhost-54698] Most of the times they perform a merge after a while: Feb 29, 2000 7:42:49 AM org.jgroups.logging.JDKLogImpl warn WARNING: localhost-54698: discarded message from non-member l4work-1893, my view is [localhost-54698|0] [localhost-54698] view: MergeView::[localhost-54698|1] [localhost-54698, l4work-1893], subgroups=[[localhost-54698|0] [localhost-54698], [l4work-1893|0] [l4work-1893]] But sometimes they don't. Then i can wait almost forever an they don't merge: Feb 29, 2000 7:43:45 AM org.jgroups.logging.JDKLogImpl warn WARNING: localhost-54698: discarded message from non-member l4work-31577, my view is [localhost-54698|2] [localhost-54698] [...] Feb 29, 2000 7:56:10 AM org.jgroups.logging.JDKLogImpl warn WARNING: localhost-54698: discarded message from non-member l4work-31577, my view is [localhost-54698|2] [localhost-54698] I have no idea how to reproduce each one of these behaviors. Shouldn't they just enter the group immediately after the clients start? Greets, -- Kai Timmer | http://kaitimmer.de Email : email@... Jabber (Google Talk): kai@... ICQ: 67765488 ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupVersion and config please ?
Kai Timmer wrote: > Hello, > i sometimes have the problem that new clients sometimes don't join a > existing group. Then i get lots of these messages on both clients: > Feb 29, 2000 7:41:07 AM org.jgroups.logging.JDKLogImpl warn > WARNING: localhost-54698: discarded message from non-member > l4work-57960, my view is [localhost-54698|0] [localhost-54698] > > Most of the times they perform a merge after a while: > Feb 29, 2000 7:42:49 AM org.jgroups.logging.JDKLogImpl warn > > WARNING: localhost-54698: discarded message from non-member > l4work-1893, my view is [localhost-54698|0] [localhost-54698] > view: MergeView::[localhost-54698|1] [localhost-54698, l4work-1893], > subgroups=[[localhost-54698|0] [localhost-54698], [l4work-1893|0] > [l4work-1893]] > > But sometimes they don't. Then i can wait almost forever an they don't merge: > Feb 29, 2000 7:43:45 AM org.jgroups.logging.JDKLogImpl warn > WARNING: localhost-54698: discarded message from non-member > l4work-31577, my view is [localhost-54698|2] [localhost-54698] > [...] > Feb 29, 2000 7:56:10 AM org.jgroups.logging.JDKLogImpl warn > WARNING: localhost-54698: discarded message from non-member > l4work-31577, my view is [localhost-54698|2] [localhost-54698] > > I have no idea how to reproduce each one of these behaviors. Shouldn't > they just enter the group immediately after the clients start? > > Greets, > -- Bela Ban Lead JGroups / Clustering Team JBoss ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the group2009/11/6 Bela Ban <belaban@...>:
> Version and config please ? I attached the protocol stack configuration to this mail and I'm using version: 2.8.0CR3 Greets, -- Kai Timmer | http://kaitimmer.de Email : email@... Jabber (Google Talk): kai@... ICQ: 67765488 <!-- Default stack using IP multicasting. It is similar to the "udp" stack in stacks.xml, but doesn't use streaming state transfer and flushing author: Bela Ban version: $Id: udp.xml,v 1.32 2009/06/17 16:35:43 belaban Exp $ --> <config xmlns="urn:org:jgroups" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-2.8.xsd"> <UDP ip_mcast="true" mcast_addr="${jgroups.udp.mcast_addr:232.10.10.10}" mcast_port="${jgroups.udp.mcast_port:45588}" tos="8" ucast_recv_buf_size="20000000" ucast_send_buf_size="640000" mcast_recv_buf_size="25000000" mcast_send_buf_size="640000" loopback="false" discard_incompatible_packets="true" max_bundle_size="64000" max_bundle_timeout="30" ip_ttl="${jgroups.udp.ip_ttl:2}" enable_bundling="true" enable_diagnostics="true" thread_naming_pattern="cl" thread_pool.enabled="true" thread_pool.min_threads="2" thread_pool.max_threads="8" thread_pool.keep_alive_time="5000" thread_pool.queue_enabled="true" thread_pool.queue_max_size="10000" thread_pool.rejection_policy="discard" oob_thread_pool.enabled="true" oob_thread_pool.min_threads="1" oob_thread_pool.max_threads="8" oob_thread_pool.keep_alive_time="5000" oob_thread_pool.queue_enabled="false" oob_thread_pool.queue_max_size="100" oob_thread_pool.rejection_policy="Run"/> <PING timeout="10000" num_initial_members="1"/> <MERGE2 max_interval="30000" min_interval="10000"/> <FD_ALL interval="1000" timeout="5000" /> <VERIFY_SUSPECT timeout="2000"/> <pbcast.NAKACK use_stats_for_retransmission="false" exponential_backoff="150" use_mcast_xmit="true" gc_lag="0" xmit_from_random_member="true" retransmit_timeout="50,300,600,1200" discard_delivered_msgs="false"/> <UNICAST timeout="300,600,1200"/> <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="1000000"/> <pbcast.GMS print_local_addr="true" join_timeout="3000" view_bundling="true"/> <FC max_credits="500000" min_threshold="0.20"/> <FRAG2 frag_size="60000" /> <!--pbcast.STREAMING_STATE_TRANSFER /--> <pbcast.STATE_TRANSFER /> <pbcast.FLUSH /> </config> ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupUsing 2.8RC3 we are also seeing nodes unable to join and high CPU. We are also seeing nodes leaving the cluster when others try to join.
We are currently trying to reproduce it with log level to debug and will provide more detail soon. UDP(bind_addr=10.4.68.61;enable_diagnostics=false;mcast_addr=228.8.8.8;mcast_port=54000;loopback=false;mcast_recv_buf_size=120000):PING(timeout=10000;num_initial_members=10000;num_ping_requests=1):MERGE2(max_interval=10000;min_interval=5000):FD_ALL(interval=5000;timeout=16000):VERIFY_SUSPECT(timeout=3000):BARRIER():pbcast.NAKACK():UNICAST():pbcast.STABLE():pbcast.GMS(join_timeout=11000;print_local_addr=true;view_bundling=true):FRAG2(frag_size=60000):pbcast.STATE_TRANSFER()} David Forget >-----Original Message----- >From: ext Kai Timmer [mailto:email@...] >Sent: Monday, November 09, 2009 1:22 PM >To: Bela Ban >Cc: jg-users >Subject: Re: [javagroups-users] New clients sometimes don't join the >group > >2009/11/6 Bela Ban <belaban@...>: >> Version and config please ? > >I attached the protocol stack configuration to this mail and I'm using >version: 2.8.0CR3 > >Greets, >-- >Kai Timmer | http://kaitimmer.de >Email : email@... >Jabber (Google Talk): kai@... >ICQ: 67765488 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupDavid,
Is there a reason why num_ping_requests is 1? Only one ping request is going to be sent out in time frame of 10 sec. Ping request are sent over unreliable udp socket (PING is below UNICAST) and could be lost in your case. Give a thorough look to all PING properties, experiment with higher value for num_ping_requests and do not forget to report back. Regards, Vladimir On 09-11-09 2:06 PM, david.forget@... wrote: > Using 2.8RC3 we are also seeing nodes unable to join and high CPU. We are also seeing nodes leaving the cluster when others try to join. > > We are currently trying to reproduce it with log level to debug and will provide more detail soon. > > UDP(bind_addr=10.4.68.61;enable_diagnostics=false;mcast_addr=228.8.8.8;mcast_port=54000;loopback=false;mcast_recv_buf_size=120000):PING(timeout=10000;num_initial_members=10000;num_ping_requests=1):MERGE2(max_interval=10000;min_interval=5000):FD_ALL(interval=5000;timeout=16000):VERIFY_SUSPECT(timeout=3000):BARRIER():pbcast.NAKACK():UNICAST():pbcast.STABLE():pbcast.GMS(join_timeout=11000;print_local_addr=true;view_bundling=true):FRAG2(frag_size=60000):pbcast.STATE_TRANSFER()} > > > David Forget > > > >> -----Original Message----- >> From: ext Kai Timmer [mailto:email@...] >> Sent: Monday, November 09, 2009 1:22 PM >> To: Bela Ban >> Cc: jg-users >> Subject: Re: [javagroups-users] New clients sometimes don't join the >> group >> >> 2009/11/6 Bela Ban<belaban@...>: >> >>> Version and config please ? >>> >> I attached the protocol stack configuration to this mail and I'm using >> version: 2.8.0CR3 >> >> Greets, >> -- >> Kai Timmer | http://kaitimmer.de >> Email : email@... >> Jabber (Google Talk): kai@... >> ICQ: 67765488 >> > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > javagroups-users mailing list > javagroups-users@... > https://lists.sourceforge.net/lists/listinfo/javagroups-users > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupHi Vladimir,
Good suggestion I will increase PING::timeout and set PING::num_ping_requests=2. We set PING on every 10 sec to reduce as much as possible the amount of UDP Messages. As you know the coordinator sents PING to every members of the cluster periodically for the entire duration of the cluster and every nodes have to reply, Some of our cluster should reach over 350 nodes in early 2010 and having PING with more aggressive value have an impact on network and create CPU spike on coordinator. David Forget >-----Original Message----- >From: ext Vladimir Blagojevic [mailto:vblagoje@...] >Sent: Monday, November 09, 2009 2:26 PM >To: Forget David (Nokia-S/Montreal) >Cc: email@...; belaban@...; javagroups- >users@... >Subject: Re: [javagroups-users] New clients sometimes don't join the >group > >David, > >Is there a reason why num_ping_requests is 1? Only one ping request is >going to be sent out in time frame of 10 sec. Ping request are sent over >unreliable udp socket (PING is below UNICAST) and could be lost in your >case. Give a thorough look to all PING properties, experiment with >higher value for num_ping_requests and do not forget to report back. > >Regards, >Vladimir > >On 09-11-09 2:06 PM, david.forget@... wrote: >> Using 2.8RC3 we are also seeing nodes unable to join and high CPU. We >are also seeing nodes leaving the cluster when others try to join. >> >> We are currently trying to reproduce it with log level to debug and >will provide more detail soon. >> >> >UDP(bind_addr=10.4.68.61;enable_diagnostics=false;mcast_addr=228.8.8.8;m >cast_port=54000;loopback=false;mcast_recv_buf_size=120000):PING(timeout= >10000;num_initial_members=10000;num_ping_requests=1):MERGE2(max_interval >=10000;min_interval=5000):FD_ALL(interval=5000;timeout=16000):VERIFY_SUS >PECT(timeout=3000):BARRIER():pbcast.NAKACK():UNICAST():pbcast.STABLE():p >bcast.GMS(join_timeout=11000;print_local_addr=true;view_bundling=true):F >RAG2(frag_size=60000):pbcast.STATE_TRANSFER()} >> >> >> David Forget >> >> >> >>> -----Original Message----- >>> From: ext Kai Timmer [mailto:email@...] >>> Sent: Monday, November 09, 2009 1:22 PM >>> To: Bela Ban >>> Cc: jg-users >>> Subject: Re: [javagroups-users] New clients sometimes don't join the >>> group >>> >>> 2009/11/6 Bela Ban<belaban@...>: >>> >>>> Version and config please ? >>>> >>> I attached the protocol stack configuration to this mail and I'm >using >>> version: 2.8.0CR3 >>> >>> Greets, >>> -- >>> Kai Timmer | http://kaitimmer.de >>> Email : email@... >>> Jabber (Google Talk): kai@... >>> ICQ: 67765488 >>> >> ---------------------------------------------------------------------- >-------- >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >30-Day >> trial. Simplify your report design, integration and deployment - and >focus on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> javagroups-users mailing list >> javagroups-users@... >> https://lists.sourceforge.net/lists/listinfo/javagroups-users >> ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupDavid, no need to increase PING:timeout, I suppose, keep it aligned with
your GMS:join_timeout which you already do. I agree with your increase of PING::num_ping_requests to 2. Also, what is your reasoning behind a setting of num_initial_members to 10000? See the details of discovery algorithm in Discovery.java. Hearing about 350+ nodes deployment and getting your feedback is certainly very interesting! Regards, Vladimir On 09-11-09 2:38 PM, david.forget@... wrote: > Hi Vladimir, > Good suggestion I will increase PING::timeout and set PING::num_ping_requests=2. We set PING on every 10 sec to reduce as much as possible the amount of UDP Messages. As you know the coordinator sents PING to every members of the cluster periodically for the entire duration of the cluster and every nodes have to reply, Some of our cluster should reach over 350 nodes in early 2010 and having PING with more aggressive value have an impact on network and create CPU spike on coordinator. > > David Forget > > > > >> -----Original Message----- >> From: ext Vladimir Blagojevic [mailto:vblagoje@...] >> Sent: Monday, November 09, 2009 2:26 PM >> To: Forget David (Nokia-S/Montreal) >> Cc: email@...; belaban@...; javagroups- >> users@... >> Subject: Re: [javagroups-users] New clients sometimes don't join the >> group >> >> David, >> >> Is there a reason why num_ping_requests is 1? Only one ping request is >> going to be sent out in time frame of 10 sec. Ping request are sent over >> unreliable udp socket (PING is below UNICAST) and could be lost in your >> case. Give a thorough look to all PING properties, experiment with >> higher value for num_ping_requests and do not forget to report back. >> >> Regards, >> Vladimir >> >> On 09-11-09 2:06 PM, david.forget@... wrote: >> >>> Using 2.8RC3 we are also seeing nodes unable to join and high CPU. We >>> >> are also seeing nodes leaving the cluster when others try to join. >> >>> We are currently trying to reproduce it with log level to debug and >>> >> will provide more detail soon. >> >>> >>> >> UDP(bind_addr=10.4.68.61;enable_diagnostics=false;mcast_addr=228.8.8.8;m >> cast_port=54000;loopback=false;mcast_recv_buf_size=120000):PING(timeout= >> 10000;num_initial_members=10000;num_ping_requests=1):MERGE2(max_interval >> =10000;min_interval=5000):FD_ALL(interval=5000;timeout=16000):VERIFY_SUS >> PECT(timeout=3000):BARRIER():pbcast.NAKACK():UNICAST():pbcast.STABLE():p >> bcast.GMS(join_timeout=11000;print_local_addr=true;view_bundling=true):F >> RAG2(frag_size=60000):pbcast.STATE_TRANSFER()} >> >>> >>> David Forget >>> >>> >>> >>> >>>> -----Original Message----- >>>> From: ext Kai Timmer [mailto:email@...] >>>> Sent: Monday, November 09, 2009 1:22 PM >>>> To: Bela Ban >>>> Cc: jg-users >>>> Subject: Re: [javagroups-users] New clients sometimes don't join the >>>> group >>>> >>>> 2009/11/6 Bela Ban<belaban@...>: >>>> >>>> >>>>> Version and config please ? >>>>> >>>>> >>>> I attached the protocol stack configuration to this mail and I'm >>>> >> using >> >>>> version: 2.8.0CR3 >>>> >>>> Greets, >>>> -- >>>> Kai Timmer | http://kaitimmer.de >>>> Email : email@... >>>> Jabber (Google Talk): kai@... >>>> ICQ: 67765488 >>>> >>>> >>> ---------------------------------------------------------------------- >>> >> -------- >> >>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >>> >> 30-Day >> >>> trial. Simplify your report design, integration and deployment - and >>> >> focus on >> >>> what you do best, core application coding. Discover what's new with >>> Crystal Reports now. http://p.sf.net/sfu/bobj-july >>> _______________________________________________ >>> javagroups-users mailing list >>> javagroups-users@... >>> https://lists.sourceforge.net/lists/listinfo/javagroups-users >>> >>> > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupdavid.forget@... wrote: > Hi Vladimir, > Good suggestion I will increase PING::timeout and set > PING::num_ping_requests=2. We set PING on every 10 sec to reduce as > much as possible the amount of UDP Messages. As you know the > coordinator sents PING to every members of the cluster periodically > for the entire duration of the cluster and every nodes have to reply, That shouldn't be an issue though as you use IP multicasting, which only sends 1 multicast packet. Yes, every receiver responds with a UDP datagram back to the coordinator though... After startup, PING is only used by MERGE2, so if you have relatively high values for min_ and max_timeout in MERGE2, then you decrease the frequency at which PING is sening out messages. > Some of our cluster should reach over 350 nodes in early 2010 Interesting ! > and having PING with more aggressive value have an impact on network > and create CPU spike on coordinator. understood. -- Bela Ban Lead JGroups / Clustering Team JBoss ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupdavid.forget@... wrote: > Using 2.8RC3 we are also seeing nodes unable to join and high CPU. We > are also seeing nodes leaving the cluster when others try to join. > > We are currently trying to reproduce it with log level to debug and > will provide more detail soon. If you can reproduce this, the sooner the better. 2.8.0 has 4 issues left and I'm working towards closing them, so I can release GA. Comments about your config below. > UDP(bind_addr=10.4.68.61;enable_diagnostics=false;mcast_addr=228.8.8.8;mcast_port=54000;loopback=false;mcast_recv_buf_size=120000): That's a small receive buffer. The bigger the better, but don't forget to set net.core.rmem_max (def: 131K IIRC) too (on UNIX systems). > PING(timeout=10000;num_initial_members=10000;num_ping_requests=1): As Vladimir pointed out, num_initial_members=10000 ? What's the rationale here ? > MERGE2(max_interval=10000;min_interval=5000): > FD_ALL(interval=5000;timeout=16000): No FD_SOCK ? You'll have to wait for up to 23 seconds (worst case) to discover a crashed member... > VERIFY_SUSPECT(timeout=3000): > BARRIER(): > pbcast.NAKACK(): > UNICAST(): > pbcast.STABLE(): > pbcast.GMS(join_timeout=11000;print_local_addr=true;view_bundling=true): join_timeout 11 seconds ? seems to high... Note that I'd set max_bundling_time (def: 50ms) too if you set view_bundling=true: this way, many concurrent joins are generating only few view changes rather than 1 view change / JOIN > FRAG2(frag_size=60000): > pbcast.STATE_TRANSFER()} -- Bela Ban Lead JGroups / Clustering Team JBoss ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
Re: New clients sometimes don't join the groupI suggest you change PING.timeout to 3000 and PING.num_initial_members
to 3. This way, you reduce the chances of members not finding each other, or returning after having found another (starting) member Kai Timmer wrote: > 2009/11/6 Bela Ban <belaban@...>: > >> Version and config please ? >> > > I attached the protocol stack configuration to this mail and I'm using > version: 2.8.0CR3 > > Greets, > -- Bela Ban Lead JGroups / Clustering Team JBoss ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ javagroups-users mailing list javagroups-users@... https://lists.sourceforge.net/lists/listinfo/javagroups-users |
|
|
|
| Free embeddable forum powered by Nabble | Forum Help |