|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
long pauses during hbase.PerformanceEvaluation testsHi all
I have been running 'bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation ' script to get an idea of the cluster. I see long PAUSES during writes. here is my setup: - 5 nodes on EC2 (m1.large) : 1 hbase master + 4 regions - hadoop & hbase get 2G heap - I don't see any JVM heap getting full (not even close) - during the pauses, the CPU usage is pretty much zero (on client, running on master machine & region servers) - no swap going on - network connections between machines are pretty fast I am curious to know what is causing the delay. I don't see errors in logs. Could region-splits be causing such long delays? Because re-runs go through much faster. Any pointers/explanations much appreciated. Here is a couple of examples: A) SequentialWrite : TestTable created notice the large gaps in timestamps (*) # bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1 ==> 201 writes / sec 09/10/28 23:17:55 INFO hbase.PerformanceEvaluation: Table {NAME => 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} created 09/10/28 23:17:55 INFO hbase.PerformanceEvaluation: Start sequentialWrite at offset 0 for 1048576 rows 09/10/28 23:18:10 INFO hbase.PerformanceEvaluation: 0/104857/1048576 09/10/28 23:18:24 INFO hbase.PerformanceEvaluation: 0/209714/1048576 * 09/10/28 23:52:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576 09/10/29 00:25:50 INFO hbase.PerformanceEvaluation: 0/419428/1048576 09/10/29 00:26:06 INFO hbase.PerformanceEvaluation: 0/524285/1048576 09/10/29 00:26:25 INFO hbase.PerformanceEvaluation: 0/629142/1048576 * 09/10/29 00:41:47 INFO hbase.PerformanceEvaluation: 0/733999/1048576 09/10/29 00:43:34 INFO hbase.PerformanceEvaluation: 0/838856/1048576 09/10/29 00:43:49 INFO hbase.PerformanceEvaluation: 0/943713/1048576 09/10/29 00:44:43 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 09/10/29 00:44:44 INFO hbase.PerformanceEvaluation: Finished sequentialWrite in 5207981ms at offset 0 for 1048576 rows During one of the pauses, here is what I see on one region server. Notice how the table split took 14 mins @ hadoop4 2009-10-29 00:26:33,925 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting split of region TestTable,0000088768,1256773950625 2009-10-29 00:41:30,428 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed TestTable,0000088768,1256773950625 2009-10-29 00:41:32,218 INFO org.apache.hadoop.hbase.regionserver.HRegion: region TestTable,0000088768,1256775993930/1603364931 available; sequence id is 58208257 2009-10-29 00:41:32,218 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed TestTable,0000088768,1256775993930 2009-10-29 00:41:32,407 INFO org.apache.hadoop.hbase.regionserver.HRegion: region TestTable,0000249152,1256775993930/1798535948 available; sequence id is 58208258 2009-10-29 00:41:32,407 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed TestTable,0000249152,1256775993930 2009-10-29 00:41:32,422 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=domU-12-31-39-06-B9-61.compute-1.internal:2181,domU-12-31-39-06-E0-62.compute-1.internal:2181,domU-12-31-39-07-8A-22.compute-1.internal:2181,domU-12-31-39-06-B0-32.compute-1.internal:2181,domU-12-31-39-06-3D-A1.compute-1.internal:2181 sessionTimeout=60000 watcher=org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher@1f5b63822009-10-29 00:41:32,422 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server domU-12-31-39-06-B0-32.compute-1.internal/10.208.183.192:2181 2009-10-29 00:41:32,466 INFO org.apache.zookeeper.ClientCnxn: Priming connection to java.nio.channels.SocketChannel[connected local=/10.208.190.143:53364 remote=domU-12-31-39-06-B0-32.compute-1.internal/10.208.183.192:2181]2009-10-29 00:41:32,467 INFO org.apache.zookeeper.ClientCnxn: Server connection successful 2009-10-29 00:41:32,528 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META updated, and report to master all successful. Old region=REGION => {NAME => 'TestTable,0000088768,1256773950625', STARTKE Y => '0000088768', ENDKEY => '', ENCODED => 1389834479, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '6 5536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}, new regions: TestTable,0000088768,1256775993930, TestTable,0000249152,1256775993930. Split took 14mins, 58sec 2009-10-29 00:41:34,379 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: TestTable,00002 49152,1256775993930 2009-10-29 00:41:34,380 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_OPEN: TestTable,0000249152,1256775993930 B) 'TestTable' exists. When I re-ran the test, ==> 3898 writes / sec 09/10/29 04:28:22 INFO hbase.PerformanceEvaluation: Finished sequentialWrite in 269966ms at offset 0 for 1048576 rows C) For kickers, here is a RandomWrite (Table created) bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1 ==> 304 writes /sec 09/10/29 05:33:46 INFO hbase.PerformanceEvaluation: Table {NAME => 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} created 09/10/29 05:33:46 INFO hbase.PerformanceEvaluation: Start randomWrite at offset 0 for 1048576 rows 09/10/29 05:34:01 INFO hbase.PerformanceEvaluation: 0/104857/1048576 09/10/29 05:34:15 INFO hbase.PerformanceEvaluation: 0/209714/1048576 09/10/29 05:35:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576 09/10/29 05:36:25 INFO hbase.PerformanceEvaluation: 0/419428/1048576 09/10/29 05:36:38 INFO hbase.PerformanceEvaluation: 0/524285/1048576 09/10/29 05:37:25 INFO hbase.PerformanceEvaluation: 0/629142/1048576 09/10/29 05:37:37 INFO hbase.PerformanceEvaluation: 0/733999/1048576 09/10/29 05:37:51 INFO hbase.PerformanceEvaluation: 0/838856/1048576 * 09/10/29 05:56:47 INFO hbase.PerformanceEvaluation: 0/943713/1048576 * 09/10/29 06:31:12 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 09/10/29 06:31:13 INFO hbase.PerformanceEvaluation: Finished randomWrite in 3446240ms at offset 0 for 1048576 rows D) randomWrite rerun, with existing TestTable bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1 ==> 3075 writes / sec 09/10/29 06:57:04 INFO hbase.PerformanceEvaluation: Finished randomWrite in 341746ms at offset 0 for 1048576 rows Thanks Sujee -- http://sujee.net |
|
|
Re: long pauses during hbase.PerformanceEvaluation testsforgot to add that I am running
hbase v0.20.1 hadoop v0.20.1 pretty much at default settings... On Thu, Oct 29, 2009 at 12:01 AM, Sujee Maniyam <sujee@...> wrote: > Hi all > > I have been running 'bin/hbase > org.apache.hadoop.hbase.PerformanceEvaluation ' script to get an idea > of the cluster. I see long PAUSES during writes. > > here is my setup: > - 5 nodes on EC2 (m1.large) : 1 hbase master + 4 regions > - hadoop & hbase get 2G heap > > - I don't see any JVM heap getting full (not even close) > - during the pauses, the CPU usage is pretty much zero (on client, > running on master machine & region servers) > - no swap going on > - network connections between machines are pretty fast > > I am curious to know what is causing the delay. I don't see errors in logs. > Could region-splits be causing such long delays? Because re-runs go > through much faster. > > Any pointers/explanations much appreciated. > > Here is a couple of examples: > > A) SequentialWrite : TestTable created > notice the large gaps in timestamps (*) > # bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1 > ==> 201 writes / sec > > > 09/10/28 23:17:55 INFO hbase.PerformanceEvaluation: Table {NAME => > 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS > => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', > BLOCKCACHE => 'true'}]} created > 09/10/28 23:17:55 INFO hbase.PerformanceEvaluation: Start sequentialWrite > at offset 0 for 1048576 rows > 09/10/28 23:18:10 INFO hbase.PerformanceEvaluation: 0/104857/1048576 > 09/10/28 23:18:24 INFO hbase.PerformanceEvaluation: 0/209714/1048576 > > * 09/10/28 23:52:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576 > 09/10/29 00:25:50 INFO hbase.PerformanceEvaluation: 0/419428/1048576 > 09/10/29 00:26:06 INFO hbase.PerformanceEvaluation: 0/524285/1048576 > 09/10/29 00:26:25 INFO hbase.PerformanceEvaluation: 0/629142/1048576 > > * 09/10/29 00:41:47 INFO hbase.PerformanceEvaluation: 0/733999/1048576 > 09/10/29 00:43:34 INFO hbase.PerformanceEvaluation: 0/838856/1048576 > 09/10/29 00:43:49 INFO hbase.PerformanceEvaluation: 0/943713/1048576 > 09/10/29 00:44:43 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 > 09/10/29 00:44:44 INFO hbase.PerformanceEvaluation: Finished > sequentialWrite in 5207981ms at offset 0 for 1048576 rows > > > During one of the pauses, here is what I see on one region server. > Notice how the table split took 14 mins > > @ hadoop4 > 2009-10-29 00:26:33,925 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Starting split of region TestTable,0000088768,1256773950625 > 2009-10-29 00:41:30,428 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Closed TestTable,0000088768,1256773950625 > 2009-10-29 00:41:32,218 INFO org.apache.hadoop.hbase.regionserver.HRegion: > region TestTable,0000088768,1256775993930/1603364931 available; sequence id > is 58208257 > 2009-10-29 00:41:32,218 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Closed TestTable,0000088768,1256775993930 > 2009-10-29 00:41:32,407 INFO org.apache.hadoop.hbase.regionserver.HRegion: > region TestTable,0000249152,1256775993930/1798535948 available; sequence id > is 58208258 > 2009-10-29 00:41:32,407 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Closed TestTable,0000249152,1256775993930 > 2009-10-29 00:41:32,422 INFO org.apache.zookeeper.ZooKeeper: Initiating > client connection, > connectString=domU-12-31-39-06-B9-61.compute-1.internal:2181,domU-12-31-39-06-E0-62.compute-1.internal:2181,domU-12-31-39-07-8A-22.compute-1.internal:2181,domU-12-31-39-06-B0-32.compute-1.internal:2181,domU-12-31-39-06-3D-A1.compute-1.internal:2181 > sessionTimeout=60000 > watcher=org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher@1f5b63822009-10-29 > 00:41:32,422 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to > server domU-12-31-39-06-B0-32.compute-1.internal/10.208.183.192:2181 > 2009-10-29 00:41:32,466 INFO org.apache.zookeeper.ClientCnxn: Priming > connection to java.nio.channels.SocketChannel[connected > local=/10.208.190.143:53364 > remote=domU-12-31-39-06-B0-32.compute-1.internal/10.208.183.192:2181]2009-10-29 > 00:41:32,467 INFO org.apache.zookeeper.ClientCnxn: Server connection > successful > 2009-10-29 00:41:32,528 INFO > org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META > updated, and report to master all successful. Old region=REGION => {NAME => > 'TestTable,0000088768,1256773950625', STARTKE > Y => '0000088768', ENDKEY => '', ENCODED => 1389834479, OFFLINE => true, > SPLIT => true, TABLE => {{NAME => 'TestTable', FAMILIES => [{NAME => > 'info', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', > BLOCKSIZE => '6 > 5536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}, new regions: > TestTable,0000088768,1256775993930, TestTable,0000249152,1256775993930. > Split took 14mins, 58sec > 2009-10-29 00:41:34,379 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: > TestTable,00002 49152,1256775993930 > 2009-10-29 00:41:34,380 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: > MSG_REGION_OPEN: TestTable,0000249152,1256775993930 > > > > B) 'TestTable' exists. When I re-ran the test, > > ==> 3898 writes / sec > > 09/10/29 04:28:22 INFO hbase.PerformanceEvaluation: Finished > sequentialWrite in 269966ms at offset 0 for 1048576 rows > > > C) For kickers, here is a RandomWrite > (Table created) > > bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1 > > ==> 304 writes /sec > > 09/10/29 05:33:46 INFO hbase.PerformanceEvaluation: Table {NAME => > 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS > => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', > BLOCKCACHE => 'true'}]} created > 09/10/29 05:33:46 INFO hbase.PerformanceEvaluation: Start randomWrite at > offset 0 for 1048576 rows > 09/10/29 05:34:01 INFO hbase.PerformanceEvaluation: 0/104857/1048576 > 09/10/29 05:34:15 INFO hbase.PerformanceEvaluation: 0/209714/1048576 > 09/10/29 05:35:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576 > 09/10/29 05:36:25 INFO hbase.PerformanceEvaluation: 0/419428/1048576 > 09/10/29 05:36:38 INFO hbase.PerformanceEvaluation: 0/524285/1048576 > 09/10/29 05:37:25 INFO hbase.PerformanceEvaluation: 0/629142/1048576 > 09/10/29 05:37:37 INFO hbase.PerformanceEvaluation: 0/733999/1048576 > 09/10/29 05:37:51 INFO hbase.PerformanceEvaluation: 0/838856/1048576 > * 09/10/29 05:56:47 INFO hbase.PerformanceEvaluation: 0/943713/1048576 > * 09/10/29 06:31:12 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 > > 09/10/29 06:31:13 INFO hbase.PerformanceEvaluation: Finished randomWrite in > 3446240ms at offset 0 for 1048576 rows > > > D) randomWrite rerun, with existing TestTable > bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1 > > ==> 3075 writes / sec > 09/10/29 06:57:04 INFO hbase.PerformanceEvaluation: Finished > randomWrite in 341746ms at offset 0 for 1048576 rows > > > Thanks > Sujee > > -- > http://sujee.net > -- http://sujee.net |
|
|
Re: long pauses during hbase.PerformanceEvaluation tests14 minutes seems way too much, anything relevant in the region server
logs around the same time? (The one who holds the region at that point) J-D On Thu, Oct 29, 2009 at 1:11 AM, Sujee Maniyam <sujee@...> wrote: > forgot to add that I am running > hbase v0.20.1 > hadoop v0.20.1 > pretty much at default settings... > > On Thu, Oct 29, 2009 at 12:01 AM, Sujee Maniyam <sujee@...> wrote: >> Hi all >> >> I have been running 'bin/hbase >> org.apache.hadoop.hbase.PerformanceEvaluation ' script to get an idea >> of the cluster. I see long PAUSES during writes. >> >> here is my setup: >> - 5 nodes on EC2 (m1.large) : 1 hbase master + 4 regions >> - hadoop & hbase get 2G heap >> >> - I don't see any JVM heap getting full (not even close) >> - during the pauses, the CPU usage is pretty much zero (on client, >> running on master machine & region servers) >> - no swap going on >> - network connections between machines are pretty fast >> >> I am curious to know what is causing the delay. I don't see errors in logs. >> Could region-splits be causing such long delays? Because re-runs go >> through much faster. >> >> Any pointers/explanations much appreciated. >> >> Here is a couple of examples: >> >> A) SequentialWrite : TestTable created >> notice the large gaps in timestamps (*) >> # bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1 >> ==> 201 writes / sec >> >> >> 09/10/28 23:17:55 INFO hbase.PerformanceEvaluation: Table {NAME => >> 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS >> => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', >> BLOCKCACHE => 'true'}]} created >> 09/10/28 23:17:55 INFO hbase.PerformanceEvaluation: Start sequentialWrite >> at offset 0 for 1048576 rows >> 09/10/28 23:18:10 INFO hbase.PerformanceEvaluation: 0/104857/1048576 >> 09/10/28 23:18:24 INFO hbase.PerformanceEvaluation: 0/209714/1048576 >> >> * 09/10/28 23:52:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576 >> 09/10/29 00:25:50 INFO hbase.PerformanceEvaluation: 0/419428/1048576 >> 09/10/29 00:26:06 INFO hbase.PerformanceEvaluation: 0/524285/1048576 >> 09/10/29 00:26:25 INFO hbase.PerformanceEvaluation: 0/629142/1048576 >> >> * 09/10/29 00:41:47 INFO hbase.PerformanceEvaluation: 0/733999/1048576 >> 09/10/29 00:43:34 INFO hbase.PerformanceEvaluation: 0/838856/1048576 >> 09/10/29 00:43:49 INFO hbase.PerformanceEvaluation: 0/943713/1048576 >> 09/10/29 00:44:43 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 >> 09/10/29 00:44:44 INFO hbase.PerformanceEvaluation: Finished >> sequentialWrite in 5207981ms at offset 0 for 1048576 rows >> >> >> During one of the pauses, here is what I see on one region server. >> Notice how the table split took 14 mins >> >> @ hadoop4 >> 2009-10-29 00:26:33,925 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> Starting split of region TestTable,0000088768,1256773950625 >> 2009-10-29 00:41:30,428 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> Closed TestTable,0000088768,1256773950625 >> 2009-10-29 00:41:32,218 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> region TestTable,0000088768,1256775993930/1603364931 available; sequence id >> is 58208257 >> 2009-10-29 00:41:32,218 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> Closed TestTable,0000088768,1256775993930 >> 2009-10-29 00:41:32,407 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> region TestTable,0000249152,1256775993930/1798535948 available; sequence id >> is 58208258 >> 2009-10-29 00:41:32,407 INFO org.apache.hadoop.hbase.regionserver.HRegion: >> Closed TestTable,0000249152,1256775993930 >> 2009-10-29 00:41:32,422 INFO org.apache.zookeeper.ZooKeeper: Initiating >> client connection, >> connectString=domU-12-31-39-06-B9-61.compute-1.internal:2181,domU-12-31-39-06-E0-62.compute-1.internal:2181,domU-12-31-39-07-8A-22.compute-1.internal:2181,domU-12-31-39-06-B0-32.compute-1.internal:2181,domU-12-31-39-06-3D-A1.compute-1.internal:2181 >> sessionTimeout=60000 >> watcher=org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher@1f5b63822009-10-29 >> 00:41:32,422 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to >> server domU-12-31-39-06-B0-32.compute-1.internal/10.208.183.192:2181 >> 2009-10-29 00:41:32,466 INFO org.apache.zookeeper.ClientCnxn: Priming >> connection to java.nio.channels.SocketChannel[connected >> local=/10.208.190.143:53364 >> remote=domU-12-31-39-06-B0-32.compute-1.internal/10.208.183.192:2181]2009-10-29 >> 00:41:32,467 INFO org.apache.zookeeper.ClientCnxn: Server connection >> successful >> 2009-10-29 00:41:32,528 INFO >> org.apache.hadoop.hbase.regionserver.CompactSplitThread: region split, META >> updated, and report to master all successful. Old region=REGION => {NAME => >> 'TestTable,0000088768,1256773950625', STARTKE >> Y => '0000088768', ENDKEY => '', ENCODED => 1389834479, OFFLINE => true, >> SPLIT => true, TABLE => {{NAME => 'TestTable', FAMILIES => [{NAME => >> 'info', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', >> BLOCKSIZE => '6 >> 5536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}, new regions: >> TestTable,0000088768,1256775993930, TestTable,0000249152,1256775993930. >> Split took 14mins, 58sec >> 2009-10-29 00:41:34,379 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: >> TestTable,00002 49152,1256775993930 >> 2009-10-29 00:41:34,380 INFO >> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: >> MSG_REGION_OPEN: TestTable,0000249152,1256775993930 >> >> >> >> B) 'TestTable' exists. When I re-ran the test, >> >> ==> 3898 writes / sec >> >> 09/10/29 04:28:22 INFO hbase.PerformanceEvaluation: Finished >> sequentialWrite in 269966ms at offset 0 for 1048576 rows >> >> >> C) For kickers, here is a RandomWrite >> (Table created) >> >> bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1 >> >> ==> 304 writes /sec >> >> 09/10/29 05:33:46 INFO hbase.PerformanceEvaluation: Table {NAME => >> 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', VERSIONS >> => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', >> BLOCKCACHE => 'true'}]} created >> 09/10/29 05:33:46 INFO hbase.PerformanceEvaluation: Start randomWrite at >> offset 0 for 1048576 rows >> 09/10/29 05:34:01 INFO hbase.PerformanceEvaluation: 0/104857/1048576 >> 09/10/29 05:34:15 INFO hbase.PerformanceEvaluation: 0/209714/1048576 >> 09/10/29 05:35:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576 >> 09/10/29 05:36:25 INFO hbase.PerformanceEvaluation: 0/419428/1048576 >> 09/10/29 05:36:38 INFO hbase.PerformanceEvaluation: 0/524285/1048576 >> 09/10/29 05:37:25 INFO hbase.PerformanceEvaluation: 0/629142/1048576 >> 09/10/29 05:37:37 INFO hbase.PerformanceEvaluation: 0/733999/1048576 >> 09/10/29 05:37:51 INFO hbase.PerformanceEvaluation: 0/838856/1048576 >> * 09/10/29 05:56:47 INFO hbase.PerformanceEvaluation: 0/943713/1048576 >> * 09/10/29 06:31:12 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 >> >> 09/10/29 06:31:13 INFO hbase.PerformanceEvaluation: Finished randomWrite in >> 3446240ms at offset 0 for 1048576 rows >> >> >> D) randomWrite rerun, with existing TestTable >> bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 1 >> >> ==> 3075 writes / sec >> 09/10/29 06:57:04 INFO hbase.PerformanceEvaluation: Finished >> randomWrite in 341746ms at offset 0 for 1048576 rows >> >> >> Thanks >> Sujee >> >> -- >> http://sujee.net >> > > > > -- > http://sujee.net > |
|
|
Re: long pauses during hbase.PerformanceEvaluation testsJean,
that would be the logs under '@hadoop4' section - I believe that was the region server holding the server at the time thanks Sujee On Thu, Oct 29, 2009 at 9:32 AM, Jean-Daniel Cryans <jdcryans@...> wrote: > 14 minutes seems way too much, anything relevant in the region server > logs around the same time? (The one who holds the region at that > point) > |
|
|
Re: long pauses during hbase.PerformanceEvaluation testsYes, anything there? Care to paste some lines in a pastebin?
Thx, J-D On Thu, Oct 29, 2009 at 10:19 AM, Sujee Maniyam <sujee@...> wrote: > Jean, > that would be the logs under '@hadoop4' section - I believe that was > the region server holding the server at the time > thanks > Sujee > > On Thu, Oct 29, 2009 at 9:32 AM, Jean-Daniel Cryans <jdcryans@...> wrote: >> 14 minutes seems way too much, anything relevant in the region server >> logs around the same time? (The one who holds the region at that >> point) >> > |
|
|
Re: long pauses during hbase.PerformanceEvaluation testshttp://pastebin.com/f37d75e1d
This is what I previously sent out in the email as well. I don't have the logs anymore (shoulda saved them!). I also realized that my 'ulimit -n = 1024'. Going to up this to 32k and re-run the test. Will post the results. Any thing else, I should keep an eye out for? thanks Sujee On Thu, Oct 29, 2009 at 11:01 AM, Jean-Daniel Cryans <jdcryans@...> wrote: > Yes, anything there? Care to paste some lines in a pastebin? > > Thx, > > J-D > > On Thu, Oct 29, 2009 at 10:19 AM, Sujee Maniyam <sujee@...> wrote: >> Jean, >> that would be the logs under '@hadoop4' section - I believe that was >> the region server holding the server at the time >> thanks >> Sujee >> >> On Thu, Oct 29, 2009 at 9:32 AM, Jean-Daniel Cryans <jdcryans@...> wrote: >>> 14 minutes seems way too much, anything relevant in the region server >>> logs around the same time? (The one who holds the region at that >>> point) >>> >> > -- http://sujee.net |
|
|
Re: long pauses during hbase.PerformanceEvaluation testsSujee,
Sorry the mailing list often doesn't ship the attached files so I couldn't see it. So your log tells me that your split really took 14 mins... !!! Could you re-run that with DEBUG? See http://wiki.apache.org/hadoop/Hbase/FAQ#A5 Thx! J-D On Thu, Oct 29, 2009 at 3:23 PM, Sujee Maniyam <sujee@...> wrote: > http://pastebin.com/f37d75e1d > This is what I previously sent out in the email as well. > > I don't have the logs anymore (shoulda saved them!). > > I also realized that my 'ulimit -n = 1024'. Going to up this to 32k > and re-run the test. Will post the results. > > Any thing else, I should keep an eye out for? > > thanks > Sujee > > On Thu, Oct 29, 2009 at 11:01 AM, Jean-Daniel Cryans > <jdcryans@...> wrote: >> Yes, anything there? Care to paste some lines in a pastebin? >> >> Thx, >> >> J-D >> >> On Thu, Oct 29, 2009 at 10:19 AM, Sujee Maniyam <sujee@...> wrote: >>> Jean, >>> that would be the logs under '@hadoop4' section - I believe that was >>> the region server holding the server at the time >>> thanks >>> Sujee >>> >>> On Thu, Oct 29, 2009 at 9:32 AM, Jean-Daniel Cryans <jdcryans@...> wrote: >>>> 14 minutes seems way too much, anything relevant in the region server >>>> logs around the same time? (The one who holds the region at that >>>> point) >>>> >>> >> > > > > -- > http://sujee.net > |
|
|
Re: long pauses during hbase.PerformanceEvaluation testsfollowing up on this; hopefully this can help out some one else.
turned out the culprit was 'ulimit -n'. it was the default 1024. I upped it to 32k. And that was like flicking a switch and making every thing work great! :-) now the benchmark-tests look much, much better. regards Sujee On Thu, Oct 29, 2009 at 3:57 PM, Jean-Daniel Cryans <jdcryans@...> wrote: > Sujee, > > Sorry the mailing list often doesn't ship the attached files so I > couldn't see it. > > So your log tells me that your split really took 14 mins... !!! Could > you re-run that with DEBUG? See > http://wiki.apache.org/hadoop/Hbase/FAQ#A5 > > Thx! > > J-D > > On Thu, Oct 29, 2009 at 3:23 PM, Sujee Maniyam <sujee@...> wrote: >> http://pastebin.com/f37d75e1d >> This is what I previously sent out in the email as well. >> >> I don't have the logs anymore (shoulda saved them!). >> >> I also realized that my 'ulimit -n = 1024'. Going to up this to 32k >> and re-run the test. Will post the results. >> >> Any thing else, I should keep an eye out for? >> >> thanks >> Sujee >> >> On Thu, Oct 29, 2009 at 11:01 AM, Jean-Daniel Cryans >> <jdcryans@...> wrote: >>> Yes, anything there? Care to paste some lines in a pastebin? >>> >>> Thx, >>> >>> J-D >>> >>> On Thu, Oct 29, 2009 at 10:19 AM, Sujee Maniyam <sujee@...> wrote: >>>> Jean, >>>> that would be the logs under '@hadoop4' section - I believe that was >>>> the region server holding the server at the time >>>> thanks >>>> Sujee >>>> >>>> On Thu, Oct 29, 2009 at 9:32 AM, Jean-Daniel Cryans <jdcryans@...> wrote: >>>>> 14 minutes seems way too much, anything relevant in the region server >>>>> logs around the same time? (The one who holds the region at that >>>>> point) >>>>> >>>> >>> >> >> >> >> -- >> http://sujee.net >> > -- http://sujee.net |
|
|
Re: long pauses during hbase.PerformanceEvaluation testsAs a remainder, on the Getting Started page in the Requirements it says
"HBase currently is a file handle hog. The usual default of 1024 on *nix systems is insufficient if you are loading any significant amount of data into regionservers. See the FAQ: Why do I see "java.io.IOException...(Too many open files)" in my logs? for how to up the limit." http://hadoop.apache.org/hbase/docs/r0.20.1/api/overview-summary.html#overview_description J-D On Fri, Oct 30, 2009 at 4:50 PM, Sujee Maniyam <sujee@...> wrote: > following up on this; hopefully this can help out some one else. > > turned out the culprit was 'ulimit -n'. > it was the default 1024. I upped it to 32k. And that was like > flicking a switch and making every thing work great! :-) > > now the benchmark-tests look much, much better. > > regards > Sujee > > > On Thu, Oct 29, 2009 at 3:57 PM, Jean-Daniel Cryans <jdcryans@...> wrote: >> Sujee, >> >> Sorry the mailing list often doesn't ship the attached files so I >> couldn't see it. >> >> So your log tells me that your split really took 14 mins... !!! Could >> you re-run that with DEBUG? See >> http://wiki.apache.org/hadoop/Hbase/FAQ#A5 >> >> Thx! >> >> J-D >> >> On Thu, Oct 29, 2009 at 3:23 PM, Sujee Maniyam <sujee@...> wrote: >>> http://pastebin.com/f37d75e1d >>> This is what I previously sent out in the email as well. >>> >>> I don't have the logs anymore (shoulda saved them!). >>> >>> I also realized that my 'ulimit -n = 1024'. Going to up this to 32k >>> and re-run the test. Will post the results. >>> >>> Any thing else, I should keep an eye out for? >>> >>> thanks >>> Sujee >>> >>> On Thu, Oct 29, 2009 at 11:01 AM, Jean-Daniel Cryans >>> <jdcryans@...> wrote: >>>> Yes, anything there? Care to paste some lines in a pastebin? >>>> >>>> Thx, >>>> >>>> J-D >>>> >>>> On Thu, Oct 29, 2009 at 10:19 AM, Sujee Maniyam <sujee@...> wrote: >>>>> Jean, >>>>> that would be the logs under '@hadoop4' section - I believe that was >>>>> the region server holding the server at the time >>>>> thanks >>>>> Sujee >>>>> >>>>> On Thu, Oct 29, 2009 at 9:32 AM, Jean-Daniel Cryans <jdcryans@...> wrote: >>>>>> 14 minutes seems way too much, anything relevant in the region server >>>>>> logs around the same time? (The one who holds the region at that >>>>>> point) >>>>>> >>>>> >>>> >>> >>> >>> >>> -- >>> http://sujee.net >>> >> > > > > -- > http://sujee.net > |
| Free embeddable forum powered by Nabble | Forum Help |