Re: HBase Exceptions on version 0.20.1

View: New views
11 Messages — Rating Filter:   Alert me  

Re: HBase Exceptions on version 0.20.1

by elsif :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I am interested as to what other folks are seeing when running this test
against their clusters.  I think my configuration is sound and the load
is fairly realistic.  Please correct me if I'm missing something here.

Here are my results from running the load test on a more recent
cluster.  This cluster is 13 x 3.0GHz Quad Core Xeon servers with 16GB
of RAM and two 1.5TB disks.  One server is the namenode, one is the
zookeeper (1 of 5), one is the hbase master, and the other ten are data
nodes + region servers. Nodes are connected via IP over Infiniband (10Gbps).

[- Configuration -]

Hadoop Version:      0.20.1, r810220
Hadoop Compiled:     Tue Sep 1 20:55:56 UTC 2009 by oom

HBase Version    0.20.1, r822817  
HBase Compiled    Wed Oct 7 11:55:42 PDT 2009, stack

hadoop-env:
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=8000 # namenode
#export HADOOP_HEAPSIZE=2000 # datanode

# Extra Java runtime options.  Empty by default.
# export HADOOP_OPTS=-server
export HADOOP_OPTS="$HADOOP_OPTS -XX:+HeapDumpOnOutOfMemoryError
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"

hbase-env:
# The maximum amount of heap to use, in MB. Default is 1000.
export HBASE_HEAPSIZE=6000

export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"


[- Test -]

Test was started with a newly formatted namenode.  Average load during
the test was 2.66. Highest load seen was 12.95 (62.8%wa) with an average
50MB/s network utilization and 0k swap in use.

After running 2 instances with 30 threads each for just under two hours
both instances aborted with "
org.apache.hadoop.hbase.client.NoServerForRegionException: No server
address listed in .META. for region" errors.  At this point two region
servers aborted.  I can send you the complete set of logs (74M) if you
are interested.

Error Counts:

    1 x ZooKeeper session expired
    3 x OutOfMemoryError messages were logged on two different regions
servers.
    8 x ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed
openScanner
    554 x ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration


[- Preliminary Status -]

Attribute Name    Value    Description
HBase Version    0.20.1, r822817    HBase version and svn revision
HBase Compiled    Wed Oct 7 11:55:42 PDT 2009, stack    When HBase
version was compiled and by whom
Hadoop Version    Unknown, rUnknown    Hadoop version and svn revision
Hadoop Compiled    Unknown, Unknown    When Hadoop version was compiled
and by whom
HBase Root Directory    hdfs://hfs-027010:8020/hbase    Location of
HBase home directory
Load average    14.5    Average number of regions per regionserver.
Naive computation.
Regions On FS    163    Number of regions on FileSystem. Rough count.
Zookeeper Quorum  
hfs-027013:2181,hfs-027012:2181,hfs-027011:2181,hfs-027015:2181,hfs-027014:2181  
Addresses of all registered ZK servers. For more, see zk dump.
Catalog Tables
Table    Description
-ROOT-    The -ROOT- table holds references to all .META. regions.
.META.    The .META. table holds references to all User Table regions
User Tables

1 table(s) in set.
Table    Description
fc_test     {NAME => 'fc_test', FAMILIES => [{NAME => 'data', VERSIONS
=> '1', COMPRESSION => 'LZO', TTL => '2147483647', BLOCKSIZE => '65536',
IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
Region Servers
    Address    Start Code    Load
hfs-027013:60030    1257291638999    requests=22, regions=15,
usedHeap=2660, maxHeap=5991
hfs-027014:60030    1257291639075    requests=25, regions=15,
usedHeap=2282, maxHeap=5991
hfs-027015:60030    1257291639015    requests=20, regions=15,
usedHeap=1552, maxHeap=5991
hfs-027016:60030    1257291638947    requests=16, regions=16,
usedHeap=2786, maxHeap=5991
hfs-027017:60030    1257291639078    requests=14, regions=14,
usedHeap=2775, maxHeap=5991
hfs-027018:60030    1257291639284    requests=19, regions=14,
usedHeap=1070, maxHeap=5991
hfs-027019:60030    1257291639149    requests=22, regions=13,
usedHeap=2095, maxHeap=5991
hfs-027020:60030    1257291639289    requests=37, regions=15,
usedHeap=2352, maxHeap=5991
hfs-027021:60030    1257291639067    requests=26, regions=14,
usedHeap=2087, maxHeap=5991
hfs-027022:60030    1257291638995    requests=19, regions=14,
usedHeap=1570, maxHeap=5991
Total:     servers: 10         requests=220, regions=145


1154 files and directories, 4417 blocks = 5571 total. Heap Size is 81.06
MB / 7.8 GB (1%)
Configured Capacity    :    13.57 TB
DFS Used    :    138.83 GB
Non DFS Used    :    10.39 GB
DFS Remaining    :    13.43 TB
DFS Used%    :    1 %
DFS Remaining%    :    98.93 %
Live Nodes     :    10
Dead Nodes     :    0

[- End Status -]
Region Servers
    Address    Start Code    Load
hfs-027013:60030    1257291638999    requests=0, regions=101,
usedHeap=5428, maxHeap=5991
hfs-027014:60030    1257291639075    requests=0, regions=103,
usedHeap=5373, maxHeap=5991
hfs-027015:60030    1257291639015    requests=0, regions=104,
usedHeap=3973, maxHeap=5991
hfs-027016:60030    1257291638947    requests=0, regions=104,
usedHeap=5051, maxHeap=5991
hfs-027017:60030    1257291639078    requests=0, regions=108,
usedHeap=5829, maxHeap=5991
hfs-027018:60030    1257291639284    requests=0, regions=102,
usedHeap=4481, maxHeap=5991
hfs-027019:60030    1257291639149    requests=0, regions=104,
usedHeap=4875, maxHeap=5991
hfs-027020:60030    1257291639289    requests=0, regions=101,
usedHeap=5625, maxHeap=5991
Total:     servers: 8         requests=0, regions=827


[- Test Program -]

package org.apache.hadoop.hbase.util;

import java.io.IOException;
import java.io.FileInputStream;
import java.lang.Integer;

import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;

/* create 'fc_test', {NAME => 'json', VERSIONS => 1}, {NAME => 'data',
VERSIONS =>1} */

public class TableTest
    extends Thread
{
    protected int datasize = 4000000;

    public TableTest(int datasize)
    {
        this.datasize = datasize;
    }

    public void run()
    {
        try {
            FileInputStream    random = new FileInputStream("/dev/urandom");
            byte[]    key = new byte[32];
            byte[]  data = new byte[datasize];

            HBaseConfiguration hbaseConfiguration = new
HBaseConfiguration();
            HTable table = new HTable(hbaseConfiguration, "fc_test");
            random.read(data, 0, datasize);

            while (true)
            {
                random.read(key, 0, 32);
                Put update = new Put(Bytes.toBytes(getHexString(key)));
                update.add(Bytes.toBytes("data"), Bytes.toBytes(""),
                    data);
                table.put(update);
            }
        } catch (Exception e) {
            System.out.println(e.toString());
        }

    }

    public static String getHexString(byte[] b) throws Exception
    {
          String result = "";
          for (int i=0; i < b.length; i++)
          {
            result +=
                Integer.toString( ( b[i] & 0xff ) + 0x100,
16).substring( 1 );
          }
         return result;
    }

    public static void main(String args[]) throws IOException
    {
        Thread t = null;
        int i;

        int threads = 30;
        int datasize = 400000;

        try {
            threads = Integer.getInteger(args[0]).intValue();
            datasize = Integer.getInteger(args[1]).intValue();
        } catch (Exception e) {
        }

        System.out.println("Starting " + threads + " threads");

        for (i = 0; i< threads; i++)
        {
               t = new TableTest(datasize);
               t.start();
        }
        // wait for last thread
        try {
           t.join();
        } catch (InterruptedException e) {
        }
    }
}


Re: HBase Exceptions on version 0.20.1

by stack-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

elsif, i'll give your program a go in a day or so and report back.
St.Ack

On Tue, Nov 3, 2009 at 6:47 PM, elsif <elsif.then@...> wrote:

> I am interested as to what other folks are seeing when running this test
> against their clusters.  I think my configuration is sound and the load
> is fairly realistic.  Please correct me if I'm missing something here.
>
> Here are my results from running the load test on a more recent
> cluster.  This cluster is 13 x 3.0GHz Quad Core Xeon servers with 16GB
> of RAM and two 1.5TB disks.  One server is the namenode, one is the
> zookeeper (1 of 5), one is the hbase master, and the other ten are data
> nodes + region servers. Nodes are connected via IP over Infiniband
> (10Gbps).
>
> [- Configuration -]
>
> Hadoop Version:      0.20.1, r810220
> Hadoop Compiled:     Tue Sep 1 20:55:56 UTC 2009 by oom
>
> HBase Version    0.20.1, r822817
> HBase Compiled    Wed Oct 7 11:55:42 PDT 2009, stack
>
> hadoop-env:
> # The maximum amount of heap to use, in MB. Default is 1000.
> export HADOOP_HEAPSIZE=8000 # namenode
> #export HADOOP_HEAPSIZE=2000 # datanode
>
> # Extra Java runtime options.  Empty by default.
> # export HADOOP_OPTS=-server
> export HADOOP_OPTS="$HADOOP_OPTS -XX:+HeapDumpOnOutOfMemoryError
> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"
>
> hbase-env:
> # The maximum amount of heap to use, in MB. Default is 1000.
> export HBASE_HEAPSIZE=6000
>
> export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError
> -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"
>
>
> [- Test -]
>
> Test was started with a newly formatted namenode.  Average load during
> the test was 2.66. Highest load seen was 12.95 (62.8%wa) with an average
> 50MB/s network utilization and 0k swap in use.
>
> After running 2 instances with 30 threads each for just under two hours
> both instances aborted with "
> org.apache.hadoop.hbase.client.NoServerForRegionException: No server
> address listed in .META. for region" errors.  At this point two region
> servers aborted.  I can send you the complete set of logs (74M) if you
> are interested.
>
> Error Counts:
>
>    1 x ZooKeeper session expired
>    3 x OutOfMemoryError messages were logged on two different regions
> servers.
>    8 x ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed
> openScanner
>    554 x ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration
>
>
> [- Preliminary Status -]
>
> Attribute Name    Value    Description
> HBase Version    0.20.1, r822817    HBase version and svn revision
> HBase Compiled    Wed Oct 7 11:55:42 PDT 2009, stack    When HBase
> version was compiled and by whom
> Hadoop Version    Unknown, rUnknown    Hadoop version and svn revision
> Hadoop Compiled    Unknown, Unknown    When Hadoop version was compiled
> and by whom
> HBase Root Directory    hdfs://hfs-027010:8020/hbase    Location of
> HBase home directory
> Load average    14.5    Average number of regions per regionserver.
> Naive computation.
> Regions On FS    163    Number of regions on FileSystem. Rough count.
> Zookeeper Quorum
>
> hfs-027013:2181,hfs-027012:2181,hfs-027011:2181,hfs-027015:2181,hfs-027014:2181
> Addresses of all registered ZK servers. For more, see zk dump.
> Catalog Tables
> Table    Description
> -ROOT-    The -ROOT- table holds references to all .META. regions.
> .META.    The .META. table holds references to all User Table regions
> User Tables
>
> 1 table(s) in set.
> Table    Description
> fc_test     {NAME => 'fc_test', FAMILIES => [{NAME => 'data', VERSIONS
> => '1', COMPRESSION => 'LZO', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
> Region Servers
>    Address    Start Code    Load
> hfs-027013:60030    1257291638999    requests=22, regions=15,
> usedHeap=2660, maxHeap=5991
> hfs-027014:60030    1257291639075    requests=25, regions=15,
> usedHeap=2282, maxHeap=5991
> hfs-027015:60030    1257291639015    requests=20, regions=15,
> usedHeap=1552, maxHeap=5991
> hfs-027016:60030    1257291638947    requests=16, regions=16,
> usedHeap=2786, maxHeap=5991
> hfs-027017:60030    1257291639078    requests=14, regions=14,
> usedHeap=2775, maxHeap=5991
> hfs-027018:60030    1257291639284    requests=19, regions=14,
> usedHeap=1070, maxHeap=5991
> hfs-027019:60030    1257291639149    requests=22, regions=13,
> usedHeap=2095, maxHeap=5991
> hfs-027020:60030    1257291639289    requests=37, regions=15,
> usedHeap=2352, maxHeap=5991
> hfs-027021:60030    1257291639067    requests=26, regions=14,
> usedHeap=2087, maxHeap=5991
> hfs-027022:60030    1257291638995    requests=19, regions=14,
> usedHeap=1570, maxHeap=5991
> Total:     servers: 10         requests=220, regions=145
>
>
> 1154 files and directories, 4417 blocks = 5571 total. Heap Size is 81.06
> MB / 7.8 GB (1%)
> Configured Capacity    :    13.57 TB
> DFS Used    :    138.83 GB
> Non DFS Used    :    10.39 GB
> DFS Remaining    :    13.43 TB
> DFS Used%    :    1 %
> DFS Remaining%    :    98.93 %
> Live Nodes     :    10
> Dead Nodes     :    0
>
> [- End Status -]
> Region Servers
>    Address    Start Code    Load
> hfs-027013:60030    1257291638999    requests=0, regions=101,
> usedHeap=5428, maxHeap=5991
> hfs-027014:60030    1257291639075    requests=0, regions=103,
> usedHeap=5373, maxHeap=5991
> hfs-027015:60030    1257291639015    requests=0, regions=104,
> usedHeap=3973, maxHeap=5991
> hfs-027016:60030    1257291638947    requests=0, regions=104,
> usedHeap=5051, maxHeap=5991
> hfs-027017:60030    1257291639078    requests=0, regions=108,
> usedHeap=5829, maxHeap=5991
> hfs-027018:60030    1257291639284    requests=0, regions=102,
> usedHeap=4481, maxHeap=5991
> hfs-027019:60030    1257291639149    requests=0, regions=104,
> usedHeap=4875, maxHeap=5991
> hfs-027020:60030    1257291639289    requests=0, regions=101,
> usedHeap=5625, maxHeap=5991
> Total:     servers: 8         requests=0, regions=827
>
>
> [- Test Program -]
>
> package org.apache.hadoop.hbase.util;
>
> import java.io.IOException;
> import java.io.FileInputStream;
> import java.lang.Integer;
>
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.KeyValue;
> import org.apache.hadoop.hbase.client.HTable;
> import org.apache.hadoop.hbase.client.Get;
> import org.apache.hadoop.hbase.client.Put;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.util.Bytes;
>
> /* create 'fc_test', {NAME => 'json', VERSIONS => 1}, {NAME => 'data',
> VERSIONS =>1} */
>
> public class TableTest
>    extends Thread
> {
>    protected int datasize = 4000000;
>
>    public TableTest(int datasize)
>    {
>        this.datasize = datasize;
>    }
>
>    public void run()
>    {
>        try {
>            FileInputStream    random = new FileInputStream("/dev/urandom");
>            byte[]    key = new byte[32];
>            byte[]  data = new byte[datasize];
>
>            HBaseConfiguration hbaseConfiguration = new
> HBaseConfiguration();
>            HTable table = new HTable(hbaseConfiguration, "fc_test");
>            random.read(data, 0, datasize);
>
>            while (true)
>            {
>                random.read(key, 0, 32);
>                Put update = new Put(Bytes.toBytes(getHexString(key)));
>                update.add(Bytes.toBytes("data"), Bytes.toBytes(""),
>                    data);
>                table.put(update);
>            }
>        } catch (Exception e) {
>            System.out.println(e.toString());
>        }
>
>    }
>
>    public static String getHexString(byte[] b) throws Exception
>    {
>          String result = "";
>          for (int i=0; i < b.length; i++)
>          {
>            result +=
>                Integer.toString( ( b[i] & 0xff ) + 0x100,
> 16).substring( 1 );
>          }
>         return result;
>    }
>
>    public static void main(String args[]) throws IOException
>    {
>        Thread t = null;
>        int i;
>
>        int threads = 30;
>        int datasize = 400000;
>
>        try {
>            threads = Integer.getInteger(args[0]).intValue();
>            datasize = Integer.getInteger(args[1]).intValue();
>        } catch (Exception e) {
>        }
>
>        System.out.println("Starting " + threads + " threads");
>
>        for (i = 0; i< threads; i++)
>        {
>               t = new TableTest(datasize);
>               t.start();
>        }
>        // wait for last thread
>        try {
>           t.join();
>        } catch (InterruptedException e) {
>        }
>    }
> }
>
>

Re: HBase Exceptions on version 0.20.1

by elsif :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Cool.  I am very interested to see what your results are.  I am running
another test with with one change from last time:

  <property>
  <name>dfs.datanode.max.xcievers</name>
  <value>4096</value>
  </property>

The 554 errors from the last run were all due to
BlockAlreadyExistsExceptions:

2009-11-03 15:44:30,322 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.3.27.21:50010,
storageID=DS-930303933-10.3.27.21-50010-1257291478576, infoPort=50075,
ipcPort=50020):DataXceiver
org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException:
Block blk_-6444291931475091735_1344 is valid, and cannot be written to.

The cpu load appears more even this time around - highest is 3.98.  So
far I have 1402 regions using 1.04TB of DFS space.


stack wrote:hightest
> elsif, i'll give your program a go in a day or so and report back.
> St.Ack
>
>  


Re: HBase Exceptions on version 0.20.1

by stack-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Just to say I'm starting up test now....  will report back.  Thanks for your
patience.
St.Ack

On Wed, Nov 4, 2009 at 3:24 PM, elsif <elsif.then@...> wrote:

> Cool.  I am very interested to see what your results are.  I am running
> another test with with one change from last time:
>
>  <property>
>  <name>dfs.datanode.max.xcievers</name>
>  <value>4096</value>
>  </property>
>
> The 554 errors from the last run were all due to
> BlockAlreadyExistsExceptions:
>
> 2009-11-03 15:44:30,322 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(10.3.27.21:50010,
> storageID=DS-930303933-10.3.27.21-50010-1257291478576, infoPort=50075,
> ipcPort=50020):DataXceiver
> org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException:
> Block blk_-6444291931475091735_1344 is valid, and cannot be written to.
>
> The cpu load appears more even this time around - highest is 3.98.  So
> far I have 1402 regions using 1.04TB of DFS space.
>
>
> stack wrote:hightest
> > elsif, i'll give your program a go in a day or so and report back.
> > St.Ack
> >
> >
>
>

Re: HBase Exceptions on version 0.20.1

by elsif :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

My most recent test made it past a couple days, and then one of the
region servers failed during a compaction:

2009-11-06 09:09:59,284 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Start
ing compaction on region
fc_test,e9e8bab582b05d18d550d4112258b4c29707c33ebdd45447c20cce5c5a9e95a0,1257440539235
2009-11-06 09:09:59,294 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 6 on 60020 took 4611ms appending an edit to hlog;
editcount=66
2009-11-06 09:10:14,298 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 7 on 60020 took 3414ms appending an edit to hlog;
editcount=93
2009-11-06 09:10:17,729 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 7 on 60020 took 3430ms appending an edit to hlog;
editcount=94
2009-11-06 09:10:17,734 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 5 on 60020 took 6889ms appending an edit to hlog;
editcount=95
2009-11-06 09:10:17,739 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 1 on 60020 took 6894ms appending an edit to hlog;
editcount=96
2009-11-06 09:10:17,743 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 0 on 60020 took 6898ms appending an edit to hlog;
editcount=97
2009-11-06 09:10:17,745 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 3 on 60020 took 6901ms appending an edit to hlog;
editcount=98
2009-11-06 09:10:17,752 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 9 on 60020 took 6908ms appending an edit to hlog;
editcount=99
2009-11-06 09:10:20,047 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 2 on 60020 took 9203ms appending an edit to hlog;
editcount=100
2009-11-06 09:10:20,074 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of
fc_test,d96130bef65b02598c22140983e596f78cb3aac9249361054a3ac6ca287eb6ed,1257447874792
because global memstore limit of 2.3g exceeded; currently 2.3g and
flushing till 1.5g
2009-11-06 09:10:20,079 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 8 on 60020 took 9235ms appending an edit to hlog;
editcount=107
2009-11-06 09:10:20,081 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 6 on 60020 took 9238ms appending an edit to hlog;
editcount=108
.
.
.
2009-11-06 09:10:25,953 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 4 on 60020 took 15105ms appending an edit to hlog;
editcount=114
2009-11-06 09:10:44,071 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: Disconnected, type: None, path: null
2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We
slept 286007ms, ten times longer than scheduled: 10000
2009-11-06 09:15:37,146 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
-1103392202834949346 lease expired
2009-11-06 09:15:37,146 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
aborting.
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hbase.client.Put.readFields(Put.java:342)
        at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:421)
        at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:350)
        at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:399)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:174)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:867)


Upon restarting the regionserver the following messages are written to
the log file over and over:

2009-11-07 00:01:00,299 WARN org.apache.hadoop.hbase.regionserver.Store:
Exception processing reconstruction log
hdfs://hfs-027010:8020/hbase/fc_test/338686730/oldlogfile.log opening
data                                                    
org.apache.hadoop.fs.ChecksumException: Checksum error:
/blk_7784569113613463369:of:/hbase/fc_test/338686730/oldlogfile.log at
2818048                                  at
org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)                                                                                      
at
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)                                                                              
at
org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)          
at
org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)          
at
org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1147)                                                                                      
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1696)                                                                            
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1746)                                                                                  
at java.io.DataInputStream.readFully(Unknown Source)                  
@                                                                      

My hbase settings are:

export HBASE_HEAPSIZE=6000
export HBASE_OPTS="$HBASE_OPTS -XX:NewSize=6m -XX:MaxNewSize=6m
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"

I have 16G of memory on each node with 6G allocated to the region server
and 2G allocated to the datanode.

Is there any way to salvage the contents of the log file or is my only
option to move it out of the way and loose all the edits?

Would it be possible to have the regionserver do a clean shutdown before
it goes down the GC spiral?

stack wrote:
> Just to say I'm starting up test now....  will report back.  Thanks for your
> patience.
> St.Ack
>
>
>  


Re: HBase Exceptions on version 0.20.1

by Andrew Purtell-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

You should consider provisioning more nodes to get beyond this ceiling you encountered.

DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing cannot happen fast enough to avoid an OOME. Possibly there was even insufficient CPU to GC. The log entries you highlighted indicate the load you are exerting on your current cluster needs to be spread out over more resources than currently allocated.

This:
> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 286007ms, ten times longer than scheduled: 10000

indicates a thread that wanted to sleep for 10 seconds was starved for CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase process shutdowns, missed DFS heartbeats possibly resulting in spurious declaration of dead datanodes, and other serious problems will result from this.

Did your systems start to swap?

When region servers shut down, the master notices this and splits their HLogs into per region reconstruction logs. These are the "oldlogfile.log" files. The master log will shed light on why this particular reconstruction log was botched. Would have happened at the master. The region server probably did do a clean shutdown. I suspect DFS was in extremis due to overloading so the split failed. The checksum error indicates incomplete write at the OS level. Did a datanode crash?

HBASE-1956 is about making the DFS latency metric exportable via the
Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
seconds is a warning. Anything above 5 seconds is an alarm.  It's a
good indication that an overloading condition is in progress.

The Hadoop stack, being pre 1.0, has some rough edges. Response to overloading is one of them. For one thing, HBase could be better about applying backpressure to writing clients when the system is under stress. We will get there. HBASE-1956 is a start.

    - Andy





________________________________
From: elsif <elsif.then@...>
To: hbase-user@...
Sent: Sun, November 8, 2009 1:12:12 AM
Subject: Re: HBase Exceptions on version 0.20.1

My most recent test made it past a couple days, and then one of the
region servers failed during a compaction:

2009-11-06 09:09:59,284 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Start
ing compaction on region
fc_test,e9e8bab582b05d18d550d4112258b4c29707c33ebdd45447c20cce5c5a9e95a0,1257440539235
2009-11-06 09:09:59,294 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 6 on 60020 took 4611ms appending an edit to hlog;
editcount=66
2009-11-06 09:10:14,298 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 7 on 60020 took 3414ms appending an edit to hlog;
editcount=93
2009-11-06 09:10:17,729 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 7 on 60020 took 3430ms appending an edit to hlog;
editcount=94
2009-11-06 09:10:17,734 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 5 on 60020 took 6889ms appending an edit to hlog;
editcount=95
2009-11-06 09:10:17,739 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 1 on 60020 took 6894ms appending an edit to hlog;
editcount=96
2009-11-06 09:10:17,743 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 0 on 60020 took 6898ms appending an edit to hlog;
editcount=97
2009-11-06 09:10:17,745 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 3 on 60020 took 6901ms appending an edit to hlog;
editcount=98
2009-11-06 09:10:17,752 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 9 on 60020 took 6908ms appending an edit to hlog;
editcount=99
2009-11-06 09:10:20,047 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 2 on 60020 took 9203ms appending an edit to hlog;
editcount=100
2009-11-06 09:10:20,074 INFO
org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of
fc_test,d96130bef65b02598c22140983e596f78cb3aac9249361054a3ac6ca287eb6ed,1257447874792
because global memstore limit of 2.3g exceeded; currently 2.3g and
flushing till 1.5g
2009-11-06 09:10:20,079 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 8 on 60020 took 9235ms appending an edit to hlog;
editcount=107
2009-11-06 09:10:20,081 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 6 on 60020 took 9238ms appending an edit to hlog;
editcount=108
.
.
.
2009-11-06 09:10:25,953 WARN org.apache.hadoop.hbase.regionserver.HLog:
IPC Server handler 4 on 60020 took 15105ms appending an edit to hlog;
editcount=114
2009-11-06 09:10:44,071 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: Disconnected, type: None, path: null
2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We
slept 286007ms, ten times longer than scheduled: 10000
2009-11-06 09:15:37,146 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
-1103392202834949346 lease expired
2009-11-06 09:15:37,146 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
aborting.
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hbase.client.Put.readFields(Put.java:342)
        at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:421)
        at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:350)
        at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:399)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:174)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:867)


Upon restarting the regionserver the following messages are written to
the log file over and over:

2009-11-07 00:01:00,299 WARN org.apache.hadoop.hbase.regionserver.Store:
Exception processing reconstruction log
hdfs://hfs-027010:8020/hbase/fc_test/338686730/oldlogfile.log opening
data                                                    
org.apache.hadoop.fs.ChecksumException: Checksum error:
/blk_7784569113613463369:of:/hbase/fc_test/338686730/oldlogfile.log at
2818048                                  at
org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)                                                                                      
at
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)                                                                              
at
org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)          
at
org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)          
at
org.apache.hadoop.hdfs.DFSClient$BlockReader.read(DFSClient.java:1147)                                                                                      
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1696)                                                                            
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1746)                                                                                  
at java.io.DataInputStream.readFully(Unknown Source)                  
@                                                                      

My hbase settings are:

export HBASE_HEAPSIZE=6000
export HBASE_OPTS="$HBASE_OPTS -XX:NewSize=6m -XX:MaxNewSize=6m
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"

I have 16G of memory on each node with 6G allocated to the region server
and 2G allocated to the datanode.

Is there any way to salvage the contents of the log file or is my only
option to move it out of the way and loose all the edits?

Would it be possible to have the regionserver do a clean shutdown before
it goes down the GC spiral?

stack wrote:
> Just to say I'm starting up test now....  will report back.  Thanks for your
> patience.
> St.Ack
>
>
>  


     

Re: HBase Exceptions on version 0.20.1

by elsif :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


The larger issue here is that any hbase cluster will reach this tipping
point at some point in its lifetime as more and more data is added.  We
need to have a graceful method to put the cluster into safe mode until
more resources can be added or the load on the cluster has been
reduced.  We cannot allow hbase to run itself into the ground causing
data loss or corruption under any circumstances.
*
*
Andrew Purtell wrote:

> You should consider provisioning more nodes to get beyond this ceiling you encountered.
>
> DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing cannot happen fast enough to avoid an OOME. Possibly there was even insufficient CPU to GC. The log entries you highlighted indicate the load you are exerting on your current cluster needs to be spread out over more resources than currently allocated.
>
> This:
>  
>> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 286007ms, ten times longer than scheduled: 10000
>>    
>
> indicates a thread that wanted to sleep for 10 seconds was starved for CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase process shutdowns, missed DFS heartbeats possibly resulting in spurious declaration of dead datanodes, and other serious problems will result from this.
>
> Did your systems start to swap?
>
> When region servers shut down, the master notices this and splits their HLogs into per region reconstruction logs. These are the "oldlogfile.log" files. The master log will shed light on why this particular reconstruction log was botched. Would have happened at the master. The region server probably did do a clean shutdown. I suspect DFS was in extremis due to overloading so the split failed. The checksum error indicates incomplete write at the OS level. Did a datanode crash?
>
> HBASE-1956 is about making the DFS latency metric exportable via the
> Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
> seconds is a warning. Anything above 5 seconds is an alarm.  It's a
> good indication that an overloading condition is in progress.
>
> The Hadoop stack, being pre 1.0, has some rough edges. Response to overloading is one of them. For one thing, HBase could be better about applying backpressure to writing clients when the system is under stress. We will get there. HBASE-1956 is a start.
>
>     - Andy
>
>  


Re: HBase Exceptions on version 0.20.1

by stack-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Agreed.  Please make an issue.

Meantime, it should be possible to have a cron run a script that checks
cluster resources from time-to-time -- e.g. how full hdfs is, how much each
regionserver is carrying -- and when it determines the needle is in the red,
flip the cluster to be read-only.

St.Ack

On Mon, Nov 9, 2009 at 9:25 AM, elsif <elsif.then@...> wrote:

>
> The larger issue here is that any hbase cluster will reach this tipping
> point at some point in its lifetime as more and more data is added.  We
> need to have a graceful method to put the cluster into safe mode until
> more resources can be added or the load on the cluster has been
> reduced.  We cannot allow hbase to run itself into the ground causing
> data loss or corruption under any circumstances.
> *
> *
> Andrew Purtell wrote:
> > You should consider provisioning more nodes to get beyond this ceiling
> you encountered.
> >
> > DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing
> cannot happen fast enough to avoid an OOME. Possibly there was even
> insufficient CPU to GC. The log entries you highlighted indicate the load
> you are exerting on your current cluster needs to be spread out over more
> resources than currently allocated.
> >
> > This:
> >
> >> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept 286007ms, ten times longer than scheduled: 10000
> >>
> >
> > indicates a thread that wanted to sleep for 10 seconds was starved for
> CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase
> process shutdowns, missed DFS heartbeats possibly resulting in spurious
> declaration of dead datanodes, and other serious problems will result from
> this.
> >
> > Did your systems start to swap?
> >
> > When region servers shut down, the master notices this and splits their
> HLogs into per region reconstruction logs. These are the "oldlogfile.log"
> files. The master log will shed light on why this particular reconstruction
> log was botched. Would have happened at the master. The region server
> probably did do a clean shutdown. I suspect DFS was in extremis due to
> overloading so the split failed. The checksum error indicates incomplete
> write at the OS level. Did a datanode crash?
> >
> > HBASE-1956 is about making the DFS latency metric exportable via the
> > Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
> > seconds is a warning. Anything above 5 seconds is an alarm.  It's a
> > good indication that an overloading condition is in progress.
> >
> > The Hadoop stack, being pre 1.0, has some rough edges. Response to
> overloading is one of them. For one thing, HBase could be better about
> applying backpressure to writing clients when the system is under stress. We
> will get there. HBASE-1956 is a start.
> >
> >     - Andy
> >
> >
>
>

Re: HBase Exceptions on version 0.20.1

by Jonathan Gray-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It's fairly easy to run HDFS into the ground if you eat up all the
resources.

It's also fairly easy to run a Linux machine into the ground if you eat
up all the resources;  or just about anything by starving it of CPU.

I don't disagree with a read-only mode if the server is full, but in
general believe admins of any production cluster need to be constantly
aware of capacity and usage and plan properly.  Not just so things don't
fill up, but also from a performance POV.

I completely disagree with the statement "any hbase cluster will reach
this tipping point at some point in its lifetime as more and more data
is added".  Anyone who continuously adds data without paying any
attention to capacity, then this will happen, but this is the case with
anything that has finite resources.


We should certainly do everything we can to prevent data loss and
corruption in all cases, but also must also set realistic expectations.
  One should not assume they can fill their HBase cluster to the brink
and for everything to always be okay (even if this is the case).

JG

stack wrote:

> Agreed.  Please make an issue.
>
> Meantime, it should be possible to have a cron run a script that checks
> cluster resources from time-to-time -- e.g. how full hdfs is, how much each
> regionserver is carrying -- and when it determines the needle is in the red,
> flip the cluster to be read-only.
>
> St.Ack
>
> On Mon, Nov 9, 2009 at 9:25 AM, elsif <elsif.then@...> wrote:
>
>> The larger issue here is that any hbase cluster will reach this tipping
>> point at some point in its lifetime as more and more data is added.  We
>> need to have a graceful method to put the cluster into safe mode until
>> more resources can be added or the load on the cluster has been
>> reduced.  We cannot allow hbase to run itself into the ground causing
>> data loss or corruption under any circumstances.
>> *
>> *
>> Andrew Purtell wrote:
>>> You should consider provisioning more nodes to get beyond this ceiling
>> you encountered.
>>> DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing
>> cannot happen fast enough to avoid an OOME. Possibly there was even
>> insufficient CPU to GC. The log entries you highlighted indicate the load
>> you are exerting on your current cluster needs to be spread out over more
>> resources than currently allocated.
>>> This:
>>>
>>>> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 286007ms, ten times longer than scheduled: 10000
>>> indicates a thread that wanted to sleep for 10 seconds was starved for
>> CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase
>> process shutdowns, missed DFS heartbeats possibly resulting in spurious
>> declaration of dead datanodes, and other serious problems will result from
>> this.
>>> Did your systems start to swap?
>>>
>>> When region servers shut down, the master notices this and splits their
>> HLogs into per region reconstruction logs. These are the "oldlogfile.log"
>> files. The master log will shed light on why this particular reconstruction
>> log was botched. Would have happened at the master. The region server
>> probably did do a clean shutdown. I suspect DFS was in extremis due to
>> overloading so the split failed. The checksum error indicates incomplete
>> write at the OS level. Did a datanode crash?
>>> HBASE-1956 is about making the DFS latency metric exportable via the
>>> Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
>>> seconds is a warning. Anything above 5 seconds is an alarm.  It's a
>>> good indication that an overloading condition is in progress.
>>>
>>> The Hadoop stack, being pre 1.0, has some rough edges. Response to
>> overloading is one of them. For one thing, HBase could be better about
>> applying backpressure to writing clients when the system is under stress. We
>> will get there. HBASE-1956 is a start.
>>>     - Andy
>>>
>>>
>>
>

Re: HBase Exceptions on version 0.20.1

by Andrew Purtell-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes I agree with you.

Individually each HBase daemon can block clients and/or reject writes and/or defer compactions as write latency increases. This is a second order indication of cluster stress. By the time write latency falls outside bounds, it may be too late to shut down gracefully. But anyway, we should do these things.

Metrics can be used for HBase and applications/writers and cluster admins to globally know directly that the cluster is under stress in order to coordinate a response. Such a response could happen concurrently on multiple levels.

But getting back to your point, there are certainly opportunities for HBase to take self preserving actions where currently it does not.

    - Andy




________________________________
From: elsif <elsif.then@...>
To: hbase-user@...
Sent: Tue, November 10, 2009 1:25:20 AM
Subject: Re: HBase Exceptions on version 0.20.1


The larger issue here is that any hbase cluster will reach this tipping
point at some point in its lifetime as more and more data is added.  We
need to have a graceful method to put the cluster into safe mode until
more resources can be added or the load on the cluster has been
reduced.  We cannot allow hbase to run itself into the ground causing
data loss or corruption under any circumstances.
*
*
Andrew Purtell wrote:

> You should consider provisioning more nodes to get beyond this ceiling you encountered.
>
> DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing cannot happen fast enough to avoid an OOME. Possibly there was even insufficient CPU to GC. The log entries you highlighted indicate the load you are exerting on your current cluster needs to be spread out over more resources than currently allocated.
>
> This:
>  
>> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 286007ms, ten times longer than scheduled: 10000
>>    
>
> indicates a thread that wanted to sleep for 10 seconds was starved for CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase process shutdowns, missed DFS heartbeats possibly resulting in spurious declaration of dead datanodes, and other serious problems will result from this.
>
> Did your systems start to swap?
>
> When region servers shut down, the master notices this and splits their HLogs into per region reconstruction logs. These are the "oldlogfile.log" files. The master log will shed light on why this particular reconstruction log was botched. Would have happened at the master. The region server probably did do a clean shutdown. I suspect DFS was in extremis due to overloading so the split failed. The checksum error indicates incomplete write at the OS level. Did a datanode crash?
>
> HBASE-1956 is about making the DFS latency metric exportable via the
> Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
> seconds is a warning. Anything above 5 seconds is an alarm.  It's a
> good indication that an overloading condition is in progress.
>
> The Hadoop stack, being pre 1.0, has some rough edges. Response to overloading is one of them. For one thing, HBase could be better about applying backpressure to writing clients when the system is under stress. We will get there. HBASE-1956 is a start.
>
>     - Andy
>
>  


     

Re: HBase Exceptions on version 0.20.1

by Andrew Purtell-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi JG,

I fully agree that administration of Hadoop/HDFS/HBase clusters is an active process. HDFS and HBase (and mapreduce, etc.) are easily scaled horizontally, but it requires an
admin to actively make available more resources and join them to the cluster in time to meet demand. We just expect these things because we anticipate terascale or petascale clusters of many nodes, which require professional administration.

One can corrupt an e.g. Oracle database if you overcommit memory on the server and the kernel panics in get_free_page...

However, HBase has opportunities to sense overloading and take self preserving actions that is is not taking advantage of. We can work toward that.

But, it will always be the case that in some circumstances without operational care it may be too late when HBase is aware of a problem for it to effectively manage the situation.

   - Andy




________________________________
From: Jonathan Gray <jlist@...>
To: hbase-user@...
Sent: Tue, November 10, 2009 3:23:20 AM
Subject: Re: HBase Exceptions on version 0.20.1

It's fairly easy to run HDFS into the ground if you eat up all the resources.

It's also fairly easy to run a Linux machine into the ground if you eat up all the resources;  or just about anything by starving it of CPU.

I don't disagree with a read-only mode if the server is full, but in general believe admins of any production cluster need to be constantly aware of capacity and usage and plan properly.  Not just so things don't fill up, but also from a performance POV.

I completely disagree with the statement "any hbase cluster will reach this tipping point at some point in its lifetime as more and more data is added".  Anyone who continuously adds data without paying any attention to capacity, then this will happen, but this is the case with anything that has finite resources.


We should certainly do everything we can to prevent data loss and corruption in all cases, but also must also set realistic expectations.  One should not assume they can fill their HBase cluster to the brink and for everything to always be okay (even if this is the case).

JG

stack wrote:

> Agreed.  Please make an issue.
>
> Meantime, it should be possible to have a cron run a script that checks
> cluster resources from time-to-time -- e.g. how full hdfs is, how much each
> regionserver is carrying -- and when it determines the needle is in the red,
> flip the cluster to be read-only.
>
> St.Ack
>
> On Mon, Nov 9, 2009 at 9:25 AM, elsif <elsif.then@...> wrote:
>
>> The larger issue here is that any hbase cluster will reach this tipping
>> point at some point in its lifetime as more and more data is added.  We
>> need to have a graceful method to put the cluster into safe mode until
>> more resources can be added or the load on the cluster has been
>> reduced.  We cannot allow hbase to run itself into the ground causing
>> data loss or corruption under any circumstances.
>> *
>> *
>> Andrew Purtell wrote:
>>> You should consider provisioning more nodes to get beyond this ceiling
>> you encountered.
>>> DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing
>> cannot happen fast enough to avoid an OOME. Possibly there was even
>> insufficient CPU to GC. The log entries you highlighted indicate the load
>> you are exerting on your current cluster needs to be spread out over more
>> resources than currently allocated.
>>> This:
>>>
>>>> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 286007ms, ten times longer than scheduled: 10000
>>> indicates a thread that wanted to sleep for 10 seconds was starved for
>> CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase
>> process shutdowns, missed DFS heartbeats possibly resulting in spurious
>> declaration of dead datanodes, and other serious problems will result from
>> this.
>>> Did your systems start to swap?
>>>
>>> When region servers shut down, the master notices this and splits their
>> HLogs into per region reconstruction logs. These are the "oldlogfile.log"
>> files. The master log will shed light on why this particular reconstruction
>> log was botched. Would have happened at the master. The region server
>> probably did do a clean shutdown. I suspect DFS was in extremis due to
>> overloading so the split failed. The checksum error indicates incomplete
>> write at the OS level. Did a datanode crash?
>>> HBASE-1956 is about making the DFS latency metric exportable via the
>>> Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
>>> seconds is a warning. Anything above 5 seconds is an alarm.  It's a
>>> good indication that an overloading condition is in progress.
>>>
>>> The Hadoop stack, being pre 1.0, has some rough edges. Response to
>> overloading is one of them. For one thing, HBase could be better about
>> applying backpressure to writing clients when the system is under stress. We
>> will get there. HBASE-1956 is a start.
>>>     - Andy
>>>
>>>
>>
>