WrongRegionException: How do I recover?

View: New views
6 Messages — Rating Filter:   Alert me  

WrongRegionException: How do I recover?

by Joost Ouwerkerk-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

HBase has started throwing WrongRegionExceptions at me when trying to access
certain regions.  I'm guessing that the META table has somehow gone out of
sync with reality.  I've tried compacting and I've tried restarting, but the
problem does not go away.  The errors are always on the same regions.  Has
anyone else seen this and succeeded at getting their table back into working
order?

*Example get:*

org.apache.hadoop.hbase.regionserver.WrongRegionException:
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
of range for HRegion
crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
    at
org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
    at
org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
    at
org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
    at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
    at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

*Example put:
*
put 'crawled_pages','r:
http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
'curi:test','test'
NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
Trying to contact region server Some server, retryOnlyOne=true, index=0,
islastrow=true, tries=4, numtries=5, i=0, listsize=1,
region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
for region
crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
row
'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
but failed after 5 attempts.
Exceptions:

    from org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
`process'
    from org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
`processBatchOfRows'
    from org/apache/hadoop/hbase/client/HTable.java:605:in `flushCommits'
    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
    from java/lang/reflect/Method.java:597:in `invoke'
    from org/jruby/javasupport/JavaMethod.java:298:in
`invokeWithExceptionHandling'
    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in `call'
    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
    from org/jruby/ast/NewlineNode.java:104:in `interpret'

Re: WrongRegionException: How do I recover?

by stack-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Meta is giving out the wrong address for a region?  Do a scan of .META.  It
might be easier dumping the scan into a file so you can grep around:

echo "scan '.META.'" | ./bin/hbase shell --format-width=300 &> /tmp/meta.txt

Grep in here for the region that contains the row you are looking for.  What
does it have for info:server?  Go to that regionserver (UI or log).  Is it
carrying the region?  If not, thats what the WRE is about.

For same region, grep its name in master log (hopefully you have DEBUG
enabled).

Whats its history?  Could it have been assigned to one server and then
another?

If so, close the region in both places.  Type 'tools' in the shell to see
doc. on "close_region" command.  You can pass it server to pass the close
message to.  Close in both places.

If its a double-assignment issue, our name for above phenomeon, suggest you
upgrade to 0.20.1.  It has at least one pointed fix for this scenario
(HBASE-1878).

St.Ack


On Wed, Nov 4, 2009 at 12:35 PM, Joost Ouwerkerk <joost@...>wrote:

> HBase has started throwing WrongRegionExceptions at me when trying to
> access
> certain regions.  I'm guessing that the META table has somehow gone out of
> sync with reality.  I've tried compacting and I've tried restarting, but
> the
> problem does not go away.  The errors are always on the same regions.  Has
> anyone else seen this and succeeded at getting their table back into
> working
> order?
>
> *Example get:*
>
> org.apache.hadoop.hbase.regionserver.WrongRegionException:
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out
> of range for HRegion
>
> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>
> startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
>
> getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
>
> row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
>    at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
>    at
>
> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
>    at
> org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
>    at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
>    at
>
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
>    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>
> *Example put:
> *
> put 'crawled_pages','r:
> http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
> 'curi:test','test'
> NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to contact region server Some server, retryOnlyOne=true, index=0,
> islastrow=true, tries=4, numtries=5, i=0, listsize=1,
>
> region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
> for region
>
> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
> row
>
> 'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
> but failed after 5 attempts.
> Exceptions:
>
>    from org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
> `process'
>    from org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
> `processBatchOfRows'
>    from org/apache/hadoop/hbase/client/HTable.java:605:in `flushCommits'
>    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
>    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
>    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
>    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>    from java/lang/reflect/Method.java:597:in `invoke'
>    from org/jruby/javasupport/JavaMethod.java:298:in
> `invokeWithExceptionHandling'
>    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in `call'
>    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
>    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
>    from org/jruby/ast/NewlineNode.java:104:in `interpret'
>

Re: WrongRegionException: How do I recover?

by Joost Ouwerkerk-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I investigated following your guidance, Stack.  Unfortunately I am not
seeing evidence of double assignment. It looks more like a case of missing
assignment.  There appear to be key ranges that are not represented in the
.META. table.  So, I have a region that handles keys AAA to BBB, and the
next region handles DDD to EEE.  Now when I try to access key CCC, I get
routed to the region that handles AAA to BBB, presumably because my key is
after AAA and before DDD.  Then HRegion.checkRow fails because the requested
key is outside the region's range.

Consider this error:

org.apache.hadoop.hbase.regionserver.WrongRegionException:
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
of range for HRegion
crawled_pages,r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove,
startKey
='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove',
getEndKey()
='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Feast-broward',
row
='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fhavana-hideout'

As the error points out, the requested row is outside the range for the
region.  In the .META. table, the next region starts at
'r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fpashas-3'.  The request row
falls after one region's End key, and before the next region's Start key.

jo

On Wed, Nov 4, 2009 at 4:56 PM, stack <stack@...> wrote:

> Meta is giving out the wrong address for a region?  Do a scan of .META.  It
> might be easier dumping the scan into a file so you can grep around:
>
> echo "scan '.META.'" | ./bin/hbase shell --format-width=300 &>
> /tmp/meta.txt
>
> Grep in here for the region that contains the row you are looking for.
>  What
> does it have for info:server?  Go to that regionserver (UI or log).  Is it
> carrying the region?  If not, thats what the WRE is about.
>
> For same region, grep its name in master log (hopefully you have DEBUG
> enabled).
>
> Whats its history?  Could it have been assigned to one server and then
> another?
>
> If so, close the region in both places.  Type 'tools' in the shell to see
> doc. on "close_region" command.  You can pass it server to pass the close
> message to.  Close in both places.
>
> If its a double-assignment issue, our name for above phenomeon, suggest you
> upgrade to 0.20.1.  It has at least one pointed fix for this scenario
> (HBASE-1878).
>
> St.Ack
>
>
> On Wed, Nov 4, 2009 at 12:35 PM, Joost Ouwerkerk <joost@...
> >wrote:
>
> > HBase has started throwing WrongRegionExceptions at me when trying to
> > access
> > certain regions.  I'm guessing that the META table has somehow gone out
> of
> > sync with reality.  I've tried compacting and I've tried restarting, but
> > the
> > problem does not go away.  The errors are always on the same regions.
>  Has
> > anyone else seen this and succeeded at getting their table back into
> > working
> > order?
> >
> > *Example get:*
> >
> > org.apache.hadoop.hbase.regionserver.WrongRegionException:
> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> > out
> > of range for HRegion
> >
> >
> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
> >
> >
> startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
> >
> >
> getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
> >
> >
> row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
> >    at
> > org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
> >    at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
> >    at
> > org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
> >    at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
> >    at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> >    at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
> >    at
> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> >
> > *Example put:
> > *
> > put 'crawled_pages','r:
> > http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
> > 'curi:test','test'
> > NativeException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException:
> > Trying to contact region server Some server, retryOnlyOne=true, index=0,
> > islastrow=true, tries=4, numtries=5, i=0, listsize=1,
> >
> >
> region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
> > for region
> >
> >
> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
> > row
> >
> >
> 'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
> > but failed after 5 attempts.
> > Exceptions:
> >
> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
> > `process'
> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
> > `processBatchOfRows'
> >    from org/apache/hadoop/hbase/client/HTable.java:605:in `flushCommits'
> >    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
> >    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
> >    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
> >    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
> >    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
> >    from java/lang/reflect/Method.java:597:in `invoke'
> >    from org/jruby/javasupport/JavaMethod.java:298:in
> > `invokeWithExceptionHandling'
> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in `call'
> >    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
> >    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
> >    from org/jruby/ast/NewlineNode.java:104:in `interpret'
> >
>

Re: WrongRegionException: How do I recover?

by Joost Ouwerkerk-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Is there a way to rebuild the META?  I'm really hoping there's no data loss
here, and it's just a question of META being out of sync with data...
jo

On Wed, Nov 4, 2009 at 7:07 PM, Joost Ouwerkerk <joost@...>wrote:

> I investigated following your guidance, Stack.  Unfortunately I am not
> seeing evidence of double assignment. It looks more like a case of missing
> assignment.  There appear to be key ranges that are not represented in the
> .META. table.  So, I have a region that handles keys AAA to BBB, and the
> next region handles DDD to EEE.  Now when I try to access key CCC, I get
> routed to the region that handles AAA to BBB, presumably because my key is
> after AAA and before DDD.  Then HRegion.checkRow fails because the requested
> key is outside the region's range.
>
> Consider this error:
>
> org.apache.hadoop.hbase.regionserver.WrongRegionException:
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
> of range for HRegion
> crawled_pages,r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove,
> startKey
> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove',
> getEndKey()
> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Feast-broward',
> row
> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fhavana-hideout'
>
> As the error points out, the requested row is outside the range for the
> region.  In the .META. table, the next region starts at
> 'r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fpashas-3'.  The request row
> falls after one region's End key, and before the next region's Start key.
>
> jo
>
>
> On Wed, Nov 4, 2009 at 4:56 PM, stack <stack@...> wrote:
>
>> Meta is giving out the wrong address for a region?  Do a scan of .META.
>>  It
>> might be easier dumping the scan into a file so you can grep around:
>>
>> echo "scan '.META.'" | ./bin/hbase shell --format-width=300 &>
>> /tmp/meta.txt
>>
>> Grep in here for the region that contains the row you are looking for.
>>  What
>> does it have for info:server?  Go to that regionserver (UI or log).  Is it
>> carrying the region?  If not, thats what the WRE is about.
>>
>> For same region, grep its name in master log (hopefully you have DEBUG
>> enabled).
>>
>> Whats its history?  Could it have been assigned to one server and then
>> another?
>>
>> If so, close the region in both places.  Type 'tools' in the shell to see
>> doc. on "close_region" command.  You can pass it server to pass the close
>> message to.  Close in both places.
>>
>> If its a double-assignment issue, our name for above phenomeon, suggest
>> you
>> upgrade to 0.20.1.  It has at least one pointed fix for this scenario
>> (HBASE-1878).
>>
>> St.Ack
>>
>>
>> On Wed, Nov 4, 2009 at 12:35 PM, Joost Ouwerkerk <joost@...
>> >wrote:
>>
>> > HBase has started throwing WrongRegionExceptions at me when trying to
>> > access
>> > certain regions.  I'm guessing that the META table has somehow gone out
>> of
>> > sync with reality.  I've tried compacting and I've tried restarting, but
>> > the
>> > problem does not go away.  The errors are always on the same regions.
>>  Has
>> > anyone else seen this and succeeded at getting their table back into
>> > working
>> > order?
>> >
>> > *Example get:*
>> >
>> > org.apache.hadoop.hbase.regionserver.WrongRegionException:
>> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
>> > out
>> > of range for HRegion
>> >
>> >
>> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>> >
>> >
>> startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
>> >
>> >
>> getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
>> >
>> >
>> row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
>> >    at
>> > org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
>> >    at
>> >
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
>> >    at
>> > org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
>> >    at
>> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
>> >    at
>> >
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
>> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>> >    at
>> >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>> >    at
>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>> >    at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>> >
>> > *Example put:
>> > *
>> > put 'crawled_pages','r:
>> > http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
>> > 'curi:test','test'
>> > NativeException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> > Trying to contact region server Some server, retryOnlyOne=true, index=0,
>> > islastrow=true, tries=4, numtries=5, i=0, listsize=1,
>> >
>> >
>> region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
>> > for region
>> >
>> >
>> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>> > row
>> >
>> >
>> 'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
>> > but failed after 5 attempts.
>> > Exceptions:
>> >
>> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
>> > `process'
>> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
>> > `processBatchOfRows'
>> >    from org/apache/hadoop/hbase/client/HTable.java:605:in `flushCommits'
>> >    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
>> >    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
>> >    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
>> >    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>> >    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>> >    from java/lang/reflect/Method.java:597:in `invoke'
>> >    from org/jruby/javasupport/JavaMethod.java:298:in
>> > `invokeWithExceptionHandling'
>> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in `call'
>> >    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
>> >    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
>> >    from org/jruby/ast/NewlineNode.java:104:in `interpret'
>> >
>>
>
>

Re: WrongRegionException: How do I recover?

by Jean-Daniel Cryans-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Today on the IRC channel we fixed it with Joost using Stack's tool in
HBASE-1867. This was caused by a file going missing in the META table
and we are still investigating why it happened.

So Joost, could you send us your NN's log so we can grep for the file names?

Thx,

J-D

On Thu, Nov 5, 2009 at 11:08 AM, Joost Ouwerkerk <joost@...> wrote:

> Is there a way to rebuild the META?  I'm really hoping there's no data loss
> here, and it's just a question of META being out of sync with data...
> jo
>
> On Wed, Nov 4, 2009 at 7:07 PM, Joost Ouwerkerk <joost@...>wrote:
>
>> I investigated following your guidance, Stack.  Unfortunately I am not
>> seeing evidence of double assignment. It looks more like a case of missing
>> assignment.  There appear to be key ranges that are not represented in the
>> .META. table.  So, I have a region that handles keys AAA to BBB, and the
>> next region handles DDD to EEE.  Now when I try to access key CCC, I get
>> routed to the region that handles AAA to BBB, presumably because my key is
>> after AAA and before DDD.  Then HRegion.checkRow fails because the requested
>> key is outside the region's range.
>>
>> Consider this error:
>>
>> org.apache.hadoop.hbase.regionserver.WrongRegionException:
>> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
>> of range for HRegion
>> crawled_pages,r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove,
>> startKey
>> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove',
>> getEndKey()
>> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Feast-broward',
>> row
>> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fhavana-hideout'
>>
>> As the error points out, the requested row is outside the range for the
>> region.  In the .META. table, the next region starts at
>> 'r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fpashas-3'.  The request row
>> falls after one region's End key, and before the next region's Start key.
>>
>> jo
>>
>>
>> On Wed, Nov 4, 2009 at 4:56 PM, stack <stack@...> wrote:
>>
>>> Meta is giving out the wrong address for a region?  Do a scan of .META.
>>>  It
>>> might be easier dumping the scan into a file so you can grep around:
>>>
>>> echo "scan '.META.'" | ./bin/hbase shell --format-width=300 &>
>>> /tmp/meta.txt
>>>
>>> Grep in here for the region that contains the row you are looking for.
>>>  What
>>> does it have for info:server?  Go to that regionserver (UI or log).  Is it
>>> carrying the region?  If not, thats what the WRE is about.
>>>
>>> For same region, grep its name in master log (hopefully you have DEBUG
>>> enabled).
>>>
>>> Whats its history?  Could it have been assigned to one server and then
>>> another?
>>>
>>> If so, close the region in both places.  Type 'tools' in the shell to see
>>> doc. on "close_region" command.  You can pass it server to pass the close
>>> message to.  Close in both places.
>>>
>>> If its a double-assignment issue, our name for above phenomeon, suggest
>>> you
>>> upgrade to 0.20.1.  It has at least one pointed fix for this scenario
>>> (HBASE-1878).
>>>
>>> St.Ack
>>>
>>>
>>> On Wed, Nov 4, 2009 at 12:35 PM, Joost Ouwerkerk <joost@...
>>> >wrote:
>>>
>>> > HBase has started throwing WrongRegionExceptions at me when trying to
>>> > access
>>> > certain regions.  I'm guessing that the META table has somehow gone out
>>> of
>>> > sync with reality.  I've tried compacting and I've tried restarting, but
>>> > the
>>> > problem does not go away.  The errors are always on the same regions.
>>>  Has
>>> > anyone else seen this and succeeded at getting their table back into
>>> > working
>>> > order?
>>> >
>>> > *Example get:*
>>> >
>>> > org.apache.hadoop.hbase.regionserver.WrongRegionException:
>>> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
>>> > out
>>> > of range for HRegion
>>> >
>>> >
>>> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>>> >
>>> >
>>> startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
>>> >
>>> >
>>> getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
>>> >
>>> >
>>> row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
>>> >    at
>>> > org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
>>> >    at
>>> >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
>>> >    at
>>> > org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
>>> >    at
>>> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
>>> >    at
>>> >
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
>>> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>>> >    at
>>> >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
>>> >
>>> > *Example put:
>>> > *
>>> > put 'crawled_pages','r:
>>> > http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
>>> > 'curi:test','test'
>>> > NativeException:
>>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>>> > Trying to contact region server Some server, retryOnlyOne=true, index=0,
>>> > islastrow=true, tries=4, numtries=5, i=0, listsize=1,
>>> >
>>> >
>>> region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
>>> > for region
>>> >
>>> >
>>> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
>>> > row
>>> >
>>> >
>>> 'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
>>> > but failed after 5 attempts.
>>> > Exceptions:
>>> >
>>> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
>>> > `process'
>>> >    from org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
>>> > `processBatchOfRows'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:605:in `flushCommits'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
>>> >    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
>>> >    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>>> >    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>>> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>>> >    from java/lang/reflect/Method.java:597:in `invoke'
>>> >    from org/jruby/javasupport/JavaMethod.java:298:in
>>> > `invokeWithExceptionHandling'
>>> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>>> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in `call'
>>> >    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
>>> >    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
>>> >    from org/jruby/ast/NewlineNode.java:104:in `interpret'
>>> >
>>>
>>
>>
>

Re: WrongRegionException: How do I recover?

by stack-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Can we have the factory09 datanode log too?  (Sorry, need all the pieces
because not much info available when running at INFO level especially
debugging stuff like this).
St.Ack


On Thu, Nov 5, 2009 at 3:56 PM, Jean-Daniel Cryans <jdcryans@...>wrote:

> Today on the IRC channel we fixed it with Joost using Stack's tool in
> HBASE-1867. This was caused by a file going missing in the META table
> and we are still investigating why it happened.
>
> So Joost, could you send us your NN's log so we can grep for the file
> names?
>
> Thx,
>
> J-D
>
> On Thu, Nov 5, 2009 at 11:08 AM, Joost Ouwerkerk <joost@...>
> wrote:
> > Is there a way to rebuild the META?  I'm really hoping there's no data
> loss
> > here, and it's just a question of META being out of sync with data...
> > jo
> >
> > On Wed, Nov 4, 2009 at 7:07 PM, Joost Ouwerkerk <joost@...
> >wrote:
> >
> >> I investigated following your guidance, Stack.  Unfortunately I am not
> >> seeing evidence of double assignment. It looks more like a case of
> missing
> >> assignment.  There appear to be key ranges that are not represented in
> the
> >> .META. table.  So, I have a region that handles keys AAA to BBB, and the
> >> next region handles DDD to EEE.  Now when I try to access key CCC, I get
> >> routed to the region that handles AAA to BBB, presumably because my key
> is
> >> after AAA and before DDD.  Then HRegion.checkRow fails because the
> requested
> >> key is outside the region's range.
> >>
> >> Consider this error:
> >>
> >> org.apache.hadoop.hbase.regionserver.WrongRegionException:
> >> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out
> >> of range for HRegion
> >>
> crawled_pages,r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove,
> >> startKey
> >> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fbasil-in-the-grove',
> >> getEndKey()
> >> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Feast-broward',
> >> row
> >> ='r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fhavana-hideout'
> >>
> >> As the error points out, the requested row is outside the range for the
> >> region.  In the .META. table, the next region starts at
> >> 'r:http:\x2F\x2Fcom.xxx.yyy\x2Frestaurants\x2Fpashas-3'.  The request
> row
> >> falls after one region's End key, and before the next region's Start
> key.
> >>
> >> jo
> >>
> >>
> >> On Wed, Nov 4, 2009 at 4:56 PM, stack <stack@...> wrote:
> >>
> >>> Meta is giving out the wrong address for a region?  Do a scan of .META.
> >>>  It
> >>> might be easier dumping the scan into a file so you can grep around:
> >>>
> >>> echo "scan '.META.'" | ./bin/hbase shell --format-width=300 &>
> >>> /tmp/meta.txt
> >>>
> >>> Grep in here for the region that contains the row you are looking for.
> >>>  What
> >>> does it have for info:server?  Go to that regionserver (UI or log).  Is
> it
> >>> carrying the region?  If not, thats what the WRE is about.
> >>>
> >>> For same region, grep its name in master log (hopefully you have DEBUG
> >>> enabled).
> >>>
> >>> Whats its history?  Could it have been assigned to one server and then
> >>> another?
> >>>
> >>> If so, close the region in both places.  Type 'tools' in the shell to
> see
> >>> doc. on "close_region" command.  You can pass it server to pass the
> close
> >>> message to.  Close in both places.
> >>>
> >>> If its a double-assignment issue, our name for above phenomeon, suggest
> >>> you
> >>> upgrade to 0.20.1.  It has at least one pointed fix for this scenario
> >>> (HBASE-1878).
> >>>
> >>> St.Ack
> >>>
> >>>
> >>> On Wed, Nov 4, 2009 at 12:35 PM, Joost Ouwerkerk <joost@...
> >>> >wrote:
> >>>
> >>> > HBase has started throwing WrongRegionExceptions at me when trying to
> >>> > access
> >>> > certain regions.  I'm guessing that the META table has somehow gone
> out
> >>> of
> >>> > sync with reality.  I've tried compacting and I've tried restarting,
> but
> >>> > the
> >>> > problem does not go away.  The errors are always on the same regions.
> >>>  Has
> >>> > anyone else seen this and succeeded at getting their table back into
> >>> > working
> >>> > order?
> >>> >
> >>> > *Example get:*
> >>> >
> >>> > org.apache.hadoop.hbase.regionserver.WrongRegionException:
> >>> > org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested
> row
> >>> > out
> >>> > of range for HRegion
> >>> >
> >>> >
> >>>
> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
> >>> >
> >>> >
> >>>
> startKey='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F',
> >>> >
> >>> >
> >>>
> getEndKey()='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fhermosa-beach\x2Fall-cuisines\x2Ftags\x2Foutdoor-dining\x2F',
> >>> >
> >>> >
> >>>
> row='r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F'
> >>> >    at
> >>> >
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1522)
> >>> >    at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1554)
> >>> >    at
> >>> >
> org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:1622)
> >>> >    at
> >>> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2278)
> >>> >    at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
> >>> >    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
> >>> >    at
> >>> >
> >>> >
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >>> >    at
> >>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648)
> >>> >    at
> >>> >
> >>>
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
> >>> >
> >>> > *Example put:
> >>> > *
> >>> > put 'crawled_pages','r:
> >>> > http://com.xxxx.yyyy/restaurants/all-areas/inglewood/all-cuisines/',
> >>> > 'curi:test','test'
> >>> > NativeException:
> >>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
> >>> > Trying to contact region server Some server, retryOnlyOne=true,
> index=0,
> >>> > islastrow=true, tries=4, numtries=5, i=0, listsize=1,
> >>> >
> >>> >
> >>>
> region=crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084
> >>> > for region
> >>> >
> >>> >
> >>>
> crawled_pages,r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Fbeverly-hills\x2Fall-cuisines\x2Ftags\x2Flunch\x2F2\x2F,1256932686084,
> >>> > row
> >>> >
> >>> >
> >>>
> 'r:http:\x2F\x2Fcom.xxxx.yyyy\x2Frestaurants\x2Fall-areas\x2Finglewood\x2Fall-cuisines\x2F',
> >>> > but failed after 5 attempts.
> >>> > Exceptions:
> >>> >
> >>> >    from
> org/apache/hadoop/hbase/client/HConnectionManager.java:1119:in
> >>> > `process'
> >>> >    from
> org/apache/hadoop/hbase/client/HConnectionManager.java:1200:in
> >>> > `processBatchOfRows'
> >>> >    from org/apache/hadoop/hbase/client/HTable.java:605:in
> `flushCommits'
> >>> >    from org/apache/hadoop/hbase/client/HTable.java:470:in `put'
> >>> >    from org/apache/hadoop/hbase/client/HTable.java:1761:in `commit'
> >>> >    from org/apache/hadoop/hbase/client/HTable.java:1742:in `commit'
> >>> >    from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
> >>> >    from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
> >>> >    from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
> >>> >    from java/lang/reflect/Method.java:597:in `invoke'
> >>> >    from org/jruby/javasupport/JavaMethod.java:298:in
> >>> > `invokeWithExceptionHandling'
> >>> >    from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
> >>> >    from org/jruby/java/invokers/InstanceMethodInvoker.java:44:in
> `call'
> >>> >    from org/jruby/runtime/callsite/CachingCallSite.java:110:in `call'
> >>> >    from org/jruby/ast/CallOneArgNode.java:57:in `interpret'
> >>> >    from org/jruby/ast/NewlineNode.java:104:in `interpret'
> >>> >
> >>>
> >>
> >>
> >
>