|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Region server going downHi,
Today one regionserver crashed and I can't figure out why. Everything started with the message "server,60020,1255644477834 znode expired". I'm still running the cluster on little memory and swap is getting in my way from time to time (it's rare but I need to fix it). Can it be the cause of the error bellow? Do you think that five minutes is enough for the property zookeeper.session.timeout? Why the message "wrong key class: org.apache.hadoop.hbase.regionserver.HLogKey is not class"? My tests show that whenever zookeeper "shakes" the whole cluster goes down. Shouldn't HBase be more robust regarding Zookeeper? Something like a retry strategy... Lucas 2009-10-16 15:07:32,167 INFO org.apache.hadoop.hbase.master.ServerManager: 2 region servers, 0 dead, average load 7.0 2009-10-16 15:07:32,537 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner scanning meta region {server: 192.168.1.2:60020, regionname: -ROOT-,,0, startKey: <>} 2009-10-16 15:07:32,560 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner scan of 1 row(s) of meta region {server: 192.168.1.2:60020, regionname: -ROOT-,,0, startKey: <>} complete 2009-10-16 15:07:32,654 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scanning meta region {server: 192.168.1.3:60020, regionname: .META.,,1, startKey: <>} 2009-10-16 15:07:32,804 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scan of 12 row(s) of meta region {server: 192.168.1.3:60020, regionname: .META.,,1, startKey: <>} complete 2009-10-16 15:07:32,804 INFO org.apache.hadoop.hbase.master.BaseScanner: All 1 .META. region(s) scanned 2009-10-16 15:08:09,551 INFO org.apache.hadoop.hbase.master.ServerManager: server,60020,1255644477834 znode expired 2009-10-16 15:08:09,605 INFO org.apache.hadoop.hbase.master.RegionManager: -ROOT- region unset (but not set to be reassigned) 2009-10-16 15:08:09,605 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of server server,60020,1255644477834: logSplit: false, rootRescanned: false, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1 2009-10-16 15:08:09,623 INFO org.apache.hadoop.hbase.regionserver.HLog: Splitting 20 hlog(s) in hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 2009-10-16 15:08:09,841 WARN org.apache.hadoop.hbase.regionserver.HLog: Exception processing hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255644478353 -- continuing. Possible DATA LOSS! java.io.IOException: wrong key class: org.apache.hadoop.hbase.regionserver.HLogKey is not class org.apache.hadoop.hbase.regionserver.transactional.THLogKey at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) at org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) at org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) 2009-10-16 15:08:09,870 WARN org.apache.hadoop.hbase.regionserver.HLog: Exception processing hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255648058463 -- continuing. Possible DATA LOSS! java.io.IOException: wrong key class: org.apache.hadoop.hbase.regionserver.HLogKey is not class org.apache.hadoop.hbase.regionserver.transactional.THLogKey at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) at org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) at org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) 2009-10-16 15:08:09,886 WARN org.apache.hadoop.hbase.regionserver.HLog: Exception processing hdfs://server2:9000/hbase/.logs/server,60020,12556 // More wrong key class errors... 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.regionserver.HLog: hlog file splitting completed in 594 millis for hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete, meta reassignment and scanning: 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.master.RegionServerOperation: ProcessServerShutdown reassigning ROOT region 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.master.RegionManager: -ROOT- region unset (but not set to be reassigned) 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.master.RegionManager: ROOT inserted into regionsInTransition 2009-10-16 15:08:32,167 INFO org.apache.hadoop.hbase.master.ServerManager: 1 region servers, 1 dead, average load 6.0[server,60020,1255644477834] |
|
|
Re: Region server going downHey,
Zookeeper is a pretty fundamental part of how we are making things happen in hbase. The problem is when you lose your session, this is how we synchronize between the master and the regionserver. At this point neither side knows what the other knows, and the safest thing is to abort the regionserver. Without that, we can end up with multiple region assignments which is pretty messy. ZK is like DNS and the network, without it running, we are more or less in trouble. There is no effective difference between a crashed machine and one that is having network problems, so they are treated the same and recovery is the same. Having said that, the session timeout is set in hbase, and i think ships at 40 seconds or so. So it should take more than a minor problem or a few lost packets to induce a crash. Now having said that, if you are killing the entire ZK cluster and expecting HBase to be ok, that is not really what will happen. This is why ZK is run in a 2N+1 scenario, so you can do rolling reboots, and survive N machine loss. But ZK is requires to be up 24/7, luckily it is fairly reliable. With hdfs 0.21, at least we'll be able to have effective hlog recovery. Now, your specific problem looks like a common issue with the master and regionservers being confused about what type of server they are running. I don't personally run the indexed or transactional extensions (they are not as inherently scalable), so maybe someone else can chime in. -ryan On Fri, Oct 16, 2009 at 1:29 PM, Lucas Nazário dos Santos <nazario.lucas@...> wrote: > Hi, > > Today one regionserver crashed and I can't figure out why. Everything > started with the message "server,60020,1255644477834 znode expired". I'm > still running the cluster on little memory and swap is getting in my way > from time to time (it's rare but I need to fix it). Can it be the cause of > the error bellow? Do you think that five minutes is enough for the property > zookeeper.session.timeout? Why the message "wrong key class: > org.apache.hadoop.hbase.regionserver.HLogKey is not class"? > > My tests show that whenever zookeeper "shakes" the whole cluster goes down. > Shouldn't HBase be more robust regarding Zookeeper? Something like a retry > strategy... > > Lucas > > > > 2009-10-16 15:07:32,167 INFO org.apache.hadoop.hbase.master.ServerManager: 2 > region servers, 0 dead, average load 7.0 > 2009-10-16 15:07:32,537 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scanning meta region {server: 192.168.1.2:60020, > regionname: -ROOT-,,0, startKey: <>} > 2009-10-16 15:07:32,560 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scan of 1 row(s) of meta region {server: > 192.168.1.2:60020, regionname: -ROOT-,,0, startKey: <>} complete > 2009-10-16 15:07:32,654 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scanning meta region {server: 192.168.1.3:60020, > regionname: .META.,,1, startKey: <>} > 2009-10-16 15:07:32,804 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scan of 12 row(s) of meta region {server: > 192.168.1.3:60020, regionname: .META.,,1, startKey: <>} complete > 2009-10-16 15:07:32,804 INFO org.apache.hadoop.hbase.master.BaseScanner: All > 1 .META. region(s) scanned > 2009-10-16 15:08:09,551 INFO org.apache.hadoop.hbase.master.ServerManager: > server,60020,1255644477834 znode expired > 2009-10-16 15:08:09,605 INFO org.apache.hadoop.hbase.master.RegionManager: > -ROOT- region unset (but not set to be reassigned) > 2009-10-16 15:08:09,605 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of > server server,60020,1255644477834: logSplit: false, rootRescanned: false, > numberOfMetaRegions: 1, onlineMetaRegions.size(): 1 > 2009-10-16 15:08:09,623 INFO org.apache.hadoop.hbase.regionserver.HLog: > Splitting 20 hlog(s) in > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 > 2009-10-16 15:08:09,841 WARN org.apache.hadoop.hbase.regionserver.HLog: > Exception processing > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255644478353 > -- continuing. Possible DATA LOSS! > java.io.IOException: wrong key class: > org.apache.hadoop.hbase.regionserver.HLogKey is not class > org.apache.hadoop.hbase.regionserver.transactional.THLogKey > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) > at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) > at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) > at > org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) > at > org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) > 2009-10-16 15:08:09,870 WARN org.apache.hadoop.hbase.regionserver.HLog: > Exception processing > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255648058463 > -- continuing. Possible DATA LOSS! > java.io.IOException: wrong key class: > org.apache.hadoop.hbase.regionserver.HLogKey is not class > org.apache.hadoop.hbase.regionserver.transactional.THLogKey > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) > at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) > at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) > at > org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) > at > org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) > 2009-10-16 15:08:09,886 WARN org.apache.hadoop.hbase.regionserver.HLog: > Exception processing hdfs://server2:9000/hbase/.logs/server,60020,12556 > > // More wrong key class errors... > > 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.regionserver.HLog: hlog > file splitting completed in 594 millis for > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 > 2009-10-16 15:08:10,203 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete, > meta reassignment and scanning: > 2009-10-16 15:08:10,203 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: ProcessServerShutdown > reassigning ROOT region > 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.master.RegionManager: > -ROOT- region unset (but not set to be reassigned) > 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.master.RegionManager: > ROOT inserted into regionsInTransition > 2009-10-16 15:08:32,167 INFO org.apache.hadoop.hbase.master.ServerManager: 1 > region servers, 1 dead, average load 6.0[server,60020,1255644477834] > |
|
|
Re: Region server going downThanks a lot Ryan. Very helpful your explanation. It's not the first time
that I see someone saying that the indexed option is not "as inherently scalable". I'll remove it and take care of my indexes manually. Also, I need to fix the swap problem. Lucas On Fri, Oct 16, 2009 at 10:12 PM, Ryan Rawson <ryanobjc@...> wrote: > Hey, > > Zookeeper is a pretty fundamental part of how we are making things > happen in hbase. The problem is when you lose your session, this is > how we synchronize between the master and the regionserver. At this > point neither side knows what the other knows, and the safest thing is > to abort the regionserver. Without that, we can end up with multiple > region assignments which is pretty messy. > > ZK is like DNS and the network, without it running, we are more or > less in trouble. There is no effective difference between a crashed > machine and one that is having network problems, so they are treated > the same and recovery is the same. > > Having said that, the session timeout is set in hbase, and i think > ships at 40 seconds or so. So it should take more than a minor > problem or a few lost packets to induce a crash. Now having said > that, if you are killing the entire ZK cluster and expecting HBase to > be ok, that is not really what will happen. This is why ZK is run in > a 2N+1 scenario, so you can do rolling reboots, and survive N machine > loss. But ZK is requires to be up 24/7, luckily it is fairly > reliable. > > With hdfs 0.21, at least we'll be able to have effective hlog recovery. > > Now, your specific problem looks like a common issue with the master > and regionservers being confused about what type of server they are > running. I don't personally run the indexed or transactional > extensions (they are not as inherently scalable), so maybe someone > else can chime in. > > -ryan > > On Fri, Oct 16, 2009 at 1:29 PM, Lucas Nazário dos Santos > <nazario.lucas@...> wrote: > > Hi, > > > > Today one regionserver crashed and I can't figure out why. Everything > > started with the message "server,60020,1255644477834 znode expired". I'm > > still running the cluster on little memory and swap is getting in my way > > from time to time (it's rare but I need to fix it). Can it be the cause > of > > the error bellow? Do you think that five minutes is enough for the > property > > zookeeper.session.timeout? Why the message "wrong key class: > > org.apache.hadoop.hbase.regionserver.HLogKey is not class"? > > > > My tests show that whenever zookeeper "shakes" the whole cluster goes > down. > > Shouldn't HBase be more robust regarding Zookeeper? Something like a > retry > > strategy... > > > > Lucas > > > > > > > > 2009-10-16 15:07:32,167 INFO > org.apache.hadoop.hbase.master.ServerManager: 2 > > region servers, 0 dead, average load 7.0 > > 2009-10-16 15:07:32,537 INFO org.apache.hadoop.hbase.master.BaseScanner: > > RegionManager.rootScanner scanning meta region {server: > 192.168.1.2:60020, > > regionname: -ROOT-,,0, startKey: <>} > > 2009-10-16 15:07:32,560 INFO org.apache.hadoop.hbase.master.BaseScanner: > > RegionManager.rootScanner scan of 1 row(s) of meta region {server: > > 192.168.1.2:60020, regionname: -ROOT-,,0, startKey: <>} complete > > 2009-10-16 15:07:32,654 INFO org.apache.hadoop.hbase.master.BaseScanner: > > RegionManager.metaScanner scanning meta region {server: > 192.168.1.3:60020, > > regionname: .META.,,1, startKey: <>} > > 2009-10-16 15:07:32,804 INFO org.apache.hadoop.hbase.master.BaseScanner: > > RegionManager.metaScanner scan of 12 row(s) of meta region {server: > > 192.168.1.3:60020, regionname: .META.,,1, startKey: <>} complete > > 2009-10-16 15:07:32,804 INFO org.apache.hadoop.hbase.master.BaseScanner: > All > > 1 .META. region(s) scanned > > 2009-10-16 15:08:09,551 INFO > org.apache.hadoop.hbase.master.ServerManager: > > server,60020,1255644477834 znode expired > > 2009-10-16 15:08:09,605 INFO > org.apache.hadoop.hbase.master.RegionManager: > > -ROOT- region unset (but not set to be reassigned) > > 2009-10-16 15:08:09,605 INFO > > org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of > > server server,60020,1255644477834: logSplit: false, rootRescanned: false, > > numberOfMetaRegions: 1, onlineMetaRegions.size(): 1 > > 2009-10-16 15:08:09,623 INFO org.apache.hadoop.hbase.regionserver.HLog: > > Splitting 20 hlog(s) in > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 > > 2009-10-16 15:08:09,841 WARN org.apache.hadoop.hbase.regionserver.HLog: > > Exception processing > > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255644478353 > > -- continuing. Possible DATA LOSS! > > java.io.IOException: wrong key class: > > org.apache.hadoop.hbase.regionserver.HLogKey is not class > > org.apache.hadoop.hbase.regionserver.transactional.THLogKey > > at > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) > > at > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) > > at > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) > > at > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) > > at > > > org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) > > at > > org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) > > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) > > 2009-10-16 15:08:09,870 WARN org.apache.hadoop.hbase.regionserver.HLog: > > Exception processing > > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255648058463 > > -- continuing. Possible DATA LOSS! > > java.io.IOException: wrong key class: > > org.apache.hadoop.hbase.regionserver.HLogKey is not class > > org.apache.hadoop.hbase.regionserver.transactional.THLogKey > > at > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) > > at > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) > > at > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) > > at > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) > > at > > > org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) > > at > > org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) > > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) > > 2009-10-16 15:08:09,886 WARN org.apache.hadoop.hbase.regionserver.HLog: > > Exception processing hdfs://server2:9000/hbase/.logs/server,60020,12556 > > > > // More wrong key class errors... > > > > 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.regionserver.HLog: > hlog > > file splitting completed in 594 millis for > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 > > 2009-10-16 15:08:10,203 INFO > > org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete, > > meta reassignment and scanning: > > 2009-10-16 15:08:10,203 INFO > > org.apache.hadoop.hbase.master.RegionServerOperation: > ProcessServerShutdown > > reassigning ROOT region > > 2009-10-16 15:08:10,203 INFO > org.apache.hadoop.hbase.master.RegionManager: > > -ROOT- region unset (but not set to be reassigned) > > 2009-10-16 15:08:10,203 INFO > org.apache.hadoop.hbase.master.RegionManager: > > ROOT inserted into regionsInTransition > > 2009-10-16 15:08:32,167 INFO > org.apache.hadoop.hbase.master.ServerManager: 1 > > region servers, 1 dead, average load 6.0[server,60020,1255644477834] > > > |
|
|
Re: Region server going downIn your first post, you are hitting 1858. Fixed in trunk and 0.20 branch,
but you will need to add the config value to recover from the WAL. I take issue with Ryan's handwavy statement about index/trx extensions not being scalable. With the indexing you pay an extra cost on puts which is essentially a constant * number of indexes. But this would still scale with the number of rows/requests. If you want those indexes, then you will have to pay that maintenance cost. And putting the maintenance in the regionserver makes the gets to rebuild the indexes a bit cheaper. Trx is a different story; it really depends on your work loads. But if you have lots of small requests that don't often interfere with each other, then it should scale. On Mon, Oct 19, 2009 at 3:42 AM, Lucas Nazário dos Santos < nazario.lucas@...> wrote: > Thanks a lot Ryan. Very helpful your explanation. It's not the first time > that I see someone saying that the indexed option is not "as inherently > scalable". I'll remove it and take care of my indexes manually. Also, I > need > to fix the swap problem. > > Lucas > > > > > On Fri, Oct 16, 2009 at 10:12 PM, Ryan Rawson <ryanobjc@...> wrote: > > > Hey, > > > > Zookeeper is a pretty fundamental part of how we are making things > > happen in hbase. The problem is when you lose your session, this is > > how we synchronize between the master and the regionserver. At this > > point neither side knows what the other knows, and the safest thing is > > to abort the regionserver. Without that, we can end up with multiple > > region assignments which is pretty messy. > > > > ZK is like DNS and the network, without it running, we are more or > > less in trouble. There is no effective difference between a crashed > > machine and one that is having network problems, so they are treated > > the same and recovery is the same. > > > > Having said that, the session timeout is set in hbase, and i think > > ships at 40 seconds or so. So it should take more than a minor > > problem or a few lost packets to induce a crash. Now having said > > that, if you are killing the entire ZK cluster and expecting HBase to > > be ok, that is not really what will happen. This is why ZK is run in > > a 2N+1 scenario, so you can do rolling reboots, and survive N machine > > loss. But ZK is requires to be up 24/7, luckily it is fairly > > reliable. > > > > With hdfs 0.21, at least we'll be able to have effective hlog recovery. > > > > Now, your specific problem looks like a common issue with the master > > and regionservers being confused about what type of server they are > > running. I don't personally run the indexed or transactional > > extensions (they are not as inherently scalable), so maybe someone > > else can chime in. > > > > -ryan > > > > On Fri, Oct 16, 2009 at 1:29 PM, Lucas Nazário dos Santos > > <nazario.lucas@...> wrote: > > > Hi, > > > > > > Today one regionserver crashed and I can't figure out why. Everything > > > started with the message "server,60020,1255644477834 znode expired". > I'm > > > still running the cluster on little memory and swap is getting in my > way > > > from time to time (it's rare but I need to fix it). Can it be the cause > > of > > > the error bellow? Do you think that five minutes is enough for the > > property > > > zookeeper.session.timeout? Why the message "wrong key class: > > > org.apache.hadoop.hbase.regionserver.HLogKey is not class"? > > > > > > My tests show that whenever zookeeper "shakes" the whole cluster goes > > down. > > > Shouldn't HBase be more robust regarding Zookeeper? Something like a > > retry > > > strategy... > > > > > > Lucas > > > > > > > > > > > > 2009-10-16 15:07:32,167 INFO > > org.apache.hadoop.hbase.master.ServerManager: 2 > > > region servers, 0 dead, average load 7.0 > > > 2009-10-16 15:07:32,537 INFO > org.apache.hadoop.hbase.master.BaseScanner: > > > RegionManager.rootScanner scanning meta region {server: > > 192.168.1.2:60020, > > > regionname: -ROOT-,,0, startKey: <>} > > > 2009-10-16 15:07:32,560 INFO > org.apache.hadoop.hbase.master.BaseScanner: > > > RegionManager.rootScanner scan of 1 row(s) of meta region {server: > > > 192.168.1.2:60020, regionname: -ROOT-,,0, startKey: <>} complete > > > 2009-10-16 15:07:32,654 INFO > org.apache.hadoop.hbase.master.BaseScanner: > > > RegionManager.metaScanner scanning meta region {server: > > 192.168.1.3:60020, > > > regionname: .META.,,1, startKey: <>} > > > 2009-10-16 15:07:32,804 INFO > org.apache.hadoop.hbase.master.BaseScanner: > > > RegionManager.metaScanner scan of 12 row(s) of meta region {server: > > > 192.168.1.3:60020, regionname: .META.,,1, startKey: <>} complete > > > 2009-10-16 15:07:32,804 INFO > org.apache.hadoop.hbase.master.BaseScanner: > > All > > > 1 .META. region(s) scanned > > > 2009-10-16 15:08:09,551 INFO > > org.apache.hadoop.hbase.master.ServerManager: > > > server,60020,1255644477834 znode expired > > > 2009-10-16 15:08:09,605 INFO > > org.apache.hadoop.hbase.master.RegionManager: > > > -ROOT- region unset (but not set to be reassigned) > > > 2009-10-16 15:08:09,605 INFO > > > org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown > of > > > server server,60020,1255644477834: logSplit: false, rootRescanned: > false, > > > numberOfMetaRegions: 1, onlineMetaRegions.size(): 1 > > > 2009-10-16 15:08:09,623 INFO org.apache.hadoop.hbase.regionserver.HLog: > > > Splitting 20 hlog(s) in > > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 > > > 2009-10-16 15:08:09,841 WARN org.apache.hadoop.hbase.regionserver.HLog: > > > Exception processing > > > > > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255644478353 > > > -- continuing. Possible DATA LOSS! > > > java.io.IOException: wrong key class: > > > org.apache.hadoop.hbase.regionserver.HLogKey is not class > > > org.apache.hadoop.hbase.regionserver.transactional.THLogKey > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) > > > at > > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) > > > at > > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) > > > at > > > > > > org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) > > > at > > > > org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) > > > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) > > > 2009-10-16 15:08:09,870 WARN org.apache.hadoop.hbase.regionserver.HLog: > > > Exception processing > > > > > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834/hlog.dat.1255648058463 > > > -- continuing. Possible DATA LOSS! > > > java.io.IOException: wrong key class: > > > org.apache.hadoop.hbase.regionserver.HLogKey is not class > > > org.apache.hadoop.hbase.regionserver.transactional.THLogKey > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1824) > > > at > > > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1876) > > > at > > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:896) > > > at > > org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802) > > > at > > > > > > org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274) > > > at > > > > org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:490) > > > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:425) > > > 2009-10-16 15:08:09,886 WARN org.apache.hadoop.hbase.regionserver.HLog: > > > Exception processing hdfs://server2:9000/hbase/.logs/server,60020,12556 > > > > > > // More wrong key class errors... > > > > > > 2009-10-16 15:08:10,203 INFO org.apache.hadoop.hbase.regionserver.HLog: > > hlog > > > file splitting completed in 594 millis for > > > hdfs://server2:9000/hbase/.logs/server,60020,1255644477834 > > > 2009-10-16 15:08:10,203 INFO > > > org.apache.hadoop.hbase.master.RegionServerOperation: Log split > complete, > > > meta reassignment and scanning: > > > 2009-10-16 15:08:10,203 INFO > > > org.apache.hadoop.hbase.master.RegionServerOperation: > > ProcessServerShutdown > > > reassigning ROOT region > > > 2009-10-16 15:08:10,203 INFO > > org.apache.hadoop.hbase.master.RegionManager: > > > -ROOT- region unset (but not set to be reassigned) > > > 2009-10-16 15:08:10,203 INFO > > org.apache.hadoop.hbase.master.RegionManager: > > > ROOT inserted into regionsInTransition > > > 2009-10-16 15:08:32,167 INFO > > org.apache.hadoop.hbase.master.ServerManager: 1 > > > region servers, 1 dead, average load 6.0[server,60020,1255644477834] > > > > > > |
| Free embeddable forum powered by Nabble | Forum Help |