|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
Records missingI have dumped all my data into Hbase using mapreduce
the result shows i have processed 4,413,160 records But there are only 4217742 rows in the table(count by rowcounter) dump: Map input records 4,413,160 0 4,413,160count: Map input records 4,217,742 0 4,217,742 There is no error,and row key is unique( Math.Random()+"_"+fileName+"_"+currentPosition ) |
|
|
Re: Records missingWhich version of HBase? Any region server crash during the upload?
J-D On Wed, Nov 4, 2009 at 10:09 PM, Eason.Lee <leongfans@...> wrote: > I have dumped all my data into Hbase using mapreduce > the result shows i have processed 4,413,160 records > But there are only 4217742 rows in the table(count by rowcounter) > > dump: > Map input records 4,413,160 0 4,413,160count: > Map input records 4,217,742 0 4,217,742 > There is no error,and row key is unique( > Math.Random()+"_"+fileName+"_"+currentPosition ) > |
|
|
Re: Records missing0.20.1
I didn't see any error during upload~~ and there is no error in logs 2009/11/5 Jean-Daniel Cryans <jdcryans@...> > Which version of HBase? Any region server crash during the upload? > > J-D > > On Wed, Nov 4, 2009 at 10:09 PM, Eason.Lee <leongfans@...> wrote: > > I have dumped all my data into Hbase using mapreduce > > the result shows i have processed 4,413,160 records > > But there are only 4217742 rows in the table(count by rowcounter) > > > > dump: > > Map input records 4,413,160 0 4,413,160count: > > Map input records 4,217,742 0 4,217,742 > > There is no error,and row key is unique( > > Math.Random()+"_"+fileName+"_"+currentPosition ) > > > |
|
|
Re: Records missingFor sure all your keys are unique? Write them as output from your job and
count that output too (does the reduce count match the map count)? St.Ack On Wed, Nov 4, 2009 at 10:30 PM, Eason.Lee <leongfans@...> wrote: > 0.20.1 > > I didn't see any error during upload~~ > > and there is no error in logs > > 2009/11/5 Jean-Daniel Cryans <jdcryans@...> > > > Which version of HBase? Any region server crash during the upload? > > > > J-D > > > > On Wed, Nov 4, 2009 at 10:09 PM, Eason.Lee <leongfans@...> wrote: > > > I have dumped all my data into Hbase using mapreduce > > > the result shows i have processed 4,413,160 records > > > But there are only 4217742 rows in the table(count by rowcounter) > > > > > > dump: > > > Map input records 4,413,160 0 4,413,160count: > > > Map input records 4,217,742 0 4,217,742 > > > There is no error,and row key is unique( > > > Math.Random()+"_"+fileName+"_"+currentPosition ) > > > > > > |
|
|
Re: Records missingSorry for reply to late~
2009/11/6 stack <stack@...> > For sure all your keys are unique? Write them as output from your job and > count that output too (does the reduce count match the map count)? > St.Ack > > Yes, they are unique, I have just checked it~~ I don't have reduce. Just save records into hbase in the map. But i just did a test , collect all the row keys, and found that the reduce count matches the map count > On Wed, Nov 4, 2009 at 10:30 PM, Eason.Lee <leongfans@...> wrote: > > > 0.20.1 > > > > I didn't see any error during upload~~ > > > > and there is no error in logs > > > > 2009/11/5 Jean-Daniel Cryans <jdcryans@...> > > > > > Which version of HBase? Any region server crash during the upload? > > > > > > J-D > > > > > > On Wed, Nov 4, 2009 at 10:09 PM, Eason.Lee <leongfans@...> > wrote: > > > > I have dumped all my data into Hbase using mapreduce > > > > the result shows i have processed 4,413,160 records > > > > But there are only 4217742 rows in the table(count by rowcounter) > > > > > > > > dump: > > > > Map input records 4,413,160 0 4,413,160count: > > > > Map input records 4,217,742 0 4,217,742 > > > > There is no error,and row key is unique( > > > > Math.Random()+"_"+fileName+"_"+currentPosition ) > > > > > > > > > > |
|
|
Re: Records missingSo, can you dump the keys from hbase and compare to those you entered and
see if you can figure what the difference is? Might give you a clue as to whats happening: e.g. take one of the missing keys and grep it in your logs, maybe there was an error around it? I can insert into an hbase instance hundreds of millions without losing entries. This is why I'm of the opinion that its something to do with your environment. If you can turn up more than the below, that'd help. Thanks Eason. St.Ack On Fri, Nov 6, 2009 at 12:33 AM, Eason.Lee <leongfans@...> wrote: > Sorry for reply to late~ > > 2009/11/6 stack <stack@...> > > > For sure all your keys are unique? Write them as output from your job > and > > count that output too (does the reduce count match the map count)? > > St.Ack > > > > Yes, they are unique, I have just checked it~~ > I don't have reduce. Just save records into hbase in the map. > But i just did a test , collect all the row keys, and found that the reduce > count matches the map count > > > > On Wed, Nov 4, 2009 at 10:30 PM, Eason.Lee <leongfans@...> wrote: > > > > > 0.20.1 > > > > > > I didn't see any error during upload~~ > > > > > > and there is no error in logs > > > > > > 2009/11/5 Jean-Daniel Cryans <jdcryans@...> > > > > > > > Which version of HBase? Any region server crash during the upload? > > > > > > > > J-D > > > > > > > > On Wed, Nov 4, 2009 at 10:09 PM, Eason.Lee <leongfans@...> > > wrote: > > > > > I have dumped all my data into Hbase using mapreduce > > > > > the result shows i have processed 4,413,160 records > > > > > But there are only 4217742 rows in the table(count by rowcounter) > > > > > > > > > > dump: > > > > > Map input records 4,413,160 0 4,413,160count: > > > > > Map input records 4,217,742 0 4,217,742 > > > > > There is no error,and row key is unique( > > > > > Math.Random()+"_"+fileName+"_"+currentPosition ) > > > > > > > > > > > > > > > |
|
|
Re: Records missingThanks for the reply
I will check that~~ 2009/11/7 stack <stack@...> > So, can you dump the keys from hbase and compare to those you entered and > see if you can figure what the difference is? Might give you a clue as to > whats happening: e.g. take one of the missing keys and grep it in your > logs, > maybe there was an error around it? > > I can insert into an hbase instance hundreds of millions without losing > entries. This is why I'm of the opinion that its something to do with your > environment. > > If you can turn up more than the below, that'd help. > > Thanks Eason. > St.Ack > > On Fri, Nov 6, 2009 at 12:33 AM, Eason.Lee <leongfans@...> wrote: > > > Sorry for reply to late~ > > > > 2009/11/6 stack <stack@...> > > > > > For sure all your keys are unique? Write them as output from your job > > and > > > count that output too (does the reduce count match the map count)? > > > St.Ack > > > > > > Yes, they are unique, I have just checked it~~ > > I don't have reduce. Just save records into hbase in the map. > > But i just did a test , collect all the row keys, and found that the > reduce > > count matches the map count > > > > > > > On Wed, Nov 4, 2009 at 10:30 PM, Eason.Lee <leongfans@...> > wrote: > > > > > > > 0.20.1 > > > > > > > > I didn't see any error during upload~~ > > > > > > > > and there is no error in logs > > > > > > > > 2009/11/5 Jean-Daniel Cryans <jdcryans@...> > > > > > > > > > Which version of HBase? Any region server crash during the upload? > > > > > > > > > > J-D > > > > > > > > > > On Wed, Nov 4, 2009 at 10:09 PM, Eason.Lee <leongfans@...> > > > wrote: > > > > > > I have dumped all my data into Hbase using mapreduce > > > > > > the result shows i have processed 4,413,160 records > > > > > > But there are only 4217742 rows in the table(count by rowcounter) > > > > > > > > > > > > dump: > > > > > > Map input records 4,413,160 0 4,413,160count: > > > > > > Map input records 4,217,742 0 4,217,742 > > > > > > There is no error,and row key is unique( > > > > > > Math.Random()+"_"+fileName+"_"+currentPosition ) > > > > > > > > > > > > > > > > > > > > > |
| Free embeddable forum powered by Nabble | Forum Help |