|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
Hbase can we insert such (inside) data faster?Hello,
We are using hadoop + hbase (0.20.1) for tests now. Machines we are testing on have following configuration: Vmware 4 core intel xeon, 2.27GHz Two hbase nodes (one master and one regionserver), 6GB RAM per each. Table has following definition: 12-byte string as Row Column family: C1 and 3 qualifiers: q1, q2, q3 (about 200 bytes per record) Column family: C2 and 2 qualifiers q1, q2 (about 2-4KB per record) I've implemented simple java utility which parses our data source and inserts results into hbase (write buffer is 12MB, autoflush off). We got following results: ~450K records ~= 4GB of data. Total time of insertion is about 600-650 seconds or ~7 MB/second or 675 rows per second, or 2ms per row. So the question is: is this time ok for such hardware or did I miss something important? Thank you. Regards, Dmitriy. |
|
|
Re: Hbase can we insert such (inside) data faster?This is slow.. We get about 4k inserts per second per region server with row
size being about 30kB. Using Vmware could be causing the slow down. Amandeep On Mon, Oct 26, 2009 at 2:04 AM, Dmitriy Lyfar <dlyfar@...> wrote: > Hello, > > We are using hadoop + hbase (0.20.1) for tests now. Machines we are testing > on have following configuration: > Vmware > 4 core intel xeon, 2.27GHz > Two hbase nodes (one master and one regionserver), 6GB RAM per each. > > Table has following definition: > > 12-byte string as Row > Column family: C1 and 3 qualifiers: q1, q2, q3 (about 200 bytes per record) > Column family: C2 and 2 qualifiers q1, q2 (about 2-4KB per record) > > I've implemented simple java utility which parses our data source and > inserts results into hbase (write buffer is 12MB, autoflush off). > We got following results: > ~450K records ~= 4GB of data. > Total time of insertion is about 600-650 seconds or ~7 MB/second or 675 > rows > per second, or 2ms per row. > > So the question is: is this time ok for such hardware or did I miss > something important? > Thank you. > > Regards, Dmitriy. > |
|
|
Re: Hbase can we insert such (inside) data faster?Hi Amandeep,
Thank you. I also forgot to mention that Zookeeper is managed by hbase on both nodes and quorum consists of two zookeepers per node. Could you tell me how much Zookeepers should I have per this configuration and how it usually should be? BTW, which hards disks did you use? 2009/10/26 Amandeep Khurana <amansk@...> > This is slow.. We get about 4k inserts per second per region server with > row > size being about 30kB. Using Vmware could be causing the slow down. > > Amandeep > > On Mon, Oct 26, 2009 at 2:04 AM, Dmitriy Lyfar <dlyfar@...> wrote: > > > Hello, > > > > We are using hadoop + hbase (0.20.1) for tests now. Machines we are > testing > > on have following configuration: > > Vmware > > 4 core intel xeon, 2.27GHz > > Two hbase nodes (one master and one regionserver), 6GB RAM per each. > > > > Table has following definition: > > > > 12-byte string as Row > > Column family: C1 and 3 qualifiers: q1, q2, q3 (about 200 bytes per > record) > > Column family: C2 and 2 qualifiers q1, q2 (about 2-4KB per record) > > > > I've implemented simple java utility which parses our data source and > > inserts results into hbase (write buffer is 12MB, autoflush off). > > We got following results: > > ~450K records ~= 4GB of data. > > Total time of insertion is about 600-650 seconds or ~7 MB/second or 675 > > rows > > per second, or 2ms per row. > > > > So the question is: is this time ok for such hardware or did I miss > > something important? > > Thank you. > > > > Regards, Dmitriy. > > > -- Regards, Lyfar Dmitriy mailto: dlyfar@... jabber: dlyfar@... |
|
|
Re: Hbase can we insert such (inside) data faster?1. You need odd number of servers for the zk quorum. 3-5 should be good
enough. In your case, even 1 is fine since the load is not much. 2. We used 7200rpm SATA drives. On Mon, Oct 26, 2009 at 2:57 AM, Dmitriy Lyfar <dlyfar@...> wrote: > Hi Amandeep, > > Thank you. I also forgot to mention that Zookeeper is managed by hbase on > both nodes and > quorum consists of two zookeepers per node. > Could you tell me how much Zookeepers should I have per this configuration > and how it usually should be? > BTW, which hards disks did you use? > > 2009/10/26 Amandeep Khurana <amansk@...> > > > This is slow.. We get about 4k inserts per second per region server with > > row > > size being about 30kB. Using Vmware could be causing the slow down. > > > > Amandeep > > > > On Mon, Oct 26, 2009 at 2:04 AM, Dmitriy Lyfar <dlyfar@...> wrote: > > > > > Hello, > > > > > > We are using hadoop + hbase (0.20.1) for tests now. Machines we are > > testing > > > on have following configuration: > > > Vmware > > > 4 core intel xeon, 2.27GHz > > > Two hbase nodes (one master and one regionserver), 6GB RAM per each. > > > > > > Table has following definition: > > > > > > 12-byte string as Row > > > Column family: C1 and 3 qualifiers: q1, q2, q3 (about 200 bytes per > > record) > > > Column family: C2 and 2 qualifiers q1, q2 (about 2-4KB per record) > > > > > > I've implemented simple java utility which parses our data source and > > > inserts results into hbase (write buffer is 12MB, autoflush off). > > > We got following results: > > > ~450K records ~= 4GB of data. > > > Total time of insertion is about 600-650 seconds or ~7 MB/second or 675 > > > rows > > > per second, or 2ms per row. > > > > > > So the question is: is this time ok for such hardware or did I miss > > > something important? > > > Thank you. > > > > > > Regards, Dmitriy. > > > > > > > > > -- > Regards, Lyfar Dmitriy > mailto: dlyfar@... > jabber: dlyfar@... > |
|
|
Re: Hbase can we insert such (inside) data faster?Dmitriy,
Are you using any system/resource monitoring software? You should be able to see if you are IO, CPU, Memory/GC, or Network bound by doing some investigating during the import.... this should tell you if you can get better performance or not (and if things are maxed, you can figure the bottleneck and try to optimize). Also, if you are doing an import into a new table, you could use the HFileOutputFormat. In my benchmarking, I saw about 10X improvement in performance compared to a heavily optimized normal import. Check out HBASE-48 for more information. JG Amandeep Khurana wrote: > 1. You need odd number of servers for the zk quorum. 3-5 should be good > enough. In your case, even 1 is fine since the load is not much. > 2. We used 7200rpm SATA drives. > > On Mon, Oct 26, 2009 at 2:57 AM, Dmitriy Lyfar <dlyfar@...> wrote: > >> Hi Amandeep, >> >> Thank you. I also forgot to mention that Zookeeper is managed by hbase on >> both nodes and >> quorum consists of two zookeepers per node. >> Could you tell me how much Zookeepers should I have per this configuration >> and how it usually should be? >> BTW, which hards disks did you use? >> >> 2009/10/26 Amandeep Khurana <amansk@...> >> >>> This is slow.. We get about 4k inserts per second per region server with >>> row >>> size being about 30kB. Using Vmware could be causing the slow down. >>> >>> Amandeep >>> >>> On Mon, Oct 26, 2009 at 2:04 AM, Dmitriy Lyfar <dlyfar@...> wrote: >>> >>>> Hello, >>>> >>>> We are using hadoop + hbase (0.20.1) for tests now. Machines we are >>> testing >>>> on have following configuration: >>>> Vmware >>>> 4 core intel xeon, 2.27GHz >>>> Two hbase nodes (one master and one regionserver), 6GB RAM per each. >>>> >>>> Table has following definition: >>>> >>>> 12-byte string as Row >>>> Column family: C1 and 3 qualifiers: q1, q2, q3 (about 200 bytes per >>> record) >>>> Column family: C2 and 2 qualifiers q1, q2 (about 2-4KB per record) >>>> >>>> I've implemented simple java utility which parses our data source and >>>> inserts results into hbase (write buffer is 12MB, autoflush off). >>>> We got following results: >>>> ~450K records ~= 4GB of data. >>>> Total time of insertion is about 600-650 seconds or ~7 MB/second or 675 >>>> rows >>>> per second, or 2ms per row. >>>> >>>> So the question is: is this time ok for such hardware or did I miss >>>> something important? >>>> Thank you. >>>> >>>> Regards, Dmitriy. >>>> >> >> >> -- >> Regards, Lyfar Dmitriy >> mailto: dlyfar@... >> jabber: dlyfar@... >> > |
| Free embeddable forum powered by Nabble | Forum Help |