Reply
Highlighted
Visitor
Posts: 0
Registered: ‎04-14-2017

why hbase java client is slow compared to REST/Thrift

I am running some performance tests on HBase Java client / Thrift / REST interface.

I have a table called “Airline” which has 500K rows.

I am fetching all 500K rows from the table through 4 different Java programs. (using JAVA Client, Thrift, Thrift2 and REST)

 

Following are the performance numbers with various fetch sizes.

For all these the batch size is set to 100000

.

                               Fetch Size (Number of Rows)

 

.

1000

2000

5000

7500

10000

15000

20000

REST

135923

67520

31293

22417

18210

14281

12348

Thrift

135912

78630

38525

32470

27617

25223

27127

Thrift2

133807

74559

39691

32457

28241

27189

25426

Java API

45086

43945

44591

45393

44936

45849

45060

 

 

I could see that, there is a performance improvement as we increase the fetch size in case of REST, Thrift, and Thrift2.

 

But with Java API, I am seeing consistent performance, irrespective of fetch size.

Why fetch size is not impacting in JAVA Client?

 

Here is snippet of my Java Program

 

 

Table table = conn.getTable(TableName.valueOf("Airline"));

Scan scan =  new Scan();

ResultScanner scanner = table.getScanner(scan);

 

for (Result[] result = scanner.next(fetchSize); result.length != 0; result = scanner.next(fetchSize))

{

 -- process the rows

 }

 

 

Can someone help me in this. Am I using wrong methods/classes for data fetching through JAVA client.

Announcements