- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HBase Scan slow after inserting million reords in table
- Labels:
-
Apache HBase
Created ‎04-12-2016 06:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am on HDP 2.3.4 ( 3 node cluster) , My HBase scans are slow after inserting a million row data
As I am new bee to HBase, Any suggestions experts can provide me to tune performance.
Would really appreciate the help.
Thanks,
Divya
Created ‎04-18-2016 10:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Divya Gehlot Are you specifying start and stop key in scans? Open ended scan which doesn't specify start and stop key usually ends up with complete table scan and hence becomes slow. As @Randy Gelhausen mentioned optimal rowkey design will help you in specifying start and stop key.
Created ‎04-12-2016 08:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Couple suggestions
- HBase is not performant for scans as it is a db for random reads/writes.
- If scans are to be performs do it on the key and not the columns.
Created ‎04-12-2016 10:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Divya Gehlot, go to HBase -> Quick Links -> HBase Master UI, then select Table details on the top, locate and click on your table. It will show you the table regions, their server layout, and number of requests per region. You can then consider to split too busy regions, and move some regions to another nodes for a better load balancing. Refer to this for split/move, and to this for a good backgrounder. Since you have only 3 nodes the results might be limited. Regarding other properties, if you can afford, be sure to have enough RAM for Region servers, not less than 16G.
Created ‎04-13-2016 04:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Divya Gehlot- as @Sunile Manjee noted, HBase is an indexed lookup system which can also perform scans. This makes you think a bit about your data access/query patterns before you can create an optimal table design.
In general, you want to design your rowkeys around your access patterns. Ensure your highest order rowkey bits can always be known to your application at HBase read-time, else your access will be a full-scan instead of a range scan.
Users of the raw HBase API often find themselves performing logic in their application code instead of server-side within HBase's RegionServer processes. A simple, but powerful way to avoid both writing large amounts of client application code and pulling significant chunks of data back, consider using Apache Phoenix on top of HBase. It makes it easy to perform a more selective HBase query via SQL query language, which also:
1. Lends itself more naturally to thinking about how data is laid out in your tables
2. Lets you define secondary indices on the data your queries access regardless of whether your application knows a specific rowkey (or range) it needs to access.
Created ‎04-18-2016 10:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Divya Gehlot Are you specifying start and stop key in scans? Open ended scan which doesn't specify start and stop key usually ends up with complete table scan and hence becomes slow. As @Randy Gelhausen mentioned optimal rowkey design will help you in specifying start and stop key.
Created ‎05-29-2017 03:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if you are using hbase shell for scanning, you can try:
> scan '<table>', CACHE => 1000
this CACHE will tell hbase RS to cache some certain number of rows before return, which can save lots of RPC calls.
