Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar

To avoid full table scan while running hbase copytable job make use of --startrow and --stoprow parameters instead of --starttime and --endtime.

Example:

hbase org.apache.hadoop.hbase.mapreduce.CopyTable --startrow r1 --stoprow r12 --peer.adr=zk1,zk2,zk3:2181:/hbase-unsecure TestTable
hbase -Dhbase.client.scanner.caching=1 -Dmapred.map.tasks.speculative.execution=false org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=phbaiskdc2000.phx.qa.xyz.com:2181:/hbase-unsecure --new.name=REVIEW --startrow="BnTQA.AR.8aa1b29c-4613-4c40-924b-2294759854c4" --stoprow="BnTQA.AR.8aa1b29c-4613-4c40-924b-2294759854c4" REVIEW

Reference: https://hbase.apache.org/book.html

1,753 Views
Comments
avatar
Super Guru

@Rohan Pednekar

This is true also for any scan that requires evaluation before retrieving anything. I am not sure why this would be an HCC article. This is merely one paragraph of what could have been a well-written article about tips and tricks when dealing with HBase. I recommend looking at some of the featured articles in HCC and write that quality. This section you published could be very useful in a larger article. Thanks for your efforts.