Created 12-07-2021 03:14 AM
I want to copy a table from HBase to HBase across clusters by using copytable command, by default it is set to 1 mapper and scans all rows which cause a timeout. Are there any options available for the HBase copytable command in a way to optimize performance? without specifying any parameter to HBase-site.xml.
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=myserver:/hbase --new.name=<<tablename>> <<tablename>>
Created 12-07-2021 12:59 PM
Hello @rootuser,
Thanks for using Cloudera Community. Based on the Post, You are trying to use CopyTable to copy HBase Table(s) from 1 Cluster to another Cluster, wherein 1 Mapper is being observed.
Please confirm if the Source Table has 1 Region only. Additionally, Confirm if CopyTable on a Table with >1 Regions (Say, 5 Regions) creates 1 Mapper or 5 Mappers. Also, Please state the HBase Version being used by your Team. Additionally, Share the Timeout being observed by your Team.
As far as I recall, HBase uses 1 Mapper per Region. As such, It's likely the Source Table has 1 Region only. In such case, Increasing the Region Split by Pre-Split or Increasing the Timeout should help.
Regards, Smarak
Created 12-07-2021 12:59 PM
Hello @rootuser,
Thanks for using Cloudera Community. Based on the Post, You are trying to use CopyTable to copy HBase Table(s) from 1 Cluster to another Cluster, wherein 1 Mapper is being observed.
Please confirm if the Source Table has 1 Region only. Additionally, Confirm if CopyTable on a Table with >1 Regions (Say, 5 Regions) creates 1 Mapper or 5 Mappers. Also, Please state the HBase Version being used by your Team. Additionally, Share the Timeout being observed by your Team.
As far as I recall, HBase uses 1 Mapper per Region. As such, It's likely the Source Table has 1 Region only. In such case, Increasing the Region Split by Pre-Split or Increasing the Timeout should help.
Regards, Smarak