Support Questions
Find answers, ask questions, and share your expertise

Help setting up HBase Bulkload Replication

Help setting up HBase Bulkload Replication

Rising Star

I am currently in the process of setting up replication on HBase. Normal HBase replication (based on the WALs) is set up and working well. However I am struggling to set up replication for bulk loading as specified in [HBASE-13153]

I am running HDP 2.6. I initially thought the version of HBase was the issue, but according to release notes this was patched in in HDP 2.4.3 so HBase version is no the issue here. I have followed the instructions provided in the link as well as I can understand them, but bulk loaded rows are not being replicated.

I have set the following configurations:

  • On the source cluster: hbase.replication.bulkload.enabled=true
  • On the source cluster: hbase.replication.cluster.id=source
  • On the peer cluster: hbase.replication.conf.dir=/tmp/fs_conf
  • On the peer cluster: hbase.replication.source.fs.conf.provider=org.apache.hadoop.hbase.replication.regionserver.DefaultSourceFSConfigurationProvider
  • I have copied the core-site.xml, hdfs-site.xml, yarn-site.xml, hbase-site.xml from the source cluster to the peer cluster, to all of the region servers under /tmp/fs_conf/source

These are the only changes specified in the JIRA (at least as far as I understand it) however the rows that are bulk loaded are not being replicated. I suspect I have missed/misinterpreted part of the instructions.

Any help will be appreciated.

3 REPLIES 3

Re: Help setting up HBase Bulkload Replication

Super Collaborator

Is hbase.replication.conf.dir defined on every node in the peer cluster ?

Specifying /tmp is not good idea since the files under /tmp would be periodically cleaned.

Re: Help setting up HBase Bulkload Replication

Rising Star

I set that configuration through Ambari, which should propagate the config to all nodes in the cluster if I understand correctly. Should I perhaps include a trailing / in the config?

I am only using /tmp for testing purposes in replicating from our dev cluster. Once this goes to our production and DR clusters, I will choose a more suitable location for the configuration files.

Re: Help setting up HBase Bulkload Replication

New Contributor

Is this Issue resolved?