Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

will hive/hbase be affected after hdfs rebalance?

will hive/hbase be affected after hdfs rebalance?

New Contributor

Hi guys:

I want to know that do you need to do anything for hive/hbase like to rebuild metastore after adding a new node to the cluster and hdfs is rebalanced? thanks.

4 REPLIES 4

Re: will hive/hbase be affected after hdfs rebalance?

Super Collaborator

For HBase it would be nice to run major compaction for tables to increase data locality.

Re: will hive/hbase be affected after hdfs rebalance?

New Contributor

what about hive, anyone has any idea? thanks

Highlighted

Re: will hive/hbase be affected after hdfs rebalance?

Hello daniel

There is nothing mandatory to do for Hive or Hbase after an HDFS rebalance. As Sergey mentionned in the Hbase case your Hbase files might not be completely located with corresponding region servers anymore and a major compaction could help there. This being said it will not block Hbase from working and over time Hbase will "re-localize" so to speak. You will only incur mild performance degradation depending on your usage pattern. On the Hive front the Metastore,Yarn and Tez will still work together to find your files and start local compute as much as possible so nothing to do either. I will let more knowledgeable experts like Sergey or others weigh in more from detailed technical specs if I missed something. But HDFS rebalance is an operation that happens continuously and should be as transparent as possible to your daily work.

Re: will hive/hbase be affected after hdfs rebalance?

Hive wan't be affected in the honogenuous cluster (all worker nodes having same spec). In a heterogenuous cluster some job execution time may vary a little because of changed data locality. In HBase, initially region data locality will go down, as mentioned above, but there is no need to do anything for normal read/write tables because as regions grow they will be major compacted by HBase engine. You may consider to do major comapction manually only for read-only tables (read-only between 2 hdfs balancings).