About Andoroid

Andoroid · ‎11-09-2018

Hello Vinod, Please refer to the previously mentioned hbck guide[1] or review the Appendix C[2] which is referenced at the end of that documentation that further discusses usage of hbck. Generally near every hbck troubleshooting is best begin with a single $ sudo -u hbase hbase hbck -fixAssignments As this will try to assign all regions that are not deployed at the time of the running. Holes can be present by many different reasons, the first step is reviewing if every region is assigned successfully, would the hole persist. Reassigning regions successfully usually eliminates the holes in the region chain. It's also a good practice to see if the Apache HBase "thebook" has any information about the issue at hand. As CDH5.8+ uses HBase 1.2 it's best to check out the corresponding version of the Apache Documentation on HBase[3]. If you would have CDH6.0.x then it's best to review HBase 2.0's documentation of the same[4] which has hbck2. [1] - Checking and Repiring HBase tables CDH5.15.x - https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_hbase_hbck.html [2] - Apache HBase documentation v1.2 / Appendix C - http://hbase.apache.org/1.2/book.html#hbck.in.depth [3] - Apache HBase documentation v1.2 / HBase hbck - http://hbase.apache.org/1.2/book.html#hbck [4] - Apache HBase documentation / HBase HBCK2 - http://hbase.apache.org/book.html#HBCK2

Andoroid · ‎11-09-2018

Additionally to the previous solution some best practices: - hbck is basically just an HBase client command - client commands are recommended to being run from nodes which has the relevant service's client configurations deployed on them. This can be done manually (not recommended, see later why) or via Cloudera Manager According to these whichever node you are running hbck should have HBase client configs deployed to make sure that it actually uses the cluster's current configs (which have several configs, like heap size for client commands, Zookeeper ensemble hostnames, etc). To have this done, it's recommended to deploy an HBase GATEWAY role[1] that actually does just this, deploys the active configs of HBase service via Cloudera Manager. Additionally if any HBase client config changes are made later via Cloudera Manager, those will be also delegated automatically just the same way as any config changes are delegated to every node which has HBase role instances installed on. There are some further reference about using hbck here[2] as this is an advanced topic. [1] - Gateway roles CDH latest version - https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_managing_roles.html#managing_roles__section_scv_ywt_cn [2] - Checking and Repairing HBase tables CDH5.15.x - https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_hbase_hbck.html (please note that in CDH6.0.0 hbck's several options are depreciated)

Andoroid · ‎11-09-2018

It worth to check if the use case is actually suited for using HDFS's NFS Gateway role[1] which is designed for such remote cluster access. [1] - Adding and Configuring an NFS Gateway - https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_hdfs_nfsgateway.html

Andoroid · ‎10-23-2018

Resurrecting this topic with some clarity on the issue and it's remedy. If RegionServers would be keeping dead connections to the Datanodes, the same symnptoms would be seen, many connection in CLOSE_WAIT, and file descriptor number increasing. In extreme cases the limit could be reached, whioch would case the host node to fail with no more open file descriptors to use issue. There wasa bug in HBase prior to CDH5.13 which is described in this upstream JIRA in more detail[1]: HBASE-9393 Hbase does not closing a closed socket resulting in many CLOSE_WAIT] This issue was patched in the following CDH releases: CDH5.13.0, CDH5.13.1, CDH5.13.2, CDH5.13.3, CDH5.14.0, CDH5.14.2, CDH5.14.4, CDH5.15.0, CDH5.15.1, CDH6.0.0. [1] - upstream HBase JIRA - https://issues.apache.org/jira/browse/HBASE-9393?attachmentOrder=asc

Andoroid · ‎08-28-2018

I am just sharing the relevant part of the linked docs, as they contain the instructions on how to enable the hbase balancer via hbase shell: Load Balancer It is assumed that the Region Load Balancer is disabled while the graceful_stop script runs (otherwise the balancer and the decommission script will end up fighting over region deployments). Use the shell to disable the balancer: hbase(main):001:0> balance_switch false true 0 row(s) in 0.3590 seconds This turns the balancer OFF. To reenable, do: hbase(main):001:0> balance_switch true false 0 row(s) in 0.3590 seconds The graceful_stop will check the balancer and if enabled, will turn it off before it goes to work. If it exits prematurely because of error, it will not have reset the balancer. Hence, it is better to manage the balancer apart from graceful_stopreenabling it after you are done w/ graceful_stop.

Online	Offline
Last Visited	‎09-11-2020 07:17 AM

Member Since	‎10-03-2017 12:32 AM
Last Visited	‎09-11-2020 07:17 AM
Posts	17
Kudos received	2

Cloudera Community

Re: Access HDFS/MR from remote HDFS cluster

Re: ERROR: Found inconsistency in table SYSTEM.CAT...

Re: hbase hbck reports inconsistency immediately a...

Re: Access HDFS/MR from remote HDFS cluster

Re: Open File Descriptors warning in Cloudera Mana...

Re: HBase will not balance Regions across RegionSe...