Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Hive/HDFS issues when primary namenode shut down.

New Contributor

I have recently enabled NameNode HA, ResourceManager HA, and Hive HA on an HDP 2.6 cluster by following the instructions provided here:  https://docs.cloudera.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/ch_managing_servi....

 

Everything continues to work fine if I stop all HDP components on host where the primary NameNode and ResourceManager are running by logging into Ambari, selecting that host, and then selecting the Host Action "Stop All Components".  But as soon as I power down that host, beeline requests become intolerably slow.  For example, a beeline request for "show tables;" takes about 20 minutes to complete.  The request will hang at this point for an extended period of time:

Setting property: [silent, false]
issuing: !connect jdbc:hive2://nn2:10000/intrepid hdfs [passwd stripped]
Connecting to jdbc:hive2://nn2:10000/intrepid

(nn2 is the host name where the standby NameNode is running.)

 

After a long while, the following output will appear:

Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Executing command: show tables;

and shortly afterward the command will complete.

 

As soon as I power up the primary NameNode host, everything works normally again even though all of the HDP components on that host remain stopped.

 

Please advise.

1 ACCEPTED SOLUTION

New Contributor

Issue was resolved by setting dfs.client.failover.proxy.provider to org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider in the HDFS Custom hdfs-site settings.

View solution in original post

2 REPLIES 2

New Contributor

After enabling log4j for beeline, it appears that this issue is caused by delays when transport.TSaslTransport tries to write data:

19/10/16 13:03:49 [main]: DEBUG transport.TSaslTransport: writing data length: 72
19/10/16 13:06:11 [main]: DEBUG transport.TSaslTransport: CLIENT: reading data length: 109

but I am unable to determine the cause of these delays. 

New Contributor

Issue was resolved by setting dfs.client.failover.proxy.provider to org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider in the HDFS Custom hdfs-site settings.