Created on 10-11-2019 10:58 AM - last edited on 10-11-2019 09:27 PM by ask_bill_brooks
I have recently enabled NameNode HA, ResourceManager HA, and Hive HA on an HDP 2.6 cluster by following the instructions provided here: https://docs.cloudera.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/ch_managing_servi....
Everything continues to work fine if I stop all HDP components on host where the primary NameNode and ResourceManager are running by logging into Ambari, selecting that host, and then selecting the Host Action "Stop All Components". But as soon as I power down that host, beeline requests become intolerably slow. For example, a beeline request for "show tables;" takes about 20 minutes to complete. The request will hang at this point for an extended period of time:
Setting property: [silent, false]
issuing: !connect jdbc:hive2://nn2:10000/intrepid hdfs [passwd stripped]
Connecting to jdbc:hive2://nn2:10000/intrepid
(nn2 is the host name where the standby NameNode is running.)
After a long while, the following output will appear:
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Executing command: show tables;
and shortly afterward the command will complete.
As soon as I power up the primary NameNode host, everything works normally again even though all of the HDP components on that host remain stopped.
Please advise.
Created on 01-08-2020 08:24 AM - edited 01-08-2020 08:25 AM
Issue was resolved by setting dfs.client.failover.proxy.provider to org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider in the HDFS Custom hdfs-site settings.
Created 10-16-2019 06:13 AM
After enabling log4j for beeline, it appears that this issue is caused by delays when transport.TSaslTransport tries to write data:
19/10/16 13:03:49 [main]: DEBUG transport.TSaslTransport: writing data length: 72
19/10/16 13:06:11 [main]: DEBUG transport.TSaslTransport: CLIENT: reading data length: 109
but I am unable to determine the cause of these delays.
Created on 01-08-2020 08:24 AM - edited 01-08-2020 08:25 AM
Issue was resolved by setting dfs.client.failover.proxy.provider to org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider in the HDFS Custom hdfs-site settings.