Support Questions

Find answers, ask questions, and share your expertise

org.apache.hadoop.ipc.StandbyException

avatar
New Contributor

I have succesfully setup a hadoop cluster with HDFS HA enabled. I did a manual failover of Active NN (nn1) . Now the Standby NN (nn2) has become Active as expected. nn1 is now the standby. All HDFS write and read operations work fine (hdfs command line ..). However the below error pops up while using Hive (create table ) . I know turning the nn1 back to ACTIVE solves the issue. Looking for workarounds which doesn't require this manual operation. Thanks in advance.

FAILED: SemanticException MetaException(message:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1915) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1407) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4501) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:961) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:835) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)

1 ACCEPTED SOLUTION

avatar

A correctly configured HDFS client will handle the StandbyException by attempting to fail itself over to the other NameNode in the HA pair, and then it will reattempt the operation. It's possible that the application is misconfigured, so that it is not aware of the NameNode HA pair, and therefore the StandbyException becomes a fatal error.

I recommend reviewing the configuration properties related to NameNode HA described in this Apache documentation:

http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.htm...

In particular, note the section about "ConfiguredFailoverProxyProvider". This is the thing that enables the automatic failover behavior in the client. HDP clusters that use NameNode HA will set this property. This error appears to be coming from the metastore, so I recommend checking that the metastore is in fact running with the correct set of configuration files.

View solution in original post

4 REPLIES 4

avatar
Contributor

It looks like you have the NN1 address hard coded somewhere in your hive-conf.xml file. You will need to change that to be NN H/A-aware.

avatar

A correctly configured HDFS client will handle the StandbyException by attempting to fail itself over to the other NameNode in the HA pair, and then it will reattempt the operation. It's possible that the application is misconfigured, so that it is not aware of the NameNode HA pair, and therefore the StandbyException becomes a fatal error.

I recommend reviewing the configuration properties related to NameNode HA described in this Apache documentation:

http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.htm...

In particular, note the section about "ConfiguredFailoverProxyProvider". This is the thing that enables the automatic failover behavior in the client. HDP clusters that use NameNode HA will set this property. This error appears to be coming from the metastore, so I recommend checking that the metastore is in fact running with the correct set of configuration files.

avatar
Master Mentor

@Nalini Kumar Kasturi are you still having problems with this? Can you provide your own solution or accept best answer?

avatar
New Contributor

We had this issue because some partitions pointed to a non HA location on hdfs. Fixed it by running:

hive --config /etc/hive/conf/conf.server --service metatool [-dryRun] -updateLocation hdfs://h2cluster hdfs://h2namenode:8020

,

In our case, the problem was that some hive partitions had incorrect location. We fixed it using hive metatool like this:
hive --config /etc/hive/conf/conf.server --service metatool [-dryRun] -updateLocation hdfs://hcluster hdfs://namenode:8020