Support Questions
Find answers, ask questions, and share your expertise

org.apache.hadoop.ipc.StandbyException

New Contributor

I have succesfully setup a hadoop cluster with HDFS HA enabled. I did a manual failover of Active NN (nn1) . Now the Standby NN (nn2) has become Active as expected. nn1 is now the standby. All HDFS write and read operations work fine (hdfs command line ..). However the below error pops up while using Hive (create table ) . I know turning the nn1 back to ACTIVE solves the issue. Looking for workarounds which doesn't require this manual operation. Thanks in advance.

FAILED: SemanticException MetaException(message:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1915) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1407) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4501) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:961) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:835) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)

1 ACCEPTED SOLUTION

A correctly configured HDFS client will handle the StandbyException by attempting to fail itself over to the other NameNode in the HA pair, and then it will reattempt the operation. It's possible that the application is misconfigured, so that it is not aware of the NameNode HA pair, and therefore the StandbyException becomes a fatal error.

I recommend reviewing the configuration properties related to NameNode HA described in this Apache documentation:

http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.htm...

In particular, note the section about "ConfiguredFailoverProxyProvider". This is the thing that enables the automatic failover behavior in the client. HDP clusters that use NameNode HA will set this property. This error appears to be coming from the metastore, so I recommend checking that the metastore is in fact running with the correct set of configuration files.

View solution in original post

4 REPLIES 4

It looks like you have the NN1 address hard coded somewhere in your hive-conf.xml file. You will need to change that to be NN H/A-aware.

A correctly configured HDFS client will handle the StandbyException by attempting to fail itself over to the other NameNode in the HA pair, and then it will reattempt the operation. It's possible that the application is misconfigured, so that it is not aware of the NameNode HA pair, and therefore the StandbyException becomes a fatal error.

I recommend reviewing the configuration properties related to NameNode HA described in this Apache documentation:

http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.htm...

In particular, note the section about "ConfiguredFailoverProxyProvider". This is the thing that enables the automatic failover behavior in the client. HDP clusters that use NameNode HA will set this property. This error appears to be coming from the metastore, so I recommend checking that the metastore is in fact running with the correct set of configuration files.

Mentor

@Nalini Kumar Kasturi are you still having problems with this? Can you provide your own solution or accept best answer?

New Contributor

We had this issue because some partitions pointed to a non HA location on hdfs. Fixed it by running:

hive --config /etc/hive/conf/conf.server --service metatool [-dryRun] -updateLocation hdfs://h2cluster hdfs://h2namenode:8020

,

In our case, the problem was that some hive partitions had incorrect location. We fixed it using hive metatool like this:
hive --config /etc/hive/conf/conf.server --service metatool [-dryRun] -updateLocation hdfs://hcluster hdfs://namenode:8020

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.