Support Questions

Find answers, ask questions, and share your expertise

“No common protection layer between client and server” while trying to communicate with kerberized H

avatar
Explorer

I'm trying to communicate programmatically to a Hadoop cluster which is kerberized (CDH 5.3/HDFS 2.5.0).

I have a valid Kerberos token on the client side. But I'm getting an error as below, "No common protection layer between client and server".

 

What does this error mean and are there any ways to fix or work around it?

 

Is this something related to HDFS-5688? The ticket seems to imply that the property "hadoop.rpc.protection" must be set, presumably to "authentication" (also per e.g. this).

 

Would this need to be set on all servers in the cluster and then the cluster bounced? I don't have easy access to the cluster so I need to understand whether 'hadoop.rpc.protection' is the actual cause. It seems that 'authentication' should be the value used by default, at least according to the core-default.xml documentation.

 

java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for principal1/server1.acme.net@xxx.acme.net to server2.acme.net/10.XX.XXX.XXX:8020; Host Details : local host is: “some-host.acme.net/168.XX.XXX.XX”; destination host is: “server2.acme.net”:8020;

    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)

    at org.apache.hadoop.ipc.Client.call(Client.java:1415)

    at org.apache.hadoop.ipc.Client.call(Client.java:1364)

    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)

    at com.sun.proxy.$Proxy24.getFileInfo(Unknown Source)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:498)

    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

    at com.sun.proxy.$Proxy24.getFileInfo(Unknown Source)

    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:707)

    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1785)

    at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1068)

    at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1064)

    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1064)

    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)

    ... 11 more

Caused by: java.io.IOException: Couldn't setup connection for principal1/server1.acme.net@xxx.acme.net to server2.acme.net/10.XX.XXX.XXX:8020;

    at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:671)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:422)

    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)

    at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:642)

    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)

    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)

    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)

    at org.apache.hadoop.ipc.Client.call(Client.java:1382)

    ... 31 more

Caused by: javax.security.sasl.SaslException: No common protection layer between client and server

    at com.sun.security.sasl.gsskerb.GssKrb5Client.doFinalHandshake(GssKrb5Client.java:251)

    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:186)

    at org.apache.hadoop.security.SaslRpcClient.saslEvaluateToken(SaslRpcClient.java:483)

    at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:427)

    at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:552)

    at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)

    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:717)

    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:713)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:422)

    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)

    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)

    ... 34 more

 

2 ACCEPTED SOLUTIONS

avatar
Explorer

Thanks, Harsh, very helpful.

 

I've been poking around on an edge node, so I do have access to hdfs-site.xml and core-site.xml. We may have to munge these files before we can use them, as they contain some values such as host names for fs.defaultFS which are cluster internal; we'll have to use different host names to be able to get in from outside the cluster...

 

Since we deal with multiple clusters organized by stage (dev, prod, etc.), we'd have to maintain multiple pairs of core-site.xml, hdfs-site.xml files, and load them dynamically at runtime via Configuration.addResource() method...

 

 

View solution in original post

avatar
Explorer

Because of our requirements (to be able to be targeted toward a different cluster based on deployment, and the HDFS config files potentially having cluster-internal host names in them), we're going with the approach of maintaining the minimalistic set of Configuration properties required to make Kerberos work on the client side. These are, again:

 

* dfs.namenode.kerberos.principal

* hadoop.rpc.protection

 

Having said that, Harsh's comments are all valid and relevant.

View solution in original post

13 REPLIES 13

avatar
Mentor
You certainly do need to set hadoop.rpc.protection to the exact value the
cluster expects. While "authentication" is the default, the other values
your cluster services may be using/enforcing are "privacy" or "integrity".

If your cluster is run via CM, I highly recommend downloading a client
configuration zip from its services and looking over all the properties
present in it, and applying the same into your project (simplest way is to
place the *.xml files into your src/main/resources, if you use maven, but
you can also apply them programmatically).

That said, from 5.1.0 onwards the clients are designed to auto-negotiate
the SASL QOP properties from the server so you never have to specify the
hadoop.rpc.protection accurately. This feature, combined with the error you
face, leads me to believe that perhaps your hadoop-client dependency
libraries are much older than 5.1.0.

avatar
Explorer

Hi Harsh,

 

My Hadoop dependencies are: hadoop-common, hadoop-hdfs, version=2.5.0, since we're running CDH 5.3.  Does that sound like the right version?

 

<dependency>

  <groupId>org.apache.hadoop</groupId>

  <artifactId>hadoop-common</artifactId>

  <version>2.5.0</version>

</dependency>

<dependency>

  <groupId>org.apache.hadoop</groupId>

  <artifactId>hadoop-hdfs</artifactId>

  <version>2.5.0</version>

</dependency>

 

Thanks.

- Dmitry

avatar
Mentor
That is an Apache Hadoop upstream version. Please add Cloudera's repository:



cloudera
https://repository.cloudera.com/artifactory/cloudera-repos/



And the right version dependency (the hadoop-client wrapper one needs to be
used generally, not specific ones such as common/hdfs/etc.):


org.apache.hadoop
hadoop-client
2.5.0-cdh5.3.10


That said, did you check up what protection mode the server configuration
is expecting?

avatar
Explorer

Harsh,

 

Thanks for the hadoop-client suggestion, I've changed the pom file. That did not make any difference, however, as far as the issue is concerned.

 

As far as downloading a client configuration zip, is that something I could do via Hue? I do not have access to the main SCM interface. Any other means of retrieving this?

 

Per your comment "You certainly do need to set hadoop.rpc.protection to the exact value the cluster expects", I've tried the other values. The "authentication" and "integrity" did not make a difference, I was still getting the error “No common protection layer between client and server”.

 

However, setting "hadoop.rpc.protection" to "privacy" caused a different type of error (see below).  Any recommendations at this point?  Thanks.

 

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1775)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1402)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4221)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:881)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:526)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:822)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)

at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:744)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1912)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1089)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)

avatar
Mentor
This is good, it looks like your server does use privacy and expect it. The
new error is cause your cluster has HA HDFS but you are passing only a
single hostname for the NN which is currently not the active one.


You will need to use the entire HA config set, or for the moment pass in
the other NN hostname.


As to client configs, you can perhaps request your admin to generate a Zip
from CM -> Cluster -> Actions -> Client Configuration URLs option visible
to them. You'll have an easier time developing apps once you have all the
required properties set, which is what the client configuration download in
CM is designed for.

avatar
Explorer

Harsh,

 

There are 3 host names at play: A, B, and C. Things have started working, actually, when I set fs.defaultFS to one of these (B); originally I was using A.   I'm told, however, that all 3 are supposed to be 'active'.  

 

>> you are passing only a single hostname for the NN

 

Per this comment you made, should I be passing in all 3 hostnames?  If so, how?  The doc states that fs.defaultFS is "The name of the default file system." so a) should all 3 names be passed and b) if so, how?

 

Thanks for your help.

avatar
Mentor

 HDFS currently is deeply tested only with 2x NameNodes, so while you can technically run 3x NNs, not everything would behave as intended. There is work ongoing to have 2+ NameNodes in future of HDFS.

 

HDFS HA architecture is also Active-Standby based, so 2x NNs being active is not possible by at least HDFS HA design. If you're using CDH, then this certainly isn't available, so am unsure what they are trying to mean by 3x Active NameNodes.

 

As to HA configuration, it involves a few configurations that are associated to one another. Here's an example core-site.xml and hdfs-site.xml properties that are relevant to HA config description from one such cluster. You can adapt them to your hostnames, but once again I'd like to recommend you obtain a client configuration zip from your administrator to make it easier in deploying with your cluster's configs, vs. hand-setting each relevant property. If you have access to some form of command/gateway/edge host, you can also usually find such config files under its /etc/hadoop/conf/ directory:

 

core-site.xml

 

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://ha-nameservice-name</value>
  </property>
… </configuration>

hdfs-site.xml

 

<configuration>
  <property>
    <name>dfs.nameservices</name>
    <value>ha-nameservice-name</value>
  </property>
  <property>
    <name>dfs.client.failover.proxy.provider.ha-nameservice-name</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>
  <property>
    <name>dfs.ha.automatic-failover.enabled.ha-nameservice-name</name>
    <value>true</value>
  </property>
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>ZKHOST:2181</value>
  </property>
  <property>
    <name>dfs.ha.namenodes.ha-nameservice-name</name>
    <value>namenode10,namenode142</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.ha-nameservice-name.namenode10</name>
    <value>NN1HOST:8020</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-address.ha-nameservice-name.namenode10</name>
    <value>NN1HOST:8022</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.ha-nameservice-name.namenode10</name>
    <value>NN1HOST:20101</value>
  </property>
  <property>
    <name>dfs.namenode.https-address.ha-nameservice-name.namenode10</name>
    <value>NN1HOST:20102</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.ha-nameservice-name.namenode142</name>
    <value>NN2HOST:8020</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-address.ha-nameservice-name.namenode142</name>
    <value>NN2HOST:8022</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.ha-nameservice-name.namenode142</name>
    <value>NN2HOST:20101</value>
  </property>
  <property>
    <name>dfs.namenode.https-address.ha-nameservice-name.namenode142</name>
    <value>NN2HOST:20102</value>
  </property>
… </configuration>

With this configuration in place, all HDFS URIs must be accessed with FS URI hdfs://ha-nameservice-name. Ideally you want to use the same name your cluster uses, so remote services can reuse the name too, which is why grabbing an actual cluster client configuration set is important.

 

 

 

avatar
Explorer

Thanks, Harsh, very helpful.

 

I've been poking around on an edge node, so I do have access to hdfs-site.xml and core-site.xml. We may have to munge these files before we can use them, as they contain some values such as host names for fs.defaultFS which are cluster internal; we'll have to use different host names to be able to get in from outside the cluster...

 

Since we deal with multiple clusters organized by stage (dev, prod, etc.), we'd have to maintain multiple pairs of core-site.xml, hdfs-site.xml files, and load them dynamically at runtime via Configuration.addResource() method...

 

 

avatar
Explorer

Because of our requirements (to be able to be targeted toward a different cluster based on deployment, and the HDFS config files potentially having cluster-internal host names in them), we're going with the approach of maintaining the minimalistic set of Configuration properties required to make Kerberos work on the client side. These are, again:

 

* dfs.namenode.kerberos.principal

* hadoop.rpc.protection

 

Having said that, Harsh's comments are all valid and relevant.