Support Questions

Find answers, ask questions, and share your expertise

Running distcp between two cluster: One Kerberized and the other is not

avatar

hadoop distcp -i -log /tmp/ hdfs://xxx:8020/apps/yyyy hdfs://xxx_cid/tmp/

In this case the "xxx" is the "un-secure" cluster, while "xxx_cid" in the secure cluster.

We are launching the job from the Kerberos cluster, with the appropriate kinit for the user and getting the following error:

java.io.IOException: Failed on local exception: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.; Host Details : local host is: "xxx/10.x.x.x"; destination host is: "xxx":8020;

...

Caused by: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.

I thought by launching the job from the secure cluster, that we could avoid any access issues. But it appears that the processes are kicked off from the "source" cluster. In this case, that's the insecure cluster.

Idea's on getting around this?

1 ACCEPTED SOLUTION

avatar

I recommend not setting this in core-site.xml, and instead setting it on the command line invocation specifically for the DistCp command that needs to communicate with the unsecured cluster. Setting it in core-site.xml means that all RPC connections for any application are eligible for fallback to simple authentication. This potentially expands the attack surface for man-in-the-middle attacks.

Here is an example of overriding the setting on the command line while running DistCp:

hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo

The command must be run while logged into the secured cluster, not the unsecured cluster.

View solution in original post

17 REPLIES 17

avatar

sounds like the distcp process is running secure, but is configured to not like simple connections.

try setting the config option

ipc.client.fallback-to-simple-auth-allowed=true

avatar

@dstreever@hortonworks.com To use Distcp for copying between a secure cluster and an insecure one, add the following to the HDFS core-default.xml, by using Ambari.

<property>
  <name>ipc.client.fallback-to-simple-auth-allowed</name>
  <value>true</value> 
</property>

avatar
Explorer

Adding this property in core-site.xml helped resolve the error.

avatar
Master Mentor

@Pardeep Nice find! Link

When copying data from a secure cluster to a secure cluster, the following configuration setting is required in the core-site.xml file:

<property>
    <name>hadoop.security.auth_to_local</name>
    <value></value>
    <description>Maps kerberos principals to local user names</description>
</property> 

avatar

I recommend not setting this in core-site.xml, and instead setting it on the command line invocation specifically for the DistCp command that needs to communicate with the unsecured cluster. Setting it in core-site.xml means that all RPC connections for any application are eligible for fallback to simple authentication. This potentially expands the attack surface for man-in-the-middle attacks.

Here is an example of overriding the setting on the command line while running DistCp:

hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo

The command must be run while logged into the secured cluster, not the unsecured cluster.

avatar
Master Mentor

@Chris Nauroth Thanks for sharing this. Could you update the answer with more details? I believe this is the best answer if you can add more details.

avatar

@Neeraj Sabharwal, thank you. I updated the answer to show an example of overriding the property from the DistCp command line.

avatar
Master Mentor

avatar
Explorer
  1. getting below error after running the command "hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo"
  2. java.io.EOFException:End of FileException between local host is***; destination host is:***;
  3. please suggest