I had a apache open source hadoop cluster which was unsecured. I had some data written over it. I recently created another cluster through ambari and then kerberized it. I now, want to copy my data from the old unsecured cluster to this new secured cluster so that I can remove the older machines and continue my operation on the new ambari secure machine.
I have a file test.txt at the unsecured cluster. I am giving below command from secured cluster.
hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://10.2.3.100:9000/test/test.txt hdfs://test-uat/test/
I am getting below exception:
- 18/01/02 09:40:35 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://10.2.3.100:9000/test/test.txt], targetPath=hdfs://test-uat/test/, targetPathExists=true, filtersFile='null', verboseLog=false}
18/01/02 09:40:35 INFO client.RMProxy: Connecting to ResourceManager at test-bigdatamaster01/10.1.9.155:8050
18/01/02 09:40:35 INFO client.AHSProxy: Connecting to Application History server at test-bigdatanode02/10.1.9.205:10200
18/01/02 09:40:36 INFO hdfs.DFSClient: Cannot get delegation token from rm/test-bigdatamaster01@TEST.COM
18/01/02 09:40:36 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; dirCnt = 0
18/01/02 09:40:36 INFO tools.SimpleCopyListing: Build file listing completed.
18/01/02 09:40:36 INFO tools.DistCp: Number of paths in the copy list: 1
18/01/02 09:40:36 INFO tools.DistCp: Number of paths in the copy list: 1
18/01/02 09:40:36 INFO client.RMProxy: Connecting to ResourceManager at test-bigdatamaster01/10.1.9.155:8050
18/01/02 09:40:36 INFO client.AHSProxy: Connecting to Application History server at test-bigdatanode02/10.1.9.205:10200
18/01/02 09:40:36 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 15 for spark on ha-hdfs:test-uat
18/01/02 09:40:36 INFO security.TokenCache: Got dt for hdfs://test-uat; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:test-uat, Ident: (HDFS_DELEGATION_TOKEN token 15 for spark)
18/01/02 09:40:36 INFO mapreduce.JobSubmitter: number of splits:1
18/01/02 09:40:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1514703479932_0009
18/01/02 09:40:36 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:test-uat, Ident: (HDFS_DELEGATION_TOKEN token 15 for spark)
18/01/02 09:40:37 INFO impl.TimelineClientImpl: Timeline service address: http://test-bigdatanode02:8188/ws/v1/timeline/
18/01/02 09:40:37 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/spark/.staging/job_1514703479932_0009
18/01/02 09:40:37 ERROR tools.DistCp: Exception encountered
java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: http://test-bigdatanode02:8188/ws/v1/timeline/?op=GETDELEGATIONTOKEN&renewer=rm%2Ftest-bigdatamaster..., status: 403, message: org.apache.hadoop.security.authentication.client.AuthenticationException
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientRetryOpForOperateDelegationToken.run(TimelineClientImpl.java:704)
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:186)
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:465)
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:375)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:360)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:331)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
at org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:302)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:193)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:155)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:128)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:462)
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: http://test-bigdatanode02:8188/ws/v1/timeline/?op=GETDELEGATIONTOKEN&renewer=rm%2Ftest-bigdatamaster..., status: 403, message: org.apache.hadoop.security.authentication.client.AuthenticationException
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:281)
at org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:133)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:212)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:133)
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:299)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:171)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:373)
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:371)
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$1.run(TimelineClientImpl.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientRetryOpForOperateDelegationToken.run(TimelineClientImpl.java:702)
... 20 more
I am using my server user 'dummy' to perform this task. I have tried with spark and hdfs users, but they are giving :
- 18/01/02 09:31:05 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
How to achieve this data transfer and what step am I missing ?