Reply
New Contributor
Posts: 3
Registered: ‎06-14-2016

HDFS distcp between clusters error

[ Edited ]

Hi,

 

I have two isolated clusters with gateway nodes each (only gateway has public IP).
I want to copy data from one cluster to other.
I configured access with haproxy and iptables.
Clusters see each other but when I use distcp I get an error:

 

shamrock@hadoop-cluster-src-gw:~$ hadoop distcp hdfs://hadoop-cluster-src-gw.some.domain/user/shamrock/test.pl hdfs://hadoop-cluster-dst-gw.some.domain/user/shamrock/
16/06/14 07:29:00 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hdfs://hadoop-cluster-src-gw.some.domain/user/shamrock/test.pl], targetPath=hdfs://hadoop-cluster-dst-gw.some.domain/user/shamrock, targetPathExists=true, preserveRawXattrs=false, filtersFile='null'}
16/06/14 07:29:00 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/06/14 07:29:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/06/14 07:29:00 ERROR tools.DistCp: Exception encountered
java.io.IOException: Mkdirs failed to create file:/user/shamrock1794299338/.staging/_distcp-1567163966 (exists=false, cwd=file:/home/shamrock)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:442)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1072)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:271)
at org.apache.hadoop.tools.SimpleCopyListing.getWriter(SimpleCopyListing.java:407)
at org.apache.hadoop.tools.SimpleCopyListing.doBuildListing(SimpleCopyListing.java:174)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:365)
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:171)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:122)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:429)

 

What is wrong with configuration that distcp is trying to create fake directory instead using right one ?
Why hadoop is trying to create directory with such weired name ?

 

Best regards,

 

Shamrock