Support Questions

Find answers, ask questions, and share your expertise

Apache Hadoop FileSystem method copyToLocalFile is not working in cluster mode.

avatar
Explorer

Hello, I have a scenario where I need to copy the file from HDFS to local file system on edge server.


When i run the code in cluster mode with spark submit , I am getting below kind of error:

 

User class threw exception: java.io.IOException: Mkdirs failed to create
/some_specific/path/in_edge_server (exists=false, cwd=file:/some/path/yarn/nm/usercache/1234/appcache/application_123456744/container_t1234)

 

It seems like it is trying to find the destination path where the execution takes place and since it is not finding the specific path it is trying to create a directory and failing?

 

Below is the function being used:
fs.copyToLocalFile(new Path(src_HDFSPath), new Path(dest_edgePath))

 

Do we have a solution to copy the file from HDFS to specific path of edge server when we run the program in cluster mode?

4 REPLIES 4

avatar
Master Mentor

@Sanchari 
It could be good to share a snippet of your code.
logically I think you copy FROM -->TO
Below is the function being used:
fs.copyFromLocalFile(new Path(src_HDFSPath), new Path(dest_edgePath))

Disclaimer I am not a Spark/Python developer 

avatar
Explorer

@Shelton  Below is the hadoop fs function being used

copyToLocalFile(new Path(src_HDFSPath), new Path(dest_edgePath))

Please note that my goal is to copy the file from HDFS to edge server local file system when i run the spark job in cluster mode

avatar
Master Mentor

@Sanchari 
I suspect the java.io.IOException: Mkdirs failed to create is due to permissions on the edge-node 
Assuming you are  the HDFS copy is being run as hdfs  and your edge node  directory belongs to a different user/group that.
Just for test purposes  can you   do the following on the edgenode 

Spoiler
# mkdir -p /some_specific/path/in_edge_server

Then run chmod on the destination path 

Spoiler
# chmod 777  /some_specific/path/in_edge_server

Finally, rerun you spark-submit and let me know 

 

avatar
Explorer

@Shelton Please note that the directory on edge server to which I am trying to copy the file is already present. So ideally it should not try to perform mkdir operation. as I mentioned in my first post, it is looking for the directory in cwd of the node where the code is being executed and since it is not able to find it , it is trying to create one. So basically it should look for the directory in edge server instead of the directory mentioned in the cwd of the error posted.