Support Questions
Find answers, ask questions, and share your expertise

how to improve performance of copyfromlocal


We have to load 1 month data from local file system to hdfs.

to load 1 days data its taking 30 min,to load 1 month data its taking 15 hrs.

so how to improve speed of loading data from local file system to hdfs.


Super Guru

@Nikhil Belure

You can use either:

NiFi for this case by using

List+Fetch File[Sftp] processors and use PutHDFS processor


Try using hadoop distcp to copy local files into HDFS as described in this thread.


If your directory have a lot of files in it then it would be much more faster if you tar (or) zip the files and then run copyFromLocal command.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.