Support Questions
Find answers, ask questions, and share your expertise

how to improve performance of copyfromlocal

how to improve performance of copyfromlocal

Explorer

We have to load 1 month data from local file system to hdfs.

to load 1 days data its taking 30 min,to load 1 month data its taking 15 hrs.

so how to improve speed of loading data from local file system to hdfs.

1 REPLY 1
Highlighted

Re: how to improve performance of copyfromlocal

Super Guru

@Nikhil Belure

You can use either:

NiFi for this case by using

List+Fetch File[Sftp] processors and use PutHDFS processor


(or)


Try using hadoop distcp to copy local files into HDFS as described in this thread.

(or)

If your directory have a lot of files in it then it would be much more faster if you tar (or) zip the files and then run copyFromLocal command.