Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to improve performance of copyfromlocal

how to improve performance of copyfromlocal

Explorer

We have to load 1 month data from local file system to hdfs.

to load 1 days data its taking 30 min,to load 1 month data its taking 15 hrs.

so how to improve speed of loading data from local file system to hdfs.

1 REPLY 1

Re: how to improve performance of copyfromlocal

Super Guru

@Nikhil Belure

You can use either:

NiFi for this case by using

List+Fetch File[Sftp] processors and use PutHDFS processor


(or)


Try using hadoop distcp to copy local files into HDFS as described in this thread.

(or)

If your directory have a lot of files in it then it would be much more faster if you tar (or) zip the files and then run copyFromLocal command.

Don't have an account?
Coming from Hortonworks? Activate your account here