Created 04-03-2019 04:46 PM
We have to load 1 month data from local file system to hdfs.
to load 1 days data its taking 30 min,to load 1 month data its taking 15 hrs.
so how to improve speed of loading data from local file system to hdfs.
Created 04-03-2019 11:37 PM
@Nikhil Belure
You can use either:
NiFi for this case by using
List+Fetch File[Sftp] processors and use PutHDFS processor
(or)
Try using hadoop distcp to copy local files into HDFS as described in this thread.
If your directory have a lot of files in it then it would be much more faster if you tar (or) zip the files and then run copyFromLocal command.