Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to improve performance of copyfromlocal
Labels:
- Labels:
-
Apache Hadoop
Explorer
Created ‎04-03-2019 04:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have to load 1 month data from local file system to hdfs.
to load 1 days data its taking 30 min,to load 1 month data its taking 15 hrs.
so how to improve speed of loading data from local file system to hdfs.
1 REPLY 1
Master Guru
Created ‎04-03-2019 11:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use either:
NiFi for this case by using
List+Fetch File[Sftp] processors and use PutHDFS processor
(or)
Try using hadoop distcp to copy local files into HDFS as described in this thread.
(or)
If your directory have a lot of files in it then it would be much more faster if you tar (or) zip the files and then run copyFromLocal command.
