Created 05-25-2016 10:07 AM
I have created a logic to achieve data transfer from local to hdfs and need to a alert particular mail id. please suggest me how to write scripts achieve my goal with validations or any alternative to achive.
sh for i in ‘cat compare.txt’ ; do hadoop dfs –copyFromlocal Bdata1/$i hdfs://192.168.1.xxx:8020/hbdata
Note: I tried to find direct comparison from local directory to HDFS but couldn't get so added to more steps
Created 05-25-2016 10:24 AM
IMHO, all you should avoid having complex logic with "home developed shell script". Those kind of shell scripts are good to do some quick tests, but when you want to go into PoC, you need something less error prone and also more optimal (shell scripts will launch some many java processes, leading to quite some overhead and latencies).
I recommend you to have a look at ingestion tools such as Flume (http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source) or Nifi (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.FetchFile/inde...). Those tools already have lot of features to ingest files into your cluster, and archive files then after.
Created 05-25-2016 10:24 AM
IMHO, all you should avoid having complex logic with "home developed shell script". Those kind of shell scripts are good to do some quick tests, but when you want to go into PoC, you need something less error prone and also more optimal (shell scripts will launch some many java processes, leading to quite some overhead and latencies).
I recommend you to have a look at ingestion tools such as Flume (http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source) or Nifi (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.FetchFile/inde...). Those tools already have lot of features to ingest files into your cluster, and archive files then after.
Created 05-26-2016 04:18 AM
@ Sourygna Luangsay
Thanks for your valuable post. i will try to understand NIFI with HDF and let you know.since I'm newer to big data technologies if i stuck up please help me....again thanks.