Created 12-05-2017 08:54 AM
I need to get files from SFTP server to HDFS directly
Created 12-06-2017 04:51 PM
Apache NiFi can read from sFTP and then use the PutHDFS to put that raw file in an HDFS directory.
Two boxes, one line, no code.
5 minutes of work.
Created 12-06-2017 03:04 PM
If you have an HDF cluster running, you can create a NiFi flow to accomplish this. Otherwise you will need a client to download the file first before importing into HDFS.
Created 12-06-2017 05:04 PM
Hi @anarasimham, Thanks for response..
I am having HDP cluster, how to work out with Nifi in my cluster? And I need to automate this job, because I need to insert those files continuously to Hive tables.
Created 12-06-2017 05:12 PM
If you're using Ambari 2.5.2 you should be able to install NiFi on the same cluster using the HDF management pack: https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1.1/bk_installing-hdf-and-hdp/content/ch_inst...
Yes you can automate the job with NiFi - you'll have to create a way to query your SFTP endpoint for incremental changes and then get those new files.
Created 12-06-2017 04:51 PM
Apache NiFi can read from sFTP and then use the PutHDFS to put that raw file in an HDFS directory.
Two boxes, one line, no code.
5 minutes of work.
Created 12-06-2017 05:01 PM
Hi @Timothy Spann, Thank u for response..can we automate this job? because I need to get files continusly to Hive tables.I am having HDP cluster,can we install Nifi in this?
Created on 12-06-2017 05:16 PM - edited 08-17-2019 08:06 PM
Yes continuously, automatically.
By default it polls for new files every 60 seconds, you can shrink that.
You can also convert those files to Apache ORC and auto build new Hive tables on them if the files are CSV, TSV, Avro, Excel, JSON, XML, EDI, HL7 or C-CDA.
Install Apache NiFi on an edge node, there are ways to combine them with HDP 2.6 and HDF 3 with the new Ambari. But it's easiest to have a separate node for Apache NiFi to start.
You can also just download nifi unzip and run on a laptop that has JDK 8 installed
https://www.apache.org/dyn/closer.lua?path=/nifi/1.4.0/nifi-1.4.0-bin.zip