Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to copy data From hdfs to Nifi(Nifi is in separate Cluster and Hadoop is in separate cluster)

avatar
Rising Star

Hi All,

i have created 2 node nifi cluster, and 3 node hadoop cluster(hortonwork), Now i want to interact with hdfs from Nifi for this what are the configuration properties i have to set. is have to copy core-site.xml into nifi bin? , then how to create processor for copy data from one hdfs directory to another . please explain with simple example.

1 ACCEPTED SOLUTION

avatar

You can use PutHDFS or FetchHDFS depending on your using. You need to set the properties Hadoop Configuration Resources in the processor which you can point to the location where you have core-site and hdfs-site configuration files. Make sure network connections are OK between NiFi and HDP. Also didnt understand your usecase fully, "Why you would like to use NiFI to copy files between HDFS directories?"

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

Hi @AnjiReddy Anumolu I would take a look at the twitter streaming example. It puts files into HDFS but the configuration is somewhat similar.

https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.h...

avatar

You can use PutHDFS or FetchHDFS depending on your using. You need to set the properties Hadoop Configuration Resources in the processor which you can point to the location where you have core-site and hdfs-site configuration files. Make sure network connections are OK between NiFi and HDP. Also didnt understand your usecase fully, "Why you would like to use NiFI to copy files between HDFS directories?"

avatar
Master Guru

Are you looking to use NiFi (on a different cluster) to move files within the Hadoop cluster? This may not be the most efficient approach as you will be moving files off the cluster just to move them back again.

If you can install NiFi on the Hadoop cluster (or so it has a Hadoop client), you could use ExecuteProcess or ExecuteStreamCommand to do something like "hadoop fs -copy /path/to/file /new/path/to/file".

avatar
Contributor

Hi @milind pandit, @AnjiReddy Anumolu,

I am facing the same issue. Would you please share how did you resolve this issue? I am confused about the GetFile and PutHDFS processor configuration. let me explain a little more:

I just want to setup a data flow with nifi for now to test whether i can transfer files from HDF cluster to HDFS of HDP cluster.For that I am just using two processor "GetFile" and "PutHDFS". I deployed 2 clusters with 2 nodes of each- HDF 3.1(2 nodes-Nifi and services) and HDP 2.6.4(2 nodes- Master and worker). Now I want to transfer files through Nifi and write those data in HDFS . How can I do that? Please share your experiences how you resolved this. Thank you.