Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Get Data from HDP using HDF

avatar

I'm using both HDF and HDP , and I'm using NIFI in HDF to stream data into HDP, but in a specific ETL use case i need to fetch data in HDP HDFS , what's the best practice to do this in NIFI , can HDF connect to HDP hdfs ?

1 ACCEPTED SOLUTION

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
7 REPLIES 7

avatar
Super Guru
@nedox nedox

HDF is not an ETL tool. How much data do you want to fetch from HDP? If it's a big chunk (millions of records or more), then why not use Sqoop? Can you please describe what you intend to do with the data you fetch?

avatar

no it's rather a small chunk of thouands of records

avatar
Super Guru

@nedox nedox

then just use GetHDFS or ListHDFS -> FetchHDFS. In these processors you will have to specify client config files from your hDP cluster and that's how it knows where to connect, which keytab and principal to use if Kerberos is enabled and which directories to fetch files from.

avatar
Master Guru

Certainly! You can get files from HDFS using the GetHDFS processor or the ListHDFS -> FetchHDFS processors.

avatar

How can i specifiy in the listHDFS processor that it need to list hdfs of my HDP cluster rather than the HDF ?

avatar
Expert Contributor

There are a number of HDFS based processors in Ni-Fi, including GetHDFS, FetchHDFS, GetHDFSEvents, etc.

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.GetHDFS/

Native processors can read or write to HDFS, depending on your requirement.

Full docs below:

http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.1.2/index.html

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login