Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hortonworks Data Flow (Apache Nifi)

Solved Go to solution

Hortonworks Data Flow (Apache Nifi)

New Contributor

Hi,

I am new to HDF and have few queries on HDF and its configuration. Can anyone please answer my below queries.

  1. What are the steps required to define a workflow so that a Nifi job can be called. I am looking for something similar to Oozie, which can be used to schedule any task related to Hadoop. In a similar context, I am looking how to achieve the same in HDF
  2. What are the ways to secure access to HDF cluster? We wanted to have a HDF cluster on AWS and have a VPC established from our network to AWS. Alongside, we want the HDF cluster to be secured and ring fenced such that designated people / machines only are able to invoke Ni-Fi processing. How to achieve the same?
  3. Extending the security question, can something similar to Knox is available for HDF. If not, how to achieve similar ring-fencing?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hortonworks Data Flow (Apache Nifi)

4 REPLIES 4

Re: Hortonworks Data Flow (Apache Nifi)

Re: Hortonworks Data Flow (Apache Nifi)

Mentor

@Greenhorn Techie

1. Nifi is not a replacement for Oozie, you can't schedule jobs though you can run cron commands and execute shell commands within Nifi. It's not a start and stop operation, it continuously runs until you explicitly stop it. You can take a look at rest api to start and stop workflow if that's what you're asking. In the next release, nifi will have scripting capabilities so essentially you can execute groovy, shell, maybe python and maybe pig but I cannot comment on the last two.

2. https://community.hortonworks.com/content/kbentry/886/securing-nifi-step-by-step.html

3. file a jira

Highlighted

Re: Hortonworks Data Flow (Apache Nifi)

New Contributor

@Artem Ervits Thanks for the info. For the first query, my intention was not to see whether Nifi works as a Oozie replacement, but to see how to get functionality like oozie in HDF world. On further reading, I found out that at each processor level, I can have scheduling (timer based, cron based or event based etc). This is sufficient for our requirements.

For security, I need to look into it deeper. Will come back later with further queries.

Many Thanks

Re: Hortonworks Data Flow (Apache Nifi)

New Contributor

@Neeraj Sabharwal @Artem Ervits

Just wondering what is the best mechanism to ingest data from relational sources into HDP. To use a combination of ExecuteSQL and putHDFS processors or to use Sqoop and deliver the data to HDP?

Many Thanks

Don't have an account?
Coming from Hortonworks? Activate your account here