Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Integration between Apache Pig, Apache Nifi and Apache Spark

Solved Go to solution

Integration between Apache Pig, Apache Nifi and Apache Spark

Super Guru

What are the various ways to integrate Apache Pig, Nifi and Spark?

I know I can connect some with Kafka or via files.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Integration between Apache Pig, Apache Nifi and Apache Spark

hello Timothy

There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could

- put your data to kafka and have spark read from it

- push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark

- you could use and execute script processor and start a pig job

In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe

hope this shares some insight

1 REPLY 1
Highlighted

Re: Integration between Apache Pig, Apache Nifi and Apache Spark

hello Timothy

There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could

- put your data to kafka and have spark read from it

- push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark

- you could use and execute script processor and start a pig job

In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe

hope this shares some insight