Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Integration between Apache Pig, Apache Nifi and Apache Spark

avatar
Master Guru

What are the various ways to integrate Apache Pig, Nifi and Spark?

I know I can connect some with Kafka or via files.

1 ACCEPTED SOLUTION

avatar

hello Timothy

There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could

- put your data to kafka and have spark read from it

- push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark

- you could use and execute script processor and start a pig job

In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe

hope this shares some insight

View solution in original post

1 REPLY 1

avatar

hello Timothy

There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could

- put your data to kafka and have spark read from it

- push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark

- you could use and execute script processor and start a pig job

In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe

hope this shares some insight