Support Questions
Find answers, ask questions, and share your expertise

Integration between Apache Pig, Apache Nifi and Apache Spark

Super Guru

What are the various ways to integrate Apache Pig, Nifi and Spark?

I know I can connect some with Kafka or via files.

1 ACCEPTED SOLUTION

hello Timothy

There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could

- put your data to kafka and have spark read from it

- push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark

- you could use and execute script processor and start a pig job

In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe

hope this shares some insight

View solution in original post

1 REPLY 1

hello Timothy

There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could

- put your data to kafka and have spark read from it

- push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark

- you could use and execute script processor and start a pig job

In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe

hope this shares some insight

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.