Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

nifi + spark structured streaming

Highlighted

nifi + spark structured streaming

New Contributor

Hi, I want to use nifi to simulate a streaming from a large dataset that contain items A,B,C,D,E, and split each line and continuously feed to spark through Kafka to implement structured streaming and analyse data.

Dataset example:

A, B, C, D

A, C, E

D, E

B, C, D, E

....

I am currently using .txt file and using listfile->fetchfile->splittext->putkafka, and when I run submit spark, it show there are some error on topic.

92877-untitled-picture-3.png

92878-untitled-picture.png

92879-untitled-picture-2.png

I was wondering what type of file should I create to put this sample dataset and what executor should I use, also the spark code (python).

1 REPLY 1

Re: nifi + spark structured streaming

@Po-Heng Chen

I recently built a presentation around nifi->kafka->spark to showcase image analysis from twitter feeds. Please take a look at this github as I think it could help you in your case:

https://github.com/felixalbani/future-of-data-santiago-e1-spark-nifi

HTH