Support Questions
Find answers, ask questions, and share your expertise

Streaming data from Kafka to HDFS: All relevant solutions

Super Collaborator

In order to understand what it would take to work with various streaming tools, I have defined this question as an umbrella for making the overview of ways to stream data.

 

For consistency I picked a simple reference usecase: Messages arrive from kafka, and need to be put on HDFS.

 

Source topic name: input

Output folder name on HDFS: output

 

The core usecase is picking up a bit of data from Kafka, and putting it on HDFS.

The bonus usecase is ensuring that new field C is defined by dividing fields A and B which both occur in the data, and ideally the schema would be used for this.

 

Subquestions:

Streaming data from Kafka to HDFS with NiFi

Streaming data from Kafka to HDFS with Flink

Streaming data from Kafka to HDFS with Flink SQL

Streaming data from Kafka to HDFS with Spark Interactive

Streaming data from Kafka to HDFS with a Spark Jar

Streaming data from Kafka to HDFS with Kafka Connect

 

If a substep is well documented, do not hesitate to refer to it, but please ensure the end-to-end process is documented including building and deployment.

 

If you notice this question is not specified well, or if there is something blocking one of the subquestions to be answered, please post a comment. 

 

 

 

 


- Dennis Jaheruddin

If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'.
1 REPLY 1

Super Collaborator

The subquestions can be found here, please note that these may or may not have been answered yet:

 

Subquestions:

Streaming data from Kafka to HDFS with NiFi

Streaming data from Kafka to HDFS with Flink Jar

Streaming data from Kafka to HDFS with Flink SQL

Streaming data from Kafka to HDFS with Spark Interactive

Streaming data from Kafka to HDFS with a Spark Jar

 
Also note that the questions ask for an example, though there may be multiple language choices and other decisions to be made. 
 

- Dennis Jaheruddin

If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'.
; ;