Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

moving data from kafka to hdfs

avatar
New Contributor

 I need a Help i have two question please

1-how I can transform the data from apache kafka to hdfs????

2-how I can transform the data from apache kafka to SparkStreming????

 

thank you

1 ACCEPTED SOLUTION

avatar
Expert Contributor

You can use flume or nifi to publish data from kafka to nifi:

 

a. Using flume 

Kafka Source -> Flume -> HDFS

b. Using Nifi:

 

Configure PublishKafka processor --> PutHdfs processor 

 

And to integrate kafka for spark streaming you need to build spark streaming job, refer the below doc. for  more details:

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_spark-component-guide/content/using-spark-s...

View solution in original post

1 REPLY 1

avatar
Expert Contributor

You can use flume or nifi to publish data from kafka to nifi:

 

a. Using flume 

Kafka Source -> Flume -> HDFS

b. Using Nifi:

 

Configure PublishKafka processor --> PutHdfs processor 

 

And to integrate kafka for spark streaming you need to build spark streaming job, refer the below doc. for  more details:

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_spark-component-guide/content/using-spark-s...