Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

moving data from kafka to hdfs

New Contributor

 I need a Help i have two question please

1-how I can transform the data from apache kafka to hdfs????

2-how I can transform the data from apache kafka to SparkStreming????

 

thank you

1 ACCEPTED SOLUTION

Contributor

You can use flume or nifi to publish data from kafka to nifi:

 

a. Using flume 

Kafka Source -> Flume -> HDFS

b. Using Nifi:

 

Configure PublishKafka processor --> PutHdfs processor 

 

And to integrate kafka for spark streaming you need to build spark streaming job, refer the below doc. for  more details:

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_spark-component-guide/content/using-spark-s...

View solution in original post

1 REPLY 1

Contributor

You can use flume or nifi to publish data from kafka to nifi:

 

a. Using flume 

Kafka Source -> Flume -> HDFS

b. Using Nifi:

 

Configure PublishKafka processor --> PutHdfs processor 

 

And to integrate kafka for spark streaming you need to build spark streaming job, refer the below doc. for  more details:

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_spark-component-guide/content/using-spark-s...

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.