Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

import data from kafka topics to HDFS

Highlighted

import data from kafka topics to HDFS

Hello i am new in HW can you please tell me how to import data from kafka topics to HDFS thanks :^

Ps i don't want to use nifi and i want to work with spark streaming

3 REPLIES 3
Highlighted

Re: import data from kafka topics to HDFS

Mentor

@maha Rm

Can you describe your Kafka cluster? Is a standalone cluster and you would like to sink the data to hdfs?
here is a link to confluent HDFS connector

Re: import data from kafka topics to HDFS

@Geoffrey Shelton Okot

I am working in hortonworks sandbox ( ambari 2.6) i created a topic , producer and a consumer i ingested the data in the consumer i want now to send this data from the consumer of kafka to HDFS

Highlighted

Re: import data from kafka topics to HDFS

Super Collaborator

I would suggest you use HDFS connect rather than Spark Streaming as it is more fault tolerant.

Kafka Connect is built into the base Kafka libraries, but you need to compile and add HDFS Connect separately to the classpath of Connect. Build from here: https://github.com/confluentinc/kafka-connect-hdfs and use a tagged branch rather than master as the releases are publicly available libraries, not SNAPSHOT builds that require you to compile Kafka from source.

Don't have an account?
Coming from Hortonworks? Activate your account here