Support Questions
Find answers, ask questions, and share your expertise


New Contributor

Hi, We are using cloudera 5.4 and  flafka to stream in the data from source database to Hadoop.


basically the data is ingested to Kafka in plain String format and the source DB table name is used as the key for each message.


we are using flume hdfs sink to store the message to Hadoop.


our configuration is similar as what's documented here (Kafka channel) except we want to store

the messages from the same DB table to its own file. since each message in kafka is keyed by the db table, i'm hoping i can do something

like the following


tier1.sinks.sink1.hdfs.path = /tmp/kafka/%{topic}/%{messageKey}.csv

Can anybody please let me know if there is a way to store kafka messages to HDFS using the message key as file name?




Re: Flafka

Cloudera Employee

HI Leon,



When the Kafka source reads from kafka, it will look for the message key and set it in the "key" header.


So you should be able to do this

tier1.sinks.sink1.hdfs.path = /tmp/kafka/%{topic}/%{key}.csv



But I haven't tested this.


If you are using the Kafka Channel and going directly to HDFS, then you're out of luck, as we don't really use the headers / message keys for text or really any non Flume Avro Event messages.