Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎10-11-2017

Flume Kafka HDFS Sink Empty Lines

Hi,

I am using Flume to sink data from Kafka topic to HDFS with channel as KafkaChannel. At first, Flume was inserting an empty line after each record in the HDFS file, and when I query the Hive table pointing to this directory, I am seeing null line after each record. I have added the appendNewLine=false to my Flume config file, which eliminated the empty line between records. But for every HDFS file flume creates, the first line is always empty, and the hive query also shows first line as Null followed by records. Do I have to add any property to my Flume config file which eliminate the null lines?. Please Suggest. Below is my Flume Config.

 

ftest.channels = ctest
ftest.sinks = stest

ftest.channels.ctest.type = org.apache.flume.channel.kafka.KafkaChannel
ftest.channels.ctest.brokerList = broker1-host:9092,broker2-host:9092,broker3-host:9092
ftest.channels.ctest.topic = ftest_pb
ftest.channels.ctest.groupId = ftest_pb_flume
ftest.channels.ctest.zookeeperConnect = host1:2181,host2:2181,host3:2181
ftest.channels.ctest.parseAsFlumeEvent = false
ftest.channels.ctest.kafka.consumer.session.timeout.ms=120000
ftest.channels.ctest.kafka.consumer.request.timeout.ms=120002
ftest.channels.ctest.kafka.consumer.linger.ms=5000

ftest.sinks.stest.type=hdfs
ftest.sinks.stest.hdfs.path=/data/incoming/ftest_pb
ftest.sinks.stest.hdfs.filePrefix=ft
ftest.sinks.stest.hdfs.useLocalTimeStamp = true
ftest.sinks.stest.hdfs.rollSize=1024000000
ftest.sinks.stest.hdfs.batchSize=10000
ftest.sinks.stest.hdfs.rollCount=0
ftest.sinks.stest.hdfs.minBlockReplicas=1
ftest.sinks.stest.hdfs.txnEventMax=10000
ftest.sinks.stest.hdfs.callTimeout=1000000
ftest.sinks.stest.channel=ctest
ftest.sinks.stest.serializer=text
ftest.sinks.stest.serializer.appendNewline=false
ftest.sinks.stest.hdfs.kerberosPrincipal = $KERBEROS_PRINCIPAL
ftest.sinks.stest.hdfs.kerberosKeytab = $KERBEROS_KEYTAB
ftest.sinks.stest.hdfs.fileType=DataStream

Announcements