Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

File size transfer problem


File size transfer problem

New Contributor

I just have a little problem with the file size send by flume to kafka I do not know where the problem lies but if the file size is larger than a number x it partitions the file into pieces. the thing that leads to losing the format json wanted as output. if there is a possibility to help me find a solution, I do not know, change the size reserved for the transfer or force the file remains intact.


Flume Configuration :


# Name the components on this agenta1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the sourcea1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /root/Bureau/V1/outputCV
a1.sources.r1.fileHeader = truea1.sources.r1.interceptors = timestampInterceptor
a1.sources.r1.interceptors.timestampInterceptor.type = timestamp
a1.sources.r1.hnadler = org.apache.flume.source.http.JSONHandler

# Describe the sinka1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSinka1.sinks.k1.kafka.topic = flumekafka
a1.sinks.k1.kafka.bootstrap.servers = 
quickstart.cloudera:9090,quickstart.cloudera:9091,quickstart.cloudera:9092a1.sinks.k1.kafka.flumeBatchSize = 200000a1.sinks.k1.kafka.producer.acks = = 1a1.sinks.k1.kafka.producer.compression.type = snappy

# Use a channel which buffers events in memorya1.channels.c1.type = memory
a1.channels.c1.capacity = 1000000a1.channels.c1.transactionCapacity = 100000

# Bind the source and sink to the channela1.sources.r1.channels = c1 = c1


kafka configuration :


As default no changes made