Created on 04-03-2016 11:04 PM - edited 09-16-2022 03:12 AM
I'm trying to load data from local to hdfs using spooldir source getting process failed error
Here My error
process failed org.apache.flume.ChannelException: Take list for MemoryTransaction, capacity 100 full, consider committing more frequently, increasing capacity, or increasing thread count at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doTake(MemoryChannel.java:96) at org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113) at org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:374) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:745)
and here my confi file
Local agent
agent.sources.localsource.type = spooldir
#agent.sources.localsource.shell = /bin/bash -c
agent.sources.localsource.spoolDir = /home/dwh/teja/Flumedata/
agent.sources.localsource.fileHeader = true
# The channel can be defined as follows.
agent.sources.localsource.channels = memoryChannel
# Each sink's type must be defined
agent.sinks.avro_Sink.type = avro
agent.sinks.avro_Sink.hostname=192.168.4.444
agent.sinks.avro_Sink.port= 8021
agent.sinks.avro_Sink.avro.batchSize = 1000
agent.sinks.avro_Sink.avro.rollCount = 0
agent.sinks.avro_Sink.avro.rollSize = 1000000
agent.sinks.avro_Sink.avro.rollInterval = 0
agent.sinks.avro_Sink.channel = memoryChannel
# Each channel's type is defined.
agent.channels.memoryChannel.type = memory
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity = 10000
Remote config file
# Please paste flume.conf here. Example:
# Sources, channels, and sinks are defined per
# agent name, in this case 'tier1'.
tier1.sources = source1
tier1.channels = channel1
tier1.sinks = sink1
tier1.sources.source1.type = avro
tier1.sources.source1.bind = 192.168.4.444
tier1.sources.source1.port=8021
tier1.sources.source1.channels = channel1
tier1.channels.channel1.type = memory
tier1.sinks.sink1.type = hdfs
tier1.sinks.sink1.channel = channel1
tier1.sinks.sink1.hdfs.path = hdfs://192.168.4.444:8020/user/hadoop/flumelogs/
tier1.sinks.sink1.hdfs.fileType = DataStream
tier1.sinks.sink1.hdfs.writeFormat= Text
tier1.sinks.sink1.hdfs.batchSize = 1000
tier1.sinks.sink1.hdfs.rollCount = 0
tier1.sinks.sink1.hdfs.rollSize = 1000000
tier1.sinks.sink1.hdfs.rollInterval = 0
tier1.channels.channel1.capacity = 10000
tier1.channels.channel1.transactioncapacity=10000
Please Help.
Created 04-04-2016 03:30 AM
Hi, I got it by changing roll size and batch size now its working fine.
rollSize = 100000 and batchsize=100
Created 04-04-2016 03:30 AM
Hi, I got it by changing roll size and batch size now its working fine.
rollSize = 100000 and batchsize=100