04-11-2018
08:13 AM
- last edited on
04-12-2018
06:59 AM
by
cjervis
Hi,
I have two agents running in different servers. I´ve configured then to send text files from one machine to another with avro. When the file arrives at HDFS, I see one tiny .tmp file. If I close the second agent which saves the file into HDFS, the file is renamed and I see the entire file there.
Could you help me please?
I´m sending both the conf files and the log:
First conf:
Agent1.sources = spooldir-source
Agent1.channels = file-channel
Agent1.sinks = avro-sink
# Describe/configure Source
Agent1.sources.spooldir-source.type = spooldir
Agent1.sources.spooldir-source.spoolDir = /path1
Agent1.sources.spooldir-source.inputCharset = ISO-8859-1
Agent1.sources.spooldir-source.fileSuffix = .OK
Agent1.sources.spooldir-source.decodeErrorPolicy = IGNORE
# Describe the sink
Agent1.sinks.avro-sink.type = avro
Agent1.sinks.avro-sink.hostname = 99.99.99.99
Agent1.sinks.avro-sink.port = 58000
#Use a channel which buffers events in file
#Agent1.channels.file-channel.type = memory
Agent1.channels.file-channel.type = file
Agent1.channels.file-channel.capacity = 10000
Agent1.channels.file-channel.transactionCapacity = 10000
Agent1.channels.file-channel.write-timeout = 60
Agent1.channels.file-channel.checkpointDir = /pathcp
Agent1.channels.file-channel.dataDirs = /pathdd
# Bind the source and sink to the channel
Agent1.sources.spooldir-source.channels = file-channel
Agent1.sinks.avro-sink.channel = file-channel
Conf 2:
Agent2.sources = avro-source
Agent2.channels = file-channel
Agent2.sinks = hdfs-sink
# Describe/configure Source
Agent2.sources.avro-source.type = avro
Agent2.sources.avro-source.bind = 99.99.99.99
Agent2.sources.avro-source.port = 58000
# Describe the sink
Agent2.sinks.hdfs-sink.type = hdfs
Agent2.sinks.hdfs-sink.hdfs.path = /pathHdfs
Agent2.sinks.hdfs-sink.hdfs.rollInterval = 0
Agent2.sinks.hdfs-sink.hdfs.rollCount = 0
Agent2.sinks.hdfs-sink.hdfs.fileType = DataStream
Agent2.sinks.hdfs-sink.hdfs.rollSize = 268435456
Agent2.sinks.hdfs-sink.hdfs.batchSize =10000
#Use a channel which buffers events in file
Agent2.channels.file-channel.type = file
Agent2.channels.file-channel.checkpointInterval = 300000
Agent2.channels.file-channel.keep-alive = 1
Agent2.channels.file-channel.checkpointOnClose = true
Agent2.channels.file-channel.checkpointDir = /pathcd
Agent2.channels.file-channel.dataDirs = /pathdd
# Bind the source and sink to the channel
Agent2.sources.avro-source.channels = file-channel
Agent2.sinks.hdfs-sink.channel = file-channel
Logs:
18/04/11 11:56:41 INFO hdfs.BucketWriter: Creating / pathHdfs /FlumeData.1523458601607.tmp
^C18/04/11 12:00:15 INFO lifecycle.LifecycleSupervisor: Stopping lifecycle supervisor 10
18/04/11 12:00:15 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider stopping
18/04/11 12:00:15 INFO hdfs.HDFSEventSink: Closing / pathHdfs /FlumeData
18/04/11 12:00:15 INFO hdfs.BucketWriter: Closing /pathHdfs/FlumeData.1523458601607.tmp
18/04/11 12:00:15 INFO hdfs.BucketWriter: Renaming / pathHdfs /FlumeData.1523458601607.tmp to / pathHdfs /FlumeData.1523458601607
04-12-2018 09:22 AM