Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎05-21-2018

Streaming Log File into HBase using Flume

[ Edited ]

I am trying to stream log file into apache hbase. Th file will look like,

::1 - - [04/Apr/2018:11:00:45 +0530] "GET /rescuewagon/auth.php HTTP/1.1" 200 4596 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"

I am trying separate this log file into separate columns in hbase and my .conf will be like

agent1.sources = source1
agent1.channels = channel1
agent1.sinks = sink2


agent1.sources.source1.type = spooldir
agent1.sources.source1.spoolDir = /home/Test


agent1.sinks.sink2.type = hbase
agent1.sinks.sink2.table = Test1
agent1.sinks.sink2.columnFamily = Log_Details
agent1.sinks.sink2.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
agent1.sinks.sink2.serializer.regex = ([^ ]*),([^ ]*),([^ ]*), (-|\\[[^\\]]*\\]), ([^ \"]*|\"[^\"]*\"), (-|[0-9]*), (-|[0-9]*),
(?: ([^\"]*|\"[^\"]*\"), ([^ \"]*|\"[^\"]*\"))?
agent1.sinks.sink2.serializer.colNames = Key,Host,Identity,User,Time,Method,Path,Protocol,Status


agent1.channels.channel1.type = memory
agent1.channels.channel1.capacity = 100000
agent1.channels.channel1.transactionCapacity = 100000


agent1.sources.source1.channels = channel1
agent1.sinks.sink2.channel = channel1

when the flume agent is started, I couldn't find any error, at the same time i couldn't find any entries in hbase. When i have tried without using "Regex Serializer", The files were pushed into hbase. I guess am doing some mistake with regex and am clueless regarding this issue. It will be great if someone can help me in getting this done, Thanks in advance

Announcements
New solutions