Support Questions

Find answers, ask questions, and share your expertise

Flume without agents on application server logs

avatar
Rising Star

Hello all,

I want to ingest data logs from application server logs into HDFS using Flume version 1.5 . Do I need to install Flume agent (client) on these application servers? How can I pull these application logs without install Flume agent? However, these servers are not part of Hadoop cluster. Can you please help?

Thanks

JN

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hi @JT Ng,

Yes, That is possible with "Netcat TCP Source" by not installing the agent on application server. however you may need to tail the log and pass on to the listener from the server where you want to feed the logs.

which means start the log push process on the source server with (on the appliaction server)

tail  -f <application Log file>.log |nc <flume_agent_host> <configured_netcat_Sync_port>

before you trigger this make sure that you initiated the fulme agent in the HDP cluster(or where the flume agent can be installed)

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 6666
a1.sources.r1.channels = c1 

Ref

on the other side you can configure the HDFS Sync to pump this HDFS file system with the following command

a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

Ref

NB : Make sure that you handle the tali and nc process while your server stops or completely shuts down your application, however you can manage the tail process with proper shell includes restartability as a service in the linux host.

Hope this helps !!

View solution in original post

2 REPLIES 2

avatar
Super Collaborator

Hi @JT Ng,

Yes, That is possible with "Netcat TCP Source" by not installing the agent on application server. however you may need to tail the log and pass on to the listener from the server where you want to feed the logs.

which means start the log push process on the source server with (on the appliaction server)

tail  -f <application Log file>.log |nc <flume_agent_host> <configured_netcat_Sync_port>

before you trigger this make sure that you initiated the fulme agent in the HDP cluster(or where the flume agent can be installed)

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 6666
a1.sources.r1.channels = c1 

Ref

on the other side you can configure the HDFS Sync to pump this HDFS file system with the following command

a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

Ref

NB : Make sure that you handle the tali and nc process while your server stops or completely shuts down your application, however you can manage the tail process with proper shell includes restartability as a service in the linux host.

Hope this helps !!

avatar
Rising Star

Thank you bkosaraju