Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Flume without agents on application server logs

Solved Go to solution

Flume without agents on application server logs

Hello all,

I want to ingest data logs from application server logs into HDFS using Flume version 1.5 . Do I need to install Flume agent (client) on these application servers? How can I pull these application logs without install Flume agent? However, these servers are not part of Hadoop cluster. Can you please help?

Thanks

JN

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Flume without agents on application server logs

Super Collaborator

Hi @JT Ng,

Yes, That is possible with "Netcat TCP Source" by not installing the agent on application server. however you may need to tail the log and pass on to the listener from the server where you want to feed the logs.

which means start the log push process on the source server with (on the appliaction server)

tail  -f <application Log file>.log |nc <flume_agent_host> <configured_netcat_Sync_port>

before you trigger this make sure that you initiated the fulme agent in the HDP cluster(or where the flume agent can be installed)

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 6666
a1.sources.r1.channels = c1 

Ref

on the other side you can configure the HDFS Sync to pump this HDFS file system with the following command

a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

Ref

NB : Make sure that you handle the tali and nc process while your server stops or completely shuts down your application, however you can manage the tail process with proper shell includes restartability as a service in the linux host.

Hope this helps !!

View solution in original post

2 REPLIES 2
Highlighted

Re: Flume without agents on application server logs

Super Collaborator

Hi @JT Ng,

Yes, That is possible with "Netcat TCP Source" by not installing the agent on application server. however you may need to tail the log and pass on to the listener from the server where you want to feed the logs.

which means start the log push process on the source server with (on the appliaction server)

tail  -f <application Log file>.log |nc <flume_agent_host> <configured_netcat_Sync_port>

before you trigger this make sure that you initiated the fulme agent in the HDP cluster(or where the flume agent can be installed)

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 6666
a1.sources.r1.channels = c1 

Ref

on the other side you can configure the HDFS Sync to pump this HDFS file system with the following command

a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

Ref

NB : Make sure that you handle the tali and nc process while your server stops or completely shuts down your application, however you can manage the tail process with proper shell includes restartability as a service in the linux host.

Hope this helps !!

View solution in original post

Highlighted

Re: Flume without agents on application server logs

Thank you bkosaraju

Don't have an account?
Coming from Hortonworks? Activate your account here