Support Questions

jeanette75007 · ‎12-14-2017

Hello all,

I want to ingest data logs from application server logs into HDFS using Flume version 1.5 . Do I need to install Flume agent (client) on these application servers? How can I pull these application logs without install Flume agent? However, these servers are not part of Hadoop cluster. Can you please help?

Thanks

JN

bkosaraju · ‎12-14-2017

Hi @JT Ng,

Yes, That is possible with "Netcat TCP Source" by not installing the agent on application server. however you may need to tail the log and pass on to the listener from the server where you want to feed the logs.

which means start the log push process on the source server with (on the appliaction server)

tail  -f <application Log file>.log |nc <flume_agent_host> <configured_netcat_Sync_port>

before you trigger this make sure that you initiated the fulme agent in the HDP cluster(or where the flume agent can be installed)

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 6666
a1.sources.r1.channels = c1

Ref

on the other side you can configure the HDFS Sync to pump this HDFS file system with the following command

a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

Ref

NB : Make sure that you handle the tali and nc process while your server stops or completely shuts down your application, however you can manage the tail process with proper shell includes restartability as a service in the linux host.

Hope this helps !!

View solution in original post

bkosaraju · ‎12-14-2017

Hi @JT Ng,

Yes, That is possible with "Netcat TCP Source" by not installing the agent on application server. however you may need to tail the log and pass on to the listener from the server where you want to feed the logs.

which means start the log push process on the source server with (on the appliaction server)

tail  -f <application Log file>.log |nc <flume_agent_host> <configured_netcat_Sync_port>

before you trigger this make sure that you initiated the fulme agent in the HDP cluster(or where the flume agent can be installed)

a1.sources = r1
a1.channels = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 6666
a1.sources.r1.channels = c1

Ref

on the other side you can configure the HDFS Sync to pump this HDFS file system with the following command

a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute

Ref

NB : Make sure that you handle the tali and nc process while your server stops or completely shuts down your application, however you can manage the tail process with proper shell includes restartability as a service in the linux host.

Hope this helps !!

jeanette75007 · ‎12-15-2017

Thank you bkosaraju

Cloudera Community

Support Questions

Flume without agents on application server logs

ERROR Installing Agent

Unable to obtain logs from a yarn application. How...

Introduction to Application Timeline Server

Flume without agents on web server

Flume Agent Start/Stop/Restart Operations through ...

Steps to fix Ambari-server & agent expired certs

Application timeline server is crashing

Application Timeline Server (ATS) issue error code...

While running flume agent facing some error

Ambari Server logging on daily log rotation