Support Questions

Find answers, ask questions, and share your expertise

Flume - source exec and sink hdfs. File is not loaded to hdfs

avatar
Rising Star

exec-source.txtHi,

I have used the below Flume program to read a file from LFS to HDFS, for learning.

But I see not folder created.

Do you see any issue in this file. I wanted to see how interceptor works.

Thank you.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

The easiest way in hortonworks hadoop is to use Ambari to run flume. It will show you some basic metrics and status of the agents.

If you dont want to use Ambari or you have some custom flume installation, i'd recommend to read this doc: http://flume.apache.org/FlumeUserGuide.html#monitoring

In any linux env you can install atleast ganglia. It will cover most of your needs in terms of agents monitoring

View solution in original post

6 REPLIES 6

avatar
Super Collaborator

It would be great to see the log of the agent

avatar
Master Guru

Check the following two lines in your sink block

source_agent.sinks.avro_sink.hdfs.filetype = Datastream
source_agent.sinks.avro_sink.hdfs.a1.sinks.k2.hdfs.path = /Revathy/Flume/%y-%m-%d/%H%M/%S 

in the first one capitals are not correct, and in the second one the property name on the left side is incorrect. Change them and retry:

source_agent.sinks.avro_sink.hdfs.fileType = DataStream
source_agent.sinks.avro_sink.hdfs.path = hdfs://sandbox.hortonworks.com:8020/user/Revathy/Flume/%y-%m-%d/%H%M/%S

avatar
Rising Star

Hi Predrag,

Thank you for your response. I am preparing for certification and trying to execute flume agent.

This is the command I use: flume-ng agent --conf conf --conf-file /Revathy/Flume/source_agent.conf --name source_agent exec-source.txt

I am trying to understand the execution messages from running the flume agent. 1. How do I confirm that the agent is running fine(successful)?

2. I have attached the conf file and execution log. There are few warnings like Configuration property ignored, No channel configured. The code looks fine to me. Can the warning be ignored or should be treated like error and fixed? log1.pnglog2.pnglog3.png 3. The source file is in LFS. The sink file is not created. Is the path - hdfs://sandbox.hortonworks.com:8080/Revathy/Flume/test, since I am not sure where to find the port?

Thank you.

avatar
Rising Star

Hi Predrag, I have found the reason for the warnings. It should be: source_agent.sinks.avro_sink.channel = memoryChannel

I have mentioned channels. I have corrected the warning and the sink file is created in hdfs. But how do I know that the flume is running successfully. Thank you.

avatar
Master Guru

Okay, great, yes, the error was about "no channel configured". Regarding the path in hdfs, I edited my answer to include the full path in hdfs including the Name node: hdfs://sandbox.hortonworks.com:8020/user/Revathy/Flume/%y-%m-%d/%H%M/%S. It's good to organize your folders in HDFS in some way, here I put your home directory in HDFS. How do one know that the Flulme agent works? Well, if it keeps on running, there are no errors in logs, and if data written to sinks is as expected. You can find a lot of details here. You can also run Flume from Ambari, in which case Ambari will let you know whether Flume process in healthy and running. However, one still has to incepct sinks to be sure.

avatar
Super Collaborator

The easiest way in hortonworks hadoop is to use Ambari to run flume. It will show you some basic metrics and status of the agents.

If you dont want to use Ambari or you have some custom flume installation, i'd recommend to read this doc: http://flume.apache.org/FlumeUserGuide.html#monitoring

In any linux env you can install atleast ganglia. It will cover most of your needs in terms of agents monitoring