- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Flume agent on edge node
- Labels:
-
Apache Flume
-
Apache Hadoop
Created on ‎05-18-2016 12:27 PM - edited ‎09-16-2022 03:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have below questions related to Flume,
- On which node should Flume agent run ? On Edge node or one of Hadoop cluster node ?
- Do I need to run Flume agent using nohup in production as it may keep running until interrupted
Created ‎05-19-2016 12:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
http://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_flume_service.html
You can have multiple flume services within a CM cluster. Each configuration would be separate.
-pd
Created ‎05-18-2016 03:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are using flume to collect events from other applications and send downstream to another agent which then delivers to its final destination (hdfs, solr, etc), then you can run that agent on a cluster node, or on the machine where the events are being generated.
If it is not running on a CDH node, you can use packages to install flume, and then use the stop and start scripts to start it and keep it running as a daemon.
-pd
Created ‎05-19-2016 07:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for solution.
In my case, I am reading logs from webserver and dumping in HDFS.
Currently I am running agent on web server and edge node (this node is not part of cluster but all clients installed on it, so I can run flume agent here by manual flume-ng command) to push data to HDFS.
What is difference in running Flume on edge node (like I am currently running) and running Flume on one of cluster node (as you suggested) ?
Also I don’t know where to find the start and stop script, do I need to write my own ?
We are using CDH - 5.3.3 and Flume 1.5.0
Any help appreciated
Created ‎05-19-2016 10:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If your edge node is part of the cluster, and you are using parcels, then you won't have start and stop scripts, and the recommended method to run flume is by setting up a flume service in CM to run on the edge node.
The only difference between an edge node and a cluster node, is that the edge nodes generally don't run hadoop services.
Have you installed the flume rpms on this edge node or are you using parcels? Where are you running the flume-ng command from:
which flume-ng alternatives --display flume-ng
-pd
Created ‎05-19-2016 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess Flume installed using parcels. I am running Flume-ng commands on edge node.
Below are details,
[@ ~]$ which flume-ng
/usr/bin/flume-ng
[@ ~]$ alternatives --display flume-ng
flume-ng - status is auto.
link currently points to /opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/bin/fl ume-ng
/opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/bin/flume-ng - priority 10
Current `best' version is /opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/bin/fl ume-ng.
Also your will be very helpfull if provide details about setting up a flume service in CM.
Thank you
Created ‎05-19-2016 11:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can see Flume running on CM portal, it means we already have Flume as service on Cloudera Manager.
Created ‎05-19-2016 12:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
http://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_flume_service.html
You can have multiple flume services within a CM cluster. Each configuration would be separate.
-pd
Created ‎05-26-2016 05:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for detail reply.
I have initiated Flume as service on Edge node and its as expected.
