<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Flume with oozie in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172672#M50223</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/9789/vamsivalivetiedu.html" nodeid="9789"&gt;@vamsi valiveti&lt;/A&gt; the easiest way is to detach shell from the command using nohup:&lt;/P&gt;&lt;PRE&gt;nohup &amp;lt;my_command&amp;gt; &amp;amp;
&lt;/PRE&gt;&lt;P&gt;Another option is to create flume init.d service script. I've posted some example script &lt;A target="_blank" href="http://www.dataprocessingtips.com/2016/04/03/data-ingestion-using-flume-ganglia-monitoring-on-ec2/"&gt;here &lt;/A&gt;(search for "Setup flume agent auto startup" on the page), and run the flume as a service.&lt;/P&gt;&lt;P&gt;And third option is to use Ambari to control the agents.
&lt;A target="_blank" href="http://www.dataprocessingtips.com/2016/04/03/data-ingestion-using-flume-ganglia-monitoring-on-ec2/"&gt;&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 29 Dec 2016 22:40:18 GMT</pubDate>
    <dc:creator>bluesmix</dc:creator>
    <dc:date>2016-12-29T22:40:18Z</dc:date>
    <item>
      <title>Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172666#M50217</link>
      <description>&lt;P&gt;We are using Flume to get the data into HDFS.After that we are running pig, hive for data transformation.Not sure how to trigger flume from oozie?&lt;/P&gt;</description>
      <pubDate>Tue, 27 Dec 2016 19:52:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172666#M50217</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-12-27T19:52:45Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172667#M50218</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/9789/vamsivalivetiedu.html" nodeid="9789"&gt;@vamsi valiveti&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Oozie is a scheduler and Flume is not working on a schedule basis instead Flume is treating the data when it receives it. So you use teh Flume configuration to tell for example that each time there is a file in a certain directory Flume will put it in hdfs (if you use the spooldir source) and so on.&lt;/P&gt;&lt;P&gt;/Best regards, Mats&lt;/P&gt;</description>
      <pubDate>Tue, 27 Dec 2016 21:52:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172667#M50218</guid>
      <dc:creator>mjohansson</dc:creator>
      <dc:date>2016-12-27T21:52:01Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172668#M50219</link>
      <description>&lt;P&gt;a)I am starting flume agent using below command.In production how we will trigger this command currently I am running manually on unix command prompt and also i want to create dependeny with hive?&lt;/P&gt;&lt;P&gt;b)can i place below command in unix shell script and call it in shell action in oozie?&lt;/P&gt;&lt;PRE&gt;flume-ng agent --conf $FLUME_CONF_DIR --conf-file $FLUME_CONF_DIR/flume.conf --name Agent7
&lt;/PRE&gt;</description>
      <pubDate>Tue, 27 Dec 2016 23:19:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172668#M50219</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-12-27T23:19:50Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172669#M50220</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/210/mjohansson.html" nodeid="210"&gt;@Mats Johansson&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Any input on my clarifications&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2016 20:09:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172669#M50220</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-12-29T20:09:29Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172670#M50221</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/9789/vamsivalivetiedu.html" nodeid="9789"&gt;@vamsi valiveti&lt;/A&gt; you can trigger Flume from the oozie shell action. However pay attention that action will be executed on random cluster node, so all your nodes should have Flume installed. Also you will need to somehow control the agents after that, and if you have &amp;gt;10 nodes it became a problem.. That's why is not common scenario of flume usage.&lt;/P&gt;&lt;P&gt;I'd say the good approach is to keep Flume running all the time. And schedule oozie jobs to process the data whenever you need.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2016 22:22:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172670#M50221</guid>
      <dc:creator>bluesmix</dc:creator>
      <dc:date>2016-12-29T22:22:33Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172671#M50222</link>
      <description>&lt;P&gt;HI &lt;A rel="user" href="https://community.cloudera.com/users/2167/bluesmix.html" nodeid="2167"&gt;@Michael M&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thanks alot for your time.one small clarification&lt;/P&gt;&lt;PRE&gt;You mentioned good approach is to keep Flume running all the time. And schedule oozie jobs to process the data whenever you need.&lt;/PRE&gt;&lt;P&gt;clarification 1:-&lt;/P&gt;&lt;P&gt;How to keep Flume running all the time?currently i am using below command on my gateway node.&lt;/P&gt;&lt;PRE&gt;flume-ng agent --conf $FLUME_CONF_DIR --conf-file $FLUME_CONF_DIR/flume.conf --name Agent7
&lt;/PRE&gt;</description>
      <pubDate>Thu, 29 Dec 2016 22:28:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172671#M50222</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-12-29T22:28:17Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172672#M50223</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/9789/vamsivalivetiedu.html" nodeid="9789"&gt;@vamsi valiveti&lt;/A&gt; the easiest way is to detach shell from the command using nohup:&lt;/P&gt;&lt;PRE&gt;nohup &amp;lt;my_command&amp;gt; &amp;amp;
&lt;/PRE&gt;&lt;P&gt;Another option is to create flume init.d service script. I've posted some example script &lt;A target="_blank" href="http://www.dataprocessingtips.com/2016/04/03/data-ingestion-using-flume-ganglia-monitoring-on-ec2/"&gt;here &lt;/A&gt;(search for "Setup flume agent auto startup" on the page), and run the flume as a service.&lt;/P&gt;&lt;P&gt;And third option is to use Ambari to control the agents.
&lt;A target="_blank" href="http://www.dataprocessingtips.com/2016/04/03/data-ingestion-using-flume-ganglia-monitoring-on-ec2/"&gt;&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2016 22:40:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172672#M50223</guid>
      <dc:creator>bluesmix</dc:creator>
      <dc:date>2016-12-29T22:40:18Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172673#M50224</link>
      <description>&lt;P&gt;HI &lt;A rel="user" href="https://community.cloudera.com/users/2167/bluesmix.html" nodeid="2167"&gt;@Michael M&lt;/A&gt;&lt;/P&gt;&lt;P&gt;For first option:-&lt;/P&gt;&lt;P&gt;In production can I place below command in shell script and schedule that script using crontab so that it will run the Flume will run continuously since In production environment we are not allowed to run any command manually on gateway node.Please correct me if i am wrong?&lt;/P&gt;&lt;PRE&gt;nohup &amp;lt;my_command&amp;gt; &amp;amp;&lt;/PRE&gt;</description>
      <pubDate>Thu, 29 Dec 2016 23:07:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172673#M50224</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-12-29T23:07:20Z</dc:date>
    </item>
    <item>
      <title>Re: Flume with oozie</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172674#M50225</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/9789/vamsivalivetiedu.html" nodeid="9789"&gt;@vamsi valiveti&lt;/A&gt; it could be the option, right.&lt;/P&gt;&lt;P&gt;But for production usage i'd think additionally about how to stop the agents and how to monitor the agent. From my experience init.d service script + ganglia monitoring is a best option.&lt;/P&gt;&lt;P&gt;It allows you to run/stop agents easily with the commands like: /etc/init.d/flume "agent" stop/start. And ganglia provides a nice web interface for the monitoring.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2016 23:51:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flume-with-oozie/m-p/172674#M50225</guid>
      <dc:creator>bluesmix</dc:creator>
      <dc:date>2016-12-29T23:51:23Z</dc:date>
    </item>
  </channel>
</rss>

