<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Is there a way to easily detect when a MR/Tez job has completed? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155110#M56995</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt;&lt;/P&gt;&lt;P&gt;you can leverage WebHCat for this as one idea, &lt;A href="https://cwiki.apache.org/confluence/display/Hive/WebHCat+UsingWebHCat#WebHCatUsingWebHCat-ErrorCodesandResponses" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/WebHCat+UsingWebHCat#WebHCatUsingWebHCat-ErrorCodesandResponses&lt;/A&gt;&lt;/P&gt;&lt;P&gt;# this will execute a hive query and save result to hdfs file in your home directory called output&lt;/P&gt;&lt;PRE&gt;curl -s -d execute="select+*+from+sample_08;" \  
 -d statusdir="output" \  
 'http://localhost:50111/templeton/v1/hive?user.name=root'&lt;/PRE&gt;&lt;P&gt;# if you ls on the directory, it will have two files, stderr and stdout&lt;/P&gt;&lt;PRE&gt;hdfs dfs -ls output&lt;/PRE&gt;&lt;P&gt;# if the job succeeded, you can cat the stdout file and view the results&lt;/P&gt;&lt;PRE&gt;hdfs dfs -cat output/stdout&lt;/PRE&gt;&lt;P&gt;when you invoke the job, you will get a response with job id, then you can also check whether output directory exists and there's no error log with webhdfs API, in that case job succeedd. &lt;/P&gt;&lt;PRE&gt; curl -i "http://sandbox.hortonworks.com:50070/webhdfs/v1/user/root/output/?op=LISTSTATUS"&lt;/PRE&gt;&lt;P&gt;another idea is to leverage Oozie to wire the jobs together, once job completes, you can use SLA monitoring features of Oozie to check whether job completed or send an email (SLA not needed for this) whichever way you go, you can have Nifi watch these events either from JMS topic in ActiveMQ if you intend to use SLA or email alert. &lt;A href="https://community.hortonworks.com/articles/83787/apache-ambari-workflow-manager-view-for-apache-ooz-1.html" target="_blank"&gt;https://community.hortonworks.com/articles/83787/apache-ambari-workflow-manager-view-for-apache-ooz-1.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;probably even better idea is to query ATS via REST API &lt;A href="https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/TimelineServer.html" target="_blank"&gt;https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/TimelineServer.html&lt;/A&gt; I think this is probably the most sane approach, you can query ATS for finished job and get status. So once you know the job ID, there are ways to get it, one of them is via my first example, then in the second processor you can query ATS for completion state.&lt;/P&gt;</description>
    <pubDate>Tue, 14 Mar 2017 03:17:13 GMT</pubDate>
    <dc:creator>aervits</dc:creator>
    <dc:date>2017-03-14T03:17:13Z</dc:date>
    <item>
      <title>Is there a way to easily detect when a MR/Tez job has completed?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155109#M56994</link>
      <description>&lt;P&gt;I am using NiFi for my data flow and then I kick off ETL script which runs many (hive/Pig) MR/Tez jobs.  Is there easy way to detect (ie trigger) once the job has finished.  Creating a trigger manually per job is not scalable since this are many jobs.  Going into each job and have it create a trigger is off the table.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Mar 2017 02:54:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155109#M56994</guid>
      <dc:creator>sunile_manjee</dc:creator>
      <dc:date>2017-03-14T02:54:30Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a way to easily detect when a MR/Tez job has completed?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155110#M56995</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt;&lt;/P&gt;&lt;P&gt;you can leverage WebHCat for this as one idea, &lt;A href="https://cwiki.apache.org/confluence/display/Hive/WebHCat+UsingWebHCat#WebHCatUsingWebHCat-ErrorCodesandResponses" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/WebHCat+UsingWebHCat#WebHCatUsingWebHCat-ErrorCodesandResponses&lt;/A&gt;&lt;/P&gt;&lt;P&gt;# this will execute a hive query and save result to hdfs file in your home directory called output&lt;/P&gt;&lt;PRE&gt;curl -s -d execute="select+*+from+sample_08;" \  
 -d statusdir="output" \  
 'http://localhost:50111/templeton/v1/hive?user.name=root'&lt;/PRE&gt;&lt;P&gt;# if you ls on the directory, it will have two files, stderr and stdout&lt;/P&gt;&lt;PRE&gt;hdfs dfs -ls output&lt;/PRE&gt;&lt;P&gt;# if the job succeeded, you can cat the stdout file and view the results&lt;/P&gt;&lt;PRE&gt;hdfs dfs -cat output/stdout&lt;/PRE&gt;&lt;P&gt;when you invoke the job, you will get a response with job id, then you can also check whether output directory exists and there's no error log with webhdfs API, in that case job succeedd. &lt;/P&gt;&lt;PRE&gt; curl -i "http://sandbox.hortonworks.com:50070/webhdfs/v1/user/root/output/?op=LISTSTATUS"&lt;/PRE&gt;&lt;P&gt;another idea is to leverage Oozie to wire the jobs together, once job completes, you can use SLA monitoring features of Oozie to check whether job completed or send an email (SLA not needed for this) whichever way you go, you can have Nifi watch these events either from JMS topic in ActiveMQ if you intend to use SLA or email alert. &lt;A href="https://community.hortonworks.com/articles/83787/apache-ambari-workflow-manager-view-for-apache-ooz-1.html" target="_blank"&gt;https://community.hortonworks.com/articles/83787/apache-ambari-workflow-manager-view-for-apache-ooz-1.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;probably even better idea is to query ATS via REST API &lt;A href="https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/TimelineServer.html" target="_blank"&gt;https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/TimelineServer.html&lt;/A&gt; I think this is probably the most sane approach, you can query ATS for finished job and get status. So once you know the job ID, there are ways to get it, one of them is via my first example, then in the second processor you can query ATS for completion state.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Mar 2017 03:17:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155110#M56995</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2017-03-14T03:17:13Z</dc:date>
    </item>
    <item>
      <title>Re: Is there a way to easily detect when a MR/Tez job has completed?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155111#M56996</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Since you are using NiFi to launch jobs, why don't you use NiFi itself to monitor it &lt;span class="lia-unicode-emoji" title=":face_with_tongue:"&gt;😛&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I tried to flex NiFi to monitor Yarn jobs by querying ResourceManager , and have documented it and my flow xml is attached in the comments. check it out.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/content/kbentry/42995/yarn-application-monitoring-with-nifi.html" target="_blank"&gt;https://community.hortonworks.com/content/kbentry/42995/yarn-application-monitoring-with-nifi.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;In the demo I used it to monitor Failed and Killed jobs only, you can change the query and ask for all the jobs say user &lt;STRONG&gt;smanjee&lt;/STRONG&gt; submitted and alert you as soon as its completed/failed/killed.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Jobin&lt;/P&gt;</description>
      <pubDate>Wed, 15 Mar 2017 01:33:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Is-there-a-way-to-easily-detect-when-a-MR-Tez-job-has/m-p/155111#M56996</guid>
      <dc:creator>jgeorge</dc:creator>
      <dc:date>2017-03-15T01:33:55Z</dc:date>
    </item>
  </channel>
</rss>

