<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Job ID for a scheduled job in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148369#M110895</link>
    <description>&lt;P&gt;
	Hi,&lt;/P&gt;&lt;P&gt;
	Wondering how to retrieve the job id for a job that is submitted through a crontab scheduled to run at regular intervals. For example, if I run a distcp job in my script as below&lt;/P&gt;&lt;PRE&gt;
	 hadoop distcp hdfs://nn1:8020/src_path hdfs://nn2:8020/dst_path &lt;/PRE&gt;&lt;P&gt;
	How to know the YARN job ID so that I can query the status of the job in my script for completion and then take appropriate action. &lt;/P&gt;&lt;P&gt;
	PS: For various reasons, we are not using Oozie and hence need to do this in script and schedule using crontab.&lt;/P&gt;</description>
    <pubDate>Tue, 25 Oct 2016 18:15:06 GMT</pubDate>
    <dc:creator>bigdata_superno</dc:creator>
    <dc:date>2016-10-25T18:15:06Z</dc:date>
    <item>
      <title>Job ID for a scheduled job</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148369#M110895</link>
      <description>&lt;P&gt;
	Hi,&lt;/P&gt;&lt;P&gt;
	Wondering how to retrieve the job id for a job that is submitted through a crontab scheduled to run at regular intervals. For example, if I run a distcp job in my script as below&lt;/P&gt;&lt;PRE&gt;
	 hadoop distcp hdfs://nn1:8020/src_path hdfs://nn2:8020/dst_path &lt;/PRE&gt;&lt;P&gt;
	How to know the YARN job ID so that I can query the status of the job in my script for completion and then take appropriate action. &lt;/P&gt;&lt;P&gt;
	PS: For various reasons, we are not using Oozie and hence need to do this in script and schedule using crontab.&lt;/P&gt;</description>
      <pubDate>Tue, 25 Oct 2016 18:15:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148369#M110895</guid>
      <dc:creator>bigdata_superno</dc:creator>
      <dc:date>2016-10-25T18:15:06Z</dc:date>
    </item>
    <item>
      <title>Re: Job ID for a scheduled job</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148370#M110896</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2696/bigdatasupernova.html" nodeid="2696"&gt;@bigdata.neophyte&lt;/A&gt; &lt;/P&gt;&lt;P&gt;You would need to use this API to fetch the job status.(https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/mapreduce/JobStatus.html)&lt;/P&gt;&lt;P&gt;If you want a simple solution you could try something like: &lt;/P&gt;&lt;P&gt;1) Set unique job name (eg:date or time) using -Dmapred.job.name=testdist01&lt;/P&gt;&lt;P&gt;2) Get the app status using : &lt;/P&gt;&lt;PRE&gt;yarn application -list -appStates ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING,FINISHED,FAILED,KILLED | grep -i "distcp: testdist01" | awk '{print $7,$8}'

FINISHED SUCCEEDED&lt;/PRE&gt;</description>
      <pubDate>Tue, 25 Oct 2016 18:54:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148370#M110896</guid>
      <dc:creator>sandyy006</dc:creator>
      <dc:date>2016-10-25T18:54:34Z</dc:date>
    </item>
    <item>
      <title>Re: Job ID for a scheduled job</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148371#M110897</link>
      <description>&lt;P&gt;Another alternative would be to use the YARN REST API to submit the application:&lt;/P&gt;&lt;P&gt;With the New Application API, you can obtain an application-id which can then be used as part of the &lt;A href="https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application"&gt;Cluster Submit Applications API&lt;/A&gt; to submit applications.&lt;/P&gt;&lt;PRE&gt;curl -X POST &lt;A href="http://&amp;lt;resource_manager&amp;gt;:8088/ws/v1/cluster/apps/new-application" target="_blank"&gt;http://&amp;lt;resource_manager&amp;gt;:8088/ws/v1/cluster/apps/new-application&lt;/A&gt;&lt;/PRE&gt;&lt;P&gt;&lt;A href="https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_New_Application_API"&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Reference: &lt;A href="https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_New_Application_API"&gt;Resource Manager REST API Documentation&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Oct 2016 23:19:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Job-ID-for-a-scheduled-job/m-p/148371#M110897</guid>
      <dc:creator>jlopez</dc:creator>
      <dc:date>2016-10-26T23:19:41Z</dc:date>
    </item>
  </channel>
</rss>

