<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How do I limit the number of simultaneous tasks for a single Tez job? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99426#M12627</link>
    <description>&lt;P&gt;
	Have you tried setting mapreduce.jobtracker.maxtasks.perjob for your pig application?&lt;/P&gt;&lt;P&gt;
	Alternatively you can use &lt;A href="http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_yarn_resource_mgt/content/using_node_labels.html"&gt;node labels&lt;/A&gt; to run your pig job on specific nodes. &lt;/P&gt;</description>
    <pubDate>Wed, 16 Dec 2015 07:02:15 GMT</pubDate>
    <dc:creator>ajay_kumar</dc:creator>
    <dc:date>2015-12-16T07:02:15Z</dc:date>
    <item>
      <title>How do I limit the number of simultaneous tasks for a single Tez job?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99425#M12626</link>
      <description>&lt;P&gt;I'm using the Elasticsearch Hadoop connector to push data via Pig to Elasticsearch.  This process has worked very well for me on my 6 node cluster.  I now have a 12 node cluster and I'm running HDP 2.3.  Now it seems that Pig is pushing too much data to Elasticsearch and it can't keep up.&lt;/P&gt;&lt;P&gt;My cluster is running 134 tasks at once for this Pig job.  Is there any easy way to change the number of simultaneous tasks running for a Pig/Tez job?  I changed my queue configuration to limit the resources to 50% for one of the queues and I'm still overloading Elasticsearch.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Dec 2015 05:35:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99425#M12626</guid>
      <dc:creator>Jaraxal</dc:creator>
      <dc:date>2015-12-16T05:35:41Z</dc:date>
    </item>
    <item>
      <title>Re: How do I limit the number of simultaneous tasks for a single Tez job?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99426#M12627</link>
      <description>&lt;P&gt;
	Have you tried setting mapreduce.jobtracker.maxtasks.perjob for your pig application?&lt;/P&gt;&lt;P&gt;
	Alternatively you can use &lt;A href="http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_yarn_resource_mgt/content/using_node_labels.html"&gt;node labels&lt;/A&gt; to run your pig job on specific nodes. &lt;/P&gt;</description>
      <pubDate>Wed, 16 Dec 2015 07:02:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99426#M12627</guid>
      <dc:creator>ajay_kumar</dc:creator>
      <dc:date>2015-12-16T07:02:15Z</dc:date>
    </item>
    <item>
      <title>Re: How do I limit the number of simultaneous tasks for a single Tez job?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99427#M12628</link>
      <description>&lt;P&gt;In my searching for ways to reduce parallel tasks, I had not yet seen that.  I will give it a try.  Thank you!&lt;/P&gt;</description>
      <pubDate>Mon, 21 Dec 2015 23:48:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99427#M12628</guid>
      <dc:creator>Jaraxal</dc:creator>
      <dc:date>2015-12-21T23:48:32Z</dc:date>
    </item>
    <item>
      <title>Re: How do I limit the number of simultaneous tasks for a single Tez job?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99428#M12629</link>
      <description>&lt;P&gt;I would switch the engine back to MapReduce to make your life easier and then control the number of Mappers spawned by controlling the input split values. Where N is the min number of bytes you want a Mapper to process and X is the Max number of bytes you want a Mapper to process.  This is easier then trying to understand how the Tez waves setting works becuase that involves the current capacity of a queue and not just the size of the underlaying data. &lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;#start pig with &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;pig -x mr&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;#in your pig script &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;set mapreduce.input.fileinputformat.split.minsize= N&lt;/P&gt;&lt;P&gt;set mapreduce.input.fileinputformat.split.maxsize= X&lt;/P&gt;</description>
      <pubDate>Tue, 22 Dec 2015 20:54:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99428#M12629</guid>
      <dc:creator>jniemiec</dc:creator>
      <dc:date>2015-12-22T20:54:25Z</dc:date>
    </item>
    <item>
      <title>Re: How do I limit the number of simultaneous tasks for a single Tez job?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99429#M12630</link>
      <description>&lt;P&gt;I think this is probably the better approach.  We were initially using Tez for the better performance over M/R.  However with our cluster easily throttling Elasticsearch, it seems reasonable to revert back to M/R and tweak settings that are easier to control.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 23 Dec 2015 06:27:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-do-I-limit-the-number-of-simultaneous-tasks-for-a-single/m-p/99429#M12630</guid>
      <dc:creator>Jaraxal</dc:creator>
      <dc:date>2015-12-23T06:27:23Z</dc:date>
    </item>
  </channel>
</rss>

