<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98484#M11864</link>
    <description>&lt;P&gt;Hive on tez,sometimes the reduce number of tez is very fewer,in hadoop mapreduce has 2000 reducers, but in tez only 10.This cause take a long time to complete the query task.&lt;/P&gt;&lt;P&gt;the hive.exec.reducers.bytes.per.reducer is same.Is there any mistake in judging the Map output in tez?&lt;/P&gt;&lt;P&gt;how can to solve this problem?&lt;/P&gt;</description>
    <pubDate>Thu, 10 Dec 2015 14:59:09 GMT</pubDate>
    <dc:creator>connectchen</dc:creator>
    <dc:date>2015-12-10T14:59:09Z</dc:date>
    <item>
      <title>Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98484#M11864</link>
      <description>&lt;P&gt;Hive on tez,sometimes the reduce number of tez is very fewer,in hadoop mapreduce has 2000 reducers, but in tez only 10.This cause take a long time to complete the query task.&lt;/P&gt;&lt;P&gt;the hive.exec.reducers.bytes.per.reducer is same.Is there any mistake in judging the Map output in tez?&lt;/P&gt;&lt;P&gt;how can to solve this problem?&lt;/P&gt;</description>
      <pubDate>Thu, 10 Dec 2015 14:59:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98484#M11864</guid>
      <dc:creator>connectchen</dc:creator>
      <dc:date>2015-12-10T14:59:09Z</dc:date>
    </item>
    <item>
      <title>Re: Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98485#M11865</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/1169/connectchen.html" nodeid="1169" target="_blank"&gt;@Jun Chen&lt;/A&gt;&lt;P&gt;Tez architecture is different from mapreduce. &lt;/P&gt;&lt;P&gt;&lt;A href="http://hortonworks.com/hadoop/tez/" rel="nofollow noopener noreferrer" target="_blank"&gt;http://hortonworks.com/hadoop/tez/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Tez requires fewer
jobs (1 vs. 3) and no IO synchronization barriers (provided via HDFS for the MR jobs) are required.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="753-screen-shot-2015-12-10-at-71226-am.png" style="width: 1412px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/23905iF9A64430388183CE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="753-screen-shot-2015-12-10-at-71226-am.png" alt="753-screen-shot-2015-12-10-at-71226-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 19 Aug 2019 12:40:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98485#M11865</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2019-08-19T12:40:11Z</dc:date>
    </item>
    <item>
      <title>Re: Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98486#M11866</link>
      <description>&lt;P&gt;If you're experiencing performance issue on Tez you need to start checking hive.tez.container.size: we had worked a lot in Hive / Tez performance optimization and very often you need to check your jobs. Sometimes we lowered the hive.tez.container.size to 1024 (less memory means more containers), other times we need to set the property value to 8192. It really depend on your workload.&lt;/P&gt;&lt;P&gt;Hive / Tez optimization could be a real long work but you can achieve good performance using hive.tez.container.size, ORC (and compression algorithm) and "pre-warming" Tez container.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Dec 2015 05:25:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98486#M11866</guid>
      <dc:creator>dorio</dc:creator>
      <dc:date>2015-12-11T05:25:18Z</dc:date>
    </item>
    <item>
      <title>Re: Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98487#M11867</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/5785/why-the-number-of-reducer-determined-by-hadoop-map.html#"&gt;@Jun Chen&lt;/A&gt; check if you have parameter below turned on: &lt;/P&gt;hive.tez.auto.reducer.parallelism&lt;P&gt;when it's on, tez automatically decrease number of reducer tasks based on output from map. you can disable it if you need.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Dec 2015 10:20:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98487#M11867</guid>
      <dc:creator>gbraccialli3</dc:creator>
      <dc:date>2015-12-11T10:20:30Z</dc:date>
    </item>
    <item>
      <title>Re: Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98488#M11868</link>
      <description>&lt;P&gt;Tez set very few reduces initially before automatically decreasing.Following is the detail picture:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="773-image-1.png" style="width: 867px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/23904i809B8EBC77448F49/image-size/medium?v=v2&amp;amp;px=400" role="button" title="773-image-1.png" alt="773-image-1.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 19 Aug 2019 12:40:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98488#M11868</guid>
      <dc:creator>connectchen</dc:creator>
      <dc:date>2019-08-19T12:40:03Z</dc:date>
    </item>
    <item>
      <title>Re: Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98489#M11869</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1169/connectchen.html" nodeid="1169"&gt;@Jun Chen&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I see... I know tez has a new way to define number of mappers tasks, described in link below, not sure about number of reducers. Usually, we define a high number of reducers by default (in ambari) and use auto.reducer parameter, that works well.&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works"&gt;https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Dec 2015 11:15:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98489#M11869</guid>
      <dc:creator>gbraccialli3</dc:creator>
      <dc:date>2015-12-11T11:15:11Z</dc:date>
    </item>
    <item>
      <title>Re: Why the number of reducer determined by Hadoop MapReduce and Tez has  a great differ?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98490#M11870</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1169/connectchen.html" nodeid="1169"&gt;@Jun Chen&lt;/A&gt; are you still having issues with this? Can you accept best answer or provide your own solution?&lt;/P&gt;</description>
      <pubDate>Wed, 03 Feb 2016 23:46:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-the-number-of-reducer-determined-by-Hadoop-MapReduce-and/m-p/98490#M11870</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-03T23:46:39Z</dc:date>
    </item>
  </channel>
</rss>

