<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Easy explaination on Map Reduce phase - From InputSplit to Reducer . in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172998#M45939</link>
    <description>&lt;P&gt;&lt;STRONG&gt;Map-reduce&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;This is a good high-level (easy) explanation: &lt;A href="http://www.thegeekstuff.com/2014/05/map-reduce-algorithm/" target="_blank"&gt;http://www.thegeekstuff.com/2014/05/map-reduce-algorithm/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;To really understand it, you need to dive deep.  For example, mapper stage writes to local disk through a buffer which then spills to disk; this intermediate data is sent across the network to reducer(s).  To really understand map-reduce (so you can optimize performance) reading this book is a good way to go: &lt;A href="http://shop.oreilly.com/product/0636920033448.do" target="_blank"&gt;http://shop.oreilly.com/product/0636920033448.do&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You can write your own map-reduce programs but they are typically implemented when you run a hive or pig job.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Tez&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;If you are running hive or pig queries, you should run it in &lt;STRONG&gt;&lt;EM&gt;tez&lt;/EM&gt;&lt;/STRONG&gt; mode.  Tez is an alternative processing engine to map-reduce which is much faster.&lt;/P&gt;&lt;P&gt;See:&lt;/P&gt;&lt;P&gt;&lt;A href="http://hortonworks.com/apache/tez/" target="_blank"&gt;http://hortonworks.com/apache/tez/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha" target="_blank"&gt;http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 12 Nov 2016 00:09:18 GMT</pubDate>
    <dc:creator>gkeys</dc:creator>
    <dc:date>2016-11-12T00:09:18Z</dc:date>
    <item>
      <title>Easy explaination on Map Reduce phase - From InputSplit to Reducer .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172997#M45938</link>
      <description>&lt;P&gt;I am looking for the easy explanation on Map Reduce phase - From InputSplit to Reducer . &lt;/P&gt;&lt;P&gt;Role of InputSplit ,RecordReader for Map Phase &lt;/P&gt;&lt;P&gt;When Shuffle/Sort Phase run &lt;/P&gt;&lt;P&gt;Partition phase &lt;/P&gt;&lt;P&gt;How the data goes to reducer&lt;/P&gt;</description>
      <pubDate>Fri, 11 Nov 2016 21:50:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172997#M45938</guid>
      <dc:creator>amit_dass</dc:creator>
      <dc:date>2016-11-11T21:50:36Z</dc:date>
    </item>
    <item>
      <title>Re: Easy explaination on Map Reduce phase - From InputSplit to Reducer .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172998#M45939</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Map-reduce&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;This is a good high-level (easy) explanation: &lt;A href="http://www.thegeekstuff.com/2014/05/map-reduce-algorithm/" target="_blank"&gt;http://www.thegeekstuff.com/2014/05/map-reduce-algorithm/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;To really understand it, you need to dive deep.  For example, mapper stage writes to local disk through a buffer which then spills to disk; this intermediate data is sent across the network to reducer(s).  To really understand map-reduce (so you can optimize performance) reading this book is a good way to go: &lt;A href="http://shop.oreilly.com/product/0636920033448.do" target="_blank"&gt;http://shop.oreilly.com/product/0636920033448.do&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You can write your own map-reduce programs but they are typically implemented when you run a hive or pig job.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Tez&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;If you are running hive or pig queries, you should run it in &lt;STRONG&gt;&lt;EM&gt;tez&lt;/EM&gt;&lt;/STRONG&gt; mode.  Tez is an alternative processing engine to map-reduce which is much faster.&lt;/P&gt;&lt;P&gt;See:&lt;/P&gt;&lt;P&gt;&lt;A href="http://hortonworks.com/apache/tez/" target="_blank"&gt;http://hortonworks.com/apache/tez/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha" target="_blank"&gt;http://www.slideshare.net/Hadoop_Summit/w-1205phall1saha&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 12 Nov 2016 00:09:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172998#M45939</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-11-12T00:09:18Z</dc:date>
    </item>
    <item>
      <title>Re: Easy explaination on Map Reduce phase - From InputSplit to Reducer .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172999#M45940</link>
      <description>&lt;P&gt;Hi Amith,&lt;/P&gt;&lt;P&gt;Please refer this video he has explained very well MapReduce flow chart. Hope this will be useful.&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.youtube.com/watch?v=6OemZEJdMp8" target="_blank"&gt;https://www.youtube.com/watch?v=6OemZEJdMp8&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Mahesh&lt;/P&gt;</description>
      <pubDate>Mon, 14 Nov 2016 23:53:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Easy-explaination-on-Map-Reduce-phase-From-InputSplit-to/m-p/172999#M45940</guid>
      <dc:creator>maheshmsh88</dc:creator>
      <dc:date>2016-11-14T23:53:00Z</dc:date>
    </item>
  </channel>
</rss>

