- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
in which memory Map and Reduce tasks is performed ?
- Labels:
-
Apache Hadoop
Created 03-29-2017 05:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i need detail information on in which memory map and reduce task will be performed ? the reduce will bring all the map task's output into one node and than performs reduce and give final output ?
Created 03-29-2017 05:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@heta desai The following doc will give mode detail on this:
2. https://hortonworks.com/apache/mapreduce/#section_1
3. https://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
You can specify the minimum unit of RAM to allocate for a Container. The tasks are run within containers launched by YARN. mapreduce.{map|reduce}.memory.mb is used by YARN to set the memory size of the container being used to run the map or reduce task. If the task grows beyond this limit, YARN will kill the container.
.
Created 03-29-2017 05:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@heta desai The following doc will give mode detail on this:
2. https://hortonworks.com/apache/mapreduce/#section_1
3. https://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
You can specify the minimum unit of RAM to allocate for a Container. The tasks are run within containers launched by YARN. mapreduce.{map|reduce}.memory.mb is used by YARN to set the memory size of the container being used to run the map or reduce task. If the task grows beyond this limit, YARN will kill the container.
.
Created 03-29-2017 07:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Its takes memory from datanode and same Node Manager, where map split is stored(Due to data locality) and map output is stored in an in-memory buffer.
when this buffer is almost full then we start (in parallel) the spilling phase in order to remove data from it and reducer output will be stored on the local filesystem.
Created 03-30-2017 07:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if data is distributed over 3 nodes. the final result set will be merge of this 3 node data. so my confusion is where this merge operation will be perform ?
