Created on 04-10-2016 12:32 PM - edited 09-16-2022 03:13 AM
Hi experts,
I need help, I have installed CDH 5.7.0 on CentOs 6 and all services are up and running.
However testing the installation running the simple pi example and using the following command doesnt execute a map-reduce job: sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
The job is planned but nothing is happening...
So the job runs forever.
What can I do to test what is wrong with my installation?!?
Thanks & Regards
Created 04-16-2016 01:24 PM
So I've fixed it by adjusting the following Yarn settings:
yarn.scheduler.maximum-allocation-mb = 8 GiB
mapreduce.map.memory.mb = 4 GiB
mapreduce.reduce.memory.mb = 4 GiB
And I've got the test example as following command running:
sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
Thanks for the comments.
Created on 04-11-2016 09:44 PM - edited 04-11-2016 09:45 PM
I've seen this happen when the job is requesting more memory than you have allowed YARN to have on any one given host. The job is scheduled, waiting for a node with enough memory to check in and take it, which won't happen since you don't have any one node that can take it on.
Try adjusting mapreduce.map.memory.mb, mapreduce.reduce.memory.mb, and yarn.app.mapreduce.am.resource.mb on your cluster and from the node you are running the job from.
here are some helpful resources for YARN tuning:
https://support.pivotal.io/hc/en-us/articles/201462036-MapReduce-YARN-Memory-Parameters
http://www.cloudera.com/documentation/enterprise/5-3-x/topics/cdh_ig_yarn_tuning.html
Created 04-13-2016 01:51 PM
Thanks, I will try it.
The first parameter mapreduce.map.memory.mb was set to 0 GiB, maybe this is the problem.
Created 04-13-2016 02:25 PM
No success, the job is still running forever 😞
I have updated the memory setting from 0 GiB to 1 Gib.
And this memory is also available on the node, but the job will not start.
I'm lost.
I have not altered the
Created 04-13-2016 09:23 PM
Based on the screenshot, it looks like you have only one node with 1 GB available? note that this 1GB is what is usable with both Heap and Virtual combined on the server side, what you specify in mapreduce.map.memory.mb is heap memory only
from the link above, re-linked here[1] YARN will automatically calculate the virtual memory to add to the task using the yarn.nodemanager.vmem-pmem-ratio (default 2.1), and then round up to the yarn.scheduler.minimum-allocation-mb (default 1024MiB).
the math would be:
1GiB (task heap memory) * 1024 = 1024MiB task heap memory
1024MiB*2.1 = 2150.4 Virtual memory
round to nearest 1024 (2150) = 3072MiB total memory needed for task.
If you can't give the cluster more resources, you can tweak the yarn.nodemanager.vmem-pmem-ratio (you do still need some overhead) and you can set yarn.scheduler.minimum-allocation-mb to smaller increments.
setting mapreduce.map.memory.mb and mapreduce.reduce.memory.mb to (1024MiB/2.1) would allow the AM to run without tweaking either. but note that for YARN, you need the AM to run as well as at least 1 task. so you really must set yarn.scheduler.minimum-allocation-mb to 512 or smaller, and then the target for mapreduce.map.memory.mb and mapreduce.reduce.memory.mb would be 512MiB/2.1 = 240.
240 MB may work, depending on what you are running.
[1]https://support.pivotal.io/hc/en-us/articles/201462036-MapReduce-YARN-Memory-Parameters
Created on 04-14-2016 01:45 PM - edited 04-14-2016 01:47 PM
The "cluster" is pseudo distributed with one node on CentOs6 and I've updated the settings according to your recommendation:
mapreduce.map.memory.mb and mapreduce.reduce.memory.mb = 240 MiB
Deployed the configuration, restarted the serivces but the result is the same, the job runs forever.
I would really need to test something urgently and I'm lost.
The health of the services is good exept "under replicated blocks"
I will follow up on this.
Thanks for any hints.
Created 04-16-2016 01:24 PM
So I've fixed it by adjusting the following Yarn settings:
yarn.scheduler.maximum-allocation-mb = 8 GiB
mapreduce.map.memory.mb = 4 GiB
mapreduce.reduce.memory.mb = 4 GiB
And I've got the test example as following command running:
sudo -u hdfs hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
Thanks for the comments.