Created on 03-16-2016 06:49 AM - edited 09-16-2022 03:09 AM
Hello,
I'm new with Hadoop and I have a design question:
We plan to install Hadoop under Ubuntu on Hyper-V machines. The Hyper-V admin defined a default memory allocation of 1GB and a maximum allocation of 32GB. Is the Hadoop framework able to allocate the additional 31GB if needed?
Regards
Klaus
Created 03-16-2016 11:40 AM
Hi Klaus,
Hadoop on 1GB doesn't make much sense. Also YARN is not really flexible when it comes to memory allocation. The nodemanagers get a maximum of memory they can use to schedule yarn tasks ( mapreduce, spark, ... ) yarn.nodemanager.resource.memory-mb
And a minimum which is also the multiple of task slots available
yarn.scheduler.minimum-allocation-mb
So if you have a maximum of 16GB and a minimum of 1GB he can give out up to 16 task slots to applications.
But AFAIK there is no way to dynamically change that without restarting the nodemanagers.
Yarn makes sure no task exceeds the slots it is provided but it doesn't give a damn about Operation System limits. So if you give your VM 4GB and set the yarn memory to 32GB yarn will happily schedule tasks till your system goes to its knees. You can of course enable swapping but that will result in bad performance.
So summary: Flexible memory settings on a nodemanager node is not a good idea.
Created 03-16-2016 11:40 AM
Hi Klaus,
Hadoop on 1GB doesn't make much sense. Also YARN is not really flexible when it comes to memory allocation. The nodemanagers get a maximum of memory they can use to schedule yarn tasks ( mapreduce, spark, ... ) yarn.nodemanager.resource.memory-mb
And a minimum which is also the multiple of task slots available
yarn.scheduler.minimum-allocation-mb
So if you have a maximum of 16GB and a minimum of 1GB he can give out up to 16 task slots to applications.
But AFAIK there is no way to dynamically change that without restarting the nodemanagers.
Yarn makes sure no task exceeds the slots it is provided but it doesn't give a damn about Operation System limits. So if you give your VM 4GB and set the yarn memory to 32GB yarn will happily schedule tasks till your system goes to its knees. You can of course enable swapping but that will result in bad performance.
So summary: Flexible memory settings on a nodemanager node is not a good idea.
Created 03-17-2016 04:44 AM
Hello Benjamin,
many thanks for your explanations. I will Forward it to the Hyper-V admin.
🙂 Klaus