As we know Default Minimum Container Size is 1024MBs
Consider this Scenario :
I have setup a Hadoop Cluster <1-Master & 3-Datanodes> in VIRTUAL BOX.
Namenode : master (3GB RAM)
Datanodes : data1 (512MB RAM),
data2(512MB RAM),
data3(1GB RAM).
Input File Size : 500MB with Replication factor 3 (All the nodes having all the blocks)
QUESTION :
1. Now lets say i want to run a Mapreduce Program, Will containers get allocated on data1, data2 ????
According to my results Job created containers on all three datanodes.
2. Is there any way we can specify a Job to run on specific node???????