Documentation says, map tasks are run in DataNodes and have data locality constraints which the scheduler tries to honor and reduce tasks can run anywhere in the cluster.
The statement "can run anywhere in the cluster" for reduce tasks, is referring to only DataNodes in the cluster OR is ResourceManager machine also considered part of the cluster, so that Reduce tasks are allowed to run in ResourceManager also ??
... View more
Each map task has circular buffer that it writes the output to. The buffer is 100MB by default( the size can be tuned by changing the mapreduce.task.io.sort.mb property). When the contents of the buffer reach a certain threshold size, controlled by a property namely mapreduce.map.sort.spill.percent, with a default value of 80%, a background thread will start to spill the contents to disk. Spills are written in round robin fashion to the directories specified by the mapreduce.cluster.local.dir property in a job specific directory. Where is the Spill directory mentioned in the above property mapreduce.cluster.local.dir located ? 1) Local to ResourceManager machine
2) Local to DataNode machine on which the particular map program runs
3) Local to any DataNode machine otherthan the DataNode machine on which the particular map program runs
... View more
Yes, its my imagination resulted out of looking into the bigger picture theoretically than work related. Also was trying to forsee a situation where it becomes necessary for NameNodes to talk to each other, when they have an application's huge NameSpace data getting partitioned(due to RAM constraints) and resides in other NameNodes.
From your explanation and the links provided, its clear that such a situation doesnt exist and is most likely not going to occur in real world scenarios. Thus the need for NameNodes to talk to each other is not at all required.
... View more
Hi @mqureshi Understood that the client needs to know upfront the name of the namespace that it want to work with. In other words, the namepace containing the data blocks which are essential for parallely distribute the map task for a particular use case.
So the naming of namespace comes into the Hadoop scenes at the time of storing the huge data into the common HDFS storage cluster. The input data will be split into input splits and gets stored in the name of several blocks in each datanodes. All the meatadata informations, that are getting generated as result of storing the input splits into various blocks in the Datanode cluster, will get added to the namespace in the corresponding NameNode. If, say for example, the kind of input file that we want to process is immensely huge such that the inputs splits generated for that file are accordingly huge in number, and the resulting meatadata information for the storage references of the entire input splits in HDFS grows beyond the RAM capacity.
In such a scenario, do we have any other option other than the below?
Divide the input file such that the metadata informations size in a NameNode are allowed to grow only within the RAM capacity.
For example we have Two NameNodes(NN1,NN2) in our cluster which are federated as following
NN1 have 8GB RAM
NN2 have 6GB RAM
Our original input file is so huge that, lets say, its metadata informations is going to occupy 10GB of RAM during HDFS storage.
As we know we cannot proceed storing this file in HDFS under a single namespace without partitioning the input file manually.
We go by dividing the input file into two equal proportions, such that the first part's metadata references(created during HDFS storage in NN1) occupy only memory less than 8GB RAM,i.e 5GB and the second part's metadata references(created during HDFS storage in NN2) will occupy only memory less than 6GB RAM,i.e 5GB
Do we have an auto detection of such use cases and avoid the manual intervention of partitioning the input data here?
... View more
In Hadoop1, horizontal scaling of NameNode was not possible. Due to this there was a limit to which one could perform Horizontal scaling for DataNodes in the cluster. Hadoop2 allows horizontal scaling of NameNodes using a technique called HDFS federation. Without HDFS federation, one cannot add more data nodes to the hadoop cluster beyond a certain limit. In other words, a single namenode's RAM will become fully occupied with metadata/namespace information used for managing the entire datanodes in a cluster. It is at this point that addition of new datanodes, i.e horizontal scaling of data nodes became impossible. Because, namenodes's RAM is already full and cannot handle the new set of metadata/namespace informations, generated as a result of managing the additional data nodes in the cluster. So the limiting factor is the memory capacity of a NameNodes RAM. HDFS federation comes to its rescue by dividing the metadata/namespace across multiple Namenodes, thereby offloading part of one application's metadata/namespace information that have grown beyond the capacity of a single NameNode's RAM to a second NameNode's RAM. Is my understanding correct ?? If YES, then as per documentation, it says that NameNodes are independent and does not communicate each other. In that case who is managing an application's metadata/namespace informations that are resident between two NameNodes? Some sort of daemon should be there that communicates with all NameNodes and finding the portions of metadata split between NameNodes. If NO, then one application's metadata/namespace information does not get split, rather it stays in the same NameNode's RAM only. The next namenode will be used only when the system visualises a situation where an applications's metadata/namespace informations cannot be fully accomodated in its RAM for successfully executing all of its MapReduce jobs. If it does not fit, then system has to try a NameNode whose RAM capacity is sufficient enough to load the job. So my take here is HDFS federation is just a work around and has not actually solved the issue. Please confirm.
... View more