Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎01-27-2015

Which disk hosts the staging area?

Hi,

 

In my experiments with CDH5, I have always set the staging area parameter "hadoop.tmp.dir" to "/tmp" on HDFS.

 

My main question is :  Is this staging area located on some DataNode's local disk that the NameNode randomly picks? 

 

If so, then I understand the HDFS File Write path to be : Client gets the name of the DataNode hosting the 'staging' area, writes the first block to it, then initiates the replication pipeline to mirror this block to the other DataNodes (specified by the NameNode). Is this correct? 

 

Regards

Rajesh

 

Announcements