10-29-2013 08:51 PM
I have updated "dfs.datanode.data.dir" on one datanode in CM4.6.3.
After reboot the whole cluster, I can still read the new value in CM configuration tab.
But I cannot find the new value in any hdfs-site.xml on this datanode.
I suppose this new value must be somewhere in this datanode, right?
So, could you tell me in which configuration file this value is stored?
10-30-2013 02:43 AM
Did you apply this new dfs value to ALL datanode roles globally, or did you set it as an override to just one datanode (perhaps you have a node with fewer or more data volumes for a datanode, and you wanted it to inherit different values than the rest of the DN's).
If you've set this as a "one-off" configuration, look for an overrride  within the HDFS > Configuration > View & Edit for the property you've altered.
More to the point, this configuration would be picked up for the scope in which you applied it. If globally, you'd find that after restarting Datanodes that the DN's are using the new values set. A new hdfs-site.xml is generated each time a service starts, and these can be referenced by going to
Services > hdfs > Instances > [click on a Datanode from the list] > Processes
Then on this page, expand the link near the middle of the page that says "Show" under "Configuration Files/Environment". This will give links to ALL the configuration files used for that specific processes start of the Datanode. The hdfs-site.xml should reflect the dfs.data.dir settings you applied, if it is intended to be a recipient of that property/value.
If it is not, then there could be some other explanation such as use of Role Config Groups, or unintended overrides.
10-31-2013 04:14 AM
@smark, thanks for your informational reply.
Yes, this is a "Overriding Configuration Settings". I just changed the value of one instance of datanode role group.
Following "Services > hdfs > Instances > [click on a Datanode from the list] > Processes "-->"Show" under "Configuration Files/Environment", click hdfs-site.xml. I found the updated value in this file.
But the hyper-link of hdfs-site.xml is from remote CM machine, instead of local machine.
So, my question are:
Is this hdfs-site.xml on local or remote?
If it is on local, but I cannot find it, where is it? I suppose it should be "/etc/hadoop/conf/hdfs-site.xml", but NOT true.
if it is on remote(from the hyper-link, this link points to remote CM node), this is a big surprise to me. The datanode service can use a remote hdfs-site.xml as its configuration file to start the service!? And I even cannot find the file on remote CM machine, it is only on memory?
10-31-2013 10:22 AM
The hdfs-site.xml file that you are viewing in CM which @smark helped you to find, resides on the local filesystem of that remote datanode. It will not be in /etc/hadoop/conf, though (unless you re-deploy your client configs to that machine), as CM maintains its own configuration directory in /var/run/cloudera-scm-agent/process for the roles that it manages. You will find the hdfs-site.xml file under that directory in the latest ???-Datanode directory.
10-31-2013 08:41 PM