Support Questions

Find answers, ask questions, and share your expertise

where to store the new value of "dfs.datanode.data.dir" after updating it in CM?

avatar
Explorer

hi,

 

I have updated "dfs.datanode.data.dir" on one datanode in CM4.6.3.

After reboot the whole cluster, I can still read the new value in CM configuration tab.

 

But I cannot find the new value in any hdfs-site.xml on this datanode.

I suppose this new value must be somewhere in this datanode, right?

So, could you tell me in which configuration file this value is stored?

 

Thanks.

 

 

1 ACCEPTED SOLUTION

avatar
Guru

The hdfs-site.xml file that you are viewing in CM which @smark helped you to find, resides on the local filesystem of that remote datanode.  It will not be in /etc/hadoop/conf, though (unless you re-deploy your client configs to that machine), as CM maintains its own configuration directory in /var/run/cloudera-scm-agent/process for the roles that it manages.  You will find the hdfs-site.xml file under that directory in the latest  ???-Datanode directory.

View solution in original post

5 REPLIES 5

avatar
Explorer
The cluster is on CentOS 6.

avatar
Super Collaborator

Did you apply this new dfs value to ALL datanode roles globally, or did you set it as an override to just one datanode (perhaps you have a node with fewer or more data volumes for a datanode, and you wanted it to inherit different values than the rest of the DN's).

 

If you've set this as a "one-off" configuration, look for an overrride [1] within the HDFS > Configuration > View & Edit for the property you've altered.

 

More to the point, this configuration would be picked up for the scope in which you applied it. If globally, you'd find that after restarting Datanodes that the DN's are using the new values set. A new hdfs-site.xml is generated each time a service starts, and these can be referenced by going to 

 

Services > hdfs > Instances > [click on a Datanode from the list] > Processes 

 

Then on this page, expand the link near the middle of the page that says "Show" under "Configuration Files/Environment". This will give links to ALL the configuration files used for that specific processes start of the Datanode. The hdfs-site.xml should reflect the dfs.data.dir settings you applied, if it is intended to be a recipient of that property/value.

 

If it is not, then there could be some other explanation such as use of Role Config Groups, or unintended overrides.

 

[1] = http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Managi...

avatar
Explorer

@smark, thanks for your informational reply.

 

Yes, this is a "Overriding Configuration Settings". I just changed the value of one instance of datanode role group.

 

Following "Services > hdfs > Instances > [click on a Datanode from the list] > Processes "-->"Show" under "Configuration Files/Environment", click hdfs-site.xml. I found the updated value in this file.

 

But the hyper-link of hdfs-site.xml is from remote CM machine, instead of local machine.

 

So, my question are:

Is this hdfs-site.xml on local or remote?

 

If it is on local, but I cannot find it, where is it? I suppose it should be "/etc/hadoop/conf/hdfs-site.xml", but NOT true.

if it is on remote(from the hyper-link, this link points to remote CM node),  this is a big surprise to me. The datanode service can use a remote hdfs-site.xml as its configuration file to start the service!? And I even cannot find the file on remote CM machine, it is only on memory?

 

Thanks.

 

avatar
Guru

The hdfs-site.xml file that you are viewing in CM which @smark helped you to find, resides on the local filesystem of that remote datanode.  It will not be in /etc/hadoop/conf, though (unless you re-deploy your client configs to that machine), as CM maintains its own configuration directory in /var/run/cloudera-scm-agent/process for the roles that it manages.  You will find the hdfs-site.xml file under that directory in the latest  ???-Datanode directory.

avatar
Explorer
I found it under /var/run/....

Thanks.

Although, I don't know where to define datanode service uses /var/run/cloudera-scm-agent/process/.../hdfs-site.xml as the configuration file to start the service.