Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

What is the Path of hdfs.site.xml , core.xml ?

avatar
Rising Star

Hello ,

 

I Installed HDFS using CM in my Linux system, I want  to know the path of the hdfs.site.xml and hdfs.core.xml .

 

Thanks

Bala

Thanks
Bala
1 ACCEPTED SOLUTION

avatar
Master Collaborator

This is important to understand the difference when using CM vs just CDH.

 

When using Cloudera Manager, configuration is stored in a central DB, upon startup of services on a target cluster node, cloudera manager passes, through the Agent on that host, the runtime configuration that should be used, and starts the processes, pointing to that runtime location.


This results in the actual services configuration being stored in a non-standard location of:

 

/var/run/cloudera-scm-agent/process/###-[service]-[SERVICE-ROLE]


The most recient 'instance" of a path is the current runtime config (use ls -ltr as root in the /var/run/cloudera-scm-agent/process path, the last being most current.


You can access the SAME information on a per service role instance basis, from the "process" tab.   For example for hdfs

 

Cloudera Manager > Cluster > HDFS > Instances > (pick for example, the NameNode from the list)> Processes 

 

You will see under "Configuration Files/Environment" a greater than (>) that you can click to expand and show all the current configs passed to the server, the same info in the path I describe above.  This is handy as not all cluster administrators have root access to get to the indicated path.

 

The cloudera manager function of "Deploy client configuration" pushes the current configuration information, SPECIFIC TO CLIENT APPLICATIONS to the cluster hosts and defined gateway nodes, which end up in the default /etc/ locations you are used to from Hadoop and CDH documentation.  Those locations will not have the complete configuration as used by the server, just values necessary for client applications (CLI, custom apps, etc) to use the cluster.

 

Todd

View solution in original post

7 REPLIES 7

avatar
Explorer
/etc/hadoop/[service name]/hdfs-site.xml

Example:
/etc/hadoop/conf.cloudera.hdfs1/hdfs-site.xml

core-site.xml on the same path.

avatar
Expert Contributor
When using CM, you should manage your services via CM as well.

The client configuration can be found under /etc/hadoop/conf but the
configuration used by various services can be different and is visible via
the CM web interface.

avatar
Master Collaborator

This is important to understand the difference when using CM vs just CDH.

 

When using Cloudera Manager, configuration is stored in a central DB, upon startup of services on a target cluster node, cloudera manager passes, through the Agent on that host, the runtime configuration that should be used, and starts the processes, pointing to that runtime location.


This results in the actual services configuration being stored in a non-standard location of:

 

/var/run/cloudera-scm-agent/process/###-[service]-[SERVICE-ROLE]


The most recient 'instance" of a path is the current runtime config (use ls -ltr as root in the /var/run/cloudera-scm-agent/process path, the last being most current.


You can access the SAME information on a per service role instance basis, from the "process" tab.   For example for hdfs

 

Cloudera Manager > Cluster > HDFS > Instances > (pick for example, the NameNode from the list)> Processes 

 

You will see under "Configuration Files/Environment" a greater than (>) that you can click to expand and show all the current configs passed to the server, the same info in the path I describe above.  This is handy as not all cluster administrators have root access to get to the indicated path.

 

The cloudera manager function of "Deploy client configuration" pushes the current configuration information, SPECIFIC TO CLIENT APPLICATIONS to the cluster hosts and defined gateway nodes, which end up in the default /etc/ locations you are used to from Hadoop and CDH documentation.  Those locations will not have the complete configuration as used by the server, just values necessary for client applications (CLI, custom apps, etc) to use the cluster.

 

Todd

avatar
Explorer

This is an old post but I have a followup question to Tgrayson's excellent response.

 

These directories have the "client" configurations, is there a similar location where the server information is kept? I am trying to recreate and preserve the history of changes made to our cluster in order to baseline and, perhaps, roll-back.

 

Or is there a better way?

 

Thanks

Arthur

avatar
Master Collaborator
In a CM managed cluster, the current runtime configs are within a current instance path (newly created each startup) in the /var/run/cloudera-scm-agent/process/###-SERVICE-roleInstance path. To identify the current one perform a ls -lrt in that path.

Rollback is a function of the CM ui... You would never manually attempt rollback by manipulating anything in that path. The start/stop of the process would instantiate a new runtime config out of the SCM DB each time.

avatar
Explorer

When I said roll-back I was not referring to an automated one, obviously that is best handled by your system.

 
I meant that over time we are changing settings. I would like to somehow capture that and put it into Perforce so that we can track what was done when. This would make it much easier to get back to an existing correct setup when things go wrong, as they always do.
 
Also, where are items like the TaskTracker settings. I don't see them in that path, maybe I'm overlooking them.
 
Thanks again.

avatar
New Contributor

Great Info.

Have one confusion.

Can we start/stop services manully which was insstaled using parcles?

If yes what are the  configuration need to do?