Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to export all HDP configuration files (xml/properties,etc)?

avatar
Expert Contributor

Guys,

I would like to export all the configurations of the HDP 2.3 cluster for reference ( not the blue print). Is there any command or utility which helps to export all the *-site*.xml and configurations?

Thanks in advance

1 ACCEPTED SOLUTION

avatar
Guru

You can go to ambari UI, hosts, select a host with clients installed, Host Actions and click on download client configs.

This will get you all the configuration files. You can also get those by exporting ambari blueprint but they are in json format, not individual xmls.

View solution in original post

8 REPLIES 8

avatar
Guru

You can go to ambari UI, hosts, select a host with clients installed, Host Actions and click on download client configs.

This will get you all the configuration files. You can also get those by exporting ambari blueprint but they are in json format, not individual xmls.

avatar
Super Guru

@Smart Solutions

As @Ravi Mutyala mentioned those will be only configurations of the clients. My understanding of your question is broader, cluster as a whole with server services and clients. I don't think you got the complete response. A simple shell script that searches for all *-site.xml files and tars them could be also helpful for a complete response.

avatar

Hi @Smart Solutions Already a good answer as above, but I'd also add ....

Are you using Ambari on this cluster?

If you're not using Ambari then all the configuration files you mentioned should be under some sort of configuration management control, such as Chef, Puppet, Ansible etc.

If you are using Ambari, then of course all this information is kept in the Ambari database, where all of the configuration is represented and where you can do revision control of the configs within the Ambari environment (plus perform config version comparison etc).

Technically it is possible to use the Ambari API to query all of the configuration parameters which make up the various xml configurations but that would be a giant chunk of work to do it that way.

See https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/configuration.md

What's your requirement to export the configs driven by?

Always happy to understand more :o)

avatar
New Contributor

I stumbled on this topic late , however we were facing a similar problem and here is how we solved it

as mentioned by ravi , you can download the host settings in xml files from ambari UI . and after that , open ms excel and in data tab choose option from other sources ; select from xml source and browse to the xml file and all the properties will be in the excel file.

we performed a similar activity to see how different services are configured on 3 clusters that are being used. simply filtering to our required service and properties in excel did the job quickly for us 🙂

avatar
Contributor

The options presented here are pretty discouraging. My use case would be to export client configuration via a Jenkins job, to allow continuous integration from dynamically scaling nodes that aren't Ambari-managed. Dynamically scaling edge-nodes should be a pretty standard use-case, but Ambari is anything but dynamic. On the other hand, it's convenient enough to manage the master/worker nodes. Ideally, I'd integrate Ambari with Git, and perform a simple checkout of the current client-configuration via Jenkins. Accessing the database directly is a big hack, and using Ambari's Rest-API is an insane overhead for something that conceptually is so simple. I also can't believe that Ambari is cooking its own version management, when there are dozen of highly evolved VCS which could be re-used. It's not as though the performance of the database backed Ambari VCS was a compelling reason to stick with it.

I wish Knox was a better piece of software, since technically it's mostly designed to abstract client config away, but documentation and scalability, as well as sheer functionality are still in their infancy.

And yes, I know, that a "simple" shell script could do this, but once you want this to work reliably, simplicity goes right out the window. VCS-Integration is probably worth filing a JIRA feature request for, and could help obsoleting/replacing/extending blueprints, which are too cumbersome for most of the applications they should support.

avatar
Contributor

I was able to get the client config via tarball with python requests:

TARBALL_URL = AMBARI_URI + ":" + AMBARI_PORT + "/api/v1/clusters/" + CLUSTER_NAME + '/components?format=client_config_tar'

However as others have stated, this is a limited set. I also need the ranger configs like ranger-hive-security.xml. I have been looking at the Ranger API and webpages that describe developing Ranger plugins, as obviously, when something in Hive etc. needs to talk to Ranger, it has to be aware of this config, and this is available under the hive conf.server folder on a give hiveserver2 host:

$ sudo ls /usr/hdp/current/hive-client/conf/conf.server/
hadoop-metrics2-hiveserver2.properties  hive-env.sh.template        hiveserver2-site.xml  ranger-hive-audit.xml     ranger-security.xml
hive-default.xml.template               hive-exec-log4j.properties  hive-site.xml         ranger-hive-security.xml  zkmigrator_jaas.conf
hive-env.sh                             hive-log4j.properties       mapred-site.xml       ranger-policymgr-ssl.xml

I need essentially this set from conf.server (Working on a Hive sidecar instance). I do not* want to pull these from a server via rsync or use cp, as it needs to be portable for my purposes.

Related:

avatar
Contributor

Another good one for those looking for properties. You can then write known files here via JSON dictionary > XML if needed.

Making API connection to: https://host.port/api/v1/clusters/cluster_name/configurations/service_config_versions?is_current=tru...

avatar
Explorer

Similar to this, I have a use case to compare Ansible Code with the Ambari Configs. The reason we are doing this is that we found several inconsistencies w.r.t to Ansible code and Ambari configs. But comparing both is a big task as there are many playbooks where we have Hadoop code so checking all the code base a heck. Any other option to do the comparison.....