Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari Audit log

avatar

Where can I find an audit trail of every changes done via Ambari ? I would like something similar to the configuration diff that we can do using the UI with the addition of the username.

Eg Olivier has changed umask to 077 in hdfs-site on Monday 5th of December 2014 at 2:20:21.123

I've found /var/log/ambari-server/ambari-config-changes.log but it doesn't show the specific change which has happened. I understand that I've got the version and i can diff w/ the previous version but i was wondering if we were recording it somewhere else.

1 ACCEPTED SOLUTION

avatar

@Olivier Renault I don't think we have a separate audit tool or recording of the changes available, however a short Python script should solve this problem.

I just created a short example (quick and dirty solution, needs some tweaking! :P), take a look at this https://github.com/mr-jstraub/ambari-audit-config

The repo contains an audit.py script that you can use as follows:

Example (audit hive-site to shell):

python audit.py --target horton01.myhost.com:8080 --cluster bigdata --user admin --config hive-site

Example (audit hive-site to hive-site_audit.log)

python audit.py --target horton01.myhost.com:8080 --cluster bigdata --user admin --config hive-site --output hive-site_audit.log

Result:

hive-site: version 1 - ADDED - javax.jdo.option.ConnectionDriverName - com.mysql.jdbc.Driver
hive-site: version 1 - ADDED - hive.fetch.task.aggr - false
hive-site: version 1 - ADDED - hive.execution.engine - tez
hive-site: version 1 - ADDED - hive.tez.java.opts - -server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps
hive-site: version 1 - ADDED - hive.vectorized.groupby.maxentries - 100000
hive-site: version 1 - ADDED - hive.server2.table.type.mapping - CLASSIC
...
...
...
hive-site: version 1 - ADDED - hive.compactor.check.interval - 300L
hive-site: version 1 - ADDED - hive.compactor.delta.pct.threshold - 0.1f
hive-site: version 2 - CHANGED - javax.jdo.option.ConnectionURL - jdbc:mysql://horton03.myhost.com/hive?createDatabaseIfNotExist=true => jdbc:mysql://horton03.myhost.com:3306/hive?createDatabaseIfNotExist=true
hive-site: version 2 - CHANGED - hive.zookeeper.quorum - horton03.myhost.com:2181,horton02.myhost.com:2181,horton01.myhost.com:2181 => horton02.myhost.com:2181,horton03.myhost.com:2181,horton01.myhost.com:2181
hive-site: version 2 - CHANGED - hive.cluster.delegation.token.store.zookeeper.connectString - horton03.myhost.com:2181,horton02.myhost.com:2181,horton01.myhost.com:2181 => horton02.myhost.com:2181,horton03.myhost.com:2181,horton01.myhost.com:2181
hive-site: version 3 - CHANGED - javax.jdo.option.ConnectionURL - jdbc:mysql://horton03.myhost.com:3306/hive?createDatabaseIfNotExist=true => jdbc:mysql://horton03.myhost.com/hive?createDatabaseIfNotExist=true
hive-site: version 4 - ADDED - atlas.cluster.name - default
hive-site: version 4 - CHANGED - hive.exec.post.hooks - org.apache.hadoop.hive.ql.hooks.ATSHook => org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.atlas.hive.hook.HiveHook
hive-site: version 4 - CHANGED - hive.metastore.sasl.enabled - false => true
hive-site: version 4 - CHANGED - hive.server2.authentication.spnego.principal - /etc/security/keytabs/spnego.service.keytab => HTTP/_HOST@EXAMPLE.COM
hive-site: version 4 - CHANGED - hive.server2.authentication.spnego.keytab - HTTP/_HOST@EXAMPLE.COM => /etc/security/keytabs/spnego.service.keytab
hive-site: version 4 - ADDED - hive.server2.authentication.kerberos.keytab - /etc/security/keytabs/hive.service.keytab
hive-site: version 4 - CHANGED - hive.zookeeper.quorum - horton02.myhost.com:2181,horton03.myhost.com:2181,horton01.myhost.com:2181 => horton03.myhost.com:2181,horton02.myhost.com:2181,horton01.myhost.com:2181
hive-site: version 4 - ADDED - hive.server2.authentication.kerberos.principal - hive/_HOST@EXAMPLE.COM
hive-site: version 4 - ADDED - atlas.rest.address - http://horton03.myhost.com:21000
hive-site: version 4 - CHANGED - hive.cluster.delegation.token.store.zookeeper.connectString - horton02.myhost.com:2181,horton03.myhost.com:2181,horton01.myhost.com:2181 => horton03.myhost.com:2181,horton02.myhost.com:2181,horton01.myhost.com:2181
hive-site: version 4 - CHANGED - hive.server2.authentication - NONE => KERBEROS
hive-site: version 5 - CHANGED - atlas.cluster.name - default => bigdata
hive-site: version 6 - ADDED - my.prop.test - blub

I still need to add the username, however I haven't found it for every config version. Does anyone know if I can retrieve the username of the person that changed the configuration?

Hope that helps 🙂

Update: Found the usernames, but I need to map config type (hive-site, hive-env,...) to service name (HIVE).....bit tricky.....

http://horton01.myhost.com.com:8080/api/v1/clusters/bigdata/configurations/service_config_versions?s...

View solution in original post

13 REPLIES 13

avatar
Expert Contributor

Ah - thanks. I am far from an expert on Python and did not realize this was a non-Hadoop package. A quick 'yum install python-requests' did the trick.

FYI: I'm not sure if you're implying that 'audit.py' should be present on the system, but it certainly isn't here. I grabbed it from GitHub.

This is a very valuable utilitity, and would be even more so if it had the ability to crawl all configuration files and generate a master listing. My proximate problem is how to effectively clone a running configuration to a new cluster. I'm surprised to find so little discussion of this task. Perhaps I'm missing something obvious, but exporting and copying configurations does not correct the myriad of machine names and IP addresses that will certainly be inappropriate in a new setting. Ambari badly needs the ability to generate bulk delta files (differences from defaults at install time) that can be imported "smartly" into a new cluster - e.g. spotting anything that looks like a URL or port address and prompting for manual intervention.

avatar
Expert Contributor

Thanks, Jonas. As I mentioned in my last post, I'm in search of a way to generate a set of deltas that can propagate my tuning and tweaks to a new cluster. I try to keep notes, but there have been many occasions where I was in the midst of troubleshooting and failed to write down what I did. Exporting configuration in bulk isn't really what's needed to propagate diffs to a new cluster, since it will have machine names (and perhaps port addresses) that won't apply to the new target. Your script looks promising for a starting point, but I'd have to flesh it out with a framework that traverses all configuration files at a minimum. From a quick look it also appears that the change tracking on shell environment files will require a lot of manual work, since there's no attempt at differencing - the entire file is dumped.

avatar

Yeah the script is really a starting point to do ambari audits. It sounds like you need more like an export/import functionality, I have worked on something similar in the past. Or are you looking for a way to export the config deltas from two clusters an compare them?

How would the export of configuration deltas work? Export all adjusted configurations, but ignore configurations that have a hostname, ip or clustername automatically? Or do you just export all delta configurations, select the configuration values you want for the new cluster and import the selected values?

avatar
Expert Contributor

The behavior I'm looking for is something like this:

Export all deltas for all configuration files beyond settings that are built-in defaults for Hortonworks installation. I believe this would be defined as version numbers > 1 (is that correct?)

Import these into a newly-built cluster and be prompted for manual intervention when a delta includes a machine name, IP address or port (the latter can probably be determined by a regex match on the property name).

As an audit tool, the presentation of environment (shell) scripts should be in the form of unified diffs rather than a dump of the entire file at each revision.

Just a few ideas off the top of my head. There's no way this process can be totally automated, but I think it's possible to get very close.