Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Why do I have two hive-site.xml config files on my HDP host?

Solved Go to solution

Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

I just saw that I have two hive-site.xml files on my Host running HDP 2.5.

  • /etc/hive/conf/hive-site.xml
  • /usr/hdp/current/spark-client/conf/hive-site.xml

Why is one file in the Spark directory and the other outside the /usr/hdp/ directory?

Which one is managed by Ambari? And can I overwrite the one in spark-client dir with the other one as described here: http://henning.kropponline.de/2016/11/06/connecting-livy-to-a-secured-kerberized-hdp-cluster/ (section "Hive Context")?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Why do I have two hive-site.xml config files on my HDP host?

Contributor

Spark uses the one in its directory to connect to hive when initializing a hive context, you can overwrite the spark hive-site with the hive's hive-site and it is recommended you do that in order to be able to connect to hive from Spark.

This is what I did in order to be able to run Livy.Spark within Zeppelin and was able to connect to Hive via this method.

9 REPLIES 9

Re: Why do I have two hive-site.xml config files on my HDP host?

Contributor

Spark uses the one in its directory to connect to hive when initializing a hive context, you can overwrite the spark hive-site with the hive's hive-site and it is recommended you do that in order to be able to connect to hive from Spark.

This is what I did in order to be able to run Livy.Spark within Zeppelin and was able to connect to Hive via this method.

Re: Why do I have two hive-site.xml config files on my HDP host?

Contributor

They are both technically managed by ambari, so if you are to restart spark, you will need to copy the hive-site.xml back over to overwrite the Spark's hive-site.xml as sometimes they are not the same

Re: Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

Will Ambari notice the new settings in hive-site file when I copy it via cp command?

Re: Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

Thank you for clarification, my Spark Livy application now works in my Kerberized cluster! This hint saved my day! As you said, I changed the listed parameters of

  • /usr/hdp/current/spark-client/conf/hive-site.xml

to values which equal the values in

  • /etc/hive/conf/hive-site.xml

Re: Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

I think the Kerbeos activation changed this default values in /etc/hive/conf/hive-site.xml, but didn't in /usr/hdp/current/spark-client/conf/hive-site.xml. So my application tried to connect e.g. to the wrong port and wasn't able to run it as impersonated user.

Re: Why do I have two hive-site.xml config files on my HDP host?

Contributor

Wonderful! If you could close the other question you had posted and accept my answer it would be greatly appreciated! It is a bit funky how it works, but as you make sure you have the correct hive-site.xml in the Spark conf, you should be okay as all of your other configs looked correct.

For some reason the hive-site.xml in Spark doesn't have the same template as Hive's. Ambari will notice the hive-site.xml and overwrite it in the Spark directory whenever Spark is restarted, thus the need to copy it over again, I have a cronjob set up to cp the hive-site.xml over every 5 minutes so I don't have to worry about that, something you might think about doing.

Re: Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

Exactly my problem, when I restart Spark, the /usr/hdp/current/spark-client/conf/hive-site.xml is overwritten with some default entries. So there's no way to change this properties via Spark configs in Ambari to make them persistent and survive service restarts right? The cronjob idea is a good workaround! Thank you!

Re: Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

I found another solution for me. I had to change a property from true to false. As @Doroszlai, Attila explained in my other question https://community.hortonworks.com/answers/98279/view.html I was able to find the property (which was reset on each Spark service restart) in the Ambari Hive configs. When I changed it here and restarted Hive and afterwards also the Spark component, my /usr/hdp/current/spark-client/conf/hive-site.xml contained the correct value! So the properties which are part of Spark's hive-site.xml but not listed in the Ambari Spark configs, need to be changed via Ambari Hive configs!

Highlighted

Re: Why do I have two hive-site.xml config files on my HDP host?

Expert Contributor

/etc/hive/conf/hive-site.xml is the config for Hive service itself and is managed via Ambari through the Hive service config page.

/usr/hdp/current/spark-client/conf/hive-site.xml actually points to /etc/spark/conf/hive-site.xml . This is the minimal hive config that Spark needs to access Hive. This is managed via Ambari through the Spark service config page. Ambari correctly configures this hive site for Kerberos. Depending upon your version of HDP you may not have the correct support in Ambari for configuring Livy.

Don't have an account?
Coming from Hortonworks? Activate your account here