Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Why do I have two hive-site.xml config files on my HDP host?

avatar
Expert Contributor

I just saw that I have two hive-site.xml files on my Host running HDP 2.5.

  • /etc/hive/conf/hive-site.xml
  • /usr/hdp/current/spark-client/conf/hive-site.xml

Why is one file in the Spark directory and the other outside the /usr/hdp/ directory?

Which one is managed by Ambari? And can I overwrite the one in spark-client dir with the other one as described here: http://henning.kropponline.de/2016/11/06/connecting-livy-to-a-secured-kerberized-hdp-cluster/ (section "Hive Context")?

1 ACCEPTED SOLUTION

avatar
Rising Star

Spark uses the one in its directory to connect to hive when initializing a hive context, you can overwrite the spark hive-site with the hive's hive-site and it is recommended you do that in order to be able to connect to hive from Spark.

This is what I did in order to be able to run Livy.Spark within Zeppelin and was able to connect to Hive via this method.

View solution in original post

9 REPLIES 9

avatar
Rising Star

Spark uses the one in its directory to connect to hive when initializing a hive context, you can overwrite the spark hive-site with the hive's hive-site and it is recommended you do that in order to be able to connect to hive from Spark.

This is what I did in order to be able to run Livy.Spark within Zeppelin and was able to connect to Hive via this method.

avatar
Rising Star

They are both technically managed by ambari, so if you are to restart spark, you will need to copy the hive-site.xml back over to overwrite the Spark's hive-site.xml as sometimes they are not the same

avatar
Expert Contributor

Will Ambari notice the new settings in hive-site file when I copy it via cp command?

avatar
Expert Contributor

Thank you for clarification, my Spark Livy application now works in my Kerberized cluster! This hint saved my day! As you said, I changed the listed parameters of

  • /usr/hdp/current/spark-client/conf/hive-site.xml

to values which equal the values in

  • /etc/hive/conf/hive-site.xml

avatar
Expert Contributor

I think the Kerbeos activation changed this default values in /etc/hive/conf/hive-site.xml, but didn't in /usr/hdp/current/spark-client/conf/hive-site.xml. So my application tried to connect e.g. to the wrong port and wasn't able to run it as impersonated user.

avatar
Rising Star

Wonderful! If you could close the other question you had posted and accept my answer it would be greatly appreciated! It is a bit funky how it works, but as you make sure you have the correct hive-site.xml in the Spark conf, you should be okay as all of your other configs looked correct.

For some reason the hive-site.xml in Spark doesn't have the same template as Hive's. Ambari will notice the hive-site.xml and overwrite it in the Spark directory whenever Spark is restarted, thus the need to copy it over again, I have a cronjob set up to cp the hive-site.xml over every 5 minutes so I don't have to worry about that, something you might think about doing.

avatar
Expert Contributor

Exactly my problem, when I restart Spark, the /usr/hdp/current/spark-client/conf/hive-site.xml is overwritten with some default entries. So there's no way to change this properties via Spark configs in Ambari to make them persistent and survive service restarts right? The cronjob idea is a good workaround! Thank you!

avatar
Expert Contributor

I found another solution for me. I had to change a property from true to false. As @Doroszlai, Attila explained in my other question https://community.hortonworks.com/answers/98279/view.html I was able to find the property (which was reset on each Spark service restart) in the Ambari Hive configs. When I changed it here and restarted Hive and afterwards also the Spark component, my /usr/hdp/current/spark-client/conf/hive-site.xml contained the correct value! So the properties which are part of Spark's hive-site.xml but not listed in the Ambari Spark configs, need to be changed via Ambari Hive configs!

avatar
Super Collaborator

/etc/hive/conf/hive-site.xml is the config for Hive service itself and is managed via Ambari through the Hive service config page.

/usr/hdp/current/spark-client/conf/hive-site.xml actually points to /etc/spark/conf/hive-site.xml . This is the minimal hive config that Spark needs to access Hive. This is managed via Ambari through the Spark service config page. Ambari correctly configures this hive site for Kerberos. Depending upon your version of HDP you may not have the correct support in Ambari for configuring Livy.