- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Why do I have two hive-site.xml config files on my HDP host?
Created ‎04-24-2017 01:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just saw that I have two hive-site.xml files on my Host running HDP 2.5.
- /etc/hive/conf/hive-site.xml
- /usr/hdp/current/spark-client/conf/hive-site.xml
Why is one file in the Spark directory and the other outside the /usr/hdp/ directory?
Which one is managed by Ambari? And can I overwrite the one in spark-client dir with the other one as described here: http://henning.kropponline.de/2016/11/06/connecting-livy-to-a-secured-kerberized-hdp-cluster/ (section "Hive Context")?
Created ‎04-24-2017 01:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Spark uses the one in its directory to connect to hive when initializing a hive context, you can overwrite the spark hive-site with the hive's hive-site and it is recommended you do that in order to be able to connect to hive from Spark.
This is what I did in order to be able to run Livy.Spark within Zeppelin and was able to connect to Hive via this method.
Created ‎04-24-2017 01:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Spark uses the one in its directory to connect to hive when initializing a hive context, you can overwrite the spark hive-site with the hive's hive-site and it is recommended you do that in order to be able to connect to hive from Spark.
This is what I did in order to be able to run Livy.Spark within Zeppelin and was able to connect to Hive via this method.
Created ‎04-24-2017 01:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They are both technically managed by ambari, so if you are to restart spark, you will need to copy the hive-site.xml back over to overwrite the Spark's hive-site.xml as sometimes they are not the same
Created ‎04-24-2017 02:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Will Ambari notice the new settings in hive-site file when I copy it via cp command?
Created ‎04-24-2017 01:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for clarification, my Spark Livy application now works in my Kerberized cluster! This hint saved my day! As you said, I changed the listed parameters of
- /usr/hdp/current/spark-client/conf/hive-site.xml
to values which equal the values in
- /etc/hive/conf/hive-site.xml
Created ‎04-24-2017 02:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think the Kerbeos activation changed this default values in /etc/hive/conf/hive-site.xml, but didn't in /usr/hdp/current/spark-client/conf/hive-site.xml. So my application tried to connect e.g. to the wrong port and wasn't able to run it as impersonated user.
Created ‎04-24-2017 07:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wonderful! If you could close the other question you had posted and accept my answer it would be greatly appreciated! It is a bit funky how it works, but as you make sure you have the correct hive-site.xml in the Spark conf, you should be okay as all of your other configs looked correct.
For some reason the hive-site.xml in Spark doesn't have the same template as Hive's. Ambari will notice the hive-site.xml and overwrite it in the Spark directory whenever Spark is restarted, thus the need to copy it over again, I have a cronjob set up to cp the hive-site.xml over every 5 minutes so I don't have to worry about that, something you might think about doing.
Created ‎04-25-2017 06:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Exactly my problem, when I restart Spark, the /usr/hdp/current/spark-client/conf/hive-site.xml is overwritten with some default entries. So there's no way to change this properties via Spark configs in Ambari to make them persistent and survive service restarts right? The cronjob idea is a good workaround! Thank you!
Created ‎04-25-2017 07:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found another solution for me. I had to change a property from true to false. As @Doroszlai, Attila explained in my other question https://community.hortonworks.com/answers/98279/view.html I was able to find the property (which was reset on each Spark service restart) in the Ambari Hive configs. When I changed it here and restarted Hive and afterwards also the Spark component, my /usr/hdp/current/spark-client/conf/hive-site.xml contained the correct value! So the properties which are part of Spark's hive-site.xml but not listed in the Ambari Spark configs, need to be changed via Ambari Hive configs!
Created ‎04-24-2017 08:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/etc/hive/conf/hive-site.xml is the config for Hive service itself and is managed via Ambari through the Hive service config page.
/usr/hdp/current/spark-client/conf/hive-site.xml actually points to /etc/spark/conf/hive-site.xml . This is the minimal hive config that Spark needs to access Hive. This is managed via Ambari through the Spark service config page. Ambari correctly configures this hive site for Kerberos. Depending upon your version of HDP you may not have the correct support in Ambari for configuring Livy.
