Created 04-26-2016 12:08 AM
Hi I am trying to execute a query in Hive through WebHcat services of Knox using below (Kerberos enabled)
curl -i -u testuser:abcdef -d execute="select+*+from+test_table;" \ -d statusdir="pokes.output" \ 'https://knox.testserver.com:8443/gateway/sample/templeton/v1/hive?user.name=testuser'
But this does not work, This fails with following message
HTTP/1.1 500 Server Error Set-Cookie: JSESSIONID=18unyi37n3omieug23ruoetn0;Path=/gateway/sample;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Server: Jetty(7.6.0.v20120127) Content-Type: application/json Content-Length: 93 {"error":"File /user/hadoop/2.3.4.0-3485/hive/lib/hive-common.jar/zookeeper.jar does not exist."}
in the first place.. instead of hive-common.jar I have hive-common-1.2.1.2.3.4.0-3485.jar and this does not include zookeeper.jar in it. Any idea how to solve this?
Created 04-26-2016 12:31 AM
This is probably not in anyway related to your use of Knox. You could try the same access going directly to webhcat to test it out, if you like.
I believe that you should be able to manage the settings for the templeton.libjars in Ambari under Advanced webhcat-site.xml. That seems to be where those settings are for me. As an example here are my current settings:
/usr/hdp/${hdp.version}/zookeeper/zookeeper.jar,/usr/hdp/${hdp.version}/hive/lib/hive-common.jar
Adjust the filenames to match what you have in your environment and see if that helps.
If you are not using Ambari then you will need to find webhcat.xml, manually edit it and restart webhcat.
Additionally, you may want to consider using HiveServer2 for SQL access. I believe that WebHCat is generally used for metadata related queries.
Created 04-26-2016 12:31 AM
This is probably not in anyway related to your use of Knox. You could try the same access going directly to webhcat to test it out, if you like.
I believe that you should be able to manage the settings for the templeton.libjars in Ambari under Advanced webhcat-site.xml. That seems to be where those settings are for me. As an example here are my current settings:
/usr/hdp/${hdp.version}/zookeeper/zookeeper.jar,/usr/hdp/${hdp.version}/hive/lib/hive-common.jar
Adjust the filenames to match what you have in your environment and see if that helps.
If you are not using Ambari then you will need to find webhcat.xml, manually edit it and restart webhcat.
Additionally, you may want to consider using HiveServer2 for SQL access. I believe that WebHCat is generally used for metadata related queries.
Created 04-26-2016 05:47 AM
Hi, Thanks for your suggestion. I have just started using this so can you please help me in understanding few more things.
I found the webhcat server using ambari and the templeton.libjars value in webhcat.-site.xml are as follows
<name>templeton.libjars</name> <value>/usr/hdp/${hdp.version}/zookeeper,/usr/hdp/${hdp.version}/hive/lib/hive-common.jar/zookeeper.jar</value>
I think this has wrong values or typo in it as you suggested. I dont have access to edit this file.
1. Is there any other way to use webhcat without editing the webhcat-site.xml file? Like passing as a post parameter in Curl or something?
2. My cluster has edge node in it. Why is it that the webhcat-site.xml file is present only in webhcat server. Why is it not present in edge node. Edge node only has webhcat-default.xml? Shouldn't all the *-site.xml be present in edge node as well?
3. How to access hiveserver2 via knox? Is it possible to use hiveserver2 to insert values into hive table from outside the cluster using knox?
Created 04-26-2016 11:50 AM
Yes, that looks like an incorrect setting and that should be changed. I don't think that there is anyway to override that from the curl side of things. Even if there were webhcat is configured wrong for your deployment and any other clients that need to access it.
Your questions:
1. No
2. I'm not sure about this actually. However, one of the beauties of using the REST API is that you don't need client side config files. Therefore, it would do you no good anyway.
3. HiveServer2 is accessed via ODBC/JDBC or the beeline client (which uses JDBC). You can certainly use HiveServer2 to insert values. It's primary purpose is to provide a server for executing SQL against hive tables, etc. See: http://hortonworks.com/hadoop-tutorial/secure-jdbc-odbc-clients-access-hiveserver2-using-apache-knox... and http://knox.apache.org/books/knox-0-9-0/user-guide.html#Hive for details of accessing it through Knox. You may also look at the samples in {GATEWAY_HOME}/samples/hive.
Created 05-27-2016 05:06 PM
Had the same error in my config, fixing it to proposed resolved an issue