Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Issue accessing Hive via WebHcat/Knox

avatar
Expert Contributor

Hi I am trying to execute a query in Hive through WebHcat services of Knox using below (Kerberos enabled)

curl -i -u testuser:abcdef -d execute="select+*+from+test_table;" \
-d statusdir="pokes.output" \
'https://knox.testserver.com:8443/gateway/sample/templeton/v1/hive?user.name=testuser'

But this does not work, This fails with following message

HTTP/1.1 500 Server Error
Set-Cookie: JSESSIONID=18unyi37n3omieug23ruoetn0;Path=/gateway/sample;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Server: Jetty(7.6.0.v20120127)
Content-Type: application/json
Content-Length: 93
{"error":"File /user/hadoop/2.3.4.0-3485/hive/lib/hive-common.jar/zookeeper.jar does not exist."}

in the first place.. instead of hive-common.jar I have hive-common-1.2.1.2.3.4.0-3485.jar and this does not include zookeeper.jar in it. Any idea how to solve this?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This is probably not in anyway related to your use of Knox. You could try the same access going directly to webhcat to test it out, if you like.

I believe that you should be able to manage the settings for the templeton.libjars in Ambari under Advanced webhcat-site.xml. That seems to be where those settings are for me. As an example here are my current settings:

/usr/hdp/${hdp.version}/zookeeper/zookeeper.jar,/usr/hdp/${hdp.version}/hive/lib/hive-common.jar

Adjust the filenames to match what you have in your environment and see if that helps.

If you are not using Ambari then you will need to find webhcat.xml, manually edit it and restart webhcat.

Additionally, you may want to consider using HiveServer2 for SQL access. I believe that WebHCat is generally used for metadata related queries.

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

This is probably not in anyway related to your use of Knox. You could try the same access going directly to webhcat to test it out, if you like.

I believe that you should be able to manage the settings for the templeton.libjars in Ambari under Advanced webhcat-site.xml. That seems to be where those settings are for me. As an example here are my current settings:

/usr/hdp/${hdp.version}/zookeeper/zookeeper.jar,/usr/hdp/${hdp.version}/hive/lib/hive-common.jar

Adjust the filenames to match what you have in your environment and see if that helps.

If you are not using Ambari then you will need to find webhcat.xml, manually edit it and restart webhcat.

Additionally, you may want to consider using HiveServer2 for SQL access. I believe that WebHCat is generally used for metadata related queries.

avatar
Expert Contributor

Hi, Thanks for your suggestion. I have just started using this so can you please help me in understanding few more things.

I found the webhcat server using ambari and the templeton.libjars value in webhcat.-site.xml are as follows

<name>templeton.libjars</name>
      <value>/usr/hdp/${hdp.version}/zookeeper,/usr/hdp/${hdp.version}/hive/lib/hive-common.jar/zookeeper.jar</value>

I think this has wrong values or typo in it as you suggested. I dont have access to edit this file.

1. Is there any other way to use webhcat without editing the webhcat-site.xml file? Like passing as a post parameter in Curl or something?

2. My cluster has edge node in it. Why is it that the webhcat-site.xml file is present only in webhcat server. Why is it not present in edge node. Edge node only has webhcat-default.xml? Shouldn't all the *-site.xml be present in edge node as well?

3. How to access hiveserver2 via knox? Is it possible to use hiveserver2 to insert values into hive table from outside the cluster using knox?

avatar
Expert Contributor

Yes, that looks like an incorrect setting and that should be changed. I don't think that there is anyway to override that from the curl side of things. Even if there were webhcat is configured wrong for your deployment and any other clients that need to access it.

Your questions:

1. No

2. I'm not sure about this actually. However, one of the beauties of using the REST API is that you don't need client side config files. Therefore, it would do you no good anyway.

3. HiveServer2 is accessed via ODBC/JDBC or the beeline client (which uses JDBC). You can certainly use HiveServer2 to insert values. It's primary purpose is to provide a server for executing SQL against hive tables, etc. See: http://hortonworks.com/hadoop-tutorial/secure-jdbc-odbc-clients-access-hiveserver2-using-apache-knox... and http://knox.apache.org/books/knox-0-9-0/user-guide.html#Hive for details of accessing it through Knox. You may also look at the samples in {GATEWAY_HOME}/samples/hive.

avatar

Had the same error in my config, fixing it to proposed resolved an issue