Hive is one of the most common used databases on Hadoop, users of Hive are doubling per year due to the amazing enhancements and the addition of Tez and Spark that enabled Hive to by pass the MR era to a an in-memory execution that changed how people are using Hive.
in this blog post, I will show you how to connect squirrel Sql Client to Hive, the concept is similar to any other clients out there as long as you are using the open-source libraries that matches the ones here you should be fine.
the files you should look for are the following (versions will differ base on which Sandbox you are running but different versions are unlikely to cause a problem)
if you are running windows you might need to install winSCP in order to grab the files from their locations.
Once all Jars above are downloaded into your local machine, Open up Squirrell and go to Drivers and Add New Driver.
Name: Hive Driver (could be anything else you want)
Example URL: jdbc:hive2://localhost:10000/default
Class Name: org.apache.hive.jdbc.HiveDriver
go to Extra Class Paths and add all the JARS you downloaded
you may change the port no or IP addresses if you are not running with the defaults.
login to you Hadoop Sandbox and verify that HIVESERVER2 is running using:
netstat -anp | grep 10000
if there was nothing running you can hiveserver2 manually
once you verify hiveserver2 is up and running you are ready to test the connection on Squirrel by creating a new Alias as following
you are now ready to connect, once connection is successful you should get a screen like this
Step 8 (Optional)
With your first Hive Query, Squirrel can be buggy and complain about memory and heap size, if this ever occurred, if you are on Mac, right click on the app icon-->show package contents-->open info.plist and add the following snippet