Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cannot connect to Hive on CDH4.5 EC2 installation using Cloudera ODBC 2.5.5 Windows x64 drivers

avatar
Explorer

Hi All,

 

I've installed a CDH4.5 Hadoop cluster on Amazon EC2 using the instructions here:

 

http://blog.cloudera.com/blog/2013/03/how-to-create-a-cdh-cluster-on-amazon-ec2-via-cloudera-manager...

 

All seems to be working OK, however I can't connect to it from a Windows VM on my laptop using either the Hive or Impala ODBC drivers. I've connected this VM to the Quickstart VM in the past, and connected via the Impala ODBC drivers, but I can't seem to connect to CDH4 running on EC2 at all. Checking one of the EC2 instances, it doesn't even seem if port 10000 (the Hive port) is being used, but Hive is running and in the configuration properties for Hiveserver2 within CM, it says it's using port 10000.


Ports are open within the EC2 security group. Is there something obvious I'm missing here?

 

Mark

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

I believe in your first post you mention that you are using CM.  If you're using CM to manage the cluster then you won't see the hive-server2 service from a command line.  You'll have to add the instance and start it from CM.  The default settings for HiveServer2 are listed in the configuration, but by default the instance is not added or started.  Here is the documentation for adding a role instance.  Once you have added the hiveserver2 instance then you can start it and should be able to access it straight away.

 

Hopefully this will get you going.  Please let me know your results.

 

You can also use the following commands on the quickstart vm or your ec2 setup to verify that port 10000 is in use once you start hiveserver2:

 

sudo netstat -tulpn | grep 10000

 

Dave

View solution in original post

27 REPLIES 27

avatar
Explorer

Hi - thanks for the background.

 

One last question (promise) - if I'm also going to connect to Impala on either the Quickstart VM or an EC2 install (using Cloudera's ODBC drivers for Impala), should I also connect using port 10000, i.e. the Hiveserver2 port? Or should I use 21050?

 

Reason I ask is that now testing the Impala drivers, 10000 works, but I can't get a connection to work on 21050 (although I seem to remember it worked on that port before...)

 

Mark

avatar
Cloudera Employee

You should use port 21050 to connect to Impala, as long as that port hasn't changed in your settings.  You should choose no authentication if you do not have security setup on EC2/Quickstart.

 

Glad to see the HS2 connection is up and running!

avatar
Explorer

Thanks. One other issue I hit with Impala is that, on the EC2 install, the port isn't open (21050); this looks like it's because the maximum number of security rules in an AWS security group has been exceeded by the installer. You can add more security groups to an instance, so I'll try that route.

avatar
New Contributor
Hi,

Request you to help me resolve the connectivity issue. (Hive ODBC)

Problem: Unable to connect to Hive DB using Cloudera ODBC Driver
What Works: Able to connect to Hive DB thru JDBC driver, port 10000 works, active and fine with JDBC.
What is needed: Need to connect OBIEE to Hive DB thru ODBC.
What is done: Installed the Cloudera Apache ODBC driver on windows



Envrionment:
Linux VirtualBox VM: Oracle Bigdata Lite (Hadoop environmnet)
Version: Cloudera Distribution including Apache Hadoop (CDH5.4.0), Oracle Linux 6.6
Windows 7: Installed Cloudera ODBC Driver for Hive. To test the connectivity.
Error Messages:
==========================
Driver Version: V2.5.12.1005
Running connectivity tests...
Attempting connection
Failed to establish connection
SQLSTATE: HY000[Cloudera][HiveODBC] (34) Error from Hive: No more data to read..
TESTS COMPLETED WITH ERROR.
=========================

avatar
New Contributor
Sorry the Error message looks like this
=============
Driver Version: V2.5.12.1005

Running connectivity tests...

Attempting connection
Failed to establish connection
SQLSTATE: HY000[Cloudera][HiveODBC] (34) Error from Hive: ETIMEDOUT.

TESTS COMPLETED WITH ERROR
=====================

avatar
New Contributor

I am encountering issues as well.

 

Using the cloudera quickstart VM - NAT networking with port forwarding. Have included port 10000.

 

Managed to connect pentaho kettle to hive. I have an install of both tableau 32 bit and 64 bit.

 

Have followed the instructions above, starting up hiveserver2 etc, however I still get this error:

 

Driver Version: V2.5.0.1001

Running connectivity tests...

Attempting connection
Failed to establish connection
SQLSTATE: HY000[Cloudera][Hardy] (34) Error from Hive: Bad version identifier.

TESTS COMPLETED WITH ERROR.

 

I am running CDH 4.4...is this an issue? Anyone know how to solve this?

 

Many thanks in advance.

 

 

 

avatar

I'd guess that your driver and HS2 have a version mismatch. It's not clear from your post exactly what you are using to try to connect to HS2. You said that you got pentaho kettle to connect successfully, and that you have both 32 bit and 64 bit tableau, but not what you used that failed.

 

You might want to verify the compatibility of your driver with CDH versions (ie check with your driver's vendor), and / or try posting your question in the Hive forums, as this doesn't seem to be an issue with Cloudera Manager.

avatar
Cloudera Employee

Hi, sorry to hear you're having problems.

 

CDH 4.4 should work.  I have the CDH4.4 VM running on my laptop with NAT and can connect to it.

 

Are you choosing an authentication mechanism or leaving it as no authentication?

 

If you are choosing No Authentication, you will need to disable impersonation for HiveServer2 and add the following to the safety valve for hive-site.xml:

 

<property>
<name>hive.server2.authentication</name>
<value>NOSASL</value>
</property>

 

Another option you could try is to leave HiveServer2 as is and choose User Name authentication and supply a user name.

 

You may also want to try the newest version of the ODBC driver.