- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Can't connect to Impala through JDBC on Amazon EMR
- Labels:
-
Apache Hadoop
-
Apache Impala
Created on ‎07-09-2014 07:13 AM - edited ‎09-16-2022 02:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
- Setup a SSH tunnel to the master node like this: ssh -ND 21050 hadoop@master-node-external-dns-hostname
- Downloaded the correct JDBC drivers from here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/impala-jdbc.html
- Tried to setup a connection using SquirrelSQL and SQLWorkbenchJ using the downloaded drivers and the following connection string: jdbc:hive2://localhost:21050/;auth=noSasl
- Result: Could not establish connection to jdbc:hive2://localhost:21050/;auth=noSasl: null
- I checked wether Impala works by running impala-shell on the master node. I can show tables, query, etc.
- I checked wether the port is forwarded through the tunnel by telnetting to localhost 21050
- I checked with beeline on the master node if it's possible at all to connect to Impala through JDBC on that port. Works just fine
Created ‎08-12-2014 11:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried a different SSH tunnel and it worked for me:
ssh -L 12345:localhost:21050 your_user_name@your_node.compute.amazonaws.com
This opens up a port 12345 on your local machine and forwards it to port 21050 on the hadoop node.
More info here: http://marcelkrcah.net/blog/how-to-wire-pandas-to-impala/
Created ‎07-09-2014 01:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Daan,
I think you'd need to ask Amazon about this; it provides support for Impala on EMR.
Created ‎07-10-2014 03:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thought that maybe this was a general Impala JDBC issue that people have
seen before.
Created ‎08-12-2014 11:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried a different SSH tunnel and it worked for me:
ssh -L 12345:localhost:21050 your_user_name@your_node.compute.amazonaws.com
This opens up a port 12345 on your local machine and forwards it to port 21050 on the hadoop node.
More info here: http://marcelkrcah.net/blog/how-to-wire-pandas-to-impala/
Created ‎08-12-2014 01:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Marcel! That seems to work indeed, at least with Tableau and Impyla. Apparently the instructions on the Amazon website regarding setting up a tunnel, don't work that well. I'm gonna try out tomorrow if this tunnel also works with Squirrel and other generic JDBC DB tools.
