- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Connecting BI tools to Spark
- Labels:
-
Apache Spark
Created 02-21-2017 02:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We need to connect different BI reporting frontends to Spark / Spark SQL. Ideally, connecting via JDBC or ODBC would give us the broadest options.
We have already reviewed this question, but have not succeeded:
If you go spark/sbin/beeline and execute
!connect jdbc:hive2://localhost:10015
you are asked for used and password, which we leave blank and enter. However, when we try to access that host:port via ODBC from an outside application, we get SSL error.
On the other hand, I've noticed there is a commercial driver relased by Simba (and also promoted by Databricks) for achieving this.
Which is the recommended way to go?
Thanks!
Created 02-27-2017 05:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The SparkSQL JDBC driver works, do you have a firewall or something blocking that port
1. Make sure Spark Thrift server is running through Ambari
2. Test Spark SQL from local spark shell
3. Check from another machine, make sure firewall / networkign is not blocking that port. Is this a sandbox? local cluster? cloud based?
can you post a screen shot of the tool you tried.
see:
which version of spark/hdp are you using?
do you see anything in the logs in your tool or on the server?
Created 02-21-2017 09:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As you look at that connection string, it shows JDBC, however, then you mention ODBC now working over that JDBC string. Please clarify that and what BI tool.
Created 02-22-2017 01:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for the confusion. I tried both. My example refers to JDBC.
Created 02-27-2017 05:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The SparkSQL JDBC driver works, do you have a firewall or something blocking that port
1. Make sure Spark Thrift server is running through Ambari
2. Test Spark SQL from local spark shell
3. Check from another machine, make sure firewall / networkign is not blocking that port. Is this a sandbox? local cluster? cloud based?
can you post a screen shot of the tool you tried.
see:
which version of spark/hdp are you using?
do you see anything in the logs in your tool or on the server?
Created 03-08-2017 03:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For JDBC, is the Simba commercial driver the only option?
Thanks!
Created 03-04-2017 12:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Timothy Spann
Actually we missed docker firewall (we were trying on a sandbox), after modifying iptables it worked -we tried the *free* ODBC driver (could not find a free one for JDBC).
Created 03-08-2017 04:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Fernando Lopez Bello did you find a solution for ODBC connections? I can connect just fine with the JDBC connection using Beeline and port 10015, but ODBC driver fails.
Created 03-08-2017 03:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, we could connect to Thrift Server via ODBC:
- Download and install ODBC driver for Spark from HDP downloads page.
- Make sure Thrift Server is up and running (default port 10015). Double check with telnet to that port, for instance.
- Configure ODBC driver like this:
Driver=Hortonworks Spark ODBC Driver;Host=192.168.170.45;Port=10015;SparkServerType=3;AuthMech=2;ThriftTransport=1;
On the other hand, I still need to connect via JDBC -without the Simba commercial driver. How could you do this?
Regards,
Fernando
Created 03-08-2017 06:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For JDBC there is a built-in jar for JDBC support. No need for Simba.
Created 03-10-2017 07:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Where can we find the referred jar?
Thanks