Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Connecting BI tools to Spark

avatar
Expert Contributor

We need to connect different BI reporting frontends to Spark / Spark SQL. Ideally, connecting via JDBC or ODBC would give us the broadest options.

We have already reviewed this question, but have not succeeded:

If you go spark/sbin/beeline and execute

!connect jdbc:hive2://localhost:10015 

you are asked for used and password, which we leave blank and enter. However, when we try to access that host:port via ODBC from an outside application, we get SSL error.

On the other hand, I've noticed there is a commercial driver relased by Simba (and also promoted by Databricks) for achieving this.

Which is the recommended way to go?

Thanks!

1 ACCEPTED SOLUTION

avatar
Master Guru

The SparkSQL JDBC driver works, do you have a firewall or something blocking that port

1. Make sure Spark Thrift server is running through Ambari

2. Test Spark SQL from local spark shell

3. Check from another machine, make sure firewall / networkign is not blocking that port. Is this a sandbox? local cluster? cloud based?

can you post a screen shot of the tool you tried.

see:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_spark-component-guide/content/jdbc-odbc-...

which version of spark/hdp are you using?

do you see anything in the logs in your tool or on the server?

View solution in original post

9 REPLIES 9

avatar
Super Guru

@Fernando Lopez Bello

As you look at that connection string, it shows JDBC, however, then you mention ODBC now working over that JDBC string. Please clarify that and what BI tool.

avatar
Expert Contributor

Sorry for the confusion. I tried both. My example refers to JDBC.

avatar
Master Guru

The SparkSQL JDBC driver works, do you have a firewall or something blocking that port

1. Make sure Spark Thrift server is running through Ambari

2. Test Spark SQL from local spark shell

3. Check from another machine, make sure firewall / networkign is not blocking that port. Is this a sandbox? local cluster? cloud based?

can you post a screen shot of the tool you tried.

see:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_spark-component-guide/content/jdbc-odbc-...

which version of spark/hdp are you using?

do you see anything in the logs in your tool or on the server?

avatar
Expert Contributor

For JDBC, is the Simba commercial driver the only option?

Thanks!

avatar
Expert Contributor

Thanks @Timothy Spann

Actually we missed docker firewall (we were trying on a sandbox), after modifying iptables it worked -we tried the *free* ODBC driver (could not find a free one for JDBC).

avatar
New Contributor

@Fernando Lopez Bello did you find a solution for ODBC connections? I can connect just fine with the JDBC connection using Beeline and port 10015, but ODBC driver fails.

avatar
Expert Contributor

Yes, we could connect to Thrift Server via ODBC:

- Download and install ODBC driver for Spark from HDP downloads page.

- Make sure Thrift Server is up and running (default port 10015). Double check with telnet to that port, for instance.

- Configure ODBC driver like this:

Driver=Hortonworks Spark ODBC Driver;Host=192.168.170.45;Port=10015;SparkServerType=3;AuthMech=2;ThriftTransport=1;

On the other hand, I still need to connect via JDBC -without the Simba commercial driver. How could you do this?

Regards,

Fernando

avatar
Super Collaborator

For JDBC there is a built-in jar for JDBC support. No need for Simba.

avatar
Expert Contributor

Where can we find the referred jar?

Thanks