- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Accessing spark dataframe in spark-shell through JDBC
- Labels:
-
Apache Spark
Created ‎12-01-2017 12:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So my workflow for everyday purposes is,
1. I have data on hive that is refreshed by oozie periodically
2. I access this data from spark-shell and then build data sets/run ml algorithms and create new dataframes.
3. Then I am expected to persist these dataframes back to hive.
4. Then I can connect from Tableau through a JDBC connection and visualize the results.
I wish to skip the step between 3 and 4 and be able to connect directly from Tableau to my spark-shell where my dataframe is and then visualize these results from there. Is this possible with the spark-thrift server? I see a lot of restrictions on how this can be run and havent managed to get it running even once before.
Note: I am not admin on the HDP cluster so I dont have access to the keytabs etc that are needed for running hive server. I only wish to run as myself but trigger the creation of the dataset from tableau instead of going to hive but running a job that updates the hive table.
Created ‎12-01-2017 06:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, @Subramaniam Ramasubramanian. You cannot connect to your Spark-shell via JDBC.
Created ‎12-01-2017 06:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, @Subramaniam Ramasubramanian. You cannot connect to your Spark-shell via JDBC.
Created ‎12-04-2017 11:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Appreciate the trouble to look into this @Dongjoon Hyun!
Created ‎12-04-2017 11:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe just one follow up question. What exactly would be the usecase when you would want to use the spark thrift server then? Is it simply to have a spark backend to hive instead of lets say tez or mr?
Created ‎12-04-2017 05:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In addition to that, STS supports Spark SQL syntax since v2.0.0. If you want to use Spark SQL Syntax with SQL 2003 support, it's a good choice. Also, you can use Spark-specific syntax like `CACHE TABLE`, too.
