Created 10-24-2017 09:31 PM
I can use SparkSession to get the list of tables in Hive, or access a Hive table as shown in the code below. Now my question is if in this case, I'm using Spark with Hive Context?
Or is it that to use hive context in Spark, I must directly use HiveContext object to access tables, and perform other Hive related functions?
spark.catalog.listTables.show val personnelTable = spark.catalog.getTable("personnel")
Created 10-25-2017 04:13 AM
I assume you're on Spark 2?
SparkSession, without explicitly creating SparkConf, SparkContext or SQLContext, encapsulates them within itself.
Also SparkSession has merged SQLContext and HiveContext in one object in Spark 2.0.
When building a session object, for example:
val spark = SparkSession .builder() .appName( "SparkSessionZipsExample" ) .config( "spark.sql.warehouse.dir" , warehouseLocation) .enableHiveSupport() .getOrCreate()
.enableHiveSupport() provides HiveContext functions. So you're able to use catalog functions since spark has provided connectivity to hive metastore on doing .enableHiveSupport()
You'll get more clarity by reading this https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html
Created 10-25-2017 04:40 PM
Thanks for the reply. Does this mean that spark object in spark-shell already has enableHiveSupport() enabled? or the spark.sql(), and spark.catalog that spark object provides are implemented by SparkSession even without enableHiveSupport()?
Created 10-25-2017 05:37 PM
Yes, spark-shell has enableHiveSupport() already enabled