Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark scala with HIve access

Highlighted

Spark scala with HIve access

New Contributor

Hi All ,

I am trying to access HIVE from spark application with scala. My code is as follows ...

val hiveLocation   = "hdfs://master:9000/user/hive/warehouse"
val conf = new SparkConf().setAppName("SOME APP NAME").setMaster("local[*]").set("spark.sql.warehouse.dir",hiveLocation)

val sc = new SparkContext(conf)
val spark = SparkSession
  .builder()
  .appName("SparkHiveExample")
  .master("local[*]")
  .config("spark.sql.warehouse.dir", hiveLocation)
  .config("spark.driver.allowMultipleContexts", "true")
  .enableHiveSupport()
  .getOrCreate()
println("Start of SQL Session--------------------")

spark.sql("select * from test").show()
println("End of SQL session-------------------")

But it ends up with error message "Table or view not found", but when I run "show tables;" under hive console , I can see that table and can run "Select * from test". All are in "user/hive/warehouse" location. Just for testing I tried with create table also from spark , just to find out the table location ...

val spark = SparkSession
      .builder()
  .appName("SparkHiveExample")
  .master("local[*]")
  .config("spark.sql.warehouse.dir", hiveLocation)
  .config("spark.driver.allowMultipleContexts", "true")
  .enableHiveSupport()
    .getOrCreate()
println("Start of SQL Session--------------------")
spark.sql("CREATE TABLE IF NOT EXISTS test11(name String)")
println("End of SQL session-------------------")

This code also executed properly(with success note) but strange thing is that I can find this table from hive console.

Even if I use "select * from TBLS;" in mysql (in my setup I configured mysql as metastore for hive) , I did not found those table which are created from spark. Is spark location is different than hive console ??? (as per my knowledge both must be the same location). What I have to do if I need to access existing table in hive from spark.

Please suggest.... Thanks in advance...

2 REPLIES 2

Re: Spark scala with HIve access

Super Collaborator

@Biswajit Chakraborty,

In Spark, Hive connects through Hive-Context, and then SQLneed to run from that context ( Instead of Spark SQL Context ).

please amend the following lines to your code after you created the spark Context.

val hiveContext = new org.apache.spark.sql.hive.HiveContext(spark);
hiveContext.sql("select * from test").show();

Additionally you may refer Step by Step instructions on how to connect Hive from Spark given here (and make use of ORC file format).

Hope this helps!!

Re: Spark scala with HIve access

New Contributor

creating HiveContext is not recommended any more I guess as per the Spark new version. In fact in spark 2 it has been depricated.

Don't have an account?
Coming from Hortonworks? Activate your account here