Support Questions

Find answers, ask questions, and share your expertise

Spark - Hive Configuration needed in Cloudera Quickstart VM 5.12

New Contributor



I am new to Cloudera Quickstart VM 5.12.

I am trying to run small program which is to create Database / Table and load data using Spark SQL.


I wrote sample code - but it is giving hive error. To fix that I tried to copy hive-conf.xml to spark/conf directory and even for hard link - but it's failed with a permission denied.


Can anyone help me... to fix this. It would be great help for me.



As this VM is with Spark 1.6.1, can somebody give sample code for this.  As of now, I am running in intellij.





Eswar K



Expert Contributor

Welcome, Eswar! 

Since you are using Spark1.6 all you'd need is a hive gateway to explore hive tables from spark sql (no need to manually transport hive-site.xml). 


You can add/ensure that the Hive gateway is added to the node from where you are running the spark-shell (in your case there is just one node so it should be your quickstart VM) using CM > Hive > Instances > Gateway Role


Screen Shot 2018-04-11 at 6.07.09 pm.png


As for your requirement of a sample code, you can start by creating a sequence or an array from the shell


scala> val data = Seq(("Falcon", 10), ("IronMan", 40), ("BlackWidow", 10))



Next, parallelize the collection and create a DataFrame from the RDD 

scala> val df = sc.parallelize(data).toDF("Name", "Count")


After this set the Hive warehouse path

scala> val options = Map("path" -> "/user/hive/warehouse/avengers")


Followed by saving the table

scala> df.write.options(options).saveAsTable("default.avengers")


Finally, query the table using Spark SQL and beeline 

scala> sqlContext.sql("select * from avengers").collect.foreach(println);
[Falcon, 30]
[IronMan, 40]
[BlackWidow, 10]
$ beeline …
> show tables;
> select * from avengers;
Falcon 30 IronMan 40 BlackWidow 10


Hope this helps. Let us know if you already got past it and/or if you are still stuck.

Good Luck!