Reply
New Contributor
Posts: 3
Registered: ‎07-31-2017

Saving dataframe using scala intellij throws exception

[ Edited ]

 

I am trying to load a CSV or an XML File using Intellij Spark Scala into a pre-existing hive table and then it gives below exceptions on the last step while saving dataframe.

Ironically: the code below works fine in spark-shell without any issues with all four cases. 1. when I use Hive Context and Insertinto().

val sparkConf = new SparkConf().setAppName("TEST")
val sc = new SparkContext(sparkConf)
val hiveContext = new HiveContext(sc)hiveContext.setConf("hive.exec.dynamic.partition", "true")hiveContext.setConf("hive.exec.dynamic.partition.mode", "nonstrict")println("CONFIG DONE!!!!!")
val xml = hiveContext.read.format("com.databricks.spark.xml").option("rowTag","employee").load("/PUBLIC_TABLESPACE/updatedtest1.xml")println("XML LOADED!!!!!!")xml.write.format("parquet").mode("overwrite").partitionBy("designation").insertInto("test2")println("TABLE SAVED!!!!!!!")

Exception in thread "main" java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(org.apache.hadoop.fs.Path, java.lang.String, java.util.Map, boolean, int, boolean, boolean, boolean)

2.when I use Hive Context and SaveAsTable().

val sparkConf = new SparkConf().setAppName("TEST")
val sc = new SparkContext(sparkConf)
val hiveContext = new HiveContext(sc)hiveContext.setConf("hive.exec.dynamic.partition", "true")hiveContext.setConf("hive.exec.dynamic.partition.mode", "nonstrict")println("CONFIG DONE!!!!!")
val xml = hiveContext.read.format("com.databricks.spark.xml").option("rowTag","employee").load("/PUBLIC_TABLESPACE/updatedtest1.xml")println("XML LOADED!!!!!!")

xml.write.format("parquet").mode("overwrite").partitionBy("designation").saveAsTable("test2")

Exception in thread "main" java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(org.apache.hadoop.fs.Path, java.lang.String, java.util.Map, boolean, int, boolean, boolean, boolean)

3. when I use SQL Context and Insertinto().

val sparkConf = new SparkConf().setAppName("TEST")
val sc = new SparkContext(sparkConf)
val hiveContext = new SQLContext(sc)hiveContext.setConf("hive.exec.dynamic.partition", "true")hiveContext.setConf("hive.exec.dynamic.partition.mode", "nonstrict")println("CONFIG DONE!!!!!")
val xml = hiveContext.read.format("com.databricks.spark.xml").option("rowTag","employee").load("/PUBLIC_TABLESPACE/updatedtest1.xml")println("XML LOADED!!!!!!") xml.write.format("parquet").mode("overwrite").partitionBy("designation").insertInto("test2")println("TABLE SAVED!!!!!!!")

Exception in thread "main" org.apache.spark.sql.AnalysisException: Table not found: test2;

4. when I use SQL Context and saveAsTable().

val sparkConf = new SparkConf().setAppName("TEST")
val sc = new SparkContext(sparkConf)
val hiveContext = new SQLContext(sc)hiveContext.setConf("hive.exec.dynamic.partition", "true")hiveContext.setConf("hive.exec.dynamic.partition.mode", "nonstrict")println("CONFIG DONE!!!!!") 
val xml = hiveContext.read.format("com.databricks.spark.xml").option("rowTag","employee").load("/PUBLIC_TABLESPACE/updatedtest1.xml")println("XML LOADED!!!!!!") xml.write.format("parquet").mode("overwrite").partitionBy("designation").saveAsTable("test2")println("TABLE SAVED!!!!!!!")

Exception in thread "main" java.lang.RuntimeException: Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

Posts: 642
Topics: 3
Kudos: 105
Solutions: 67
Registered: ‎08-16-2016

Re: Saving dataframe using scala intellij throws exception

The SparkSQL is because it is not using Hive. The rest of the errors and fact that it works in spark-shell tells me that it is likely a classpath issue. The spark-shell is getting all of the correct classes for Hive but your project in Intellij does not.
New Contributor
Posts: 3
Registered: ‎07-31-2017

Re: Saving dataframe using scala intellij throws exception

I have checked and used same library dependencies in Intellij too. Still, below are dependencies. Beside it, when there is no predefined table, it creates one but unable to overwrite or append an existing table.

 

name := "testonSpark"

version := "1.0"

scalaVersion := "2.10.5"


libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.0"
libraryDependencies += "com.databricks" % "spark-xml_2.10" % "0.4.1"
libraryDependencies += "org.apache.spark" % "spark-hive_2.10" % "1.6.0"

 

Announcements