Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to write dataframe data in hive table by spark scala.

how to write dataframe data in hive table by spark scala.

New Contributor

I am try to push data in existing hive table, i have already created orc table in hive not able to push data in hive. this code is work if i copy paste on spark console but not able to run by spark-submit.

import org.apache.spark.SparkConfimport org.apache.spark.SparkContextobjectTestCode{def main(args:Array[String]):Unit={val conf =newSparkConf().setAppName("first example").setMaster("local")val sc =newSparkContext(conf)val sqlContext =new org.apache.spark.sql.SQLContext(sc)for(i <-0 to 100-1){//  sample value but it replace with business logic. and try to push into table.for loop consider as business logic.var fstring ="fstring"+ i
      var cmd ="cmd"+ i
      var idpath ="idpath"+ i
      import sqlContext.implicits._
      val sDF =Seq((fstring, cmd, idpath)).toDF("t_als_s_path","t_als_s_cmd","t_als_s_pd")
      sDF.write.insertInto("l_sequence");//sDF.write.format("orc").saveAsTable("l_sequence");
      println("write data ==> "+ i)}}

Giving the error.

Exception in thread "main" org.apache.spark.sql.AnalysisException:Table or view not found: l_sequence;
        at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
        at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$lookupTableFromCatalog(Analyzer.scala:449)
        at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$8.applyOrElse(Analyzer.scala:455)
        at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$8.applyOrElse(Analyzer.scala:453)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:60)
        at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:453)
        at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:443)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:85)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:82)
        at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
        at scala.collection.immutable.List.foldLeft(List.scala:84)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:82)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:74)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74)
        at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:65)
        at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:63)
        at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:51)
        at org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:69)
        at org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:68)
        at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:74)
        at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:74)
        at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:78)
        at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:76)
        at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:83)
        at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:83)
        at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
        at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
        at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:259)
        at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:239)
        at com.hq.bds.Helloword$anonfun$main$1.apply$mcVI$sp(Helloword.scala:16)
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
        at com.hq.bds.Helloword$.main(Helloword.scala:10)
        at com.hq.bds.Helloword.main(Helloword.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:729)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2 REPLIES 2
Highlighted

Re: how to write dataframe data in hive table by spark scala.

Expert Contributor

1. Import SparkSession, SparkContext is deprecated

import org.apache.spark.sql.SparkSession

2. Check you have hive-site.xml in the "/usr/lib/spark/conf" directory

You can try adding this as well with the spark-submit

 --files /usr/hdp/current/spark-client/conf/hive-site.xml 
Highlighted

Re: how to write dataframe data in hive table by spark scala.

Expert Contributor

Try this (hcc has buggy formatting issues, ignore that, notice the import and spark val)

      import org.apache.spark.sql.SparkSession
     



       objectTestCode{ 



      def main(args:Array[String]):Unit={

      val spark = SparkSession
      .builder()
      .appName("SilverTailParser")
      .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
      .enableHiveSupport()
      .getOrCreate()


      /*
      sample value but it replace with business logic. and try to push into table.
      for loop consider as business logic.
       */
     for(i <-0 to 100-1){
      var fstring ="fstring"+ i
      var cmd ="cmd"+ i
      var idpath ="idpath"+ i

      import spark.implicits._ // NOTE
      
      val sDF =Seq((fstring, cmd, idpath)).toDF("t_als_s_path","t_als_s_cmd","t_als_s_pd")
      sDF.write.insertInto("l_sequence");
      println("write data ==> "+ i)}}
Don't have an account?
Coming from Hortonworks? Activate your account here