About hadoopSparkZen

hadoopSparkZen · ‎07-12-2017

Thank you Guna. i link the confuguration file hive to spark as ln -s /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml it started workig after restart only.

hadoopSparkZen · ‎07-04-2017

Hi Guna, i try as you said, default and cloudera both databases have the tables but its not showing tables. When i try as "show databases" its not showing bothdatabases its showing only default. Please See Bellow scala> val df = hiveObj.sql("show tables in cloudera") df: org.apache.spark.sql.DataFrame = [tableName: string, isTemporary: boolean] scala> df.show() +---------+-----------+ |tableName|isTemporary| +---------+-----------+ +---------+-----------+ scala> val df1 = hiveObj.sql("show tables in default") df1: org.apache.spark.sql.DataFrame = [tableName: string, isTemporary: boolean] scala> df1.show() +---------+-----------+ |tableName|isTemporary| +---------+-----------+ +---------+-----------+ scala> val df2 = hiveObj.sql("show databases") df2: org.apache.spark.sql.DataFrame = [result: string] scala> df2.show() +-------+ | result| +-------+ |default| +-------+

hadoopSparkZen · ‎07-04-2017

Hi Guna, i did as you say still same thing repeating,Please see bellow: scala> import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.hive.HiveContext scala> import sqlContext.implicits._ import sqlContext.implicits._ scala> val hiveObj = new HiveContext(sc) 17/07/04 02:10:55 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException hiveObj: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@3474ddfe scala> hiveObj.refreshTable("cloudera.test1") scala> val s = hiveObj.sql("select * from cloudera.test1").collect() org.apache.spark.sql.AnalysisException: Table not found: `cloudera`.`test1`; line 1 pos 23 at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:54) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:121)

hadoopSparkZen · ‎07-03-2017

Thank you for quick replay to Saranvisa and csguna. Please see bellow Table is existed and its loaded as scala> val l = sc.textFile("/user/hive/warehouse/cloudera.db/test1").collect().foreach(println) 1|Raj|200 2|Rahul|300 3|Ram|400 4|Sham|250 5|John|500 l: Unit = () scala> But in the bellow its not working, Could you please check it once(its showing one metastore warning also in 2nd step) I try in different ways but its showing as "table not found" scala> import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.hive.HiveContext scala> val sqlContext = new HiveContext(sc) 17/07/03 22:47:30 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@36bcf0b6 scala> import sqlContext.implicits._ import sqlContext.implicits._ scala> val r = sqlContext.sql("select * from cloudera.test1") org.apache.spark.sql.AnalysisException: Table not found: `cloudera`.`test1`; line 1 pos 23 at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:54) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:121) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:120) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50) at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:48) at $iwC$$iwC$$iwC.<init>(<console>:50) at $iwC$$iwC.<init>(<console>:52) at $iwC.<init>(<console>:54) at <init>(<console>:56) at .<init>(<console>:60) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1064) at org.apache.spark.repl.Main$.main(Main.scala:35) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

hadoopSparkZen · ‎07-02-2017

Thank you for your replay, but still its not loading. giving errror as table not found. Could you please show how to load hive data in sparks-shell(I'm using Cloudera quick start5.10 in VirtualBox)

hadoopSparkZen · ‎06-29-2017

I have downloaded Cloudera quickstart 5.10 for VirtualBox. But it's not loading hive data into spark import org.apache.spark.sql.hive.HiveContext import sqlContext.implicits._ val hiveObj = new HiveContext(sc) hiveObj.refreshTable("db.table") // if you have uograded your hive do this, to refresh the tables. val sample = sqlContext.sql("select * from table").collect() sample.foreach(println) Still i'm getting the error as table not found(It's not accessing metadata) What should i do, Any one pls help me (In cloudera quickstart we are unable to copy hive-site.xml in to spark/conf)

hadoopSparkZen · ‎06-29-2017

I have downloaded Cloudera quickstart 5.10 for VirtualBox. But it's not loading hive data into spark import org.apache.spark.sql.hive.HiveContext import sqlContext.implicits._ val hiveObj = new HiveContext(sc) hiveObj.refreshTable("db.table") // if you have uograded your hive do this, to refresh the tables. val sample = sqlContext.sql("select * from table").collect() sample.foreach(println) Still i'm getting the error as table not found(It's not accessing metadata) What should i do, Any one pls help me

Online	Offline
Last Visited	‎08-17-2017 04:04 AM

Member Since	‎06-29-2017 04:39 AM
Last Visited	‎08-17-2017 04:04 AM
Posts	7
Kudos received	3

Cloudera Community

Re: How to load Hive data in to Spark-shell

Re: How to load Hive data in to Spark-shell

Re: How to load Hive data in to Spark-shell

Re: How to load Hive data in to Spark-shell

Re: How to load Hive data in to Spark-shell

How to load Hive data in to Spark-shell

Re: how to access the hive tables from spark-shell