Created on 01-25-2016 05:07 PM - edited 09-16-2022 02:59 AM
Hi,
I am trying to access the already existing table in hive by using spark shell
But when I run the instructions, error comes "table not found".
e.g. in hive table is existing name as "department" in default database.
i start the spark-shell and execute the following set of instructions.
import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
val depts = sqlContext.sql("select * from departments")
depts.collecat().foreach(println)
but it coudn't find the table.
Now My questions are:
1. As I know ny using HiveContext spark can access the hive metastore. But it is not doing here, so is there any configuration setup required? I am using Cloudera quickstart VM 5..5
2. As an alternative I created the table on spark-shell , load a data file and then performed some queries and then exit the spark shell.
3. even if I create the table using spark-shell, it is not anywhere existing when I am trying to access it using hive editor.
4. when i again start the spark-shell , then earlier table i created, was no longer existing, so exactly where this table and metadata is stored and all....
I am very much confused, because accroding to theortical concepts, it should go under the hive metastore.
Thanks & Regards
Created 02-18-2019 01:34 PM
Hi there,
Just in case someone still needs the solution, here is what i tried and it works.
spark-shell --driver-java-options "-Dhive.metastore.uris=thrift://quickstart:9083"
I am using spark 1.6 with cloudera vm.
val df=sqlContext.sql("show databases")
df.show
You should be able to see all the databases in hive. I hope it helps.
Created 11-09-2017 02:42 AM
Created 11-09-2017 03:29 AM
On the Spark configuration page i dont have Hive checkbox too.
Try to install another version of Spark.
Created 02-23-2019 02:41 AM
Created 05-24-2017 05:17 AM
Hi,
Did u fix this issue?
Created 05-20-2018 11:49 PM
Try "select * from db.table" in line 3
Created 10-15-2018 11:45 PM
Hi,
I am trying to access the already existing table in hive by using pyspark
e.g. in hive table is existing name as "department" in default database.
err msg :-
18/10/15 22:01:23 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
18/10/15 22:02:35 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0-cdh5.13.0
18/10/15 22:02:38 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
I checked the below files, they are same.
/usr/lib/hive/conf/hive-site.xml
/usr/lib/spark/conf/hive-site.xml
Any help on how to set up the HiveContext from pyspark is highly appreciated.
Created 02-18-2019 01:34 PM
Hi there,
Just in case someone still needs the solution, here is what i tried and it works.
spark-shell --driver-java-options "-Dhive.metastore.uris=thrift://quickstart:9083"
I am using spark 1.6 with cloudera vm.
val df=sqlContext.sql("show databases")
df.show
You should be able to see all the databases in hive. I hope it helps.
Created 03-13-2023 01:11 PM
You are life saver, I have been struggling with this for 7-8 hours and my deadline to submit a case study was close. Thanks alot!!!