Reply
Explorer
Posts: 13
Registered: ‎01-08-2016

how to access the hive tables from spark-shell

Hi,


I am trying to access the already existing table in hive by using spark shell

But when I run the instructions, error comes "table not found".

e.g. in hive table is existing name as "department" in default database.


i start the spark-shell and execute the following set of instructions.


import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
val depts = sqlContext.sql("select * from departments")
depts.collecat().foreach(println)



but it coudn't find the table.



Now My questions are:

1. As I know ny using HiveContext spark can access the hive metastore. But it is not doing here, so is there any configuration setup required?  I am using Cloudera quickstart VM 5..5

2. As an alternative I created the table on spark-shell , load a data file and then performed some queries and then exit the spark shell.

3. even if I create the table using spark-shell, it is not anywhere existing when I am trying to access it using hive editor.

4. when i again start the spark-shell , then earlier table i created, was no longer existing, so exactly where this table and metadata is stored and all....


I am very much confused, because accroding to theortical concepts, it should go under the hive metastore.

Thanks & Regards

Explorer
Posts: 13
Registered: ‎01-08-2016

Re: how to access the hive tables from spark-shell

to connect to hive metastore you need to copy the hive-site.xml file into spark/conf directory. After that spark will be able to connect to hive metastore.
so run the  following ommand after log in as root user   

 

cp  /usr/lib/hive/conf/hive-site.xml    /usr/lib/spark/conf/

New Contributor
Posts: 6
Registered: ‎11-10-2015

Re: how to access the hive tables from spark-shell

Or you create a symbolic link to avoid file version syncing issues:

ln -s /usr/lib/hive/conf/hive-site.xml    /usr/lib/spark/conf/hive-site.xml
Explorer
Posts: 13
Registered: ‎11-22-2016

Re: how to access the hive tables from spark-shell

Still the issue is persisting,

What else can we do to make it work other than hive-site.xml

Explorer
Posts: 13
Registered: ‎11-22-2016

Re: how to access the hive tables from spark-shell

[ Edited ]

which version spark are you using?

assuming you are using 1.4v or higher.

 

import org.apache.spark.sql.hive.HiveContext
import sqlContext.implicits._
val hiveObj = new HiveContext(sc)

hiveObj.refreshTable("db.table") // if you have uograded your hive do this, to refresh the tables.

val sample = sqlContext.sql("select * from table").collect()
sample.foreach(println)

 

This has worked for me

New Contributor
Posts: 2
Registered: ‎05-24-2017

Re: how to access the hive tables from spark-shell

Hi,

 

Did u fix this issue?

Explorer
Posts: 7
Registered: ‎06-29-2017

Re: how to access the hive tables from spark-shell

I have downloaded Cloudera quickstart 5.10 for VirtualBox.

But it's not loading hive data into spark 

 

import org.apache.spark.sql.hive.HiveContext
import sqlContext.implicits._
val hiveObj = new HiveContext(sc)

hiveObj.refreshTable("db.table") // if you have uograded your hive do this, to refresh the tables.

val sample = sqlContext.sql("select * from table").collect()
sample.foreach(println)

 

Still i'm getting the error as table not found(It's not accessing metadata)

What should i do, Any one pls help me

Highlighted
New Contributor
Posts: 2
Registered: ‎05-09-2017

Re: how to access the hive tables from spark-shell

[ Edited ]

I'm having the same issue. I'm using CDH 5.10 with Spark on Yarn

 

Also, is there a way to incllude hive-site.xml through Cloudera Manager? At the moment I have a script to make sure that the symlink is there (and links to the correct hive-site.xml) in the whole cluster, but getting Cloudera Manager to do it for me would be easier, faster and less error prone.

Announcements