Support Questions

fnushelly · ‎01-25-2016

Hi,

I am trying to access the already existing table in hive by using spark shell

But when I run the instructions, error comes "table not found".

e.g. in hive table is existing name as "department" in default database.

i start the spark-shell and execute the following set of instructions.

import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
val depts = sqlContext.sql("select * from departments")
depts.collecat().foreach(println)

but it coudn't find the table.

Now My questions are:

1. As I know ny using HiveContext spark can access the hive metastore. But it is not doing here, so is there any configuration setup required? I am using Cloudera quickstart VM 5..5

2. As an alternative I created the table on spark-shell , load a data file and then performed some queries and then exit the spark shell.

3. even if I create the table using spark-shell, it is not anywhere existing when I am trying to access it using hive editor.

4. when i again start the spark-shell , then earlier table i created, was no longer existing, so exactly where this table and metadata is stored and all....

I am very much confused, because accroding to theortical concepts, it should go under the hive metastore.

Thanks & Regards

MichelleY · ‎02-18-2019

Hi there,

Just in case someone still needs the solution, here is what i tried and it works.

spark-shell --driver-java-options "-Dhive.metastore.uris=thrift://quickstart:9083"

I am using spark 1.6 with cloudera vm.

val df=sqlContext.sql("show databases")

df.show

You should be able to see all the databases in hive. I hope it helps.

View solution in original post

fnushelly · ‎01-26-2016

to connect to hive metastore you need to copy the hive-site.xml file into spark/conf directory. After that spark will be able to connect to hive metastore.
so run the following ommand after log in as root user

cp /usr/lib/hive/conf/hive-site.xml /usr/lib/spark/conf/

sofiane · ‎09-01-2016

Or you create a symbolic link to avoid file version syncing issues:

ln -s /usr/lib/hive/conf/hive-site.xml    /usr/lib/spark/conf/hive-site.xml

jack0188 · ‎01-03-2017

Still the issue is persisting,

What else can we do to make it work other than hive-site.xml

jack0188 · ‎01-14-2017

which version spark are you using?

assuming you are using 1.4v or higher.

import org.apache.spark.sql.hive.HiveContext
import sqlContext.implicits._
val hiveObj = new HiveContext(sc)

hiveObj.refreshTable("db.table") // if you have uograded your hive do this, to refresh the tables.

val sample = sqlContext.sql("select * from table").collect()
sample.foreach(println)

This has worked for me

hadoopSparkZen · ‎06-29-2017

I have downloaded Cloudera quickstart 5.10 for VirtualBox.

But it's not loading hive data into spark

import org.apache.spark.sql.hive.HiveContext
import sqlContext.implicits._
val hiveObj = new HiveContext(sc)

hiveObj.refreshTable("db.table") // if you have uograded your hive do this, to refresh the tables.

val sample = sqlContext.sql("select * from table").collect()
sample.foreach(println)

Still i'm getting the error as table not found(It's not accessing metadata)

What should i do, Any one pls help me

miguelalonso · ‎07-25-2017

I'm having the same issue. I'm using CDH 5.10 with Spark on Yarn

Also, is there a way to incllude hive-site.xml through Cloudera Manager? At the moment I have a script to make sure that the symlink is there (and links to the correct hive-site.xml) in the whole cluster, but getting Cloudera Manager to do it for me would be easier, faster and less error prone.

Jeno · ‎10-23-2017

Hi!

On the last week i have resolved the same problem for Spark 2.

For this I've select the Hive Service dependance on the Spark 2 service Configuration page (Service-Wide Category):

After stale services was restarted Spark 2 started to works correctly.

Sentara · ‎11-08-2017

I am having the same issue and copying the hive-site.xml did not resolve the issue for me. I am not using spark2, but the v1.6 that comes with Cloudera 5.13 - and there is no spark/hive configuration setting. Was anyone else able to figure out how to fix this? Thanks!

Jeno · ‎11-08-2017

Hi!

Have you installed the appropriate Gateways on the server where these configuration settings are required?

Cloudera Community

Support Questions

how to access the hive tables from spark-shell