Reply
Contributor
Posts: 56
Registered: ‎02-09-2015

Re: I am using a hive cotext in pyspark cdh5.3 virtual box and i get the error

i got it working on spark-shell , as below :

 

 

spark-shell --master spark://bdvm01:7077 --driver-memory 1G --executor-memory 1G

 

once shell starts :

 

import org.apache.spark.sql._
import org.apache.spark.sql.hive._
import org.apache.spark.sql.hive.execution.InsertIntoHiveTable
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("LOAD DATA INPATH '/user/admin/test_partition_1.txt' INTO TABLE `default.test_partition_2` PARTITION (c0='56')")

 

and i got it working pretty fine , so i guess the same for spark-submit will work , good luck 

Highlighted
New Contributor
Posts: 3
Registered: ‎01-02-2016

Re: I am using a hive cotext in pyspark cdh5.3 virtual box and i get the error

I got HiveContext and query working in spark-shell too.  I took one fellow's advice to have HADOOP_CONF_DIR=/etc/hadoop/conf:/etc/hive/conf.  That got hive.metastore working in Spark.  You should see messages like the followings in spark-shell (Before I added /etc/hive/conf to HADDOP_CONF_DIR, I saw messages of embedded objects created even though I have mysql working in hive-metastore )

16/01/04 14:51:57 INFO metastore: Trying to connect to metastore with URI thrift://localhost:9083
16/01/04 14:51:58 INFO metastore: Connected to metastore.
:
16/01/04 14:52:02 INFO HiveContext: default warehouse location is /user/hive/warehouse

I also got hive-server2 working so that I can create Hive tables in beeline.  I think most applications. including Spark,  interface with Hive using hive-server2 as hive client.   hive-server2 has dependency upon zookeeper.   With correct spark-env.sh configuration and hive-metastore and hive-server2 falling in place successfully, I can query existing table and create new table without problems.

Announcements