import org.apache.spark.sql._ import org.apache.spark.sql.hive._ import org.apache.spark.sql.hive.execution.InsertIntoHiveTable val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext.sql("LOAD DATA INPATH '/user/admin/test_partition_1.txt' INTO TABLE `default.test_partition_2` PARTITION (c0='56')")
and i got it working pretty fine , so i guess the same for spark-submit will work , good luck
I got HiveContext and query working in spark-shell too. I took one fellow's advice to have HADOOP_CONF_DIR=/etc/hadoop/conf:/etc/hive/conf. That got hive.metastore working in Spark. You should see messages like the followings in spark-shell (Before I added /etc/hive/conf to HADDOP_CONF_DIR, I saw messages of embedded objects created even though I have mysql working in hive-metastore )
16/01/04 14:51:57 INFO metastore: Trying to connect to metastore with URI thrift://localhost:9083 16/01/04 14:51:58 INFO metastore: Connected to metastore. : 16/01/04 14:52:02 INFO HiveContext: default warehouse location is /user/hive/warehouse
I also got hive-server2 working so that I can create Hive tables in beeline. I think most applications. including Spark, interface with Hive using hive-server2 as hive client. hive-server2 has dependency upon zookeeper. With correct spark-env.sh configuration and hive-metastore and hive-server2 falling in place successfully, I can query existing table and create new table without problems.