Created 12-08-2015 07:44 PM
Java code:
DataFrame peopleDataFrame = sqlContext.createDataFrame(rowRDD, schema);
HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext( jsc.sc() );
hiveContext.sql("CREATE TABLE IF NOT EXISTS people_t1 (emp_id string, first_name string, last_name string, job_title string, mgr_emp_id string)");
// Register the DataFrame as a table.
peopleDataFrame.registerTempTable("people");
....
peopleDataFrame.insertInto("default.people_t1", true);
Got:
java.lang.RuntimeException: Table Not Found: default.people_t1
While table does exist in Hive:
hive> describe people_t1;
OK
emp_id string
first_name string
last_name string
job_title string
mgr_emp_id string
Time taken: 0.284 seconds, Fetched: 5 row(s)
Created 12-08-2015 09:50 PM
Figured it out, it has to be HiveContext, not SQLContext, after making below change, it works:
HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());
//SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);
Created 12-08-2015 08:12 PM
What version of Spark? I assume you are using Spark 1.5.1 from Tech Preview. What if you try using the DataFrameWriter class, saveAsTable() method. That may not work for you if you really are INSERTing records to an existing table, but could work if you mean to OVERWRITE the table.
Documentation here:
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.sql.DataFrameWriter
Created 12-08-2015 08:45 PM
I am using Spark 1.3.1. Seems that saveAsTable() creates internal Spark table source.
Created 12-08-2015 09:50 PM
Figured it out, it has to be HiveContext, not SQLContext, after making below change, it works:
HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());
//SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);