Support Questions

Find answers, ask questions, and share your expertise

How to load Hive Transactional tables with Spark ?


Hi everybody,

I have tried hard to load Hive transactional table with Spark 2.2 but without success. Here below is my note:

non-transactional table transactional table
Using hiveContext or SparkSession
hiveContext.sql or spark.sql
work not supporting
Using jdbc connect to Hive"jdbc").options(Map("url" -> url,"user" -> user, "password" -> password, "dbtable" -> "table_test")).load()
OR"jdbc").option("url", url).option("driver", "org.apache.hive.jdbc.HiveDriver").option("dbtable", "user_tnguy11.table_test").load().show()
return empty table return empty table
Using ORC files"orc").load("/apps/hive/warehouse/user_tnguy11.db/table_orc_test")
work fail

I am new to Spark. Any help would be appreciated. Thanks a lot.




Expert Contributor

Hi, @Tu Nguyen

Are you using HDP 2.6.3+? If then, you can try SPARK-LLAP connector. It's for the secure environment (Kerberos and Ranger), but it can read all Hive table through LLAP.

For HDP 2.6.3+ Spark 2.2, I didn't write an updated article for that, but almost the same except the SPARK-LLAP jar file is already built-in HDP 2.6.3+. You don't need to download it.