Support Questions

Find answers, ask questions, and share your expertise

How to load Hive Transactional tables with Spark ?

Explorer

Hi everybody,

I have tried hard to load Hive transactional table with Spark 2.2 but without success. Here below is my note:

non-transactional table transactional table
Using hiveContext or SparkSession
hiveContext.sql or spark.sql
work not supporting
Using jdbc connect to Hive
hiveContext.read.format("jdbc").options(Map("url" -> url,"user" -> user, "password" -> password, "dbtable" -> "table_test")).load()
OR
sparkSession.read.format("jdbc").option("url", url).option("driver", "org.apache.hive.jdbc.HiveDriver").option("dbtable", "user_tnguy11.table_test").load().show()
return empty table return empty table
Using ORC files
hiveContext.read.format("orc").load("/apps/hive/warehouse/user_tnguy11.db/table_orc_test")
work fail

I am new to Spark. Any help would be appreciated. Thanks a lot.

Regards,

Tu

1 REPLY 1

Expert Contributor

Hi, @Tu Nguyen

Are you using HDP 2.6.3+? If then, you can try SPARK-LLAP connector. It's for the secure environment (Kerberos and Ranger), but it can read all Hive table through LLAP.

https://community.hortonworks.com/articles/101181/rowcolumn-level-security-in-sql-for-apache-spark-2...

For HDP 2.6.3+ Spark 2.2, I didn't write an updated article for that, but almost the same except the SPARK-LLAP jar file is already built-in HDP 2.6.3+. You don't need to download it.