Cloudera Community

Community Articles

Find and share helpful community-sourced technical articles.

Advanced Search

SumitraMenon

Community Manager

On HDP3, SparkSQL API will directly query Spark2 own catalog namespace. The Spark catalog is independent of the Hive catalog. Hence, a HiveWarehouseConnector was developed to allow Spark users to query Hive data through the HiveWarehouseSessionAPI. Hive tables on HDP3 are ACID by default, given that Spark2 does not operate on ACID tables yet. To guarantee data integrity, the HiveWarehouseConnector will process queries through the HiveServer2Interactive (LLAP) service. This is not the case for External tables.

This video will explain how to access Hive from Spark2 on HDP3 along with some architectural changes and the support provided for particular use cases.