Support Questions
Find answers, ask questions, and share your expertise

Spark Hive Sql Context is loading all Partitions for a table causing performance issues on HiveMetaStore



Cluster Version: HDP 2.6.3

Spark v1.6.3

Hive 1.2.*

We are using Spark Hive Sql context a lot in our jobs and whenever we run a query from Spark sql for a specific partition on a table, it is loading all partitions for that particular table. As we have too many jobs doing this, causing too much pressure on HiveMetaStore GC and seeing frequent pauses in Hive MetaStore.

Is there a way we can avoid Spark sql from loading all partitions for a specific table when queried using partition?