Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark Hive Sql Context is loading all Partitions for a table causing performance issues on HiveMetaStore

Highlighted

Spark Hive Sql Context is loading all Partitions for a table causing performance issues on HiveMetaStore

Contributor

Hi,

Cluster Version: HDP 2.6.3

Spark v1.6.3

Hive 1.2.*

We are using Spark Hive Sql context a lot in our jobs and whenever we run a query from Spark sql for a specific partition on a table, it is loading all partitions for that particular table. As we have too many jobs doing this, causing too much pressure on HiveMetaStore GC and seeing frequent pauses in Hive MetaStore.

Is there a way we can avoid Spark sql from loading all partitions for a specific table when queried using partition?

Don't have an account?
Coming from Hortonworks? Activate your account here