Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark SQL 2.4.4 vs Hive 1.2.1 running job very slow

Spark SQL 2.4.4 vs Hive 1.2.1 running job very slow

New Contributor

Hi all,

 

I have a problem when running a query like this "select 1 from table_name where partition1='1' and partition2 = '2' and partition3 = '3' limit 1" (just to check if there is data in table). This query run very slow on my Spark Thrift Server, even though there are no others job running.

Here is the progress:

1. The query appear on STS log that it had been submitted

2. The log halt and no more log appear. SparkUI show no new job.

3. After 4 to 5 minutes, log continue to run and job appear on SparkUI.

 

I wonder if there are any incompatible version between Spark and Hive. Currrently i'm using Spark 2.4.4 and Hive 1.2.1. Before that i use Spark 2.0.2 and everything run normal.

 

So appreciate if anyone can help me with this problem.

 

Don't have an account?
Coming from Hortonworks? Activate your account here