Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Which one is best Hive vs Impala vs Drill vs Kudu, in combination with Spark SQL?

Which one is best Hive vs Impala vs Drill vs Kudu, in combination with Spark SQL?

New Contributor
 
1 REPLY 1

Re: Which one is best Hive vs Impala vs Drill vs Kudu, in combination with Spark SQL?

Rising Star

Assuming you want to access the data via spark, then the main question is how it should be stored.

 

For this Drill is not supported, but Hive tables and Kudu are supported by Cloudera.

 

Now it boils down to whether you want to store the data in Hive or in Kudu, as Spark can work with both of these. 

 

If you want to insert your data record by record, or want to do interactive queries in Impala then Kudu is likely the best choice.

 

If you want to insert and process your data in bulk, then Hive tables are usually the nice fit.