Reply
New Contributor
Posts: 8
Registered: ‎04-01-2018

Which one is best Hive vs Impala vs Drill vs Kudu, in combination with Spark SQL?

[ Edited ]
 
Highlighted
Cloudera Employee
Posts: 39
Registered: ‎01-07-2019

Re: Which one is best Hive vs Impala vs Drill vs Kudu, in combination with Spark SQL?

Assuming you want to access the data via spark, then the main question is how it should be stored.

 

For this Drill is not supported, but Hive tables and Kudu are supported by Cloudera.

 

Now it boils down to whether you want to store the data in Hive or in Kudu, as Spark can work with both of these. 

 

If you want to insert your data record by record, or want to do interactive queries in Impala then Kudu is likely the best choice.

 

If you want to insert and process your data in bulk, then Hive tables are usually the nice fit.