04-10-2019 04:15 AM
Assuming you want to access the data via spark, then the main question is how it should be stored.
For this Drill is not supported, but Hive tables and Kudu are supported by Cloudera.
Now it boils down to whether you want to store the data in Hive or in Kudu, as Spark can work with both of these.
If you want to insert your data record by record, or want to do interactive queries in Impala then Kudu is likely the best choice.
If you want to insert and process your data in bulk, then Hive tables are usually the nice fit.