Support Questions

GrazittiAPI · ‎10-06-2020

So, we recently moved to using spark as HIVE engine instead of mr and we are seeing some significant improvements in certain queries which needs intermediate tables/storage to process. Can someone provide a complete or optimum list of configuration that we should be using that will not cause memory issues but still be able to get the best out of using spark as engine?

Nodes : 30
Cores: 16
Memory: 112GB /node
Hadoop 2.6.0-cdh5.13.0
Hive 1.1.0-cdh5.13.0

tusharkathpal · ‎10-12-2020

There is no such optimal configurations. Fine tuning will be done based upon your load and how HOS reacts on the workload submitted on it. Again this will vary from customer to customer

View solution in original post

tusharkathpal · ‎10-12-2020

@Zeus

There is no such optimal configurations. Fine tuning will be done based upon your load and how HOS reacts on the workload submitted on it. Again this will vary from customer to customer

Cloudera Community

Support Questions

Optimum HIVE parameters for using hive.execution.engine=spark ?