Support Questions

Find answers, ask questions, and share your expertise

Optimum HIVE parameters for using hive.execution.engine=spark ?

avatar
New Contributor

So, we recently moved to using spark as HIVE engine instead of mr and we are seeing some significant improvements in certain queries which needs intermediate tables/storage to process. Can someone provide a complete or optimum list of configuration that we should be using that will not cause memory issues but still be able to get the best out of using spark as engine?

Nodes : 30
Cores: 16
Memory: 112GB /node
Hadoop 2.6.0-cdh5.13.0
Hive 1.1.0-cdh5.13.0

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Zeus 

 

There is no such optimal configurations. Fine tuning will be done based upon your load and how HOS reacts on the workload submitted on it. Again this will vary from customer to customer

View solution in original post

1 REPLY 1

avatar
Expert Contributor

@Zeus 

 

There is no such optimal configurations. Fine tuning will be done based upon your load and how HOS reacts on the workload submitted on it. Again this will vary from customer to customer