I'm new to Capacity Scheduler and got assigned for managing a new CDP 7.1.3 cluster on-prem with a dozen of servers.
As the cluster setting up is almost done, the next question is how to manage the resource in an effective way to accommodate different workload at different time. The cluster is built for data lake, at night time, most of the workload is ETL jobs, mainly on hdfs, hive, spark, but at day time, it will be mainly data query jobs, from python, R, spark and impala.
I'd like to seek your help on how to handle such situation? Appreciate if you can share some of your experience or point me to some relevant documents!