Support Questions
Find answers, ask questions, and share your expertise

Is there any guidance available on YARN capacity scheduler to handle different workload at different time?

Highlighted

Is there any guidance available on YARN capacity scheduler to handle different workload at different time?

Explorer

Dear All,

I'm new to Capacity Scheduler and got assigned for managing a new CDP 7.1.3 cluster on-prem with a dozen of servers.

As the cluster setting up is almost done, the next question is how to manage the resource in an effective way to accommodate different workload at different time. The cluster is built for data lake, at night time, most of the workload is ETL jobs, mainly on hdfs, hive, spark, but at day time, it will be mainly data query jobs, from python, R, spark and impala.

 

I'd like to seek your help on how to handle such situation? Appreciate if you can share some of your experience or point me to some relevant documents!

 

Looking Forward To Hearing From You!

 

Stay Safe & Healthy