I am using CDH 5.7 for both dev and prod clusters. I've been reading about pre-emption feature in Capacity scheduler and planning to use it for both clusters. Based on my reading, come to know that it is matured feature (released almost 3 years back etc), but haven't found any related docs/articles from cloudera side. I would like to know Cloudera's perspective on enabling this feature in capacity scheduler.
If you use CDH, we recommend you use FS instead of CS. And we are keeping improve the preemption in FS. You can find these information in YARN-4752.
We have been using EMR and moved to CDH around 5 months back. Moving from CS to FS requires sometime for us though it is on "todo" list. Also, I do understand your recommendation as Cloudera is making significant progress on FS. I've been doing lot of reading about this as well. Given this situtation and CDH 5.7 has Hadoop 2.6.0+CDH 5.7.0, Can we use pre-emption in capacity scheduler for production cluster given that CS has implemented in version much earlier than Hadoop 2.6.0?
In addition, I am seeing recent fixes in YARN-4752 around 2-3 weeks back. Assuming it has been corporated in CDH later than 5.7 version, it requires upgrade on current running clusters as well? If yes, then it is quite big effort for us.
We are looking at https://issues.apache.org/jira/browse/YARN-4752 as well. Can you help confirm which version of CDH picked up YARN-4752? Somehow I couldn't find YARN-4752 mentioned thru release notes at https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_fixed.html#xd_583c10bf....