Created 12-11-2024 01:46 AM
My pyspark yarn-client application got killed by cluster because of this setting, yarn.scheduler.capacity.root.default-application-lifetime. What config should I use to declare my application lifetime and avoid getting killed?
Created 12-11-2024 10:25 AM
@Jaaaaay Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Spark experts @Bharati @jagadeesan who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 12-12-2024 09:32 AM
Admins can enforce application lifetime SLAs at a service level by setting yarn.scheduler.capacity.<queue-path>.maximum-application-lifetime ” and “yarn.scheduler.capacity.root.<queue-path>.default-application-lifetime” in
capacity-scheduler.xml.
CM > Yarn > Configuration > Capacity Scheduler Configuration Advanced Configuration Snippet (Safety Valve)
reference link is https://blog.cloudera.com/enforcing-application-lifetime-slas-yarn/
How are you setting this parameter? Can you share the entire stacktrace of the failure?
Created 12-12-2024 06:25 PM
Admin only set a value of 86400 for yarn.scheduler.capacity.root.<queue-path>.default-application-lifetime. Which should be a soft enforcement in a sense. Value for yarn.scheduler.capacity.<queue-path>.maximum-application-lifetime is not set. My question is how to configure my pyspark application to bypass this soft enforcement.
Created 12-18-2024 05:29 PM
@Bharati Any updates? I have tried the REST API method. Getting my app’s expiry time works just fine. But put request get a 401 unauthorized error. It’s a shared cluster and I don’t have the admin-level authorization.