Created on 12-23-2021 05:30 AM - last edited on 12-23-2021 07:22 AM by DianaTorres
Hi All,
I'm new person is a hadoop world.
Can not find answer on this question.
How to gracefully stop YARN role on a data node and save or pause running jobs
So the jobs will not exit with killer or crash status.
I know that in ClouderaManager you can decommission yarn role when you can stop it.
Is this a safe way to keep the jobs running and not failed state?
Is this a gracefull yarn role shutdown or where is other way to do this?
Please advice me how to do this.
Thank you.
Created 01-15-2022 11:32 PM
HI,
YARN Graceful decommission will wait for jobs to complete. You can pass the timeout value so that YARN will start decommission after x seconds. If no jobs running within x secs then automatically YARN will start decommission without waiting for timeout to happen.
CM -> Clusters -> yarn -> Configuration -> In search bar (
To decommission a specific host/more hosts
CM -> Clusters -> yarn -> Instances (Select the hosts that you want to decommission)
Click -> Actions for selected hosts -> Decommission
In case you want to decommission all the roles of a host then follow this doc
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_mc_host_maint.html#decomm_host
Make sure to mark the answer as the accepted solution. If it resolves you issue !
Created 12-27-2021 04:53 AM
I made YARM role decommission on a data node.
And now I see map reduce jobs on that data node have status KILLED.
So My guess is "yarn role decommission" is not a graceful shutdown solution.
Please advice how to do propriety YARN role graceful shutdown on a data node.
Created 12-31-2021 09:22 AM
@xpouser You want to take a look at : https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html
Created 01-15-2022 11:32 PM
HI,
YARN Graceful decommission will wait for jobs to complete. You can pass the timeout value so that YARN will start decommission after x seconds. If no jobs running within x secs then automatically YARN will start decommission without waiting for timeout to happen.
CM -> Clusters -> yarn -> Configuration -> In search bar (
To decommission a specific host/more hosts
CM -> Clusters -> yarn -> Instances (Select the hosts that you want to decommission)
Click -> Actions for selected hosts -> Decommission
In case you want to decommission all the roles of a host then follow this doc
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_mc_host_maint.html#decomm_host
Make sure to mark the answer as the accepted solution. If it resolves you issue !