Created on 10-31-2016 10:35 AM - edited 09-16-2022 03:46 AM
I ran a API call to decommsion datanode.
HDFS.decommission(dn.name)
I used the following to make sure the role is indeed decomissioned:
if role.commissionState == 'DECOMMISSIONED':
then I will run role deletion:
HDFS.delete_role(dn.name)
But I got the error:
Removing datanode roles..Failed to remove datanode role on the host Role hdfs-DATANODE-4a8948d61dc8a4727f810f736d9d3447 has 1 active commands (error 400)
It turned out that after commissionState became DECOMMISSONED, the UI still shows the decommisiong command was running for another 10 to 15 seconds.
To workaround that, I had to let the prgram sleep for addtional 60 seconds after the decomission status became decomissioned.
Is this a known issue in API v11?
Created 10-31-2016 12:55 PM
I think you can use the wait() method too as demonstrated here:
https://cloudera.github.io/cm_api/docs/python-client/#service-lifecycle-and-commands
Created 10-31-2016 12:52 PM
Hello,
What you did to work around this is fine. Basically, what you really want to do is wait till the decommission command is complete since Cloudera Manager will not let you delete the role until commands running against that role are complete.
Since "decommission" returns a command object, I imagine you could use the id to query the commands list for that service. I imagine we have some example code for that lying around, but I'm not sure where it is.
Waiting 60 seconds is likely OK, but verifying the command has completed is more sound.
Created 10-31-2016 12:55 PM
I think you can use the wait() method too as demonstrated here:
https://cloudera.github.io/cm_api/docs/python-client/#service-lifecycle-and-commands
Created on 10-31-2016 02:25 PM - edited 10-31-2016 03:44 PM
Thanks the wait() worked. But I was wondering what action has wait method. Now i now service start, decommssion etc has wait method but role start stop etc doesn't have wait method.
also, is there a timeout function to the wait()