Created 06-25-2018 10:43 PM
I have a yarn service app which has two components Master and Worker. I restarted Yarn services and launched the yarn service app.
Here, I'm noticing that the app launched by Yarn only get Master component. It did not start any worker node.
Can someone please explain why could this situation happen and how to recover from this ?
Created 06-26-2018 01:42 AM
This can happen if application was getting launched, and RM has created: /registry/users/[user]/services/yarn-service/[application]/components parent prefix. Components had not reach STABLE state, and RM was shutdown and restarted. The recovery of the components can only read partial records from ZooKeeper to report the current running state. Service AM log contains:
2018-06-19 00:27:03,186 [main] INFO service.ServiceScheduler - Could not read component paths: `/users/ambari-qa/services/yarn-service/mawo-try/components': No such file or directory: KeeperErrorCode = NoNode for /registry/users/ambari-qa/services/yarn-service/mawo-try/components
The error message was reporting correctly because the application state is unknown or partially registered. One way to recover properly is to stop the application and restart the application.
Most of the time, the partial running application may require system administrator or end user intervention to clean up properly. This is a NP-complete problem that requires human intervention to recover.
Created 06-26-2018 01:42 AM
This can happen if application was getting launched, and RM has created: /registry/users/[user]/services/yarn-service/[application]/components parent prefix. Components had not reach STABLE state, and RM was shutdown and restarted. The recovery of the components can only read partial records from ZooKeeper to report the current running state. Service AM log contains:
2018-06-19 00:27:03,186 [main] INFO service.ServiceScheduler - Could not read component paths: `/users/ambari-qa/services/yarn-service/mawo-try/components': No such file or directory: KeeperErrorCode = NoNode for /registry/users/ambari-qa/services/yarn-service/mawo-try/components
The error message was reporting correctly because the application state is unknown or partially registered. One way to recover properly is to stop the application and restart the application.
Most of the time, the partial running application may require system administrator or end user intervention to clean up properly. This is a NP-complete problem that requires human intervention to recover.