Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CDH 5.15 shows RM down right after install using CM

Highlighted

CDH 5.15 shows RM down right after install using CM

Explorer

Log is attached. Just installed cluster and hit this issue. Not happened before in pre 5.15 releases.

 

Thanks.

 

EDIT: Can't seem to attach logs

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hadoop-yarn/fail. Name node is in safe mode.
The reported blocks 745 has reached the threshold 0.9990 of total blocks 745. The number of live datanodes 1 has reached the minimum number 1. In safe mode extension. Safe mode will be turned off automatically in 1 seconds.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4527)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4502)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:884)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:328)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:641)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)

at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy90.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:575)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy91.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:3155)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:3122)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1005)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1001)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1001)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:993)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1970)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.writeFlagFileForFailedAM(RMAppImpl.java:1352)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.access$3500(RMAppImpl.java:111)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$AttemptFailedFinalStateSavedTransition.transition(RMAppImpl.java:1036)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$AttemptFailedFinalStateSavedTransition.transition(RMAppImpl.java:1028)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$FinalStateSavedTransition.transition(RMAppImpl.java:1017)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$FinalStateSavedTransition.transition(RMAppImpl.java:1011)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:766)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:110)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:868)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:852)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
at java.lang.Thread.run(Thread.java:745)
2018-08-12 22:16:04,534 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1534112156808_0001 failed 2 times due to AM Container for appattempt_1534112156808_0001_000002 exited with exitCode: 143
For more detailed output, check application tracking page:http://ip-172-31-28-114.ec2.internal:8088/proxy/application_1534112156808_0001/Then, click on links to logs of each attempt.
Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
Failing this attempt. Failing the application.
2018-08-12 22:16:04,536 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0001 State change from FINAL_SAVING to FAILED on event = APP_UPDATE_SAVED
2018-08-12 22:16:04,537 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dr.who OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1534112156808_0001 failed 2 times due to AM Container for appattempt_1534112156808_0001_000002 exited with exitCode: 143
For more detailed output, check application tracking page:http://ip-172-31-28-114.ec2.internal:8088/proxy/application_1534112156808_0001/Then, click on links to logs of each attempt.
Diagnostics: Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Killed by external signal
Failing this attempt. Failing the application. APPID=application_1534112156808_0001
2018-08-12 22:16:04,539 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1534112156808_0001,name=hadoop,user=dr.who,queue=root.users.dr_dot_who,state=FAILED,trackingUrl=http://ip-172-31-28-114.ec2.internal:8088/cluster/app/application_1534112156808_0001,appMasterHost=N..., vCores:0>
2018-08-12 22:16:52,666 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 2
2018-08-12 22:16:53,015 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 2 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2018-08-12 22:16:53,015 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 2 submitted by user dr.who
2018-08-12 22:16:53,016 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dr.who OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1534112156808_0002
2018-08-12 22:16:53,016 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1534112156808_0002
2018-08-12 22:16:53,016 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1534112156808_0002
2018-08-12 22:16:53,016 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0002 State change from NEW to NEW_SAVING on event = START
2018-08-12 22:16:53,017 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0002 State change from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
2018-08-12 22:16:53,017 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementRule: Name dr.who is converted to dr_dot_who when it is used as a queue name.
2018-08-12 22:16:53,017 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1534112156808_0002 from user: dr.who, in queue: root.users.dr_dot_who, currently num of applications: 1
2018-08-12 22:16:53,019 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0002 State change from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
2018-08-12 22:16:53,019 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1534112156808_0002_000001
2018-08-12 22:16:53,019 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0002_000001 State change from NEW to SUBMITTED on event = START
2018-08-12 22:16:53,020 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1534112156808_0002_000001 to scheduler from user: dr.who
2018-08-12 22:16:53,020 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0002_000001 State change from SUBMITTED to SCHEDULED on event = ATTEMPT_ADDED
2018-08-12 22:16:53,287 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1534112156808_0002_01_000001 Container Transitioned from NEW to ALLOCATED
2018-08-12 22:16:53,287 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dr.who OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1534112156808_0002 CONTAINERID=container_1534112156808_0002_01_000001
2018-08-12 22:16:53,288 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1534112156808_0002_01_000001 of capacity <memory:1024, vCores:1> on host ip-172-31-28-114.ec2.internal:8041, which has 1 containers, <memory:1024, vCores:1> used and <memory:1846, vCores:7> available after allocation
2018-08-12 22:16:53,288 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : ip-172-31-28-114.ec2.internal:8041 for container : container_1534112156808_0002_01_000001
2018-08-12 22:16:53,291 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1534112156808_0002_01_000001 Container Transitioned from ALLOCATED to ACQUIRED
2018-08-12 22:16:53,292 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1534112156808_0002_000001
2018-08-12 22:16:53,292 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1534112156808_0002 AttemptId: appattempt_1534112156808_0002_000001 MasterContainer: Container: [ContainerId: container_1534112156808_0002_01_000001, NodeId: ip-172-31-28-114.ec2.internal:8041, NodeHttpAddress: ip-172-31-28-114.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.31.28.114:8041 }, ]
2018-08-12 22:16:53,292 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0002_000001 State change from SCHEDULED to ALLOCATED_SAVING on event = CONTAINER_ALLOCATED
2018-08-12 22:16:53,298 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0002_000001 State change from ALLOCATED_SAVING to ALLOCATED on event = ATTEMPT_NEW_SAVED
2018-08-12 22:16:53,299 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1534112156808_0002_000001
2018-08-12 22:16:53,302 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1534112156808_0002_01_000001, NodeId: ip-172-31-28-114.ec2.internal:8041, NodeHttpAddress: ip-172-31-28-114.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.31.28.114:8041 }, ] for AM appattempt_1534112156808_0002_000001
2018-08-12 22:16:53,302 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1534112156808_0002_000001
2018-08-12 22:16:53,302 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1534112156808_0002_000001
2018-08-12 22:16:53,315 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_1534112156808_0002_01_000001, NodeId: ip-172-31-28-114.ec2.internal:8041, NodeHttpAddress: ip-172-31-28-114.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.31.28.114:8041 }, ] for AM appattempt_1534112156808_0002_000001
2018-08-12 22:16:53,315 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0002_000001 State change from ALLOCATED to LAUNCHED on event = LAUNCHED
2018-08-12 22:16:53,621 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 3
2018-08-12 22:16:53,978 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 3 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2018-08-12 22:16:53,978 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 3 submitted by user dr.who
2018-08-12 22:16:53,978 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dr.who OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1534112156808_0003
2018-08-12 22:16:53,978 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1534112156808_0003
2018-08-12 22:16:53,979 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1534112156808_0003
2018-08-12 22:16:53,979 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0003 State change from NEW to NEW_SAVING on event = START
2018-08-12 22:16:53,979 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0003 State change from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED
2018-08-12 22:16:53,979 WARN org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementRule: Name dr.who is converted to dr_dot_who when it is used as a queue name.
2018-08-12 22:16:53,980 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1534112156808_0003 from user: dr.who, in queue: root.users.dr_dot_who, currently num of applications: 2
2018-08-12 22:16:53,981 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1534112156808_0003 State change from SUBMITTED to ACCEPTED on event = APP_ACCEPTED
2018-08-12 22:16:53,981 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1534112156808_0003_000001
2018-08-12 22:16:53,981 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0003_000001 State change from NEW to SUBMITTED on event = START
2018-08-12 22:16:53,981 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1534112156808_0003_000001 to scheduler from user: dr.who
2018-08-12 22:16:53,982 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1534112156808_0003_000001 State change from SUBMITTED to SCHEDULED on event = ATTEMPT_ADDED
2018-08-12 22:16:54,289 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1534112156808_0002_01_000001 Container Transitioned from ACQUIRED to RUNNING