Created on 06-22-2017 05:05 AM - edited 09-16-2022 04:48 AM
Hi ,
3 days ago I started to test hadoop cloudera HA over CDH 5.11 using 4 nodes 8GB ram , 4 cores (google compute engines)
All nodes where defined as data nodes .
During these tests I have restarted nodes .
After a while there there was 1 node which did not succeed to start the impala services than 2 and at last all of them .
No matter I restarted the Impala services or restarted one instance . The service did not succeed to run with no informative error message in impala logs files and agent logs .
cloudera-scm-server log:
2017-06-22 11:50:26,011 INFO CommandPusher:com.cloudera.cmf.service.GenericBringUpRoleCommand: BringUp command (119) has finished on service impala for role 61/impala-IMPALAD-9cfb5b1ab405f5aa4093cb4531cd05dd, with status FAILURE and message MessageWithArgs{messageId=message.command.role.bringUp.supervisor.fatal, args=[]}
cloudera-scm-agent log - nothing significant.
I have deleted the service and created it once again . Did solve the problem.
Deleted the services restarted all machined - again did not help .
I have Uninstalled cloudera in all machined using this link
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_ig_uninstall_cm.html
restarted all machined and created the cluster once again it failed on creating the impala services .
All other services are working with no issues (HDFS, YARN,ZOOKEEPER,OOZIE)
What is going on ?
Do I miss something ?
I have QA ENV with failed impala instance with no remedy .
I'm afraid to have this problem in production.
Appreciate your help .
Many thanks
Alon