Member since
08-25-2018
4
Posts
0
Kudos Received
0
Solutions
09-23-2018
10:33 PM
Hi all, when enable Kerberos on new cluster after restart the failed installation got the error message Generate Missing Credentials Command /usr/share/cmf/bin/gen_credentials.sh failed with exit code 1 and output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/usr/lib/mit/bin:/usr/bin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/usr/lib/mit/bin:/usr/bin:/sbin:/usr/sbin:/bin:/usr/bin
+ CMF_REALM=HADM.RU
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf5888901524077791261.keytab
+ PRINC=mapred/ip-172-31-46-169.us-west-2.compute.internal@HADM.RU
+ MAX_RENEW_LIFE=604800
+ KADMIN='kadmin -k -t /var/run/cloudera-scm-server/cmf5922922234613877041.keytab -p cloudera-scm/admin@HADM.RU -r HADM.RU'
+ RENEW_ARG=
+ '[' 604800 -gt 0 ']'
+ RENEW_ARG='-maxrenewlife "604800 sec"'
+ '[' -z /etc/krb5.conf ']'
+ echo 'Using custom config path '\''/etc/krb5.conf'\'', contents below:'
+ cat /etc/krb5.conf
+ kadmin -k -t /var/run/cloudera-scm-server/cmf5922922234613877041.keytab -p cloudera-scm/admin@HADM.RU -r HADM.RU -q 'addprinc -maxrenewlife "604800 sec" -randkey mapred/ip-172-31-46-169.us-west-2.compute.internal@HADM.RU'
WARNING: no policy specified for mapred/ip-172-31-46-169.us-west-2.compute.internal@HADM.RU; defaulting to no policy
add_principal: Principal or policy already exists while creating "mapred/ip-172-31-46-169.us-west-2.compute.internal@HADM.RU".
+ '[' 604800 -gt 0 ']'
++ kadmin -k -t /var/run/cloudera-scm-server/cmf5922922234613877041.keytab -p cloudera-scm/admin@HADM.RU -r HADM.RU -q 'getprinc -terse mapred/ip-172-31-46-169.us-west-2.compute.internal@HADM.RU'
++ tail -1
++ cut -f 12
+ RENEW_LIFETIME=0
+ '[' 0 -eq 0 ']'
+ echo 'Unable to set maxrenewlife'
+ exit 1
>> Close Redhat 7.5 linux Cloudera Manager 5.12.1
... View more
08-26-2018
10:00 AM
I set only my IP, Doubled RAM I know that enabling the Kerberos resolve that issue but in previous versions 5.12 (May-June) everything was working by default without problem on smaler configuration. huh Thanks
... View more
08-26-2018
08:58 AM
Thanks for response, don`t know what to do. 1. Completely new cluster setup 4 nodes on Amazon WS (idle) Cloudera 5.15.1 RedHat 7.5 Cloudera SCM with all main roles on t2.x2large -- 16 GB RAM Data nodes t2.medium 4 GB RAM Cluster new, without load 2. Cluster exposed to internet on ports (22,50070,19888,8042,7180,8020,7432,7182-7183,8088, ICMP) I use inbound/outbound rules for security group on AWS (not enough) 3. [ec2-user@ip-172-31-35-169 ~]$ sudo -u yarn crontab -l no crontab for yarn Thanks
... View more
08-25-2018
11:01 PM
Hi hadoopNoob, I have the same problem with dr.who and unexpected exits, could you give an idea (links or simple guidance) how your resolve it. Thanks 6:00:25.009 AM INFO NodeStatusUpdaterImpl Registered with ResourceManager as ip-172-31-42-197.us-west-2.compute.internal:8041 with total resource of <memory:1024, vCores:2> 6:00:25.009 AM INFO NodeStatusUpdaterImpl Notifying ContainerManager to unblock new container-requests 6:00:25.318 AM ERROR RecoveredContainerLaunch Unable to recover container container_1535261132868_0172_01_000001 java.io.IOException: Timeout while waiting for exit code from container_1535261132868_0172_01_000001 at org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.reacquireContainer(ContainerExecutor.java:199) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch.call(RecoveredContainerLaunch.java:83) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.RecoveredContainerLaunch.call(RecoveredContainerLaunch.java:46) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 6:00:25.325 AM WARN RecoveredContainerLaunch Recovered container exited with a non-zero exit code 154 6:00:25.329 AM INFO Container Container container_1535261132868_0172_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 6:00:25.329 AM INFO ContainerLaunch Cleaning up container container_1535261132868_0172_01_000001 6:00:25.416 AM INFO DefaultContainerExecutor Deleting absolute path : /yarn/nm/usercache/dr.who/appcache/application_1535261132868_0172/container_1535261132868_0172_01_000001 6:00:25.418 AM WARN NMAuditLogger USER=dr.who OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1535261132868_0172 CONTAINERID=container_1535261132868_0172_01_000001 6:00:25.423 AM INFO Container Container container_1535261132868_0172_01_000001 transitioned from EXITED_WITH_FAILURE to DONE 6:00:25.423 AM INFO Application Removing container_1535261132868_0172_01_000001 from application application_1535261132868_0172 6:00:25.424 AM INFO AppLogAggregatorImpl Considering container container_1535261132868_0172_01_000001 for log-aggregation 6:00:25.424 AM INFO AuxServices Got event CONTAINER_STOP for appId application_1535261132868_0172 6:00:25.452 AM INFO ContainersMonitorImpl Starting resource-monitoring for container_1535261132868_0172_01_000001 6:00:25.454 AM INFO ContainersMonitorImpl Stopping resource-monitoring for container_1535261132868_0172_01_000001 6:00:26.429 AM INFO NodeStatusUpdaterImpl Removed completed containers from NM context: [container_1535261132868_0172_01_000001] 6:00:26.491 AM INFO ContainerManagerImpl Start request for container_1535261132868_0174_01_000001 by user dr.who 6:00:26.492 AM INFO ContainerManagerImpl Creating a new application reference for app application_1535261132868_0174 6:00:26.500 AM INFO Application Application application_1535261132868_0174 transitioned from NEW to INITING 6:00:26.505 AM INFO AppLogAggregatorImpl rollingMonitorInterval is set as -1. The log rolling monitoring interval is disabled. The logs will be aggregated after this application is finished. 6:00:26.519 AM INFO NMAuditLogger USER=dr.who IP=172.31.35.169 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1535261132868_0174 CONTAINERID=container_1535261132868_0174_01_000001 6:00:26.545 AM INFO Application Adding container_1535261132868_0174_01_000001 to application application_1535261132868_0174 6:00:26.545 AM INFO Application Application application_1535261132868_0174 transitioned from INITING to RUNNING 6:00:26.546 AM INFO Container Container container_1535261132868_0174_01_000001 transitioned from NEW to LOCALIZED 6:00:26.546 AM INFO AuxServices Got event CONTAINER_INIT for appId application_1535261132868_0174 6:00:26.655 AM INFO Container Container container_1535261132868_0174_01_000001 transitioned from LOCALIZED to RUNNING 6:00:26.678 AM INFO DefaultContainerExecutor launchContainer: [bash, /yarn/nm/usercache/dr.who/appcache/application_1535261132868_0174/container_1535261132868_0174_01_000001/default_container_executor.sh] 6:00:26.788 AM WARN DefaultContainerExecutor Exit code from container container_1535261132868_0174_01_000001 is : 143 6:00:26.789 AM INFO Container Container container_1535261132868_0174_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 6:00:26.789 AM INFO ContainerLaunch Cleaning up container container_1535261132868_0174_01_000001 6:00:26.851 AM INFO DefaultContainerExecutor Deleting absolute path : /yarn/nm/usercache/dr.who/appcache/application_1535261132868_0174/container_1535261132868_0174_01_000001 6:00:26.865 AM WARN NMAuditLogger USER=dr.who OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1535261132868_0174 CONTAINERID=container_1535261132868_0174_01_000001 6:00:26.865 AM INFO Container Container container_1535261132868_0174_01_000001 transitioned from EXITED_WITH_FAILURE to DONE 6:00:26.866 AM INFO Application Removing container_1535261132868_0174_01_000001 from application application_1535261132868_0174 6:00:26.866 AM INFO AppLogAggregatorImpl Considering container container_1535261132868_0174_01_000001 for log-aggregation 6:00:26.866 AM INFO AuxServices Got event CONTAINER_STOP for appId application_1535261132868_0174
... View more