Support Questions
Find answers, ask questions, and share your expertise

Adding one more Oozie server failed in Ambari.

New Contributor

Hello, 

We have one oozie server in our lower environment and I was assigned to add one more oozie server. I added one oozie server in ambari but  oozie server installation stuck due to keytab/kerberos issues. 

Error--> Execution of '/usr/bin/kinit -l 5m20s -c /var/lib/ambari-agent/tmp/oozie_alert_cc_7579 -kt /etc/security/keytabs/oozie.service.keytab oozie/FQDN; ' returned 1. kinit: Client 'oozie/FQDN' not found in Kerberos database while getting initial credentials

 

 

Any help? How to I fix this issue to add one more oozie server?

 

Thank in advance.

 

 

 

 

 

 ambari server.PNGoozie issues.PNG

6 REPLIES 6

Cloudera Employee

Hi @Mdali ,

 

Based on the error message "'oozie/FQDN' not found in Kerberos database", looks like the oozie kerberos prinicpal creation failed. Could you check the Ambari server logs during the time you tried to add another Oozie server to identify the cause?

 

Thanks,

Prashanth Vishnu

New Contributor

@pvishnu 

Hi, Thank you for reply. As I aborted the operation after 4 hours, there is not any std.err generated for this issue. I have checked the ambari-server log. During the installation time, all below messages are generated continuously. Looks like Ambari tried non stop to install but failed. 

 

Any Idea?

 

 

 

StackAdvisorHelper:255 - Clear stack advisor caches, hosts: [FQDN]
2021-09-10 14:43:17,110 INFO [ambari-client-thread-381065] HostComponentResourceProvider:973 - Received a updateHostComponent request, clusterName=XXX_DEV, serviceName=OOZIE, componentName=OOZIE_SERVER, hostname=FQDN, request={ clusterName=XXX_DEV, serviceName=OOZIE, componentName=OOZIE_SERVER, hostname=FQDN publicHostname=null, desiredState=INSTALLED, state=null, desiredStackId=null, staleConfig=null, adminState=null, maintenanceState=null}
2021-09-10 14:43:17,110 INFO [ambari-client-thread-381065] HostComponentResourceProvider:697 - Handling update to host component, clusterName=XXX_DEV, serviceName=OOZIE, componentName=OOZIE_SERVER, hostname=FQDN, currentState=INIT, newDesiredState=INSTALLED
2021-09-10 14:43:17,353 INFO [ambari-action-scheduler] ServiceComponentHostImpl:1054 - Host role transitioned to a new state, serviceComponentName=OOZIE_SERVER, hostName=FQDN, oldState=INIT, currentState=INSTALLING
2021-09-10 14:43:17,364 INFO [ambari-action-scheduler] AgentCommandsPublisher:124 - AgentCommandsPublisher.sendCommands: sending ExecutionCommand for host FQDN, role OOZIE_SERVER, roleCommand INSTALL, and command ID 13530-0, task ID 56448
2021-09-10 14:44:07,441 ERROR [agent-report-processor-3] HeartbeatProcessor:516 - Operation failed - may be retried. Service component host: OOZIE_SERVER, host: FQDN Action id 13530-0 and taskId 56448
2021-09-10 14:44:07,442 INFO [agent-report-processor-3] ServiceComponentHostImpl:1054 - Host role transitioned to a new state, serviceComponentName=OOZIE_SERVER, hostName=FQDN, oldState=INSTALLING, currentState=INSTALL_FAILED
2021-09-10 14:44:07,742 INFO [ambari-action-scheduler] ActionDBAccessorImpl:227 - Aborting command. Hostname FQDN role KERBEROS_CLIENT requestId 13530 taskId 56450 stageId 2
2021-09-10 14:44:07,742 INFO [ambari-action-scheduler] ActionDBAccessorImpl:227 - Aborting command. Hostname FQDN role KERBEROS_CLIENT requestId 13530 taskId 56453 stageId 5

Cloudera Employee

Hi @Mdali ,


Could you ensure the KDC server is reachable from the Ambari server? If it isn't then it is possible that the tasks might get timed out.

 

# ping <KDC host>
telnet <KDC host> 88

Also, check the ambari-server.log for the keyword "CreatePrincipalsServerAction", as ideally below are the messages you can expect when you add a oozie server to the cluster,

 

-------------
14 Sep 2021 03:03:23,511 INFO [Server Action Executor Worker 2577] KerberosServerAction:359 - Processing identities...
14 Sep 2021 03:03:23,518 INFO [Server Action Executor Worker 2577] CreatePrincipalsServerAction:205 - Processing principal, oozie/<FQDN>@HADOOP.COM
14 Sep 2021 03:03:23,921 INFO [Server Action Executor Worker 2577] KerberosServerAction:463 - Processing identities completed.
-------------

 

Thanks,
Prashanth Vishnu

New Contributor

Hello Prashanth,

I was able to install the oozie, it was ambari- agent permission issues. But oozie server did not start up and running. I tried to start oozie server but it failed. It shows tar command is not getting executed which is executed by ambari automatically I believe. I tried to run manually tar command, but its failed.

 

exception:

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

 Execute[('tar', '-xvf', u'/usr/hdp/current/oozie-server/oozie-sharelib.tar.gz', '-C', u'/usr/hdp/current/oozie-server')] {'not_if': "ambari-sudo.sh su oozie -l -s /bin/bash -c 'ls /hadoop/var/run/oozie/oozie.pid >/dev/null 2>&1 && ps -p `cat /hadoop/var/run/oozie/oozie.pid` >/dev/null 2>&1' || test -f /usr/hdp/current/oozie-server/.hashcode && test -d /usr/hdp/current/oozie-server/share", 'sudo': True}

Command failed after 1 tries

 

 

Any advise.

 

Thank you. 

Cloudera Employee

Hi @Mdali ,

 

Maybe the file"/usr/hdp/current/oozie-server/oozie-sharelib.tar.gz" is corrupted? Could you try copying the file from the other Oozie server if its the same version? Then try restarting again and let us know how it goes.

 

Thanks,

Prashanth Vishnu

New Contributor

Hi @pvishnu,

I already did the same thing as you mentioned above. But I got the same error.

Any further advise?

 

Thank you.