I'm facing a new issue on our kerberized cluster :
I submit a coordinator job to oozie server, that schedules a worflow execution once a day.
Before submitting the coordinator job to oozie, I have to proceed with "kinit" command : OK, this is the expected behavior : This command returns a kerberos ticket that will be used to authenticate against oozie server when submitting my job.
The tricky part : my oozie workflow runs a java action that submits new "sub-workflows" to oozie and monitors their execution thanks to oozie java client API (org.apache.oozie.client.OozieClient.getJobInfo())
Everything has been working fine for more than 10 days, but unexpectedly it failed yesterday night, and the reason is about a kerberos ticket expiration : here is what I found in the mapper logs :
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.JavaMain], main() threw exception, IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Ticket expired (32))
org.apache.oozie.action.hadoop.JavaMainException: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Ticket expired (32))
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.security.AccessController.doPrivileged(Native Method)
Caused by: IO_ERROR : java.io.IOException: Error while connecting Oozie server. No of retries = 1. Exception = Could not authenticate, GSSException: No valid credentials provided (Mechanism level: Ticket expired (32))
I understood that oozie server basically submits a map/reduce job and that mappers execute my java code.
I also read that if this java action has to authenticate against hadoop cluster, it has to retrieve oozie delegation token by using "HADOOP_TOKEN_FILE_LOCATION" environment variable (populated by oozie ?)
However, it seems that "somehow", my java code is able to authenticate against oozie server to submit sub-workflows (because it has been running successfully for 10 days)...but how does it work ? the kerberos ticket is not propagated to mappers, so how is it possible ?
Above stracktrace clearly mentions "ticket expired" error, so mapper seems to use a kerberos ticket...which one ? where does it come from ?
I feel a bit confused about kerberos ticket & delegation tokens...I thought that no kerberos ticket was needed (except for the very first coordinator submission) and that everything relied on delegation token afterward...Did I miss something ?