Created 03-05-2025 11:04 PM
Hi Team,
We are looking for your help to troubleshoot & fix the below issue.
Issue description : All yarn jobs are failing to get submitted with below error after enabling Ozone.
ERROR :
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1741072771401_0004 to YARN : Failed to renew token: Kind: OzoneToken, Service: <host ip>:9862, Ident: (OzoneToken owner=hdfs/<hostname>@PLATFORMKRB.COM, renewer=yarn, realUser=, issueDate=2025-03-06T06:57:54.532Z, maxDate=2025-03-13T06:57:54.532Z, sequenceNumber=13, masterKeyId=1, strToSign=null, signature=null, awsAccessKeyId=null, omServiceId=<hostname>, omCertSerialId=6)
Regards,
Sushant
Created 03-06-2025 08:06 AM
@Jaguar Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Ozone expert @Devesh who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 03-06-2025 09:34 PM
Possible Causes & Solutions:
1. Token Expired
The error message shows the token’s maxDate=2025-03-13T06:57:54.532Z, which means it is valid until March 13, 2025. However, if the token was already expired or incorrectly renewed, it would cause this failure.
Solution:
Check token validity using the klist command:
klist -e
If expired, renew the token manually:
hdfs dfs -renewDelegationToken <token>
2. Incorrect or Missing Kerberos Credentials
If the application is running on a Kerberized cluster, YARN must have a valid Kerberos ticket.
Solution:
Ensure the Kerberos ticket is valid:
If the ticket is expired, re-authenticate:
kinit -kt /etc/security/keytabs/yarn.service.keytab yarn/<hostname>@PLATFORMKRB.COM
3. Incorrect Token Renewer Principal
The error message includes:
renewer=yarn
This means the token was issued for YARN to renew, but YARN may not have the required permissions.
Solution:
Ensure the renewer principal is correctly configured in core-site.xml:
<property>
<name>hadoop.security.auth_to_local</name>
<value>RULE:[1:$1@$0](yarn@PLATFORMKRB.COM)s/.*/yarn/</value>
</property>
Restart YARN after making changes.
4. Misconfigured Ozone Token in YARN
If the Ozone token is not being properly renewed, you may need to refresh it.
Solution:
Try manually obtaining a new Ozone token:
hdfs fetchdt -fs o3fs://<bucket>.<volume>/<host>:9862 <token-file>
Pass the token explicitly when running the application.
5. OM (Ozone Manager) Certificate or Service ID Issues
The error mentions omServiceId=<hostname> and omCertSerialId=6. If the OM certificate has expired or the service ID is incorrect, it can prevent token renewal.
Solution:
Check if the OM certificate is valid:
ozone admin cert list
If expired, renew the certificates and restart OM.
6. Mismatched Ozone and YARN Versions
If the Ozone and YARN versions are incompatible, token renewal may fail.
Solution:
Check the Ozone and YARN versions:
ozone version
yarn version
Ensure they are compatible.
Next Steps:
a. Check if the Kerberos ticket is valid (klist).
b. Try renewing the token manually (hdfs dfs -renewDelegationToken).
c. Ensure the correct renewer principal is configured in YARN.
d. Verify Ozone Manager certificates (ozone admin cert list).
e. Restart YARN and OM after making changes.
Created 03-12-2025 09:08 AM
@Jaguar Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,Created 03-12-2025 11:43 PM
Hi Team,
4. Misconfigured Ozone Token in YARN
hdfs fetchdt -fs o3fs://<bucket>.<volume>/<host>:9862 <token-file>
Que. : Where to get this token file <token-file>
5. OM (Ozone Manager) Certificate or Service ID Issues
ozone admin cert list
Que. : We see there are few cert expired, we need to know the commands to renew them.
Regards,
Sushant
Created 03-25-2025 10:31 AM
@Devesh Hi! Do you have some insights here? Thanks!
Regards,
Diana Torres,Created 03-31-2025 05:01 PM
@sathishkr @willx Hi! Do you have some insights here? Thanks!
Regards,
Diana Torres,