Created 07-09-2025 04:03 AM
Hi everyone,
I am having an issue where the aggregated logs are not being generated/saved for Spark Jobs.
There is the below error but I don't know what can be causing this.
The user has permissions on Ranger for ADLS.
Failed to setup application log directory for application_1751535521169_0953 Failed to acquire a SAS token for get-status on /oplogs/yarn-app-logs/csso_luis.simoes/bucket-logs-ifile/0953/application_1751535521169_0953 due to org.apache.hadoop.security.AccessControlException: org.apache.ranger.raz.intg.RangerRazException: <!doctype html><html lang="en"><head><title>HTTP Status 401 – Unauthorized</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 401 – Unauthorized</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Message</b> Authentication required</p><p><b>Description</b> The request has not been applied to the target resource because it lacks valid authentication credentials for that resource.</p><hr class="line" /><h3>Apache Tomcat/8.5.100</h3></body></html>; HttpStatus: 401 at org.apache.hadoop.fs.azurebfs.services.AbfsClient.appendSASTokenToQuery(AbfsClient.java:1233) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.appendSASTokenToQuery(AbfsClient.java:1199) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getPathStatus(AbfsClient.java:905) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:1007) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:729) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:719) at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.checkExists(LogAggregationFileController.java:530) at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController$1.run(LogAggregationFileController.java:479) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.createAppDir(LogAggregationFileController.java:460) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:273) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:223) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:366) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:267) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:157) at java.lang.Thread.run(Thread.java:750)
Any help would be very much appreciated.
Created on 08-12-2025 10:46 AM - edited 08-12-2025 10:54 AM
Hello @LSIMS,
The logs are complaining about no valid credentials, it's reporting "HTTP Status 401 – Unauthorized".
This could be also due to a valid Kerberos token not being passed.
Something you can try is adding below setting under "YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml" on YARN configurations tab.
Name: hadoop.kerberos.keytab.login.autorenewal.enabled
Value: true
Then, restart any stale service and retry. This should work now.
Created 08-13-2025 05:15 AM
Created 08-14-2025 09:30 AM
They are using the service principals, is the autorenewal that is not being done on time.
Adding the setting you can force that to be done.
Created 08-15-2025 07:03 AM
Created 08-18-2025 08:13 AM
Default configurations are the tested and for that reason are set in that way, but those are configurable for a reason, sometimes, and depending on the environment, the use case and much more, they need to be tuned in an specific way.