Support Questions

Find answers, ask questions, and share your expertise

Error generating aggregated logs for Spark Applications on Cloudera CDP 7.2.18

avatar
Explorer

Hi everyone,

I am having an issue where the aggregated logs are not being generated/saved for Spark Jobs.
There is the below error but I don't know what can be causing this.
The user has permissions on Ranger for ADLS.

Failed to setup application log directory for application_1751535521169_0953
Failed to acquire a SAS token for get-status on /oplogs/yarn-app-logs/csso_luis.simoes/bucket-logs-ifile/0953/application_1751535521169_0953 due to org.apache.hadoop.security.AccessControlException: org.apache.ranger.raz.intg.RangerRazException: <!doctype html><html lang="en"><head><title>HTTP Status 401 – Unauthorized</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 401 – Unauthorized</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Message</b> Authentication required</p><p><b>Description</b> The request has not been applied to the target resource because it lacks valid authentication credentials for that resource.</p><hr class="line" /><h3>Apache Tomcat/8.5.100</h3></body></html>; HttpStatus: 401
	at org.apache.hadoop.fs.azurebfs.services.AbfsClient.appendSASTokenToQuery(AbfsClient.java:1233)
	at org.apache.hadoop.fs.azurebfs.services.AbfsClient.appendSASTokenToQuery(AbfsClient.java:1199)
	at org.apache.hadoop.fs.azurebfs.services.AbfsClient.getPathStatus(AbfsClient.java:905)
	at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:1007)
	at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:729)
	at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:719)
	at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.checkExists(LogAggregationFileController.java:530)
	at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController$1.run(LogAggregationFileController.java:479)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
	at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.createAppDir(LogAggregationFileController.java:460)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:273)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:223)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:366)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:267)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:157)
	at java.lang.Thread.run(Thread.java:750)

 Any help would be very much appreciated.

5 REPLIES 5

avatar
Contributor

Hello @LSIMS

The logs are complaining about no valid credentials, it's reporting "HTTP Status 401 – Unauthorized". 

This could be also due to a valid Kerberos token not being passed. 

Something you can try is adding below setting under "YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml" on YARN configurations tab. 

Name: hadoop.kerberos.keytab.login.autorenewal.enabled
Value: true

Then, restart any stale service and retry. This should work now. 


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.

avatar
Explorer
But this is not supposed to be configured out of the box?
The aggregation logs are not using the service principals and built-in authentication?


avatar
Contributor

They are using the service principals, is the autorenewal that is not being done on time. 
Adding the setting you can force that to be done. 


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.

avatar
Explorer
So this mean that the default configuration is the most correct configuration?
I would assume the configuration is supposed to ensure these core functionalities work 100%.

avatar
Contributor

Default configurations are the tested and for that reason are set in that way, but those are configurable for a reason, sometimes, and depending on the environment, the use case and much more, they need to be tuned in an specific way. 


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.