I'm trying to increase the yarn application logs retention in my cluster and set the following parameters:
yarn.log-aggregation-enable is set to true
spark.eventLog.enabled is set to true
yarn.log-aggregation.retain-seconds is set to 7 days
But, In the resouce manager i can see the logs for the applications only for 3 days and from the resource manager application in CM i can access the logs only from 1 day back.
Can you please help in these 2 issues?
i see that the permissions of /tmp/logs is 770 and when run the command yarn logs -applicationId with the application owner user, then yes i can see the logs using the Cli command but from the UI no one can see these logs,
The group for the dir is cloudera-scm, and i don't have LDAP entry for this group, how i chould let user accessing this from the resource manager UI?
I notice that yarn.resourcemanager.max-completed-applications is set to 10,000 which almost the running application for 2-3 days.
I want to increase the number to 40,000 to cover the retention of 7 days, what other changes i should take inot consideration? should i increase the jaba heap size for the resources manager? should i change any other configuration at the level od node manager? what i should monitor after increasing max completed applications to 40,000? how this make impact the resource manager recovery performance?
Thanks in advance.