<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Manage YARN  local log-dirs  space in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168865#M29802</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We have enabled log aggregation in our hadoop cluster but still we see lot of files stored locally on the individual nodes (/u01/hadoop/yarn/local/usercache/hive/appcache) . I suppose these files should be moved to HDFS once the job is completed but not sure whether this is happening. Is there a way to troubleshoot this or is it safe to delete these files.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Venkatesh S&lt;/P&gt;</description>
    <pubDate>Thu, 26 May 2016 17:45:07 GMT</pubDate>
    <dc:creator>vsivalingam</dc:creator>
    <dc:date>2016-05-26T17:45:07Z</dc:date>
    <item>
      <title>Manage YARN  local log-dirs  space</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168865#M29802</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We have enabled log aggregation in our hadoop cluster but still we see lot of files stored locally on the individual nodes (/u01/hadoop/yarn/local/usercache/hive/appcache) . I suppose these files should be moved to HDFS once the job is completed but not sure whether this is happening. Is there a way to troubleshoot this or is it safe to delete these files.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Venkatesh S&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 17:45:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168865#M29802</guid>
      <dc:creator>vsivalingam</dc:creator>
      <dc:date>2016-05-26T17:45:07Z</dc:date>
    </item>
    <item>
      <title>Re: Manage YARN  local log-dirs  space</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168866#M29803</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10733/vsivalingam.html" nodeid="10733"&gt;@Venkadesh Sivalingam&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Can you check in logs if you see any error [Check in both Yarn and NM logs]&lt;/P&gt;&lt;P&gt;usually search for parameters -org.apache.hadoop.yarn.logaggregation&lt;/P&gt;&lt;P&gt;I see there are few bugs already with log aggregation which are fixed in HDP 2.2 and ahead- &lt;/P&gt;&lt;P&gt;&lt;A href="https://hortonworks.jira.com/browse/BUG-12006"&gt;BUG-12006&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/YARN-2468" target="_blank"&gt;https://issues.apache.org/jira/browse/YARN-2468&lt;/A&gt;&lt;/P&gt;&lt;P&gt;What is the version of HDP you are using ?&lt;/P&gt;&lt;P&gt;Also Can you make sure those property are in place and set correctly -&lt;/P&gt;&lt;H4&gt;PROPERTIES RESPECTED WHEN LOG-AGGREGATION IS ENABLED&lt;/H4&gt;&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;yarn.nodemanager.remote-app-log-dir&lt;/STRONG&gt;: This is on the default file-system, usually HDFS and indictes where the NMs should aggregate logs to. This &lt;EM&gt;should not&lt;/EM&gt; be local file-system, otherwise serving daemons like history-server will not able to serve the aggregated logs. Default is /tmp/logs.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;yarn.nodemanager.remote-app-log-dir-suffix&lt;/STRONG&gt;: The remote log dir will be created at {yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam}. Default value is “logs””.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;yarn.log-aggregation.retain-seconds&lt;/STRONG&gt;: How long to wait before deleting aggregated-logs, -1 or a negative number disables the deletion of aggregated-logs. One needs to be careful and not set this to a too small a value so as to not burden the distributed file-system.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;yarn.log-aggregation.retain-check-interval-seconds&lt;/STRONG&gt;: Determines how long to wait between aggregated-log retention-checks. If it is set to 0 or a negative value, then the value is computed as one-tenth of the aggregated-log retention-time. As with the previous configuration property, one needs to be careful and not set this to low values. Defaults to -1.&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;yarn.log.server.url&lt;/STRONG&gt;: Once an application is done, NMs redirect web UI users to this URL where aggregated-logs are served. Today it points to the MapReduce specific JobHistory.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Thu, 26 May 2016 18:26:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168866#M29803</guid>
      <dc:creator>sshimpi</dc:creator>
      <dc:date>2016-05-26T18:26:45Z</dc:date>
    </item>
    <item>
      <title>Re: Manage YARN  local log-dirs  space</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168867#M29804</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2648/sshimpi.html" nodeid="2648"&gt;@Sagar Shimpi&lt;/A&gt; Thanks a lot for your response.&lt;/P&gt;&lt;P&gt;Our Hadoop version is Hadoop 2.7.1.2.3.2.0-2950 and all the settings related to log configuration  looks fine.&lt;/P&gt;&lt;P&gt;I have checked the yarn logs but found only the below warnings.&lt;/P&gt;&lt;P&gt;2016-05-25 15:39:55,813 WARN  logaggregation.LogAggregationService (LogAggregationService.java:verifyAndCreateRemoteLogDir(195)) - Remote Root Log Dir [/app-logs] already exist, but with inc
orrect permissions. Expected: [rwxrwxrwt], Found: [rwxrwxrwx]. The cluster may have problems with multiple users.
2016-05-25 15:39:55,813 WARN  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:&amp;lt;init&amp;gt;(190)) - rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disab
led. The logs will be aggregated after this application is finished.&lt;/P&gt;&lt;P&gt;We see some output  files in the appcache folder which is taking more space. &lt;/P&gt;&lt;P&gt;/u01/hadoop/yarn/local/usercache/hive/appcache&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 19:12:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168867#M29804</guid>
      <dc:creator>vsivalingam</dc:creator>
      <dc:date>2016-05-26T19:12:40Z</dc:date>
    </item>
    <item>
      <title>Re: Manage YARN  local log-dirs  space</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168868#M29805</link>
      <description>&lt;P style="margin-left: 20px;"&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10733/vsivalingam.html" nodeid="10733"&gt;@Venkadesh Sivalingam&lt;/A&gt; &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;If yarn local directory is the the one that has space issue as indicated, then its not related to yarn container logs but yarn local data. Now, This can be valid case if the job is still running. If the job is not running, there will be cases when crashed jobs can leave yarn local data. If you want to clean this up, this can stop nodemanager on that node (when no containers are running on that node) and clean up all /yarn/local directories. &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;On another note, there is a warning about permissions on /app-logs. Please correct the file permission (though I believe this is not causing an issue right now)&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 19:38:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168868#M29805</guid>
      <dc:creator>ravi1</dc:creator>
      <dc:date>2016-05-26T19:38:28Z</dc:date>
    </item>
    <item>
      <title>Re: Manage YARN  local log-dirs  space</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168869#M29806</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/216/ravi.html" nodeid="216"&gt;@Ravi Mutyala&lt;/A&gt;There is no job running currently. So I believe the files can be removed manually.&lt;/P&gt;&lt;P&gt;But does this happen with all failed jobs?  Will it be a manual process every time to remove such  kind of leftover files or any process is available to remove these files periodically?&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 19:46:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Manage-YARN-local-log-dirs-space/m-p/168869#M29806</guid>
      <dc:creator>vsivalingam</dc:creator>
      <dc:date>2016-05-26T19:46:50Z</dc:date>
    </item>
  </channel>
</rss>

