<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65698#M76279</link>
    <description>&lt;P&gt;YARN Resource Manager Halts with the OOM : Unable to create native thread and the Job fails over to standby Resource Manager in completing the Task.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please let us know the root cause of the issue&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ERROR Message :&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:09,637 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e189_1521451854044_2288_01_000002 Container Transitioned from ALLOCATED to ACQUIRED&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:10,413 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e189_1521451854044_2288_01_000002 Container Transitioned from ACQUIRED to RUNNING&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:10,695 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate...&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:19,354 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hue is accessing unchecked &lt;A href="http://labs-sca01-8-2-30.eng.fireeye.com:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288" target="_blank"&gt;&lt;SPAN class="s2"&gt;http://server1:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288&lt;/SPAN&gt;&lt;/A&gt; which is the app master GUI of application_1521451854044_2288 owned by edh_srv_prod&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:30,212 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hue is accessing unchecked &lt;A href="http://labs-sca01-8-2-30.eng.fireeye.com:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288" target="_blank"&gt;&lt;SPAN class="s2"&gt;http://server1:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288&lt;/SPAN&gt;&lt;/A&gt; which is the app master GUI of application_1521451854044_2288 owned by edh_srv_prod&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:34,090 FATAL org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[2101925946@qtp-1878992188-14302,5,main] threw an Error.&amp;nbsp; Shutting down now...&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;java.lang.OutOfMemoryError: unable to create new native thread&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at java.lang.Thread.start0(Native Method)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at java.lang.Thread.start(Thread.java:714)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1095)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at org.mortbay.jetty.security.SslSocketConnector$SslConnection.run(SslSocketConnector.java:723)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:34,093 INFO org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;yarn application -status application_1521451854044_2288&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Application Report :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Application-Id : application_1521451854044_2288&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Application-Name : oozie:launcher:T=shell:W=OS_Changes_incremental_workflow:A=shell-b8b2:ID=0006766-180222181315002-oozie-oozi-W&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Application-Type : MAPREDUCE&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; User : edh_srv_prod&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Queue : root.edh_srv_prod&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Start-Time : 1521710999557&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Finish-Time : 1521711593154&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Progress : 100%&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; State : FINISHED&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Final-State : SUCCEEDED&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Tracking-URL : &lt;A href="https://labs-sca01-8-2-14-a.eng.fireeye.com:19890/jobhistory/job/job_1521451854044_2288" target="_blank"&gt;https://server1:19890/jobhistory/job/job_1521451854044_2288&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; RPC Port : 40930&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AM Host : server3&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Aggregate Resource Allocation : 1809548 MB-seconds, 1181 vcore-seconds&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Log Aggregation Status : SUCCEEDED&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Diagnostics : &lt;STRONG&gt;Attempt recovered after RM restart&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 13:00:57 GMT</pubDate>
    <dc:creator>Johnny_Bach</dc:creator>
    <dc:date>2022-09-16T13:00:57Z</dc:date>
    <item>
      <title>Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65698#M76279</link>
      <description>&lt;P&gt;YARN Resource Manager Halts with the OOM : Unable to create native thread and the Job fails over to standby Resource Manager in completing the Task.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please let us know the root cause of the issue&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;ERROR Message :&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:09,637 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e189_1521451854044_2288_01_000002 Container Transitioned from ALLOCATED to ACQUIRED&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:10,413 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e189_1521451854044_2288_01_000002 Container Transitioned from ACQUIRED to RUNNING&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:10,695 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: checking for deactivate...&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:19,354 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hue is accessing unchecked &lt;A href="http://labs-sca01-8-2-30.eng.fireeye.com:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288" target="_blank"&gt;&lt;SPAN class="s2"&gt;http://server1:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288&lt;/SPAN&gt;&lt;/A&gt; which is the app master GUI of application_1521451854044_2288 owned by edh_srv_prod&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:30,212 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hue is accessing unchecked &lt;A href="http://labs-sca01-8-2-30.eng.fireeye.com:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288" target="_blank"&gt;&lt;SPAN class="s2"&gt;http://server1:43045/ws/v1/mapreduce/jobs/job_1521451854044_2288&lt;/SPAN&gt;&lt;/A&gt; which is the app master GUI of application_1521451854044_2288 owned by edh_srv_prod&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:34,090 FATAL org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[2101925946@qtp-1878992188-14302,5,main] threw an Error.&amp;nbsp; Shutting down now...&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;java.lang.OutOfMemoryError: unable to create new native thread&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at java.lang.Thread.start0(Native Method)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at java.lang.Thread.start(Thread.java:714)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1095)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at org.mortbay.jetty.security.SslSocketConnector$SslConnection.run(SslSocketConnector.java:723)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;2018-03-22 02:30:34,093 INFO org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;yarn application -status application_1521451854044_2288&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Application Report :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Application-Id : application_1521451854044_2288&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Application-Name : oozie:launcher:T=shell:W=OS_Changes_incremental_workflow:A=shell-b8b2:ID=0006766-180222181315002-oozie-oozi-W&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Application-Type : MAPREDUCE&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; User : edh_srv_prod&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Queue : root.edh_srv_prod&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Start-Time : 1521710999557&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Finish-Time : 1521711593154&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Progress : 100%&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; State : FINISHED&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Final-State : SUCCEEDED&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Tracking-URL : &lt;A href="https://labs-sca01-8-2-14-a.eng.fireeye.com:19890/jobhistory/job/job_1521451854044_2288" target="_blank"&gt;https://server1:19890/jobhistory/job/job_1521451854044_2288&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; RPC Port : 40930&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AM Host : server3&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Aggregate Resource Allocation : 1809548 MB-seconds, 1181 vcore-seconds&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Log Aggregation Status : SUCCEEDED&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Diagnostics : &lt;STRONG&gt;Attempt recovered after RM restart&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:00:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65698#M76279</guid>
      <dc:creator>Johnny_Bach</dc:creator>
      <dc:date>2022-09-16T13:00:57Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65718#M76280</link>
      <description>What CDH version are you using? If it is equal to or lower than 5.9.1 or 5.8.3, and you use a KMS service in the cluster (for HDFS Transparent Encryption Zone features), you may be hitting &lt;A href="https://issues.apache.org/jira/browse/HADOOP-13838" target="_blank"&gt;https://issues.apache.org/jira/browse/HADOOP-13838&lt;/A&gt;, which has been fixed in the bug-fix releases of CDH 5.8.4, 5.9.2, and 5.10.0 onwards.</description>
      <pubDate>Sat, 24 Mar 2018 05:48:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65718#M76280</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2018-03-24T05:48:22Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65721#M76281</link>
      <description>&lt;P&gt;i'm currently on CDH - 5.8.2 with KMS service, but what's your thought about OS running out of PID&amp;nbsp; as the error message suggests to be likely so?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 24 Mar 2018 06:27:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65721#M76281</guid>
      <dc:creator>Johnny_Bach</dc:creator>
      <dc:date>2018-03-24T06:27:20Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65722#M76282</link>
      <description>Thank you for confirming the CDH version. Do you also have a KMS service in the cluster? If yes, you're definitely hitting the aforementioned bug.&lt;BR /&gt;&lt;BR /&gt;You're partially right about "OS running out of PID". More specifically, the YARN RM process runs into its 'no. of processes' (nproc) ulimit, which should be set to a high default (32k processes) if you are running Cloudera Manager. There's no reason YARN should normally be using threads counting upto 32k.</description>
      <pubDate>Sat, 24 Mar 2018 06:30:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65722#M76282</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2018-03-24T06:30:11Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65723#M76283</link>
      <description>Yes, we do have an KMS service in the cluster.Thanks for providing an clarity on "OS running out of PID"</description>
      <pubDate>Sat, 24 Mar 2018 07:08:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/65723#M76283</guid>
      <dc:creator>Johnny_Bach</dc:creator>
      <dc:date>2018-03-24T07:08:39Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Resource Manager Halts  with java.lang.OutOfMemoryError: unable to create new native thread</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/69233#M76284</link>
      <description>&lt;P&gt;we are using CDH 5.14.0,I found our components [hdfs,yarn,hbase] would restart because of the same issue. the exception like this :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;java.io.IOException: Cannot run program "stat": error=11, Resource temporarily unavailable&lt;BR /&gt;at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)&lt;BR /&gt;at org.apache.hadoop.util.Shell.runCommand(Shell.java:551)&lt;BR /&gt;at org.apache.hadoop.util.Shell.run(Shell.java:507)&lt;BR /&gt;at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)&lt;BR /&gt;at org.apache.hadoop.fs.HardLink.getLinkCount(HardLink.java:218)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.ReplicaInfo.breakHardLinksIfNeeded(ReplicaInfo.java:265)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:1177)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:1148)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.&amp;lt;init&amp;gt;(BlockReceiver.java:210)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:675)&lt;BR /&gt;at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)&lt;BR /&gt;at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;BR /&gt;Caused by: java.io.IOException: error=11, Resource temporarily unavailable&lt;BR /&gt;at java.lang.UNIXProcess.forkAndExec(Native Method)&lt;BR /&gt;at java.lang.UNIXProcess.&amp;lt;init&amp;gt;(UNIXProcess.java:247)&lt;BR /&gt;at java.lang.ProcessImpl.start(ProcessImpl.java:134)&lt;BR /&gt;at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)&lt;BR /&gt;... 13 more&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2018-06-20 02:05:54,797 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is out of memory. Will retry in 30 seconds.&lt;BR /&gt;java.lang.OutOfMemoryError: unable to create new native thread&lt;BR /&gt;at java.lang.Thread.start0(Native Method)&lt;BR /&gt;at java.lang.Thread.start(Thread.java:717)&lt;BR /&gt;at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:154)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;alse,I noted cloudera manager help us set the ulimit. here is our config:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if [ $(id -u) -eq 0 ]; then&lt;BR /&gt;# Max number of open files&lt;BR /&gt;ulimit -n 32768&lt;/P&gt;&lt;P&gt;# Max number of child processes and threads&lt;BR /&gt;ulimit -u 65536&lt;/P&gt;&lt;P&gt;# Max locked memory&lt;BR /&gt;ulimit -l unlimited&lt;BR /&gt;fi&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ps: our machine is 72c 250g. could you help me that what the reason causes create native thread failed?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 20 Jun 2018 07:30:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Yarn-Resource-Manager-Halts-with-java-lang-OutOfMemoryError/m-p/69233#M76284</guid>
      <dc:creator>puny_reborn</dc:creator>
      <dc:date>2018-06-20T07:30:40Z</dc:date>
    </item>
  </channel>
</rss>

