<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark-Streaming and dynamic allocation in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155643#M36301</link>
    <description>&lt;P&gt;Please keep in mind (from &lt;A target="_blank" href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/ch_using-spark-streaming.html"&gt;HDP 2.4 docs&lt;/A&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;IMG alt="[Important&amp;gt;" src="https://ip1.i.lithium.com/b5a0c182228c10e8be662189c58adb9f56cc2e19/68747470733a2f2f646f63732e686f72746f6e776f726b732e636f6d2f484450446f63756d656e74732f484450322f4844502d322e342e322f626b5f737061726b2d67756964652f636f6d6d6f6e2f696d616765732f61646d6f6e2f696d706f7274616e742e706e67" style="color: inherit !important; border: inherit !important; border-image-source: inherit !important; border-image-slice: inherit !important; border-image-width: inherit !important; border-image-outset: inherit !important; border-image-repeat: inherit !important; background: inherit !important;" /&gt;Dynamic Resource Allocation does not work with Spark Streaming.&lt;/P&gt;</description>
    <pubDate>Thu, 28 Jul 2016 23:22:41 GMT</pubDate>
    <dc:creator>clukasik</dc:creator>
    <dc:date>2016-07-28T23:22:41Z</dc:date>
    <item>
      <title>Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155640#M36298</link>
      <description>&lt;P&gt;I want to setup a spark streaming-cluster with dynamic allocation. I tested the dynamic allocation on it by submitting the SparkPi application and the dynamic allocation works fine.&lt;/P&gt;&lt;P&gt;Then I tried my own application: It get his input from an other server and I used the &lt;EM&gt;socketTextStream&lt;/EM&gt; to receiving the input data. &lt;/P&gt;&lt;P&gt;The application is very simple:&lt;/P&gt;&lt;PRE&gt;final JavaReceiverInputDStream&amp;lt;String&amp;gt; stream = 
	ssc.socketTextStream(host, port, StorageLevels.MEMORY_AND_DISK_SER);
final JavaDStream&amp;lt;MyObject&amp;gt; mapStream = stream.map(...);
mapStream.foreachRDD(new VoidFunction&amp;lt;JavaRDD&amp;lt;MyObject&amp;gt;&amp;gt;() {
	@Override
	public void call(final JavaRDD&amp;lt;MyObject&amp;gt; stringJavaRDD) throws Exception {
		stringJavaRDD.collect();
        }
});
&lt;/PRE&gt;&lt;P&gt;The map function is the core of my application and it need some milliseconds computation time for each event.&lt;/P&gt;&lt;P&gt;When I increase the number of events, the cluster allocate new containers and when slowing down the events the number of containers decrease, &lt;STRONG&gt;but the new allocated containers are never used&lt;/STRONG&gt;. I checked the number of computed tasks per executor, they are always 0. Because the containers are never used, spark deallocate them after &lt;EM&gt;executorIdleTimeout&lt;/EM&gt; and immediately allocate a new one because the workload is very high. Only the first 2 containers from the application start do the jobs.&lt;/P&gt;&lt;P&gt;I though, maybe it could help if I use Kafka for receiving the events distributed to all containers. (I don't know if it is the right way to solve my problem.)&lt;/P&gt;&lt;P&gt;With Kafka I got an other problem: Spark didn't allocate more container as I have partitions set for my topic.&lt;/P&gt;&lt;P&gt;I use &lt;STRONG&gt;HDP 2.4&lt;/STRONG&gt; to setup my &lt;STRONG&gt;3 node&lt;/STRONG&gt; cluster with:&lt;/P&gt;&lt;UL&gt;
&lt;LI&gt;Zookeeper: 3.4&lt;/LI&gt;&lt;LI&gt;HDFS, YARN, MapReduce2: 2.7&lt;/LI&gt;&lt;LI&gt;spark: 1.6&lt;/LI&gt;&lt;LI&gt;Kafka 0.9&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Each node has 4 cores an 8 GB RAM.&lt;/P&gt;&lt;P&gt;Can you tell me, which option is the right for solving my problem an how can I fix it?&lt;/P&gt;&lt;P&gt;Thank you so much for helping me &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 22:39:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155640#M36298</guid>
      <dc:creator>retricia1</dc:creator>
      <dc:date>2016-07-28T22:39:10Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155641#M36299</link>
      <description>&lt;P&gt;Hi &lt;/P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/12001/retricia1.html" nodeid="12001"&gt;@Rene Rene&lt;/A&gt;Can you also share the following properties value ? and how much memory and cores are your nodes contributing to yarn ( You can see the value in Resource Manager UI -&amp;gt; Nodes )&lt;P&gt;yarn.scheduler.minimum-allocation-mb&lt;/P&gt;&lt;P&gt;yarn.nodemanager.resource.memory-mb &lt;/P&gt;&lt;P&gt;mapreduce.map.memory.mb&lt;/P&gt;&lt;P&gt;mapreduce.map.java.opts.max.heap&lt;/P&gt;&lt;P&gt;mapreduce.map.cpu.vcores&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 22:43:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155641#M36299</guid>
      <dc:creator>vbonthu</dc:creator>
      <dc:date>2016-07-28T22:43:36Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155642#M36300</link>
      <description>&lt;P&gt;Each node contribute 4GB RAM and 4 cores to yarn. I submit the application with:&lt;/P&gt;&lt;PRE&gt;--driver-memory 640mb
--executor-memory 640mb
&lt;/PRE&gt;&lt;P&gt;With the overhead each container use 1 GB RAM. Altogether 12 containers are possible.&lt;/P&gt;&lt;PRE&gt;yarn.scheduler.minimum-allocation-mb=512mb
yarn.nodemanager.resource.memory-mb=4096mb
mapreduce.map.memory.mb=1.5GB
&lt;/PRE&gt;&lt;P&gt;The following properties are not defined:&lt;/P&gt;&lt;PRE&gt;mapreduce.map.java.opts.max.heap
mapreduce.map.cpu.vcores
&lt;/PRE&gt;</description>
      <pubDate>Thu, 28 Jul 2016 23:10:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155642#M36300</guid>
      <dc:creator>retricia1</dc:creator>
      <dc:date>2016-07-28T23:10:00Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155643#M36301</link>
      <description>&lt;P&gt;Please keep in mind (from &lt;A target="_blank" href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/ch_using-spark-streaming.html"&gt;HDP 2.4 docs&lt;/A&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;IMG alt="[Important&amp;gt;" src="https://ip1.i.lithium.com/b5a0c182228c10e8be662189c58adb9f56cc2e19/68747470733a2f2f646f63732e686f72746f6e776f726b732e636f6d2f484450446f63756d656e74732f484450322f4844502d322e342e322f626b5f737061726b2d67756964652f636f6d6d6f6e2f696d616765732f61646d6f6e2f696d706f7274616e742e706e67" style="color: inherit !important; border: inherit !important; border-image-source: inherit !important; border-image-slice: inherit !important; border-image-width: inherit !important; border-image-outset: inherit !important; border-image-repeat: inherit !important; background: inherit !important;" /&gt;Dynamic Resource Allocation does not work with Spark Streaming.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 23:22:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155643#M36301</guid>
      <dc:creator>clukasik</dc:creator>
      <dc:date>2016-07-28T23:22:41Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155644#M36302</link>
      <description>&lt;P&gt;Oh no.... I did't saw this before. Is this only a HDP 2.4 problem? On &lt;A href="http://www.cloudera.com/documentation/enterprise/5-5-x/topics/spark_streaming.html"&gt;CDH-Doc&lt;/A&gt; it seems to be possible.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jul 2016 00:15:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155644#M36302</guid>
      <dc:creator>retricia1</dc:creator>
      <dc:date>2016-07-29T00:15:48Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155645#M36303</link>
      <description>&lt;P&gt;For background see &lt;A href="https://mail-archives.apache.org/mod_mbox/spark-user/201511.mbox/%3CCA+AHuKksNyOxq8NazZA_mbHb+J1Ry-h9gMtXj=Zc5WkHU7Nnvg@mail.gmail.com%3E."&gt;https://mail-archives.apache.org/mod_mbox/spark-user/201511.mbox/%3CCA+AHuKksNyOxq8NazZA_mbHb+J1Ry-h9gMtXj=Zc5WkHU7Nnvg@mail.gmail.com%3E.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Pinging &lt;A rel="user" href="https://community.cloudera.com/users/332/vshukla.html" nodeid="332"&gt;@vshukla&lt;/A&gt; to see if he has any updates.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jul 2016 01:17:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155645#M36303</guid>
      <dc:creator>lgeorge</dc:creator>
      <dc:date>2016-07-29T01:17:38Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155646#M36304</link>
      <description>&lt;P&gt;Spark Streaming &amp;amp; Dynamic Resource Allocation is a new feature with Spark 2.0 (https://issues.apache.org/jira/browse/SPARK-12133) so it is not yet available in either HDP or CDH.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jul 2016 07:44:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155646#M36304</guid>
      <dc:creator>vshukla</dc:creator>
      <dc:date>2016-07-29T07:44:23Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155647#M36305</link>
      <description>&lt;P&gt;Hello, HDP 2.6.1 has Spark 2, is dynamic resource allocation for streaming jobs working now?&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2017 23:12:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155647#M36305</guid>
      <dc:creator>tataru_marian</dc:creator>
      <dc:date>2017-07-25T23:12:21Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155648#M36306</link>
      <description>&lt;P&gt;Hello, HDP 2.6.1 has Spark 2, is dynamic resource allocation for streaming jobs working now?&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2017 23:13:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155648#M36306</guid>
      <dc:creator>tataru_marian</dc:creator>
      <dc:date>2017-07-25T23:13:09Z</dc:date>
    </item>
    <item>
      <title>Re: Spark-Streaming and dynamic allocation</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155649#M36307</link>
      <description>&lt;P&gt;Hello, HDP 2.6.1 has Spark 2, is dynamic resource allocation for streaming jobs working now?&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2017 23:13:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-Streaming-and-dynamic-allocation/m-p/155649#M36307</guid>
      <dc:creator>tataru_marian</dc:creator>
      <dc:date>2017-07-25T23:13:29Z</dc:date>
    </item>
  </channel>
</rss>

