<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark driver memory keeps growing in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Spark-driver-memory-keeps-growing/m-p/165006#M127373</link>
    <description>&lt;P&gt;Broadcast is done with blocks of data. See the spark.broadcast.blockSize property &lt;A target="_blank" href="https://spark.apache.org/docs/1.6.1/configuration.html"&gt;here&lt;/A&gt;. This explains why the value grows in the log output.&lt;/P&gt;&lt;P&gt;How big is the file you are broadcasting? You can use the &lt;A target="_blank" href="https://spark.apache.org/docs/1.6.1/api/java/index.html?org/apache/spark/util/SizeEstimator.html"&gt;SizeEstimator&lt;/A&gt; to get a sense of what your object will really occupy. Then make sure your "--driver-memory" and "--executor-memory" has enough breathing room. Guidance for tuning can be found here: &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/ch_tuning-spark.html" target="_blank"&gt;https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/ch_tuning-spark.html&lt;/A&gt; &lt;/P&gt;</description>
    <pubDate>Thu, 04 Aug 2016 21:13:44 GMT</pubDate>
    <dc:creator>clukasik</dc:creator>
    <dc:date>2016-08-04T21:13:44Z</dc:date>
  </channel>
</rss>

