<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Measuring Spark job performance in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Measuring-Spark-job-performance/m-p/308299#M223509</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/83689"&gt;@TimmehG&lt;/a&gt;,&amp;nbsp;&lt;SPAN&gt;Spark has a configurable metrics system based on the&amp;nbsp;&lt;/SPAN&gt;&lt;A href="http://metrics.dropwizard.io/4.1.1" target="_blank" rel="noopener"&gt;Dropwizard Metrics Library&lt;/A&gt;&lt;SPAN&gt;. This allows users to report Spark metrics to a variety of sinks including HTTP, JMX, and CSV files. The metrics are generated by sources embedded in the Spark codebase. They provide instrumentation for specific activities and Spark components. The metrics system is configured via a configuration file that Spark expects to be present at&amp;nbsp;&lt;/SPAN&gt;$SPARK_HOME/conf/metrics.properties&lt;SPAN&gt;. A custom file location can be specified via the&amp;nbsp;&lt;/SPAN&gt;spark.metrics.conf&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://spark.apache.org/docs/latest/configuration.html#spark-properties" target="_blank" rel="noopener"&gt;configuration property&lt;/A&gt;&lt;SPAN&gt;. Instead of using the configuration file, a set of configuration parameters with prefix&amp;nbsp;&lt;/SPAN&gt;spark.metrics.conf.&lt;SPAN&gt;&amp;nbsp;can be used.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I agree with you, running spark applications continuously &amp;amp; reliably is a challenging task, and a good performance monitoring system is needed.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Several external tools can be used to help profile the performance of Spark jobs:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Cluster-wide monitoring tools, such as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="http://ganglia.sourceforge.net/" target="_blank" rel="noopener"&gt;Ganglia&lt;/A&gt;, can provide insight into overall cluster utilization and resource bottlenecks. For instance, a Ganglia dashboard can quickly reveal whether a particular workload is disk-bound, network bound, or CPU bound.&lt;/LI&gt;&lt;LI&gt;OS profiling tools such as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="http://dag.wieers.com/home-made/dstat/" target="_blank" rel="noopener"&gt;dstat&lt;/A&gt;,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="http://linux.die.net/man/1/iostat" target="_blank" rel="noopener"&gt;iostat&lt;/A&gt;, and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="http://linux.die.net/man/1/iotop" target="_blank" rel="noopener"&gt;iotop&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;can provide fine-grained profiling on individual nodes.&lt;/LI&gt;&lt;LI&gt;JVM utilities such as&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;jstack&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for providing stack traces,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;jmap&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for creating heap-dumps,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;jstat&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for reporting time-series statistics and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;jconsole&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for visually exploring various JVM properties are useful for those comfortable with JVM internals.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;For more insights you can refer to the below links:&lt;/P&gt;&lt;P&gt;&lt;A href="https://spark.apache.org/docs/latest/monitoring.html" target="_blank" rel="noopener"&gt;https://spark.apache.org/docs/latest/monitoring.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://blog.cloudera.com/demystifying-spark-jobs-to-optimize-for-cost-and-performance/" target="_blank" rel="noopener"&gt;https://blog.cloudera.com/demystifying-spark-jobs-to-optimize-for-cost-and-performance/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.infoq.com/articles/spark-application-monitoring-influxdb-grafana/" target="_blank" rel="noopener"&gt;https://www.infoq.com/articles/spark-application-monitoring-influxdb-grafana/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://db-blog.web.cern.ch/blog/luca-canali/2017-03-measuring-apache-spark-workload-metrics-performance-troubleshooting" target="_blank" rel="noopener"&gt;https://db-blog.web.cern.ch/blog/luca-canali/2017-03-measuring-apache-spark-workload-metrics-performance-troubleshooting&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please accept the answer you found most useful.&lt;/P&gt;</description>
    <pubDate>Wed, 23 Dec 2020 06:15:48 GMT</pubDate>
    <dc:creator>jagadeesan</dc:creator>
    <dc:date>2020-12-23T06:15:48Z</dc:date>
  </channel>
</rss>

