Member since
01-10-2018
22
Posts
1
Kudos Received
0
Solutions
11-23-2018
10:13 AM
Can someone please have a look into it
... View more
11-20-2018
02:01 PM
I have configured graphite and grafana for monitoring the spark applications as per "https://community.hortonworks.com/articles/222813/monitoring-spark-2-performance-via-grafana-in-amba-1.html". Are the below queries, the correct ones ?
Driver Memory
Driver Heap Usage aliasByNode($application.driver.jvm.heap.usage, 1) Driver JVM Memory Pools Usage aliasByNode($application.driver.jvm.pools.*.used, 4) Executor & Driver Memory Used aliasByNode($application.*.jvm.heap.used, 1) Executor Memory Used aliasByNode(exclude($application.*.jvm.heap.used, '.driver.jvm.heap'), 1) alias(sumSeries(exclude($application.*.jvm.heap.used, '.driver.jvm.heap')), 'total') Task Executor
Active Tasks Per Executor aliasByNode(summarize($application.*.executor.threadpool.activeTasks, '10s', 'sum', false), 1) Completed Tasks per Executor aliasByNode($application.*.executor.threadpool.completeTasks, 1) Completed Tasks/Minute per Executor aliasByNode(nonNegativeDerivative(summarize($application.*.executor.threadpool.completeTasks, '1m', 'avg', false)), 1) Read/Write IOPS
Read IOPS alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.read_ops)), 'total') aliasByNode(perSecond($application.*.executor.filesystem.hdfs.read_ops), 1) Write IOPS alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.write_ops)), 'total') aliasByNode(perSecond($application.*.executor.filesystem.hdfs.write_ops), 1) HDFS Bytes Reads/Writes Per Executor
Executor HDFS Reads aliasByMetric($application.*.executor.filesystem.hdfs.read_bytes) Executor HDFS Bytes Written aliasByMetric($application.*.executor.filesystem.hdfs.write_bytes Also does grafana and graphite provides metrices on the below use case ?
We have a bunch of hourly / daily batches on Airflow. This batch use PySpark for data processing. We want to see historical trend of Spark memory usage on the same batch. So we want to aggregate Spark applications on the same batch then visualize historical trends so we can check if how memory usage is increased based on traffic
... View more
Labels:
- Labels:
-
Apache Spark
11-07-2018
01:25 PM
@Jonathan Sneep Could you please check if the below metrics queries are correct : Driver Memory Driver Heap Usage aliasByNode($application.driver.jvm.heap.usage, 1) Driver JVM Memory Pools Usage aliasByNode($application.driver.jvm.pools.*.used, 4) Executor & Driver Memory Used aliasByNode($application.*.jvm.heap.used, 1) Executor Memory Used aliasByNode(exclude($application.*.jvm.heap.used, '.driver.jvm.heap'), 1) alias(sumSeries(exclude($application.*.jvm.heap.used, '.driver.jvm.heap')), 'total') Task Executor Active Tasks Per Executor aliasByNode(summarize($application.*.executor.threadpool.activeTasks, '10s', 'sum', false), 1) Completed Tasks per Executor aliasByNode($application.*.executor.threadpool.completeTasks, 1) Completed Tasks/Minute per Executor aliasByNode(nonNegativeDerivative(summarize($application.*.executor.threadpool.completeTasks, '1m', 'avg', false)), 1) Read/Write IOPS Read IOPS alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.read_ops)), 'total') aliasByNode(perSecond($application.*.executor.filesystem.hdfs.read_ops), 1) Write IOPS alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.write_ops)), 'total') aliasByNode(perSecond($application.*.executor.filesystem.hdfs.write_ops), 1) HDFS Bytes Reads/Writes Per Executor Executor HDFS Reads aliasByMetric($application.*.executor.filesystem.hdfs.read_bytes) Executor HDFS Bytes Written aliasByMetric($application.*.executor.filesystem.hdfs.write_bytes) Also please let me know the queries for the below : HDFS Read/Write Byte Rate HDFS Read Rate/Sec HDFS Write Rate/Sec Looking forward to your update regarding the same.
... View more
11-07-2018
09:24 AM
@Jonathan Sneep Did you had a chance to look into it.
... View more
11-05-2018
07:38 PM
@Jonathan Sneep Could you please check if the below metrics queries are correct : Driver Memory Driver Heap Usage aliasByNode($application.driver.jvm.heap.usage, 1) Driver JVM Memory Pools Usage aliasByNode($application.driver.jvm.pools.*.used, 4) Executor & Driver Memory Used aliasByNode($application.*.jvm.heap.used, 1) Executor Memory Used aliasByNode(exclude($application.*.jvm.heap.used, '.driver.jvm.heap'), 1) alias(sumSeries(exclude($application.*.jvm.heap.used, '.driver.jvm.heap')), 'total') Task Executor Active Tasks Per Executor aliasByNode(summarize($application.*.executor.threadpool.activeTasks, '10s', 'sum', false), 1) Completed Tasks per Executor aliasByNode($application.*.executor.threadpool.completeTasks, 1) Completed Tasks/Minute per Executor aliasByNode(nonNegativeDerivative(summarize($application.*.executor.threadpool.completeTasks, '1m', 'avg', false)), 1) Read/Write IOPS Read IOPS alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.read_ops)), 'total') aliasByNode(perSecond($application.*.executor.filesystem.hdfs.read_ops), 1) Write IOPS alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.write_ops)), 'total') aliasByNode(perSecond($application.*.executor.filesystem.hdfs.write_ops), 1) HDFS Bytes Reads/Writes Per Executor Executor HDFS Reads aliasByMetric($application.*.executor.filesystem.hdfs.read_bytes) Executor HDFS Bytes Written aliasByMetric($application.*.executor.filesystem.hdfs.write_bytes) Also please let me know the queries for the below : HDFS Read/Write Byte Rate HDFS Read Rate/Sec HDFS Write Rate/Sec Looking forward to your update regarding the same.
... View more
11-03-2018
07:49 PM
@Jonathan Sneep I am not able to reply to your comment. Seems like the Reply option is not available. Hence replying here Comment HDFS Write bytes by executor should look something like this (be sure to set the left Y unit type to bytes);
aliasByNode($application.*.executor.filesystem.*.write_bytes, 1)
Executor and Driver memory usage example (similarly as above set the left Y unit to bytes);
aliasByNode($application.*.jvm.heap.used, 1)
I'll try to find time later to give you some more examples, but they are mostly slight variations on the examples above : - )
Thanks for the comment. Will try the metrics queries and will let you know. Also looking forward to your updates on the below as well
HDFS Bytes Read/Written Per Executor HDFS Executor Read Bytes/Sec Read IOPS Task Executor
Active Tasks per Executor Completed Tasks per Executor Completed Tasks/Minute per Executor Driver Memory
Driver Heap Usage Driver JVM Memory Pools Usage Executor Memory Usage JVM Heap Usage Per Executor
... View more
11-02-2018
12:17 PM
@Jonathan Sneep I have managed to configure and integrate spark app, grafana and graphite. Could you please let me know on how to configure the metrics to get the graphs of the below : HDFS Bytes Read/Written Per Executor HDFS Executor Read/Write Bytes/Sec Read IOPS Task Executor Active Tasks per Executor Completed Tasks per Executor Completed Tasks/Minute per Executor Driver Memory Driver Heap Usage Driver JVM Memory Pools Usage Executor Memory Usage JVM Heap Usage Per Executor Spark Executor and Driver memory used
... View more
11-01-2018
12:26 PM
@Jonathan Sneep...i already followed that but getting the below error while installing packages Error: Package: python-django-tagging-0.3.1-7.el6.noarch (epel)
Requires: Django
You could try using --skip-broken to work around the problem
Amazon Ami - Amazon Linux AMI 2017.03.1.20170812 x86_64 HVM GP2
... View more
11-01-2018
12:10 PM
@Jonathan Sneep....is there a proper documention to install and configure graphite on amazon ami ?
... View more
11-01-2018
08:36 AM
@Jonathan Sneep...thanks for the input...will check this and let you know
... View more