Support Questions

Find answers, ask questions, and share your expertise

How to monitor the actual memory allocation of a spark application

avatar
Contributor

Is there a proper way to monitor the memory usage of a spark application.

By memory usage, i didnt mean the executor memory, that can be set, but the actual memory usage of the application.

Note : We are running Spark on YARN

14 REPLIES 14

avatar
Contributor

@Jonathan Sneep

I am not able to reply to your comment. Seems like the Reply option is not available.

Hence replying here

Comment

HDFS Write bytes by executor should look something like this (be sure to set the left Y unit type to bytes);

aliasByNode($application.*.executor.filesystem.*.write_bytes, 1)

Executor and Driver memory usage example (similarly as above set the left Y unit to bytes);

aliasByNode($application.*.jvm.heap.used, 1)

I'll try to find time later to give you some more examples, but they are mostly slight variations on the examples above : - )

Thanks for the comment. Will try the metrics queries and will let you know.

Also looking forward to your updates on the below as well

  • HDFS Bytes Read/Written Per Executor
  • HDFS Executor Read Bytes/Sec
  • Read IOPS
  • Task Executor
    • Active Tasks per Executor
    • Completed Tasks per Executor
    • Completed Tasks/Minute per Executor
  • Driver Memory
    • Driver Heap Usage
    • Driver JVM Memory Pools Usage
    • Executor Memory Usage
    • JVM Heap Usage Per Executor

avatar
Contributor

@Jonathan Sneep
Could you please check if the below metrics queries are correct :

  • Driver Memory
    • Driver Heap Usage
      aliasByNode($application.driver.jvm.heap.usage, 1)
    • Driver JVM Memory Pools Usage
      aliasByNode($application.driver.jvm.pools.*.used, 4)
  • Executor & Driver Memory Used
    aliasByNode($application.*.jvm.heap.used, 1)
  • Executor Memory Used
    aliasByNode(exclude($application.*.jvm.heap.used, '.driver.jvm.heap'), 1)
    alias(sumSeries(exclude($application.*.jvm.heap.used, '.driver.jvm.heap')), 'total')
  • Task Executor
    • Active Tasks Per Executor
      aliasByNode(summarize($application.*.executor.threadpool.activeTasks, '10s', 'sum', false), 1)
    • Completed Tasks per Executor
      aliasByNode($application.*.executor.threadpool.completeTasks, 1)
    • Completed Tasks/Minute per Executor
      aliasByNode(nonNegativeDerivative(summarize($application.*.executor.threadpool.completeTasks, '1m', 'avg', false)), 1)
  • Read/Write IOPS
    • Read IOPS
      alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.read_ops)), 'total')
      aliasByNode(perSecond($application.*.executor.filesystem.hdfs.read_ops), 1)
    • Write IOPS
      alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.write_ops)), 'total')
      aliasByNode(perSecond($application.*.executor.filesystem.hdfs.write_ops), 1)
  • HDFS Bytes Reads/Writes Per Executor
    • Executor HDFS Reads
      aliasByMetric($application.*.executor.filesystem.hdfs.read_bytes)
    • Executor HDFS Bytes Written
      aliasByMetric($application.*.executor.filesystem.hdfs.write_bytes)

Also please let me know the queries for the below :

  • HDFS Read/Write Byte Rate
    • HDFS Read Rate/Sec
    • HDFS Write Rate/Sec

Looking forward to your update regarding the same.

avatar
Contributor

@Jonathan Sneep

Did you had a chance to look into it.

avatar

@Nikhil

Nice work. HDFS Write bytes by executor should look something like this (be sure to set the left Y unit type to bytes);

aliasByNode($application.*.executor.filesystem.*.write_bytes, 1)

Executor and Driver memory usage example (similarly as above set the left Y unit to bytes);

aliasByNode($application.*.jvm.heap.used, 1)

I'll try to find time later to give you some more examples, but they are mostly slight variations on the examples above : - )

avatar
Contributor

@Jonathan Sneep
Could you please check if the below metrics queries are correct :

  • Driver Memory
    • Driver Heap Usage
      aliasByNode($application.driver.jvm.heap.usage, 1)
    • Driver JVM Memory Pools Usage
      aliasByNode($application.driver.jvm.pools.*.used, 4)
  • Executor & Driver Memory Used
    aliasByNode($application.*.jvm.heap.used, 1)
  • Executor Memory Used
    aliasByNode(exclude($application.*.jvm.heap.used, '.driver.jvm.heap'), 1)
    alias(sumSeries(exclude($application.*.jvm.heap.used, '.driver.jvm.heap')), 'total')
  • Task Executor
    • Active Tasks Per Executor
      aliasByNode(summarize($application.*.executor.threadpool.activeTasks, '10s', 'sum', false), 1)
    • Completed Tasks per Executor
      aliasByNode($application.*.executor.threadpool.completeTasks, 1)
    • Completed Tasks/Minute per Executor
      aliasByNode(nonNegativeDerivative(summarize($application.*.executor.threadpool.completeTasks, '1m', 'avg', false)), 1)
  • Read/Write IOPS
    • Read IOPS
      alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.read_ops)), 'total')
      aliasByNode(perSecond($application.*.executor.filesystem.hdfs.read_ops), 1)
    • Write IOPS
      alias(perSecond(sumSeries($application.*.executor.filesystem.hdfs.write_ops)), 'total')
      aliasByNode(perSecond($application.*.executor.filesystem.hdfs.write_ops), 1)
  • HDFS Bytes Reads/Writes Per Executor
    • Executor HDFS Reads
      aliasByMetric($application.*.executor.filesystem.hdfs.read_bytes)
    • Executor HDFS Bytes Written
      aliasByMetric($application.*.executor.filesystem.hdfs.write_bytes)

Also please let me know the queries for the below :

  • HDFS Read/Write Byte Rate
    • HDFS Read Rate/Sec
    • HDFS Write Rate/Sec

Looking forward to your update regarding the same.