Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

YARN Tuning - How is memory overhead estimate determined?

avatar
Expert Contributor

Looking at the "Tuning YARN" documentation page, I noticed that the estimate for "Task Overhead" seems very large - 51GB for the example given, and 24GB on the tuning spreadsheet.  The s/s comment says: "Allow additional overhead for task buffers, such as the HDFS Sort I/O buffer, JVM overheads etc."  However, it does not explain how the number is derived.  This seems steep for a small/medium cluster that probably account for the majority of users.  Similar config guide at Hortonworks uses a rough ratio of 1/8 instead.

 

Please help - I need justification for requesting additional hardware $$s.

 

Thanks,

Miles

 

 

1 ACCEPTED SOLUTION

avatar
Mentor
The value on the doc page is picked as about 20% of the RAM for overhead reservation, but you could set it lower. Our past overcommit testing does show that the values can reach close to extra 20% in use for some tested workloads, but that would not be an always-as-such case - and this may have changed overall lately also.

We're reworking the docs for these recommendations soon in future, as developments happen. For now, please rely on the XLSX file for a more closer guideline on the recommended calculated values.

View solution in original post

1 REPLY 1

avatar
Mentor
The value on the doc page is picked as about 20% of the RAM for overhead reservation, but you could set it lower. Our past overcommit testing does show that the values can reach close to extra 20% in use for some tested workloads, but that would not be an always-as-such case - and this may have changed overall lately also.

We're reworking the docs for these recommendations soon in future, as developments happen. For now, please rely on the XLSX file for a more closer guideline on the recommended calculated values.