- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
What is the Hortonworks recommendation on Swap usage?
Created on ‎03-11-2016 08:07 PM - edited ‎09-16-2022 03:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am looking for a recommendation on swap usage in an HDP cluster.
We currently disable swap in our deployments to save Hadoop JVM performance from being impacted by the potential use.
The logic is based on the thought that it is better to kill a process that is out of memory and have it be rescheduled than allow the performance impact.
This is the current documentation available from Hortonworks regarding partitioning.
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_cluster-planning-guide/content/ch_partitioning_chapter.html
Can you please provide some clarity on this?
Thank you,
David
Created ‎03-12-2016 02:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As with many topics, "it depends".
For slave/worker/data hosts which only have distributed services you can likely disable swap. With distributed services it's preferred to let the process/host be killed rather than swap. The killing of that process or host shouldn't affect cluster availability.
Said another way: you want to "fail fast" not to "slowly degrade". Just 1 bad process/host can greatly degrade performance of the whole cluster. For example, in a 350 host cluster removal of 2 bad nodes improved throughput by ~2x:
- http://www.slideshare.net/t3rmin4t0r/tez8-ui-walkthrough/23
- http://pages.cs.wisc.edu/~thanhdo/pdf/talk-socc-limplock.pdf
For masters, swap is also often disabled though it's not a set rule from Hortonworks and I assume there will be some discussion/disagreement. Masters can be treated somewhat like you'd treat masters in other, non-Hadoop, environments.
The fear with disabling swap on masters is that an OOM (out of memory) event could affect cluster availability. But that will still happen even with swap configured, it just will take slightly longer. Good administrator/operator practices would be to monitor RAM availability, then fix any issues before running out of memory. Thus maintaining availability without affecting performance. No swap is needed then.
Scenarios where you might want swap:
- playing/testing functionality, not performance, on hosts with very little RAM so will likely need to swap.
- if you have the need to use more memory, or expect to need more, than the amount of RAM which has been purchased. And can accept severe degradation in failure. In this case you would need a lot of swap configured. Your better off buying the right amount of memory.
Extra thoughts:
- if you want to disable swap, but your organization require their to be a swap partition, set swappiness=0
- if you choose to have swap, set swappiness=1 to avoid swapping until all physical memory has been used.
- most Cloud/Virtualization providers disable swap by default. Don't change that.
- some advise to avoid swap on SSDs due to reducing their lifespan
Created ‎03-11-2016 08:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
David - Thanks for posting. As discussed separately, the 2xRAM recommendation is definitely out of date. I'm working on some consensus with my team on their recommendations, and look forward to others comments coming in below.
Created ‎03-11-2016 08:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The questions will be: - 1. Should there be a swap partition at all (i.e. swappiness=0)? - 2. Do recommendations vary between masters, workers or certain components? - 3. If swappiness>=1, what should the amount be?
Created ‎03-12-2016 02:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As with many topics, "it depends".
For slave/worker/data hosts which only have distributed services you can likely disable swap. With distributed services it's preferred to let the process/host be killed rather than swap. The killing of that process or host shouldn't affect cluster availability.
Said another way: you want to "fail fast" not to "slowly degrade". Just 1 bad process/host can greatly degrade performance of the whole cluster. For example, in a 350 host cluster removal of 2 bad nodes improved throughput by ~2x:
- http://www.slideshare.net/t3rmin4t0r/tez8-ui-walkthrough/23
- http://pages.cs.wisc.edu/~thanhdo/pdf/talk-socc-limplock.pdf
For masters, swap is also often disabled though it's not a set rule from Hortonworks and I assume there will be some discussion/disagreement. Masters can be treated somewhat like you'd treat masters in other, non-Hadoop, environments.
The fear with disabling swap on masters is that an OOM (out of memory) event could affect cluster availability. But that will still happen even with swap configured, it just will take slightly longer. Good administrator/operator practices would be to monitor RAM availability, then fix any issues before running out of memory. Thus maintaining availability without affecting performance. No swap is needed then.
Scenarios where you might want swap:
- playing/testing functionality, not performance, on hosts with very little RAM so will likely need to swap.
- if you have the need to use more memory, or expect to need more, than the amount of RAM which has been purchased. And can accept severe degradation in failure. In this case you would need a lot of swap configured. Your better off buying the right amount of memory.
Extra thoughts:
- if you want to disable swap, but your organization require their to be a swap partition, set swappiness=0
- if you choose to have swap, set swappiness=1 to avoid swapping until all physical memory has been used.
- most Cloud/Virtualization providers disable swap by default. Don't change that.
- some advise to avoid swap on SSDs due to reducing their lifespan
