Reaching out to the community for guidance or advice on sizing Spark worker nodes on IBM Power servers. Given the chip architecture has 8 threads per core compared to 2 threads per core for Intel x64, should I adjust memory? I don't want to over configure but also don't want constrain throughput by not having enough memory. Thanks in advance for input or guidance.
Thanks for the links to the Redbooks - I checked them (although admittedly did not read each one of them thoroughly) and wasn't able to get a definitive answer. On a separate note, this post by Raj Krishamurthy and Randy Swanberg has some good tips for tuning Spark on Power8.
I didn't search all the Redbooks, but in "IBM Data Engine for Hadoop and Spark" you can find "Solution reference architecture" chapter and there is some information on sizing.
Also in the book "Hortonworks Data Platform with IBM Spectrum Scale Reference Guide for Building an Integrated Solution" they have the following: "For information and assistance about sizing and configuring the HDP on a Power Systems + IBM Spectrum Scale/IBM Elastic Storage Server solution, contact the Cognitive Systems Solution Center (firstname.lastname@example.org)." Also there is some information there about reference architecture they used.