Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Deciding spark job configuration


Deciding spark job configuration


I'd like to know how to configure a spark job running on a small cluster consisting of 3 nodes

I have 3 nodes each has 4 cores and 16G ram, I'd like to know what would be the best confiugration to process a 3.5G file, basically the task is search contents of rdd (200 keyword) in another rdd (40 Million line)

how can I configure number of executors, core and memory to get best performance ?