I am facing some issues while running Spark(1.6) jobs in Yarn cluster mode with below configurations: --master yarn --deploy-mode cluster --executor-cores 8 --num-executors 3 --executor-memory 25G --driver-memory 6g --conf spark.network.timeout=10000000 --conf spark.cores.max=35 --conf spark.memory.fraction=0.6 --conf spark.memory.storageFraction=0.5 --conf spark.shuffle.memoryFraction=1
Also, I am giving spark.sql.shuffle.partitions=30 in spark config.xml.
I am running the job with above command on a three node cluster setup of hortonworks where each node has around 51GB of memory available. The input data records is approx 254 million. The job crashes when inserting data to Hive with Executor Lost issue and Exit code as 143. There is very high shuffling of data during processing.
Can you please suggest what can be done to resolve this issue?
Also, how can we determine based on input size , the memory parameters to be used for running the job?