I'm working on a Cloudera Express Cluster with CDH 5.14, the architecture is the following :
Total : 6 nodes
3 nodes which are master node and data node
We are not really using HDFS as we have mounted NFS share on the 6 nodes to replace HDFS (we changed configuration to write in this mount point, it's very similar to Dell EMC Isilon, but it's another constructor. The cluster is healthy, so no issue here.
We are using mesos masters on 2 external nodes, on 5 of our cluster nodes we have installed mesos slaves.
Spark 1.6.1 is used on the 2 external nodes, when a job is submitted from those nodes, a new docker container is created on each spark executor to execute the different tasks of our job.
Yet, we need to replace an old cluster, the jobs are the same (except the write location replaced by the NFS share instead of HDFS), we have one node less compare to the old cluster but each cluster's node have more resources allowed (i think it's pretty good) :
vCPU : 16 RAM : 90 GB
The issue is :
For a similar job, on the old cluster the job finished after 7 minutes, on our new cluster it takes more than 50 minutes ... After spark UI investigation, we discovered that there is a lot of time taken by the "suffle write time", and i don't understand why.
Is it a tuning issue of spark ? An issue with mesos configuration ? May be the job's code which is not optimal ?
How can I reduce the shuffle write time, is there any parameter on Cloudera which can help us (for example increase serialization, etc ....)