01-31-2018 08:43 PM - edited 01-31-2018 09:51 PM
After upgraded CDH to 5.14.0-1.cdh5.14.0.p0.24 and SPARK2 to 2.2.0.cloudera2-1.cdh5.12.0.p0.232957, spark ETL script will encounter a strange stop between tasks. The spark script was not changed, it's same as before.
I found the problem is that the pause encountered when spark reading thousands of files. But the previous version read thousands of files too, and there's no performance issue. Did I miss something here.