Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark small files reading performance downgrades in HDP 3.1

Spark small files reading performance downgrades in HDP 3.1

New Contributor

After upgrading to HDP 3.1, we found spark stuck for 2 hours to load lots of small files.

The log stuck on

FileInputFormat: Total input paths to process : 480864

However, before upgrading, the same job took only 10+ min on HDP 2.6.4

Is it a known issue? i did some google but no luck.

Thanks in advance!

Don't have an account?
Coming from Hortonworks? Activate your account here