Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Writing to hdfs using rest api pausing spark application

New Contributor

Hi guys,

I am running a spark application which writes some information to log files from inside every reduceByKey() operation. For this i am using webhdfs rest api and writing to log files in hdfs. The problem is when i write from each reduce operation it runs fine till certain number of finished tasks but pauses after that. But if i don't write to hdfs from each reduce operation and rather just perform reduction everything runs fine. The data size is probably just 50 bytes max. So is there a limit to how many connections a webhdfs server can simultaneously have open ? If not what can be the problem ? I am stuck out here so some direction would be helpful.


What is your HDP version?

New Contributor

HDP- is my current version that i am using