Hi guys,
I am running a spark application which writes some information to log files from inside every reduceByKey() operation. For this i am using webhdfs rest api and writing to log files in hdfs. The problem is when i write from each reduce operation it runs fine till certain number of finished tasks but pauses after that. But if i don't write to hdfs from each reduce operation and rather just perform reduction everything runs fine. The data size is probably just 50 bytes max. So is there a limit to how many connections a webhdfs server can simultaneously have open ? If not what can be the problem ? I am stuck out here so some direction would be helpful.