Slow ReadProcessor warnings leading to application failure

We are facing Slow ReadProcessor warnings while pulling data from kafka with spark applications. After few slow ReadProcesser warnings, the applications fail. A partial log is attached. Please let us know if you need further information.


Please find below warning message,i am frequently i am seeing this logs and also my application taking too long to complete. 


2021-12-13 03:25:00 WARN DFSClient:854 - Slow ReadProcessor read fields took 117390ms (threshold=30000ms); ack: seqno: 353 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 778712 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[,DS-ec5cff3e-e958-416e-9ad8-de319cfbc28a,DISK], DatanodeInfoWithStorage[,DS-61163e3d-59ef-4dfc-b194-7385cff86a7c,DISK], DatanodeInfoWithStorage[,DS-af490217-ef46-4d92-bd6e-78bda82c82dc,DISK]]


Couple of possibilities for this WARN messages are:

1) If there is any GC issue on the datanode, this type of WARN messages is seen.

2) If there is any disk issue

3) the last possibility is network latency/slowness between the application, Kafka node, and datanode.

I agree @Nandinin's suggestion. Adding some thoughts on hdfs side for your reference:

1. Now you know which 3 DNs maybe slow in the pipeline and the timestamp. So you can go to each datanode log, to see if there are "JvmPauseMonitor" ? or "Lock held"? or other WARN / ERROR ?

2. Refer to this KB, check the Slow message from DN logs around the above timestamp to determine what is the main cause.






