I am using HDP 22.214.171.124-187.I have enabled hbase replication.I have total 10 Region server.Everything was working fine,but suddenly 1 RS replication stopped and rest 9RS is able to replicate.
So in logs of RS that stop replicating I found:
regionserver.ReplicationSourceWALReader: Failed to read stream of replication entries 259735-java.io.EOFException: Cannot seek after EOF 259779- at org.apache.hadoop.hdfs.DFSInputStream.seek
If you have fixed the issue, Kindly update the Post with the Solution.
The Exception is coming from DFSInputStream with EOFException. Worth checking if the RegionServer having issues has any Zero Length WAL File under the WAL Directory. Or, Enable TRACE Logging on the concerned RegionServer to capture additional details on the concerned Exception. Or, Checking which WAL is being replicated & any FS issues with the blocks associated with the WAL File.