I am running an 11 node HDP 2.4 cluster with 8 data nodes. I have recently had a lot of trouble with HDFS, and one data node in particular has been very unstable, going down at least once most days in the past two weeks. Looking through the data node log, I have seen the following error a number of times in the ~5 minutes before shutdown:
java.io.IOException: cannot find BPOfferService for bpid=BP-1426797840-xx.xx.xx.xx-1461158403571
A few minutes before that occurs, I also noticed the following:
java.lang.OutOfMemoryError: Java heap space
I feel like increasing the Java maximum heap size will prevent the out of memory exception. However before I do that I wanted to check if there are any fixes for the first error. Is it possible that the BPOfferService exception is cause by running out of heap space?
BPOfferService exception is not related to Heap space. Just check if you restart the cluster could resolve this issue as may be datanode reading old reference of Name node(in case you formatted eirlier).