Hello @Atradius , thank you for reaching out to the Community with your issue of having both NameNodes down. Do you see in your NN log entries like JvmPauseMonitor saying "Detected pause in JVM or host machine" and a value larger than 1000ms, please? It can be an indication that your service is running out of heap. If it is the NameNode, the short-term solution is to increase the heap and restart the service. A long term solution is to identify why did you run out of heap? E.g. do you face with small files issue? Please read article  about how to tackle this. Losing quorum might be caused by ZK service issue, when the ZK is not in quorum. Please check the ZK logs as well. Please let us know if you need more input to progress with your investigation. Best regards: Ferenc  https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-challenges/
... View more
@bgooley Thanks a bunch! This is good info. I do see the below now which means /usr/lib/jvm is good for openJDK. Note: Cloudera strongly recommends installing Oracle JDK at /usr/java/<jdk-version> and OpenJDK at /usr/lib/jvm (or /usr/lib64/jvm on SLES 12), which allows Cloudera Manager to auto-detect and use the correct JDK version. Unfortunately in the CDH 5.16 install guide it doesnt clarify that for openJDK /usr/lib/jvm is good path but makes a blanket statement that The JDK must be installed at /usr/java/jdk-version. Hopefully they will update the doc in future. https://www.cloudera.com/documentation/enterprise/5-16-x/topics/cdh_ig_jdk_installation.html .
... View more