I set up a single Node ec2 instance to run Cloudera Manger. The installation seemed to complete fine except for this on HDFS:
Bad : 718 under replicated blocks in the cluster. 720 total blocks in the cluster. Percentage under replicated blocks: 99.72%. Critical threshold: 40.00%.
This is the error message I get when I run pyspark:
ERROR spark.SparkContext: Error initializing SparkContext. java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
I'm not sure where to start looking into, but if anybody can direct me inthe right direction I would appareciate it.
This is my dashboard:
I based my settings on this tutorial by Hue.
1. You can ignore the under replicated blocks error. Because by default, it will expect 3 replication but it is not possible with your single node cluster. (or) you can change replication to 1 and restart the cluster. It may help you
2. Go to Yarn -> configuration -> increase the "yarn.nodemanager.resource.memory-mb " , restart yarn and try again, it may help you