I have a problem with Spark and Hive Jobs because of missing blocks in the cluster. Cloudera Version is 5.12.0.
Error Message is:
Caused by: java.io.IOException: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1229449011-10.65.184.84-1424518888833:blk_1116890537_43225117
So we have actually missing blocks in the cluster. I have created a list of all missing blocks and it's locations.
The blocks in the error message are not missing.
The problem occurs with running hive queries over ssh with "hive -e". If I run the same Statement over hue it works. Further is the replication factor of the above Block > 1.
What could be the cause for that issue. What could I do for beter debugging the problem?
Thank you in advance,
in my case the problem was a node with a different OS. After decomissioning the node, the error was fixed.
Maybe this information could help somebody.