Support Questions

Find answers, ask questions, and share your expertise

relationship between Hive query and missing blocks on cluster

avatar

I decommissioned and deleted 2 out of 3 my HDFS data nodes. Although I expected blocks to have been replicated, it had not.

I started getting under replication error on my cluster. I have started HDFS balancer now but hive queries are terribly slow.

Is there some relation between two? Is it because it has to write to three nodes when files are underreplicated?

21 REPLIES 21

avatar

I believe I am already using Beeline. Yes, I tried switching back to MapReduce execution engine but still get the same error. @Geoffrey Shelton Okot I do have hive server 2 up and running on the cluster. Also, I don't find it ideal having to switch to Hive on Spark because of this unidentified issue. Do you mind pointing out what could be the other reasons for interruption on Map Reduce job or if it is possible to escalate it?

avatar
Master Mentor

@Sim kaur

Can you share the latest version of these 2 files /var/log/hive/*.err and /var/log/hive/*.log