Lately, we've seen intermittent behavior where Hive queries take a very long time to start.
For example, I submitted a query via Hive CLI on an edge node about 5 minutes ago, and it still isn't in the application manager. There is NOTHING else running. Zero use of our cluster happening right now.
Usually, the queries eventually start.
Ocassionally, I get an error like: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Can anyone help me figure out why this is happening?
Try this before you execute your code.
Try running it in the debug mode and then provide the o/p here. For hivecli you could do as below:
hive --hiveconf hive.root.logger=DEBUG,console
Once done , re-run the query and see where it fails. That should give you better insight on the failure here
For several minutes, I get messages like this:
DAGClientRPCImpl: GetDAGStatus via AM for app: <application ID> dag:<dag ID> IPC Client (client ID) connection to <data node> from <edge node> sending X Got ping response for sessionid: <session ID> after 0ms
... where X is a number that increases each time this gets printed.
This entire time, it shows "-1" as the number of "total" and "pending" mappers, and 0 for everything else.
I'm sorry, @shaleen somani - this was over a year ago and I don't remember the details any more.
My guess is that our primary and secondary name nodes had failed over for some reason.
I've found that when this happens, things continue to "work", but not quite right and it can be hard to pin down.
You can use the hdfs haadmin utility to check the status.