Support Questions

Find answers, ask questions, and share your expertise

Hive queries taking LONG time to start

avatar
Super Collaborator

Lately, we've seen intermittent behavior where Hive queries take a very long time to start.

For example, I submitted a query via Hive CLI on an edge node about 5 minutes ago, and it still isn't in the application manager. There is NOTHING else running. Zero use of our cluster happening right now.

Usually, the queries eventually start.

Ocassionally, I get an error like: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

MapReduce Jobs Launched:

Stage-Stage-1: FAIL

Can anyone help me figure out why this is happening?

6 REPLIES 6

avatar
Expert Contributor
this is just a suggestion but have you tried running on Hive on Tez? Its a much faster and efficient execution engine.

Try this before you execute your code.

set hive.execution.engine=tez;

avatar
Super Collaborator

Thanks Adnan,

We use MR and Tez for different queries. But both query engines show this behavior.

avatar
Super Collaborator

Try running it in the debug mode and then provide the o/p here. For hivecli you could do as below:

hive --hiveconf hive.root.logger=DEBUG,console

Once done , re-run the query and see where it fails. That should give you better insight on the failure here

avatar
Super Collaborator

Thanks @Sumesh

For several minutes, I get messages like this:

DAGClientRPCImpl: GetDAGStatus via AM for app: <application ID> dag:<dag ID> IPC Client (client ID) connection to <data node> from <edge node> sending X Got ping response for sessionid: <session ID> after 0ms

... where X is a number that increases each time this gets printed.

This entire time, it shows "-1" as the number of "total" and "pending" mappers, and 0 for everything else.

avatar
New Contributor

were you able to fix this issue @Zack Riesland ?

If yes, can you share the solution ... ?

,

how did u fix it ?

avatar
Super Collaborator

I'm sorry, @shaleen somani - this was over a year ago and I don't remember the details any more.

My guess is that our primary and secondary name nodes had failed over for some reason.

I've found that when this happens, things continue to "work", but not quite right and it can be hard to pin down.

You can use the hdfs haadmin utility to check the status.
Good luck!