Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive queries taking LONG time to start

Hive queries taking LONG time to start

Super Collaborator

Lately, we've seen intermittent behavior where Hive queries take a very long time to start.

For example, I submitted a query via Hive CLI on an edge node about 5 minutes ago, and it still isn't in the application manager. There is NOTHING else running. Zero use of our cluster happening right now.

Usually, the queries eventually start.

Ocassionally, I get an error like: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

MapReduce Jobs Launched:

Stage-Stage-1: FAIL

Can anyone help me figure out why this is happening?

6 REPLIES 6
Highlighted

Re: Hive queries taking LONG time to start

Expert Contributor
this is just a suggestion but have you tried running on Hive on Tez? Its a much faster and efficient execution engine.

Try this before you execute your code.

set hive.execution.engine=tez;
Highlighted

Re: Hive queries taking LONG time to start

Super Collaborator

Thanks Adnan,

We use MR and Tez for different queries. But both query engines show this behavior.

Highlighted

Re: Hive queries taking LONG time to start

Expert Contributor

Try running it in the debug mode and then provide the o/p here. For hivecli you could do as below:

hive --hiveconf hive.root.logger=DEBUG,console

Once done , re-run the query and see where it fails. That should give you better insight on the failure here

Highlighted

Re: Hive queries taking LONG time to start

Super Collaborator

Thanks @Sumesh

For several minutes, I get messages like this:

DAGClientRPCImpl: GetDAGStatus via AM for app: <application ID> dag:<dag ID> IPC Client (client ID) connection to <data node> from <edge node> sending X Got ping response for sessionid: <session ID> after 0ms

... where X is a number that increases each time this gets printed.

This entire time, it shows "-1" as the number of "total" and "pending" mappers, and 0 for everything else.

Highlighted

Re: Hive queries taking LONG time to start

New Contributor

were you able to fix this issue @Zack Riesland ?

If yes, can you share the solution ... ?

,

how did u fix it ?

Highlighted

Re: Hive queries taking LONG time to start

Super Collaborator

I'm sorry, @shaleen somani - this was over a year ago and I don't remember the details any more.

My guess is that our primary and secondary name nodes had failed over for some reason.

I've found that when this happens, things continue to "work", but not quite right and it can be hard to pin down.

You can use the hdfs haadmin utility to check the status.
Good luck!

Don't have an account?
Coming from Hortonworks? Activate your account here