- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive queries taking LONG time to start
- Labels:
-
Apache YARN
Created ‎03-13-2017 05:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lately, we've seen intermittent behavior where Hive queries take a very long time to start.
For example, I submitted a query via Hive CLI on an edge node about 5 minutes ago, and it still isn't in the application manager. There is NOTHING else running. Zero use of our cluster happening right now.
Usually, the queries eventually start.
Ocassionally, I get an error like: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: FAIL
Can anyone help me figure out why this is happening?
Created ‎03-13-2017 05:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try this before you execute your code.
set hive.execution.engine=tez;
Created ‎03-13-2017 05:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Adnan,
We use MR and Tez for different queries. But both query engines show this behavior.
Created ‎03-13-2017 06:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try running it in the debug mode and then provide the o/p here. For hivecli you could do as below:
hive --hiveconf hive.root.logger=DEBUG,console
Once done , re-run the query and see where it fails. That should give you better insight on the failure here
Created ‎03-15-2017 01:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Sumesh
For several minutes, I get messages like this:
DAGClientRPCImpl: GetDAGStatus via AM for app: <application ID> dag:<dag ID> IPC Client (client ID) connection to <data node> from <edge node> sending X Got ping response for sessionid: <session ID> after 0ms
... where X is a number that increases each time this gets printed.
This entire time, it shows "-1" as the number of "total" and "pending" mappers, and 0 for everything else.
Created ‎06-14-2018 01:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
were you able to fix this issue @Zack Riesland ?
If yes, can you share the solution ... ?
,how did u fix it ?
Created ‎06-14-2018 10:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm sorry, @shaleen somani - this was over a year ago and I don't remember the details any more.
My guess is that our primary and secondary name nodes had failed over for some reason.
I've found that when this happens, things continue to "work", but not quite right and it can be hard to pin down.
You can use the hdfs haadmin utility to check the status.
Good luck!
