Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
Number of MapReduce jobs for single Hive query
Labels:
- Labels:
-
Apache Hive
-
MapReduce
Explorer
Created on ‎01-28-2016 06:27 AM - edited ‎09-16-2022 03:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
How to find out the number of MapReduce jobs for single Hive query?
Could you please let me know if I execute below query how many mapreduce jobs will be launched and the squence of these jobs.
select col1, col2, sum(col3), count(col4), avg(col5) from sample_table where col1=condition;
Thanks in advance
1 ACCEPTED SOLUTION
Mentor
Created ‎02-28-2016 01:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can run an EXPLAIN on a query to see how Hive would plan to run the query (how many phases). This will help you get a sense of 'how many jobs' or something close to it.
Your query is invalid in HiveQL, but with GROUP BY statements further added for col1 and col2 to make it legal, it would take a single job.
Your query is invalid in HiveQL, but with GROUP BY statements further added for col1 and col2 to make it legal, it would take a single job.
1 REPLY 1
Mentor
Created ‎02-28-2016 01:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can run an EXPLAIN on a query to see how Hive would plan to run the query (how many phases). This will help you get a sense of 'how many jobs' or something close to it.
Your query is invalid in HiveQL, but with GROUP BY statements further added for col1 and col2 to make it legal, it would take a single job.
Your query is invalid in HiveQL, but with GROUP BY statements further added for col1 and col2 to make it legal, it would take a single job.
