- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Set maximum containers on a Hive query
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a hive insert statement which by default will use all available resources in YARN as it is reading a large volume of data.
I am happy for the query to take longer and use less resources so that other users can also have access to compute resources.
I don't want to set up YARN queues as this is an unusual query and so don't want to permanently restrict the cluster.
If I was using Spark can do this quite easily with setting a number of executors. Is there a hive config that allows me to do this at a query level.
I have looked at various other posts such as those below, but nothing seems to allow this.
Also seen this: https://community.cloudera.com/t5/Support-Questions/How-are-number-of-mappers-determined-for-a-query... - but not sure if changing split sizes is a good idea. Would this then impact the structure of data stored by my data.
Grateful for any suggestions.
Created ‎11-23-2021 12:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Andyjmoss
As you already pointed https://community.cloudera.com/t5/Support-Questions/How-are-number-of-mappers-determined-for-a-query...
There is no limit per query, you can only adjust max and min grouping size to play around on mapper tasks.
Would this then impact the structure of data stored by my data?
No this only affects how much data each map task will get.
Created ‎11-23-2021 12:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Andyjmoss
As you already pointed https://community.cloudera.com/t5/Support-Questions/How-are-number-of-mappers-determined-for-a-query...
There is no limit per query, you can only adjust max and min grouping size to play around on mapper tasks.
Would this then impact the structure of data stored by my data?
No this only affects how much data each map task will get.
Created ‎11-26-2021 01:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @rpathak - having discussed this further amongst our team we think we are going to try setting up elastic YARN queues to help this situation.
