- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Memory limit exceeded cannot perform hash join
Created on ‎07-25-2016 06:01 AM - edited ‎09-16-2022 03:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Everyone,
I am running a cluster with 2 impalads, with memory capacity of 60GB for each impalad. I am using CDH 5.7 package and managing the cluster using CM. I had come across an query with below error -
Memory limit exceeded Cannot perform hash join at node with id 2. Repartitioning did not reduce the size of a spilled partition. Repartitioning level 2. Number of rows 53969358.
Query:
SELECT `dim_experiment`.`experiment_name` AS `experiment_name` FROM `gwynniebee_bi`.`fact_recommendation_events` `fact_recommendatio` LEFT OUTER JOIN `gwynniebee_bi`.`dim_experiment` `dim_experiment` ON (`fact_recommendatio`.`experiment_key` = `dim_experiment`.`experiment_key`) GROUP BY 1
Profile:
---------------- Estimated Per-Host Requirements: Memory=2.05GB VCores=3 WARNING: The following tables are missing relevant table and/or column statistics. gwynniebee_bi.fact_recommendation_events 09:MERGING-EXCHANGE [UNPARTITIONED] | order by: dim_experiment.experiment_name ASC | hosts=1 per-host-mem=unavailable | tuple-ids=3 row-size=38B cardinality=41 | 04:SORT | order by: dim_experiment.experiment_name ASC | hosts=1 per-host-mem=16.00MB | tuple-ids=3 row-size=38B cardinality=41 | 08:AGGREGATE [FINALIZE] | group by: dim_experiment.experiment_name | hosts=1 per-host-mem=10.00MB | tuple-ids=2 row-size=38B cardinality=41 | 07:EXCHANGE [HASH(dim_experiment.experiment_name)] | hosts=1 per-host-mem=0B | tuple-ids=2 row-size=38B cardinality=41 | 03:AGGREGATE [STREAMING] | group by: dim_experiment.experiment_name | hosts=1 per-host-mem=10.00MB | tuple-ids=2 row-size=38B cardinality=41 | 02:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] | hash predicates: dim_experiment.experiment_key = fact_recommendatio.experiment_key | runtime filters: RF000 <- fact_recommendatio.experiment_key | hosts=1 per-host-mem=2.00GB | tuple-ids=1N,0 row-size=46B cardinality=unavailable | |--06:EXCHANGE [HASH(fact_recommendatio.experiment_key)] | | hosts=2 per-host-mem=0B | | tuple-ids=0 row-size=4B cardinality=unavailable | | | 00:SCAN HDFS [gwynniebee_bi.fact_recommendation_events fact_recommendatio, RANDOM] | partitions=1/1 files=90 size=3.96GB | table stats: unavailable | column stats: unavailable | hosts=2 per-host-mem=56.00MB | tuple-ids=0 row-size=4B cardinality=unavailable | 05:EXCHANGE [HASH(dim_experiment.experiment_key)] | hosts=1 per-host-mem=0B | tuple-ids=1 row-size=42B cardinality=78 | 01:SCAN HDFS [gwynniebee_bi.dim_experiment dim_experiment, RANDOM] partitions=1/1 files=1 size=10.56KB runtime filters: RF000 -> dim_experiment.experiment_key table stats: 78 rows total column stats: all hosts=1 per-host-mem=32.00MB tuple-ids=1 row-size=42B cardinality=78
I was trying to correlate this issue with available memory, in use etc with CM metrics like tcmalloc_bytes_in_use, mem_rss etc. Around sametime, I am seeing very less usage on memory being used for impalad process.
Below graph shows the above metrics around the sametime (when query had memory error)
Am I missing any other metric to look for? Please share your thoughts.
Thanks,
Mani
Created ‎09-10-2016 04:50 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can see that the stats for table "gwynniebee_bi.fact_recommendation_events" was not available:
| table stats: unavailable | column stats: unavailable
Without stats, Impala will not be able to do accurate estimate in terms of resources required for the query, this sometimes will lead to OOM error due to wrong estimation of memroy usage up front.
If you check the SUMMARY of the query, you will get detaild information about estimated and actual memory usage.
I suggest you to run "COMPUTE STATS gwynniebee_bi.fact_recommendation_events" and try again.
Created ‎09-13-2016 02:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Eric's suggestion is the general solution to this problem - without stats Impala is choosing a bad join order and there are a lot of duplicates on the right side of the join.
One workaround is to add a straight_join hint, which lets you control the order in which the tables are joined. I believe in your case just adding straight_join will flip the sides of the join, which will almost certainly help you.
SELECT `dim_experiment`.`experiment_name` AS `experiment_name` FROM `gwynniebee_bi`.`fact_recommendation_events` `fact_recommendatio` LEFT OUTER JOIN `gwynniebee_bi`.`dim_experiment` `dim_experiment` ON (`fact_recommendatio`.`experiment_key` = `dim_experiment`.`experiment_key`) GROUP BY 1
Created ‎01-17-2018 04:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are facing a similar issue and its a blocker for us now. I've also tried with straight join hint but still getting the same error,
Query Status: Cannot perform hash join at node with id 20. Repartitioning did not reduce the size of a spilled partition. Repartitioning level 1. Number of rows 185442.
Created ‎01-17-2018 02:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@spurusothamanusually we go through a couple of steps to troubleshoot issues like this. The two most likely solutions are:
1. Give the query more memory by increasing mem_limit or reducing # of concurrent queries
2. Adjust the SQL by rewriting the query or adding hints to get a different query plan to that avoids having so many duplicate values on the right side of the join.
Depending on the exact scenario, the solution might be 1, 2, or both.
straight_join is only useful if you use it to force a plan with a different join order. If you want input on whether you have a bad plan and what a better join order might be, please provide a query profile.
