Support Questions

pankshiv1809 · ‎03-21-2023

Hi Team,

I am getting below mentioned ERROR, could some one please suggest me how to fix it.

Thanks in advance for your support always.

job still getting failed
- ERROR - AnalysisException raised in the UPSS_PROMO_PROMOTIONS Spark JOB
Traceback (most recent call last):
File "/hdp_bi_code/hdp_bi_code/upss/transformation/promo_promotions.py", line 203, in process_promotions
scd2_history_data_latest.write.mode("overwrite").insertInto(targetDbMergeTableNm, overwrite=True)
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 716, in insertInto
self._jwrite.mode("overwrite" if overwrite else "append").insertInto(tableName)
File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in _call_
answer, self.gateway_client, self.target_id, self.name)
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
raise AnalysisException(s.split(': ', 1)[1], stackTrace)
AnalysisException: u'java.lang.OutOfMemoryError: GC overhead limit exceeded;'

RangaReddy · ‎03-30-2023

Hi @pankshiv1809

You application is failed with java.lang.OutOfMemoryError: GC overhead limit exceeded. Based on the data you are processing you need to adjust the resources (Executor memory and Driver memory and overhead).

For example,

--conf spark.executor.memory=10g 
--conf spark.driver.memoryOverhead=1g
--conf spark.driver.memory=10g
--conf spark.executor.memoryOverhead=1g

RangaReddy · ‎04-05-2023

Hi @pankshiv1809

The above parameters values helped in your case. If yes please Accept as Solution. It will helpful for other members.

Shelton · ‎04-06-2023

@pankshiv1809
Can you share the spark-submit conf for UPSS_PROMO_PROMOTIONS Spark JOB ?
JConsole, which helps to detect performance problems in the code including java.lang.OutOfMemoryErrors.
Depending on the available memory on your cluster you can then re-adjust as suggested by @RangaReddy

Cloudera Community

Support Questions

Hadoop-Spark Job execution Issue