Support Questions

Find answers, ask questions, and share your expertise

Hadoop-Spark Job execution Issue

avatar
Contributor

Hi Team,

 

I am getting below mentioned ERROR, could some one please suggest me how to fix it.

 

Thanks in advance for your support always.

 

job still getting failed
- ERROR - AnalysisException raised in the UPSS_PROMO_PROMOTIONS Spark JOB
Traceback (most recent call last):
File "/hdp_bi_code/hdp_bi_code/upss/transformation/promo_promotions.py", line 203, in process_promotions
scd2_history_data_latest.write.mode("overwrite").insertInto(targetDbMergeTableNm, overwrite=True)
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 716, in insertInto
self._jwrite.mode("overwrite" if overwrite else "append").insertInto(tableName)
File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in _call_
answer, self.gateway_client, self.target_id, self.name)
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
raise AnalysisException(s.split(': ', 1)[1], stackTrace)
AnalysisException: u'java.lang.OutOfMemoryError: GC overhead limit exceeded;'

 

3 REPLIES 3

avatar
Master Collaborator

Hi @pankshiv1809 

 

You application is failed with java.lang.OutOfMemoryError: GC overhead limit exceeded. Based on the data you are processing you need to adjust the resources (Executor memory and Driver memory and overhead).

 

For example,

--conf spark.executor.memory=10g 
--conf spark.driver.memoryOverhead=1g
--conf spark.driver.memory=10g
--conf spark.executor.memoryOverhead=1g

 

avatar
Master Collaborator

Hi @pankshiv1809 

 

The above parameters values helped in your case. If yes please Accept as Solution. It will helpful for other members.

avatar
Master Mentor

@pankshiv1809 
Can you share the spark-submit conf for UPSS_PROMO_PROMOTIONS Spark JOB ?
 JConsole, which helps to detect performance problems in the code including   java.lang.OutOfMemoryErrors.
Depending on the available memory on your cluster  you can then re-adjust as suggested by  @RangaReddy