Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Memory Issues in while accessing files in Spark

Memory Issues in while accessing files in Spark

New Contributor

Hello All,

 

we are using below memory configuration and spark job is failling and  running beyond physical memory limits. Current usage: 1.6 GB of 1.5 GB physical memory used; 3.9 GB of 3.1 GB virtual memory used. Killing container.

 

we are using spark excutor memory 8 GB and we dont know  from where we got 1.5GB pysical memory and when we see job spark environments we see excutor memory as 8 GB and excutor memory overhead is 8 GB and also we have yarn.scheduler.minimum-allocation-mb= 1 GB and yarn.scheduler.maximum-allocation-mb= 20 GB  and we have 89 GB and 216 Gb memory allocated for yarn.nodemanager.resource.memory-mb with diffrent role groups for nodemanager and no other jobs are running during in cluster.

 

Please suggest me in this issue.

job config :

spark = SparkSession.builder.appName("Spark ETL 3")        

.config("spark.driver.maxResultSize", "0")        

.config("spark.driver.memory", "8g")        

.config("spark.driver.cores", "2")        

.config("spark.executor.memory", "8g")        

.config("spark.executor.cores", "4")        

.config("spark.yarn.driver.memoryOverhead", "2g")        

.config("spark.yarn.executor.memoryOverhead", "8g")        

.getOrCreate()

 

error is 

 

application application_1499367756284_12564 failed 2 times due to AM Container for appattempt_1499367756284_12564_000002 exited with exitCode: -104
For more detailed output, check application tracking page:http://gaalplpapp0022b.linux.us.ups.com:8088/proxy/application_1499367756284_12564/Then, click on links to logs of each attempt.
Diagnostics: Container [pid=25270,containerID=container_e38_1499367756284_12564_02_000001] is running beyond physical memory limits. Current usage: 1.6 GB of 1.5 GB physical memory used; 3.9 GB of 3.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_e38_1499367756284_12564_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 25617 25274 25270 25270 (python) 239 33 547803136 17192 /opt/cloudera/parcels/Anaconda-4.1.1/bin/python staging_ETL_2209_to_9196.py
|- 25270 25268 25270 25270 (bash) 1 0 116011008 372 /bin/bash -c LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/../../../CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/lib/native: /usr/lib/jvm/java-1.8.0-oracle.x86_64/bin/java -server -Xmx1024m -Djava.io.tmpdir=/data/1/yarn/nm/usercache/flexapp/appcache/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001/tmp -Dspark.yarn.app.container.log.dir=/data/1/yarn/container-logs/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.deploy.PythonRunner' --primary-py-file staging_ETL_2209_to_9196.py --properties-file /data/1/yarn/nm/usercache/flexapp/appcache/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001/__spark_conf__/__spark_conf__.properties 1> /data/1/yarn/container-logs/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001/stdout 2> /data/1/yarn/container-logs/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001/stderr
|- 25274 25270 25270 25270 (java) 12561 587 3476746240 389639 /usr/lib/jvm/java-1.8.0-oracle.x86_64/bin/java -server -Xmx1024m -Djava.io.tmpdir=/data/1/yarn/nm/usercache/flexapp/appcache/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001/tmp -Dspark.yarn.app.container.log.dir=/data/1/yarn/container-logs/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class org.apache.spark.deploy.PythonRunner --primary-py-file staging_ETL_2209_to_9196.py --properties-file /data/1/yarn/nm/usercache/flexapp/appcache/application_1499367756284_12564/container_e38_1499367756284_12564_02_000001/__spark_conf__/__spark_conf__.properties
 
Thanks
Narendar
1 REPLY 1
Highlighted

Re: Memory Issues in while accessing files in Spark

Master Guru
The failing container isn't a Spark Executor but the Spark Application Master instead. To specify its memory value, if using yarn-client mode, use "spark.yarn.am.memory" and "spark.yarn.am.memoryOverhead". If in cluster mode, raise the equivalent 'driver' values instead (its specified as 2 GiB currently, which is likely setting the actual heap to 1.5 GiB).