Created 11-29-2017 09:34 PM
I am getting desperate here! My Spark2 jobs take hours then get stuck!
I have a 4 node cluster each with 16GB RAM and 8 cores. I run HDP 2.6, Spark 2.1 and Zeppelin 0.7.
I have:
Via Zeppelin (same notebook) I do an INSERT into a Hive table::
for a 50 column table with about 12 million records.
This gets split into 3 stages of 75, 75 and 200 tasks. The 75 and 75 get stuck at stages 73 and 74 and the garbage collection lasts for hours. Any idea what I can try?
EDIT: I have not looked at tweaking partitions, can anyone give me pointers on how to do that, please?
Created 11-30-2017 09:12 AM
Check whether SPARK_HOME in interpreter settings points to correct pyspark.
Is it set to below value?
SPARK_HOME | /usr/hdp/current/spark2-client/ |
Where are you setting spark properties, in spark-env.sh or via Zeppelin? Check this thread:
https://issues.apache.org/jira/browse/ZEPPELIN-295
Do spark.driver.memory=4G, spark.driver.cores=2.
Check spark.memory.fraction (If it's set to 0.75, reduce it to 0.6) https://issues.apache.org/jira/browse/SPARK-15796
Check logs-> do tail -f /var/log/zeppelin/zeppelin-interpreter-spark2-spark-zeppelin-{HOSTNAME}.log in zeppelin host.