Support Questions

Find answers, ask questions, and share your expertise

ExecutorLostFailure when using Phoenix-Spark Plugin in HDP 2.6.5

Expert Contributor

I wrote different Spark applications, that save Dataset data to Phoenix tables. When I try to process huge datasets, some of my Spark Jobs fail by a ExecutorLostFailure exception. The jobs are retried and seem to finish successfully on their second approaches.

Here the code, that saves the dataframe to my Phoenix table:

dfToSave.write().format("org.apache.phoenix.spark").mode("overwrite").option("table", "PHOENIX_TABLE_NAME").option("zkUrl", "").save();

Here the output of one of the Jobs in the Spark History UI:

ExecutorLostFailure (executor 3 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 3.0 GB of 3 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

Why do I get this error when I use the Spark plugin for Apache Phoenix? Are there any configurations to manage the memory consumption of the Phoenix-Spark job?


@Daniel Müller

You might want to increase the --executor-memory (and probably yarn.scheduler.maximum-allocation-mb as well) to a value that can hold your data size in memory. In some cases repartitioning is a better option.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.