Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ExecutorLostFailure when using Phoenix-Spark Plugin in HDP 2.6.5

Highlighted

ExecutorLostFailure when using Phoenix-Spark Plugin in HDP 2.6.5

Expert Contributor

I wrote different Spark applications, that save Dataset data to Phoenix tables. When I try to process huge datasets, some of my Spark Jobs fail by a ExecutorLostFailure exception. The jobs are retried and seem to finish successfully on their second approaches.

Here the code, that saves the dataframe to my Phoenix table:

dfToSave.write().format("org.apache.phoenix.spark").mode("overwrite").option("table", "PHOENIX_TABLE_NAME").option("zkUrl", "server.name:2181:/hbase-unsecure").save();

Here the output of one of the Jobs in the Spark History UI:

ExecutorLostFailure (executor 3 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 3.0 GB of 3 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

Why do I get this error when I use the Spark plugin for Apache Phoenix? Are there any configurations to manage the memory consumption of the Phoenix-Spark job?

1 REPLY 1

Re: ExecutorLostFailure when using Phoenix-Spark Plugin in HDP 2.6.5

Contributor
@Daniel Müller

You might want to increase the --executor-memory (and probably yarn.scheduler.maximum-allocation-mb as well) to a value that can hold your data size in memory. In some cases repartitioning is a better option.

Don't have an account?
Coming from Hortonworks? Activate your account here