Support Questions
Find answers, ask questions, and share your expertise

Oozie Spark2 Action throws "Attempt to add (${workflowAppUri}/lib/${dependencyJar}) multiple times to the distributed cache."

Have created a spark action which uses a dependency jar. So I have used below config in workflow.xml to pass the dependency jar

<spark-opts>--jars ${workflowAppUri}/lib/${dependencyJar}</spark-opts>

But got below error

2019-06-12 07:00:35,140  WARN SparkActionExecutor:523 - SERVER[manager-0] USER[root] GROUP[-] TOKEN[] APP[spark-wf] JOB[0000068-190611183932696-oozie-root-W] ACTION[0000068-190611183932696-oozie-root-W@spark-node] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://${nameNode}/${workflowAppUri}/lib/${dependencyJar}) multiple times to the distributed cache.

Have seen similar issues like duplicate jars between oozie and spark2 sharelib directory. Have tried the solution people are suggesting. But nothing resolves this


If we add jars in the lib directory of the application root directory, oozie automatically distributing the jars to it's distributed cache. In my case, I have tried to add the jar which is already in the lib directory. So, I just need to remove the below line from my workflow definition.

<spark-opts>--jars ${workflowAppUri}/lib/${dependencyJar}</spark-opts>

And also I have tested that if you want to attach the jars that are not available in your lib directory, you can mention like below in your workflow definition.

<spark-opts>--jars ${nameNode}/tmp/{someJar}</spark-opts>
; ;