Support Questions
Find answers, ask questions, and share your expertise

Oozie Spark2 Action throws "Attempt to add (${workflowAppUri}/lib/${dependencyJar}) multiple times to the distributed cache."

Have created a spark action which uses a dependency jar. So I have used below config in workflow.xml to pass the dependency jar


<spark-opts>--jars ${workflowAppUri}/lib/${dependencyJar}</spark-opts>


But got below error


2019-06-12 07:00:35,140  WARN SparkActionExecutor:523 - SERVER[manager-0] USER[root] GROUP[-] TOKEN[] APP[spark-wf] JOB[0000068-190611183932696-oozie-root-W] ACTION[0000068-190611183932696-oozie-root-W@spark-node] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://${nameNode}/${workflowAppUri}/lib/${dependencyJar}) multiple times to the distributed cache.


Have seen similar issues like duplicate jars between oozie and spark2 sharelib directory. Have tried the solution people are suggesting. But nothing resolves this

1 REPLY 1

If we add jars in the lib directory of the application root directory, oozie automatically distributing the jars to it's distributed cache. In my case, I have tried to add the jar which is already in the lib directory. So, I just need to remove the below line from my workflow definition.


<spark-opts>--jars ${workflowAppUri}/lib/${dependencyJar}</spark-opts>


And also I have tested that if you want to attach the jars that are not available in your lib directory, you can mention like below in your workflow definition.


<spark-opts>--jars ${nameNode}/tmp/{someJar}</spark-opts>
; ;