Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark job dependency conflict with oozie yarn jars ? How to ensure my spark jar has priority

Spark job dependency conflict with oozie yarn jars ? How to ensure my spark jar has priority

I am creating a spark submit through oozie and how do I ensure that my spark jar has priority over all others jars that will be run through oozie lib path ?

Error message

[Loaded org.apache.poi.xssf.model.ThemesTable$ThemeElement from file:/usr/local/lib/radoop/rapidminer_libs-7.6.1.jar]

java.lang.NoSuchMethodError: org.apache.poi.ss.usermodel.Cell.getCellTypeEnum()Lorg/apache/poi/ss/usermodel/CellType;

I tried setting oozie.use.system.libpath = false , did not work then I tried modifying HADOOP_CLASSPATH variable did not work

I also tried modifying spark submit adding #--conf "spark.driver.userClassPathFirst=true" --conf "spark.executor.userClassPathFirst=true" - Also did not work

Then I tried all kinds of tricks with maven that did not work trying to relocate the jar.

Similar question here https://community.hortonworks.com/questions/6610/can-i-ensure-that-my-own-jars-have-classpath-prior.... is there a way to do it without java action or only with java action?

I am using spark 2.3,scala 2.11

Thank you

2 REPLIES 2
Highlighted

Re: Spark job dependency conflict with oozie yarn jars ? How to ensure my spark jar has priority

@na

So your findings lead you to think there is a jar in oozie share lib directory that is taking precedence over yours?

Did you find which exact jar in oozie share lib directory is causing the problem?

I think the options you have tried are all good attempts to avoid jar conflict if that is coming from oozie. But perhaps the problem could come from MR configs as an example. Hence I suggest you try to find which jar could be causing the problem for you in this case.

Highlighted

Re: Spark job dependency conflict with oozie yarn jars ? How to ensure my spark jar has priority

It is a rapid miner jar it contains poi version 3.14 but I am using java poi library version 3.17 in my spark jar.

Don't have an account?
Coming from Hortonworks? Activate your account here