In my Scala work project, I use spark-submit to launch my application into a yarn cluster. I am quite new to Maven projects and pom.xml, but the problem I seem to be having is that hadoop's spark2 jars use an older version of google protobuf (2.5.0) than the internal dependencies I'm importing at work (2.6.1). The error is here:
java.lang.NoSuchMethodError: com/google/protobuf/LazyStringList.getUnmodifiableView()Lcom/google/protobuf/LazyStringList; (loaded from file:/usr/hdp/126.96.36.199-91/spark2/jars/protobuf-java-2.5.0.jar by sun.misc.Launcher$AppClassLoader@8b6f2bf7) called from class protobuf.com.mycompany.group.otherproject.api.JobProto$Query
Since I'm not quite sure how to approach dependency issues like this, and I can't change the code of the internal dependency that uses 2.6.1, I added the required protobuf version as a dependency to my project, as well:
<dependency> <groupId>com.google.protobuf</groupId> <artifactId>protobuf-java</artifactId> <version>2.6.1</version> </dependency>
Unfortunately, this hasn't resolved the issue. When the internal dependency (which does import 2.6.1 on its own) tries to use its proto, the conflict occurs.
Any suggestions on how I could force the usage of the newer, correct version would be greatly appreciated.
@Robert Cornell, you'd need to shade the protobuf dependency which is being used in the spark application to avoid conflicts between application and spark dependencies.
On a side note you can try if setting below properties helps with the current state.
spark.driver.userClassPathFirst true spark.executor.userClassPathFirst true
I'll give these a shot!
I was reading about shading. Does it somehow change all the references in the dependency code that imports the newer protobuf? I can't change the code in the other API's my company has, but if shading changes the name of the import and all the underlying (ie not in my project) references to it, I guess that'll work.