Support Questions

Find answers, ask questions, and share your expertise

Spark-submit is forcing an earlier protobuf version than required in my project's dependencies

avatar
Explorer

In my Scala work project, I use spark-submit to launch my application into a yarn cluster. I am quite new to Maven projects and pom.xml, but the problem I seem to be having is that hadoop's spark2 jars use an older version of google protobuf (2.5.0) than the internal dependencies I'm importing at work (2.6.1). The error is here:

	java.lang.NoSuchMethodError:
    com/google/protobuf/LazyStringList.getUnmodifiableView()Lcom/google/protobuf/LazyStringList;
    (loaded from file:/usr/hdp/2.6.4.0-91/spark2/jars/protobuf-java-2.5.0.jar 
    by sun.misc.Launcher$AppClassLoader@8b6f2bf7)
	called from class protobuf.com.mycompany.group.otherproject.api.JobProto$Query 

Since I'm not quite sure how to approach dependency issues like this, and I can't change the code of the internal dependency that uses 2.6.1, I added the required protobuf version as a dependency to my project, as well:

	    <dependency> 
	        <groupId>com.google.protobuf</groupId> 
	        <artifactId>protobuf-java</artifactId> 
	        <version>2.6.1</version> 
	    </dependency> 

Unfortunately, this hasn't resolved the issue. When the internal dependency (which does import 2.6.1 on its own) tries to use its proto, the conflict occurs.

Any suggestions on how I could force the usage of the newer, correct version would be greatly appreciated.

2 REPLIES 2

avatar

@Robert Cornell, you'd need to shade the protobuf dependency which is being used in the spark application to avoid conflicts between application and spark dependencies.

On a side note you can try if setting below properties helps with the current state.

spark.driver.userClassPathFirst true
spark.executor.userClassPathFirst true

avatar
Explorer

I'll give these a shot!

I was reading about shading. Does it somehow change all the references in the dependency code that imports the newer protobuf? I can't change the code in the other API's my company has, but if shading changes the name of the import and all the underlying (ie not in my project) references to it, I guess that'll work.