Support Questions

Find answers, ask questions, and share your expertise

Spark - separating dependecies of spark and application

avatar
Rising Star

Hello,
I struggle on a dependency issues in spark, now. Being new in spark, I hope there is a simple remedy.
The question is, is there any mechanism how to separate dependencies of the spark engine and dependencies of a application.

 

Example: The latest version of spark-core_2.12 (3.1.1, March 2021) depends on
hadoop-client (3.3.0, March 2020) which itself depends on hadoop-common (3.3.0, July 2020)
which finally depends on an antient version of gson (2.2.4, May 2013).

You can easily find many other examples, e.g. commons-codec, protobuf-java ...

 

So, what if your application, basically a library developed outside spark, depends on the latest (no longer compatible) version of gson 2.8.6?

My obviously naive approach to start a spark application ends in runtime incompatibility clashes (e.g. with gson)

 

Best regards
Jaro

1 ACCEPTED SOLUTION

avatar
Contributor

Hello,

 

You can solve this by using the Maven shade plugin. Take a look at the Cloudera doc https://docs.cloudera.com/runtime/7.2.9/developing-spark-applications/topics/spark-packaging-differe... .

 

Michael

View solution in original post

2 REPLIES 2

avatar
Contributor

Hello,

 

You can solve this by using the Maven shade plugin. Take a look at the Cloudera doc https://docs.cloudera.com/runtime/7.2.9/developing-spark-applications/topics/spark-packaging-differe... .

 

Michael

avatar
Community Manager

Hi @Jarinek,

Did the reply from @mridley resolve your issue? If so, please mark the reply as the solution, as it will make it easier for others to find the answer in the future. 

Screen Shot 2019-08-06 at 1.54.47 PM.png

 

 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.