Created 05-13-2021 09:17 AM
Hello,
I struggle on a dependency issues in spark, now. Being new in spark, I hope there is a simple remedy.
The question is, is there any mechanism how to separate dependencies of the spark engine and dependencies of a application.
Example: The latest version of spark-core_2.12 (3.1.1, March 2021) depends on
hadoop-client (3.3.0, March 2020) which itself depends on hadoop-common (3.3.0, July 2020)
which finally depends on an antient version of gson (2.2.4, May 2013).
You can easily find many other examples, e.g. commons-codec, protobuf-java ...
So, what if your application, basically a library developed outside spark, depends on the latest (no longer compatible) version of gson 2.8.6?
My obviously naive approach to start a spark application ends in runtime incompatibility clashes (e.g. with gson)
Best regards
Jaro
Created 05-14-2021 07:41 AM
Hello,
You can solve this by using the Maven shade plugin. Take a look at the Cloudera doc https://docs.cloudera.com/runtime/7.2.9/developing-spark-applications/topics/spark-packaging-differe... .
Michael
Created 05-14-2021 07:41 AM
Hello,
You can solve this by using the Maven shade plugin. Take a look at the Cloudera doc https://docs.cloudera.com/runtime/7.2.9/developing-spark-applications/topics/spark-packaging-differe... .
Michael
Created 05-24-2021 05:47 AM
Hi @Jarinek,
Did the reply from @mridley resolve your issue? If so, please mark the reply as the solution, as it will make it easier for others to find the answer in the future.