Created on 09-22-2015 12:36 AM - edited 09-16-2022 02:41 AM
Hello,
I would like to use newer version of some of the libraries listed in /etc/spark/conf/classpath.txt.
What is the recommended way to do that? I add other libraries using spark-submit's --jars (I have the jars on HDFS), but
this does not work with newer versions of libraries that are already in classpath.txt.
Alternatively, is there a way to disable construction of classpath.txt and rely solely on libraries provided to the spark-submit (except spark and hadoop possibly)?
I'm running spark on yarn (cluster mode).
Thank you!
Created 09-22-2015 06:06 AM
we had an similiar problem running Accumulo 1.7.2 (parcel based) on CDH5. Unfortunately CDH5 bundles Accumul 1.6.0 jars by default.
Our workaround was to modify SPARK_DIST_CLASSPATH via
Spark Service Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh – Spark (Service-Wide)
SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-core.jar:$SPARK_DIST_CLASSPATH SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-fate.jar:$SPARK_DIST_CLASSPATH SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-start.jar:$SPARK_DIST_CLASSPATH SPARK_DIST_CLASSPATH=/opt/cloudera/parcels/ACCUMULO/lib/accumulo/lib/accumulo-trace.jar:$SPARK_DIST_CLASSPATH export SPARK_DIST_CLASSPATH
This way you can add or redefine SPARK_DIST_CLASSPATH