<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark distributed classpath in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31515#M34409</link>
    <description>&lt;P&gt;It should not pose a problem. If it does let us know but we have not seen an issue with this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
    <pubDate>Thu, 03 Sep 2015 18:48:42 GMT</pubDate>
    <dc:creator>Wilfred</dc:creator>
    <dc:date>2015-09-03T18:48:42Z</dc:date>
    <item>
      <title>Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31320#M34404</link>
      <description>&lt;P&gt;We have Spark installed via Cloudera Manager on a YARN cluster. It appears there is a &lt;EM&gt;classpath.txt&lt;/EM&gt; file in &lt;EM&gt;/etc/spark/conf&lt;/EM&gt; that include list of jars that should be available on spark's distributed classpath. And&amp;nbsp;&lt;EM&gt;spark-env.sh&amp;nbsp;&lt;/EM&gt;seems to be the on that's exporting this configuration.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It is my understanding that cloudera manager creates the&amp;nbsp;&lt;EM&gt;classpath.txt&lt;/EM&gt; file. I would like how does cloudera manger evaluate the list of jars that go into this file, and is it something that can be controlled through cloudera manager.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:39:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31320#M34404</guid>
      <dc:creator>NT</dc:creator>
      <dc:date>2022-09-16T09:39:21Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31420#M34405</link>
      <description>&lt;P&gt;For adding custom classes to the classpath you should use one of the two following options:&lt;BR /&gt;- add them via the command line options&lt;BR /&gt;- add them via the config&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the driver you have the option to use: --driver-class-path /path/to/file&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Or for the the executor use&lt;/P&gt;&lt;P&gt;--conf "spark.executor.extraClassPath=/path/to/jar"&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;In spark-defaults.conf set the two values (or one if you only need it for one side&lt;BR /&gt;&amp;nbsp; spark.driver.extraClassPath&lt;BR /&gt;&amp;nbsp; spark.executor.extraClassPath&lt;/P&gt;&lt;P&gt;This can be done through the CM UI.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Depending on the exact thing you are doing you might see limitations of which option you can use.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2015 12:55:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31420#M34405</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2015-09-01T12:55:50Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31422#M34406</link>
      <description>&lt;P&gt;Thank you for your response Wilfred. It sure helps me. However, my question was more towards understanding how &lt;EM&gt;classpath.txt&amp;nbsp;&lt;/EM&gt;file mentioned below is created? Does CM create this file on all nodes, is it something we can configure through CM?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;08:42:43 $ ll /etc/spark/conf/&lt;BR /&gt;total 60&lt;BR /&gt;drwxr-xr-x 3 root root 4096 &amp;nbsp; Aug 25 12:28 ./&lt;BR /&gt;drwxr-xr-x 3 root root 4096 &amp;nbsp; Aug 25 12:28 ../&lt;BR /&gt;-rw-r--r-- 1 root root 29228 Aug 25 12:28 classpath.txt&lt;BR /&gt;-rw-r--r-- 1 root root 21 &amp;nbsp; &amp;nbsp; &amp;nbsp; Aug 25 12:28 __cloudera_generation__&lt;BR /&gt;-rw-r--r-- 1 root root 550 &amp;nbsp; &amp;nbsp; Aug 25 12:28 log4j.properties&lt;BR /&gt;-rw-r--r-- 1 root root 800 &amp;nbsp; &amp;nbsp; Aug 25 12:28 spark-defaults.conf&lt;BR /&gt;-rw-r--r-- 1 root root 1122 &amp;nbsp; Aug 25 12:28 spark-env.sh&lt;BR /&gt;drwxr-xr-x 2 root root 4096 &amp;nbsp; Aug 25 12:28 yarn-conf/&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2015 13:44:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31422#M34406</guid>
      <dc:creator>NT</dc:creator>
      <dc:date>2015-09-01T13:44:51Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31424#M34407</link>
      <description>&lt;P&gt;yes CM generates this as part of the gateway (client config). The classpath text file is generated by CM based on the dependencies that are defined in the deployment.&lt;/P&gt;&lt;P&gt;This is not something you can change.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As you can see in the &lt;A href="http://spark.apache.org/docs/latest/hadoop-provided.html" target="_blank"&gt;upstream docs&lt;/A&gt; we use a form of hadoop free distribution but we still only test this with CDH and the specific dependencies.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does that explain what you are lookign for?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;WIlfred&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2015 14:35:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31424#M34407</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2015-09-01T14:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31426#M34408</link>
      <description>&lt;P&gt;Thank you for the quick response, I really appreciate helping me clear my questions.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The answer was exactly what I was looking for. It is automated and users cannot control the elements of classpath.txt file.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Pardon my naive question, but can it pose a problem having different versions of same dependencies on classpath?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;09:39:34 $ cat /etc/spark/conf/classpath.txt | grep jersey-server&lt;BR /&gt;/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p886.563/jars/jersey-server-1.9.jar&lt;BR /&gt;/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p886.563/jars/jersey-server-1.14.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Sep 2015 14:43:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31426#M34408</guid>
      <dc:creator>NT</dc:creator>
      <dc:date>2015-09-01T14:43:11Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31515#M34409</link>
      <description>&lt;P&gt;It should not pose a problem. If it does let us know but we have not seen an issue with this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Thu, 03 Sep 2015 18:48:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31515#M34409</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2015-09-03T18:48:42Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31516#M34410</link>
      <description>Thank you! That definitely helps.</description>
      <pubDate>Thu, 03 Sep 2015 18:51:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31516#M34410</guid>
      <dc:creator>NT</dc:creator>
      <dc:date>2015-09-03T18:51:47Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31906#M34411</link>
      <description>&lt;P&gt;Actually I think i got an issue related to the fact that classpath.txt contains multiple versions of the same jar:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The issue is related to this jira :&amp;nbsp;&amp;nbsp;&lt;A href="https://issues.apache.org/jira/browse/SPARK-8332" target="_blank"&gt;https://issues.apache.org/jira/browse/SPARK-8332&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And on /etc/spark/conf/classpath.txt :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;-----------------------------&lt;/P&gt;&lt;P&gt;cat /etc/spark/conf/classpath.txt | grep jackson&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/jackson-annotations-2.2.3.jar&lt;BR /&gt;/&lt;SPAN&gt;opt&lt;/SPAN&gt;/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/jackson-core-2.2.3.jar&lt;BR /&gt;/&lt;SPAN&gt;opt&lt;/SPAN&gt;/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/jackson-databind-2.2.3.jar&lt;BR /&gt;/&lt;SPAN&gt;opt&lt;/SPAN&gt;/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/jackson-annotations-2.3.0.jar&lt;BR /&gt;/&lt;SPAN&gt;opt&lt;/SPAN&gt;/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/jackson-core-2.3.1.jar&lt;BR /&gt;/&lt;SPAN&gt;opt&lt;/SPAN&gt;/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/jackson-databind-2.3.1.jar&lt;/P&gt;&lt;P&gt;-----------------------------&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Somehow the classloader is pointing to the version 2.2.3 of jackson, where the method&amp;nbsp;&lt;SPAN&gt;handledType() of the class&amp;nbsp;BigDecimalDeserializer does not exist.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Similar errors may appears for jersey as well since the api changed a bit inbetween those versions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is that a way to solve this kind of issue in a proper way?&lt;/P&gt;</description>
      <pubDate>Wed, 16 Sep 2015 13:21:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/31906#M34411</guid>
      <dc:creator>andreF</dc:creator>
      <dc:date>2015-09-16T13:21:25Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37875#M34412</link>
      <description>&lt;P&gt;Hi andreF,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have the similar issue, did you fix the issue?&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 17:22:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37875#M34412</guid>
      <dc:creator>zlpmichelle</dc:creator>
      <dc:date>2016-02-25T17:22:40Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37876#M34413</link>
      <description>&lt;P&gt;Hi Wilfred,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have the similar issue as&amp;nbsp;&lt;SPAN&gt;andreF's&lt;/SPAN&gt;, we have serval differnt guava in&amp;nbsp;&lt;SPAN class="s1"&gt;/etc/spark/conf/&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;classpath.txt&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;,&amp;nbsp;&lt;/SPAN&gt;do you know how to fix the issue?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/&lt;/SPAN&gt;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-11.0.2.jar&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/&lt;/SPAN&gt;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-11.0.jar&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/&lt;/SPAN&gt;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-14.0.1.jar&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/&lt;/SPAN&gt;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-16.0.1.jar&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;Our app needs to use&amp;nbsp;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-16.0.1.jar, so I add&amp;nbsp;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-16.0.1.jar into&amp;nbsp;&lt;SPAN&gt;/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/, and add "&lt;SPAN class="s1"&gt;/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/&lt;/SPAN&gt;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-16.0.1.jar"&amp;nbsp;&lt;/SPAN&gt;into&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;/etc/spark/conf/&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;classpath.txt.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&lt;SPAN class="s1"&gt;&lt;SPAN class="s1"&gt;&lt;SPAN class="s1"&gt;However, it doesn't work, spark action in oozie still can not find&amp;nbsp;&lt;SPAN class="s2"&gt;guava&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;-16.0.1.jar. How does classpath.txt work? Do you know how to manage or modify the classpath.txt manually? Thanks!&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 17:30:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37876#M34413</guid>
      <dc:creator>zlpmichelle</dc:creator>
      <dc:date>2016-02-25T17:30:26Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37884#M34414</link>
      <description>&lt;P&gt;Hi zlpmichelle,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This problem came from the fact that I wasn't using a CDH artifact on my Maven dependency. If you package guava 16.0.1 into your jar, you still get this problem?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It is still a bit obscure for me how exactly the classpath.txt works and why it mixes several versions of the same API &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 18:07:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37884#M34414</guid>
      <dc:creator>andreF</dc:creator>
      <dc:date>2016-02-25T18:07:06Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37887#M34415</link>
      <description>&lt;P&gt;Agree, andreF. Thanks for your resposen.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I package guava 16.0.1 into my jar, it still get the same problem, I guess it has guava version confliction, since there are serveral different guava in CDH's classpath.txt.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I really confused about this.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 18:43:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37887#M34415</guid>
      <dc:creator>zlpmichelle</dc:creator>
      <dc:date>2016-02-25T18:43:11Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37942#M34416</link>
      <description>&lt;P&gt;If you need a specific version of guava you can not just add it to the classpath. If you do you totally rely on the randomness that is in the class loaders. There is no guarantee that you will get the proper version of guava loaded.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;First thing that you need to do is make sure that you get the proper version of guava loaded at all times. To do this the proper way is to shade (mvn) or shadow (gradle) your guava. Check the web on how to do this. It is really the only way to make sure you get the correct version and not break the rest of hadoop at the same time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After that is done you need to use the class path addition as discussed earlier and make sure that you add your shaded version.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This i sthe only way to do this without being vulnerable for changes in the hadoop dependencies.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Fri, 26 Feb 2016 04:56:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37942#M34416</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2016-02-26T04:56:40Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37945#M34417</link>
      <description>&lt;P&gt;Thanks very much Wilfred for your helpful suggestion! We tried shadow before in gradle, at that time it still had the guava Nosuchmethod issue. Let's try shadow again with adding guava-16.0.1.jar in oozie sharelib to see whether it can work.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Feb 2016 09:35:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/37945#M34417</guid>
      <dc:creator>zlpmichelle</dc:creator>
      <dc:date>2016-02-26T09:35:30Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38122#M34418</link>
      <description>&lt;P&gt;We generated&amp;nbsp;&lt;SPAN&gt;shadow (gradle) to shadow guava-16.0.1 for sparktest.jar, and put the&amp;nbsp;&lt;SPAN&gt;shadow (gradle)&amp;nbsp;&lt;SPAN&gt;sparktest.jar into oozie's share lib classpath, then&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;run the shawsaprktest.jar in CDH oozie, it throws following exception:&lt;/P&gt;&lt;HR /&gt;&lt;P&gt;Failing Oozie Launcher, Main class &lt;SPAN class="error"&gt;[org.apache.oozie.action.hadoop.SparkMain]&lt;/SPAN&gt;, main() threw exception, org.jets3t.service.ServiceException:&lt;STRONG&gt; Request Error: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer&lt;/STRONG&gt;&lt;BR /&gt;org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.ServiceException: Request Error: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer&lt;BR /&gt;at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:478)&lt;BR /&gt;at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)&lt;BR /&gt;at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)&lt;BR /&gt;at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:181)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:606)&lt;BR /&gt;at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)&lt;BR /&gt;at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)&lt;BR /&gt;at org.apache.hadoop.fs.s3native.$Proxy48.retrieveMetadata(Unknown Source)&lt;BR /&gt;at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:468)&lt;BR /&gt;at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)&lt;BR /&gt;at org.apache.hadoop.fs.Globber.doGlob(Globber.java:272)&lt;BR /&gt;at org.apache.hadoop.fs.Globber.glob(Globber.java:151)&lt;BR /&gt;at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1653)&lt;BR /&gt;at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259)&lt;BR /&gt;at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)&lt;BR /&gt;at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)&lt;BR /&gt;at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)&lt;/P&gt;&lt;P&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)&lt;BR /&gt;at scala.Option.getOrElse(Option.scala:120)&lt;BR /&gt;at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)&lt;BR /&gt;at scala.Option.getOrElse(Option.scala:120)&lt;BR /&gt;at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)&lt;BR /&gt;at scala.Option.getOrElse(Option.scala:120)&lt;BR /&gt;at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)&lt;BR /&gt;at scala.Option.getOrElse(Option.scala:120)&lt;BR /&gt;at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)&lt;BR /&gt;at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)&lt;BR /&gt;at org.apache.spark.rdd.RDD.count(RDD.scala:1121)&lt;BR /&gt;at com.gridx.spark.MeterReadingLoader$.load(MeterReadingLoader.scala:120)&lt;BR /&gt;at com.gridx.spark.MeterReadingLoader$.main(MeterReadingLoader.scala:101)&lt;BR /&gt;at com.gridx.spark.MeterReadingLoader.main(MeterReadingLoader.scala)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;According to &lt;A href="http://quabr.com/20613854/exception-when-getting-a-list-of-buckets-on-s3-using-jets3t" target="_blank" rel="nofollow"&gt;http://quabr.com/20613854/exception-when-getting-a-list-of-buckets-on-s3-using-jets3t&lt;/A&gt;, after changing httpclient from 4.2.5 jar &amp;nbsp;to 4.2 jar in hdfs oozie shared lib, it throws following exception:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;STRONG&gt;JA008: File does not exist: hdfs://ip-10-0-4-248.us-west-1.compute.internal:8020/user/oozie/share/lib/lib_20151201085935/spark/httpclient-4.2.5.jar&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2016 05:07:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38122#M34418</guid>
      <dc:creator>zlpmichelle</dc:creator>
      <dc:date>2016-03-01T05:07:37Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38130#M34419</link>
      <description>&lt;P&gt;You can not just replace a file in HDFS and expect it to be picked up. The files will be localised during the run and there is a check to make sure that the files are were they should be. See the &lt;A href="http://blog.cloudera.com/blog/2014/05/how-to-use-the-sharelib-in-apache-oozie-cdh-5/" target="_blank"&gt;blog&lt;/A&gt; on how the sharelib works.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The OOTB version of Spark that we deliver with CDH does not throw the error that you show. It runs with the provided http client so I doubt that replacing the jar is the proper solution. It most likely is due to a mismatch in one of the other jars that results in this error.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2016 09:03:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38130#M34419</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2016-03-01T09:03:53Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38238#M34420</link>
      <description>&lt;P&gt;Thanks Wilfred.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This issue is from &lt;A href="https://community.cloudera.com/t5/Batch-Processing-and-Workflow/how-to-add-external-guava-16-0-1-jar-in-CDH-oozie-classpath/m-p/37803#U37803" target="_self"&gt;http://community.cloudera.com/t5/Batch-Processing-and-Workflow/how-to-add-external-guava-16-0-1-jar-in-CDH-oozie-classpath/m-p/37803#U37803&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does that mean there is no solution to let CDH5.5.0 &amp;nbsp;Hue/Oozie support spark action(Spark 1.5.0) which will write data into Cassandra 2.1.11?&lt;/P&gt;</description>
      <pubDate>Thu, 03 Mar 2016 03:29:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38238#M34420</guid>
      <dc:creator>zlpmichelle</dc:creator>
      <dc:date>2016-03-03T03:29:01Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38371#M34421</link>
      <description>&lt;P&gt;You most likely have pulled in too many dependencies&amp;nbsp;when you build your application. When you look at the gradle documentation for building&amp;nbsp;it shows that it behaves differently than the maven. When you pack up an application gradle includes far more dependencies than maven. This could have pulled in dependencies which you don't want or need.&lt;/P&gt;&lt;P&gt;Make sure that you only have in the application what you really need and that is not provided by hadoop. Search&amp;nbsp;for gradle and dependency management. You need some way to define a "provided" scope in gradle.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Mon, 07 Mar 2016 00:12:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/38371#M34421</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2016-03-07T00:12:28Z</dc:date>
    </item>
    <item>
      <title>Re: Spark distributed classpath</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/80077#M34422</link>
      <description>I understand this is older post but I am getting same problem. Can you please provide solution if it is resolved for you?&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;BR /&gt;</description>
      <pubDate>Thu, 20 Sep 2018 04:16:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Spark-distributed-classpath/m-p/80077#M34422</guid>
      <dc:creator>SandeepP</dc:creator>
      <dc:date>2018-09-20T04:16:41Z</dc:date>
    </item>
  </channel>
</rss>

