<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question ${SPARK_DIST_CLASSPATH} does not include jars in /usr/lib/hadoop/client/  ... in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/SPARK-DIST-CLASSPATH-does-not-include-jars-in-usr-lib-hadoop/m-p/52883#M58502</link>
    <description>&lt;P&gt;Hello Friends:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A quick preamble, and then a question ...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I run CDH 5.10 on CentOS6 final for personal use (1-node for Master and CM; and 4-nodes for Workers/Slaves). They are all Linux LXC/Containers.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It's been a while since I spun the cluster up, so the first thing I did was a '&lt;STRONG&gt;yum update'&lt;/STRONG&gt;&amp;nbsp;of the nodes. No issues there. The cluster is up and running. All green statuses in CM.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, one thing that used to work but now does not, is the&amp;nbsp;&lt;STRONG&gt;pyspark&lt;/STRONG&gt;&amp;nbsp;command. Whe I run now, I get the following exception:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The jar file for that class is:&lt;/P&gt;&lt;PRE&gt;&lt;FONT color="#3366FF"&gt;/usr/lib/hadoop/client&lt;/FONT&gt;/&lt;FONT color="#FF6600"&gt;hadoop-common.jar&lt;/FONT&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After troubleshooting -- &lt;EM&gt;again it's been a while since I used the cluster, so some things may have changed&lt;/EM&gt; -- I determined that the&amp;nbsp;&lt;STRONG&gt;SPARK_DIST_CLASSPATH&lt;/STRONG&gt;&amp;nbsp;environment variable was getting set, but did not contain&amp;nbsp;&lt;EM&gt;&lt;STRONG&gt;any&lt;/STRONG&gt;&lt;/EM&gt; of the jars in that directory (including, of course, the one mentioned above).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The script &lt;EM&gt;ultimately&lt;/EM&gt; responsible for setting&amp;nbsp;&lt;STRONG&gt;SPARK_DIST_CLASSPATH&lt;/STRONG&gt;&amp;nbsp;is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;/etc/spark/conf/spark-env.sh&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and it consults the list of jars in the &lt;STRONG&gt;classpath.txt&lt;/STRONG&gt; file to do so.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sadly, that file does not have any of the jars in the aforementioned directory. I could, of course, manually add them, but I found it odd that it did not have them in the first place. It seems like an important directory of jars to have included in &lt;STRONG&gt;classpath.txt&lt;/STRONG&gt;.&amp;nbsp;&lt;SPAN&gt;(again, in /usr/lib/hadoop/client/)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;So my questions (finally) &lt;/U&gt;... =:)&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Any idea why the jars in that directory weren't included in classpath.txt? Was it perhaps just an upgrade issue?&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;LI&gt;Has anyone had to manually add the jars in that directory? (again, in /usr/lib/hadoop/client/)&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;LI&gt;Is /etc/spark/conf/&lt;STRONG&gt;classpath.txt&lt;/STRONG&gt; meant to be edited?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm curious. It seems odd that they were left out, and I don't want to just blindly add them in.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you in advance!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 11:22:11 GMT</pubDate>
    <dc:creator>prismalytics</dc:creator>
    <dc:date>2022-09-16T11:22:11Z</dc:date>
    <item>
      <title>${SPARK_DIST_CLASSPATH} does not include jars in /usr/lib/hadoop/client/  ...</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/SPARK-DIST-CLASSPATH-does-not-include-jars-in-usr-lib-hadoop/m-p/52883#M58502</link>
      <description>&lt;P&gt;Hello Friends:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A quick preamble, and then a question ...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I run CDH 5.10 on CentOS6 final for personal use (1-node for Master and CM; and 4-nodes for Workers/Slaves). They are all Linux LXC/Containers.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It's been a while since I spun the cluster up, so the first thing I did was a '&lt;STRONG&gt;yum update'&lt;/STRONG&gt;&amp;nbsp;of the nodes. No issues there. The cluster is up and running. All green statuses in CM.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However, one thing that used to work but now does not, is the&amp;nbsp;&lt;STRONG&gt;pyspark&lt;/STRONG&gt;&amp;nbsp;command. Whe I run now, I get the following exception:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The jar file for that class is:&lt;/P&gt;&lt;PRE&gt;&lt;FONT color="#3366FF"&gt;/usr/lib/hadoop/client&lt;/FONT&gt;/&lt;FONT color="#FF6600"&gt;hadoop-common.jar&lt;/FONT&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;After troubleshooting -- &lt;EM&gt;again it's been a while since I used the cluster, so some things may have changed&lt;/EM&gt; -- I determined that the&amp;nbsp;&lt;STRONG&gt;SPARK_DIST_CLASSPATH&lt;/STRONG&gt;&amp;nbsp;environment variable was getting set, but did not contain&amp;nbsp;&lt;EM&gt;&lt;STRONG&gt;any&lt;/STRONG&gt;&lt;/EM&gt; of the jars in that directory (including, of course, the one mentioned above).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The script &lt;EM&gt;ultimately&lt;/EM&gt; responsible for setting&amp;nbsp;&lt;STRONG&gt;SPARK_DIST_CLASSPATH&lt;/STRONG&gt;&amp;nbsp;is:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;/etc/spark/conf/spark-env.sh&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and it consults the list of jars in the &lt;STRONG&gt;classpath.txt&lt;/STRONG&gt; file to do so.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sadly, that file does not have any of the jars in the aforementioned directory. I could, of course, manually add them, but I found it odd that it did not have them in the first place. It seems like an important directory of jars to have included in &lt;STRONG&gt;classpath.txt&lt;/STRONG&gt;.&amp;nbsp;&lt;SPAN&gt;(again, in /usr/lib/hadoop/client/)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;So my questions (finally) &lt;/U&gt;... =:)&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Any idea why the jars in that directory weren't included in classpath.txt? Was it perhaps just an upgrade issue?&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;LI&gt;Has anyone had to manually add the jars in that directory? (again, in /usr/lib/hadoop/client/)&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;LI&gt;Is /etc/spark/conf/&lt;STRONG&gt;classpath.txt&lt;/STRONG&gt; meant to be edited?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm curious. It seems odd that they were left out, and I don't want to just blindly add them in.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you in advance!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 11:22:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/SPARK-DIST-CLASSPATH-does-not-include-jars-in-usr-lib-hadoop/m-p/52883#M58502</guid>
      <dc:creator>prismalytics</dc:creator>
      <dc:date>2022-09-16T11:22:11Z</dc:date>
    </item>
    <item>
      <title>SOLVED: Re: ${SPARK_DIST_CLASSPATH} does not include jars in /usr/lib/hadoop/client/  ...</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/SPARK-DIST-CLASSPATH-does-not-include-jars-in-usr-lib-hadoop/m-p/52887#M58503</link>
      <description>&lt;P&gt;I ended up repairing the issue after more work.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The UI eventually revealed to me that the version of CDH (5.10) and CM (5.4) were not in sync. When I investigated why, I found&amp;nbsp;that the entry in&amp;nbsp;&lt;STRONG&gt;/etc/yum.repos.d/cloudera-manager.repo&lt;/STRONG&gt; was pegged at &amp;nbsp;CDH 5.4, so my 'yum updates' did not update CM (though it updated everything else). So that made sense.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I updated the repo file, yum updated CM, and restarted. Then I let the UI walk me through a few&amp;nbsp;upgrades and corrections of stale states. So I unfortunately don't know where the fix came. =:) But basically we can say that &lt;STRONG&gt;classpath.txt&lt;/STRONG&gt; hadn't been updated properly. Now it has the correct entries.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm glad I didn't brute-force things (not my style anyway). I doubt this one-off issue will help anyone, but who knows. =:)&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2017 03:04:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/SPARK-DIST-CLASSPATH-does-not-include-jars-in-usr-lib-hadoop/m-p/52887#M58503</guid>
      <dc:creator>prismalytics</dc:creator>
      <dc:date>2017-03-30T03:04:53Z</dc:date>
    </item>
  </channel>
</rss>

