<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question spark hdfs uri missing host on hdp 2.4 in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167412#M129744</link>
    <description>&lt;P&gt;I setup a small hdp 2.4 cluster with cloudbreak. After startup the spark history server fails. In the log I find:&lt;/P&gt;&lt;PRE&gt;Caused by: java.io.IOException: Incomplete HDFS URI, no host: hdfs:///spark-history&lt;/PRE&gt;&lt;P&gt;I found these entries in the spark-defaults.conf file: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;spark.history.fs.logDirectory hdfs:///spark-history&lt;/LI&gt;&lt;LI&gt;spark.eventLog.dir hdfs:///spark-history. &lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I'm using WASB storage for the cluster. I've no idea what host to set these URI's to.&lt;/P&gt;</description>
    <pubDate>Tue, 21 Mar 2017 16:44:05 GMT</pubDate>
    <dc:creator>onem</dc:creator>
    <dc:date>2017-03-21T16:44:05Z</dc:date>
    <item>
      <title>spark hdfs uri missing host on hdp 2.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167412#M129744</link>
      <description>&lt;P&gt;I setup a small hdp 2.4 cluster with cloudbreak. After startup the spark history server fails. In the log I find:&lt;/P&gt;&lt;PRE&gt;Caused by: java.io.IOException: Incomplete HDFS URI, no host: hdfs:///spark-history&lt;/PRE&gt;&lt;P&gt;I found these entries in the spark-defaults.conf file: &lt;/P&gt;&lt;UL&gt;&lt;LI&gt;spark.history.fs.logDirectory hdfs:///spark-history&lt;/LI&gt;&lt;LI&gt;spark.eventLog.dir hdfs:///spark-history. &lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I'm using WASB storage for the cluster. I've no idea what host to set these URI's to.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Mar 2017 16:44:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167412#M129744</guid>
      <dc:creator>onem</dc:creator>
      <dc:date>2017-03-21T16:44:05Z</dc:date>
    </item>
    <item>
      <title>Re: spark hdfs uri missing host on hdp 2.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167413#M129745</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Are you using wasb as the default filesystem? If so then you can check the hdfs-site for the fs.defaultFS. I assume it should look like something like this: wasb://&amp;lt;storage_account&amp;gt;/spark-history&lt;/P&gt;</description>
      <pubDate>Tue, 21 Mar 2017 16:53:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167413#M129745</guid>
      <dc:creator>Krisz</dc:creator>
      <dc:date>2017-03-21T16:53:29Z</dc:date>
    </item>
    <item>
      <title>Re: spark hdfs uri missing host on hdp 2.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167414#M129746</link>
      <description>&lt;P&gt;I changed the directory to one based on the wasb defaultFS. But the spark throws a class not find error:&lt;/P&gt;&lt;PRE&gt;java.lang.ClassNotFoundException: com.microsoft.azure.storage.blob.BlobListingDetails
&lt;/PRE&gt;&lt;P&gt;You'd expect this class to be available on a system supporting WASB... &lt;/P&gt;&lt;P&gt;Could I add some java options somewhere so it can find this class?&lt;/P&gt;&lt;P&gt;Also strange is that when I look at the setup of a HDInsight cluster, it's setup just like the default setup by Cloudbreak and there it simply works, the spark history server doesn't complain about the default hdfs:///spark-history uri&lt;/P&gt;</description>
      <pubDate>Thu, 23 Mar 2017 19:35:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167414#M129746</guid>
      <dc:creator>onem</dc:creator>
      <dc:date>2017-03-23T19:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: spark hdfs uri missing host on hdp 2.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167415#M129747</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I can reproduce your issue on HDP 2.5 as well. It seems like the spark assembly contains invalid azure storage jars. To fix this quickly I did the following:&lt;/P&gt;&lt;PRE&gt;mkdir -p /tmp/jarupdate &amp;amp;&amp;amp; cd /tmp/jarupdate
find /usr/hdp/ -name "azure-storage*.jar"
cp /usr/hdp/2.5.0.1-210/hadoop/lib/azure-storage-2.2.0.jar .
cp /usr/hdp/current/spark-historyserver/lib/spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar .
unzip azure-storage-2.2.0.jar
jar uf spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar com/
mv -f spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar /usr/hdp/current/spark-historyserver/lib/spark-assembly-1.6.3.2.5.0.1-210-hadoop2.7.3.2.5.0.1-210.jar
cd .. &amp;amp;&amp;amp; rm -rf /tmp/jarupdate

&lt;/PRE&gt;&lt;P&gt;Basically I put the desired class files into the assembly jar and updated the original jar file. Once it's done just start the history server and it should be ok.&lt;/P&gt;&lt;P&gt;I changed in the spark defaults 2 configuration:&lt;/P&gt;&lt;PRE&gt;spark.eventLog.dir = wasb://cloudbreak492@kriszwasbnorth.blob.core.windows.net/spark-history
spark.history.fs.logDirectory = wasb://cloudbreak492@kriszwasbnorth.blob.core.windows.net/spark-history&lt;/PRE&gt;&lt;P&gt;Once the history server started I was able to start the spark teragen job:&lt;/P&gt;&lt;PRE&gt;spark-submit --class com.nexr.spark.terasort.TeraGen --deploy-mode cluster --master yarn-cluster --num-executors 1 spark-terasort-0.1.jar 1G wasb://cloudbreak492@kriszwasbnorth.blob.core.windows.net/teradata
&lt;/PRE&gt;</description>
      <pubDate>Thu, 23 Mar 2017 21:55:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167415#M129747</guid>
      <dc:creator>Krisz</dc:creator>
      <dc:date>2017-03-23T21:55:25Z</dc:date>
    </item>
    <item>
      <title>Re: spark hdfs uri missing host on hdp 2.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167416#M129748</link>
      <description>&lt;P&gt;Thanks, that did the trick. I guess this is a bug. Where should I report it? A quick google search didn't give me anything useful on 'hortonworks bug report'&lt;/P&gt;</description>
      <pubDate>Fri, 24 Mar 2017 02:01:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167416#M129748</guid>
      <dc:creator>onem</dc:creator>
      <dc:date>2017-03-24T02:01:40Z</dc:date>
    </item>
    <item>
      <title>Re: spark hdfs uri missing host on hdp 2.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167417#M129749</link>
      <description>&lt;P&gt;I think this is an Ambari bug. You can report the issue here after you've registered: &lt;A href="http://issues.apache.org/jira/browse/AMBARI"&gt;http://issues.apache.org/jira/browse/AMBARI&lt;/A&gt; . Also could you please accept the answer here so others can find the solution while it's beeing fixed. Thank you.&lt;/P&gt;&lt;P&gt;Br,&lt;/P&gt;&lt;P&gt;Krisz&lt;/P&gt;</description>
      <pubDate>Fri, 24 Mar 2017 05:08:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-hdfs-uri-missing-host-on-hdp-2-4/m-p/167417#M129749</guid>
      <dc:creator>Krisz</dc:creator>
      <dc:date>2017-03-24T05:08:20Z</dc:date>
    </item>
  </channel>
</rss>

