<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/278370#M82955</link>
    <description>&lt;P&gt;what is the exact solution?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"I solved this issue by using spark 2.3.1 jars under /usr/hdp/current/spark2-client/ from the HDP3.0 cluster."&lt;BR /&gt;I get this on spark-submit. Are you saying that i should pass --jars&amp;nbsp;/usr/hdp/current/spark2-client/*.jar ?&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 26 Sep 2019 20:11:43 GMT</pubDate>
    <dc:creator>charles2588</dc:creator>
    <dc:date>2019-09-26T20:11:43Z</dc:date>
    <item>
      <title>HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177616#M82949</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;My spark structured streaming jobs working in HDP2.6 failed in HDP3.0:&lt;/P&gt;&lt;PRE&gt;java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
    at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
    at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
    at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:3268)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3313)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:85)
    at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.&amp;lt;init&amp;gt;(HadoopFileLinesReader.scala:46)
    at org.apache.spark.sql.execution.datasources.json.TextInputJsonDataSource$.readFile(JsonDataSource.scala:125)
    at org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:132)
    at org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:130)
    at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
    at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)
    at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:128)
    at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)
    at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
    at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:216)
    at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:108)
    at org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:101)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:109)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)&lt;/PRE&gt;&lt;P&gt;I did not find useful info online. Any clue is appreciated.&lt;/P&gt;</description>
      <pubDate>Thu, 30 Aug 2018 22:42:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177616#M82949</guid>
      <dc:creator>jiangok2006</dc:creator>
      <dc:date>2018-08-30T22:42:10Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177617#M82950</link>
      <description>&lt;P&gt;Per &lt;A href="https://spark.apache.org/docs/latest/building-spark.html" target="_blank"&gt;https://spark.apache.org/docs/latest/building-spark.html&lt;/A&gt;, spark 2.3.1 is built with hadoop 2.6.X by default. This is why I see my fat jar includes hadoop 2.6.5 (instead of 3.1.0) jars. HftpFileSystem has been removed in hadoop 3. I need spark 2.3.1 jars that built with hadoop 3.1.&lt;/P&gt;&lt;P&gt;On &lt;A href="https://spark.apache.org/downloads.html"&gt;https://spark.apache.org/downloads.html&lt;/A&gt;, I only see spark 2.3.1 built with hadoop 2.7. Where can I get spark 2.3.1 built with hadoop 3? Does spark 2.3.1 support hadoop 3? 
Appreciate your help.&lt;/P&gt;&lt;P&gt;[UPDATE] I solved this issue by using spark 2.3.1 jars under /usr/hdp/current/spark2-client/ from the HDP3.0 cluster. Thanks.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 30 Aug 2018 23:57:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177617#M82950</guid>
      <dc:creator>jiangok2006</dc:creator>
      <dc:date>2018-08-30T23:57:41Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177618#M82951</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18745/jiangok2006.html" nodeid="18745"&gt;@Lian Jiang&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;I'm facing the exactly same issue. I read your 'update' but I can't solve it. Could you give more details about your solution? &lt;/P&gt;&lt;P&gt;I copied the (spark*.jar) jars &lt;STRONG&gt;from&lt;/STRONG&gt; /usr/hdp/current/spark2-client/ &lt;STRONG&gt;to&lt;/STRONG&gt; /usr/hdp/3.0.0.0-1634/zeppelin/lib then restart my ambari-server and zeppelin. It does not work.&lt;/P&gt;</description>
      <pubDate>Mon, 24 Sep 2018 19:31:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177618#M82951</guid>
      <dc:creator>hayati_ibis</dc:creator>
      <dc:date>2018-09-24T19:31:27Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177619#M82952</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/87272/hayatiibis.html"&gt;Hayati İbiş&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Please copy all jar files instead of only spark*.jar. Hope this helps. Thanks.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Sep 2018 03:45:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177619#M82952</guid>
      <dc:creator>jiangok2006</dc:creator>
      <dc:date>2018-09-26T03:45:54Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177620#M82953</link>
      <description>&lt;P&gt;&lt;A rel="user" href="#"&gt;@Lian Jiang&lt;/A&gt;&lt;/P&gt;&lt;P&gt;We have the same issue, for sbt i tried to copy /usr/hdp/current/spark2-client/jars/* inside /.ivy2/cache but still doesn't work.. How did you exactyl solve it ? &lt;/P&gt;</description>
      <pubDate>Tue, 04 Jun 2019 00:17:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/177620#M82953</guid>
      <dc:creator>pierre_kieffer-</dc:creator>
      <dc:date>2019-06-04T00:17:09Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/278364#M82954</link>
      <description>&lt;P&gt;what is the exact solution?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"I solved this issue by using spark 2.3.1 jars under /usr/hdp/current/spark2-client/ from the HDP3.0 cluster."&lt;BR /&gt;I get this on spark-submit. Are you saying that i should pass --jars&amp;nbsp;/usr/hdp/current/spark2-client/*.jar ?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Sep 2019 19:19:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/278364#M82954</guid>
      <dc:creator>charles2588</dc:creator>
      <dc:date>2019-09-26T19:19:30Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/278370#M82955</link>
      <description>&lt;P&gt;what is the exact solution?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"I solved this issue by using spark 2.3.1 jars under /usr/hdp/current/spark2-client/ from the HDP3.0 cluster."&lt;BR /&gt;I get this on spark-submit. Are you saying that i should pass --jars&amp;nbsp;/usr/hdp/current/spark2-client/*.jar ?&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Sep 2019 20:11:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/278370#M82955</guid>
      <dc:creator>charles2588</dc:creator>
      <dc:date>2019-09-26T20:11:43Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/299456#M82956</link>
      <description>&lt;P&gt;I had the same problem with a fat scala jar that no longer worked after an upgrade to Cloudera 6.3.3.&lt;/P&gt;&lt;P&gt;This did the trick for me:&lt;/P&gt;&lt;P&gt;Build a thin jar instead by using the "provided" tag in the build.sbt&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;  "org.apache.spark" %% "spark-core" % "2.4.0" % "provided",
  "org.apache.spark" %% "spark-sql" % "2.4.0" % "provided",
  "org.apache.spark" %% "spark-hive" % "2.4.0" % "provided",
  "commons-httpclient" % "commons-httpclient" % "3.1",&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;The httpclient jar had to be added and I also had to force utf-8 encoding.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jul 2020 19:23:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/299456#M82956</guid>
      <dc:creator>RichardSmithONS</dc:creator>
      <dc:date>2020-07-09T19:23:38Z</dc:date>
    </item>
    <item>
      <title>Re: HDP3.0: spark structured streaming jobs working in HDP2.6 fail</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/299765#M82957</link>
      <description>&lt;P&gt;That is the reason I see my fat container containing Hadoop 2.6.5 containers (rather than 3.1.0). Expelled HftpFileSystem from Hadoop 3. I need a 2.3.1 container worked from Hadoop 3.1. I just observe Spark 2.3.1 worked from Hadoop 2.7. Where would i be able to get Spark 2.3.1 worked with Hadoop 3? Sparkle 2.3.1 backings Hadoop 3? A debt of gratitude is in order for the assistance. I tackled this issue utilizing 2.3.1 over the HDP3.0 group under/usr/hdp/current/spark2-customer/.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 13:47:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDP3-0-spark-structured-streaming-jobs-working-in-HDP2-6/m-p/299765#M82957</guid>
      <dc:creator>MarthaNeilson</dc:creator>
      <dc:date>2020-07-15T13:47:15Z</dc:date>
    </item>
  </channel>
</rss>

