<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: CDH 5.7.0 Spark Streaming S3 Error in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41142#M29164</link>
    <description>&lt;P&gt;It looks like the AWS jar files have changed from CDH 5.4.8 to CDH 5.5.2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In CDH 5.4.8:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-1.7.14.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In CDH 5.5.2+, these replaced&amp;nbsp;aws-java-sdk-1.7.14.jar:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-core-1.10.6.jar&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-kms-1.10.6.jar&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-s3-1.10.6.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But the jets3t files are the same:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/jets3t-0.9.0.jar&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/jets3t-0.6.1.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I don't know if this has anything to do with it but the only difference is the CDH&amp;nbsp;version:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/hadoop-aws-2.6.0-cdh5.x.x.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I wonder if any of these are the problem.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Ben&lt;/P&gt;</description>
    <pubDate>Sat, 21 May 2016 15:02:54 GMT</pubDate>
    <dc:creator>benassi</dc:creator>
    <dc:date>2016-05-21T15:02:54Z</dc:date>
    <item>
      <title>CDH 5.7.0 Spark Streaming S3 Error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41138#M29162</link>
      <description>&lt;P&gt;I am testing CDH 5.7.0 and found that Spark Streaming no longer works with S3. I also found out that it doesn't work with SQS either. I tried using the AWS SDK 1.11.0 jar, but it only worked to fix the SQS problem. I also tried to get the latest jets3t 0.9.4 jar and use it. It didn't work.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Exception in thread "JobGenerator" java.lang.VerifyError: Bad type on operand stack&lt;BR /&gt;Exception Details:&lt;BR /&gt;Location:&lt;BR /&gt;org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore.copy(Ljava/lang/String;Ljava/lang/String;)V @155: invokevirtual&lt;BR /&gt;Reason:&lt;BR /&gt;Type 'org/jets3t/service/model/S3Object' (current frame, stack[4]) is not assignable to 'org/jets3t/service/model/StorageObject'&lt;BR /&gt;Current Frame:&lt;BR /&gt;bci: @155&lt;BR /&gt;flags: { }&lt;BR /&gt;locals: { 'org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore', 'java/lang/String', 'java/lang/String', 'org/jets3t/service/model/S3Object' }&lt;BR /&gt;stack: { 'org/jets3t/service/S3Service', 'java/lang/String', 'java/lang/String', 'java/lang/String', 'org/jets3t/service/model/S3Object', integer }&lt;BR /&gt;Bytecode:&lt;BR /&gt;0x0000000: b200 fcb9 0190 0100 9900 39b2 00fc bb01&lt;BR /&gt;0x0000010: 5659 b701 5713 0192 b601 5b2b b601 5b13&lt;BR /&gt;0x0000020: 0194 b601 5b2c b601 5b13 0196 b601 5b2a&lt;BR /&gt;0x0000030: b400 7db6 00e7 b601 5bb6 015e b901 9802&lt;BR /&gt;0x0000040: 002a b400 5799 0030 2ab4 0047 2ab4 007d&lt;BR /&gt;0x0000050: 2b01 0101 01b6 019b 4e2a b400 6b09 949e&lt;BR /&gt;0x0000060: 0016 2db6 019c 2ab4 006b 949e 000a 2a2d&lt;BR /&gt;0x0000070: 2cb6 01a0 b1bb 00a0 592c b700 a14e 2d2a&lt;BR /&gt;0x0000080: b400 73b6 00b0 2ab4 0047 2ab4 007d b600&lt;BR /&gt;0x0000090: e72b 2ab4 007d b600 e72d 03b6 01a4 57a7&lt;BR /&gt;0x00000a0: 000a 4e2a 2d2b b700 c7b1&lt;BR /&gt;Exception Handler Table:&lt;BR /&gt;bci [0, 116] =&amp;gt; handler: 162&lt;BR /&gt;bci [117, 159] =&amp;gt; handler: 162&lt;BR /&gt;Stackmap Table:&lt;BR /&gt;same_frame_extended(@65)&lt;BR /&gt;same_frame(@117)&lt;BR /&gt;same_locals_1_stack_item_frame(@162,Object[#139])&lt;BR /&gt;same_frame(@169)&lt;/P&gt;&lt;P&gt;at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:338)&lt;BR /&gt;at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:328)&lt;BR /&gt;at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2696)&lt;BR /&gt;at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)&lt;BR /&gt;at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2733)&lt;BR /&gt;at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2715)&lt;BR /&gt;at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:382)&lt;BR /&gt;at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)&lt;BR /&gt;at org.apache.spark.streaming.dstream.FileInputDStream.org$apache$spark$streaming$dstream$FileInputDStream$$fs(FileInputDStream.scala:297)&lt;BR /&gt;at org.apache.spark.streaming.dstream.FileInputDStream.findNewFiles(FileInputDStream.scala:198)&lt;BR /&gt;at org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:149)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:352)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:352)&lt;BR /&gt;at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:351)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:351)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:346)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:344)&lt;BR /&gt;at scala.Option.orElse(Option.scala:257)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:341)&lt;BR /&gt;at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:352)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:352)&lt;BR /&gt;at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:351)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:351)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:346)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:344)&lt;BR /&gt;at scala.Option.orElse(Option.scala:257)&lt;BR /&gt;at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:341)&lt;BR /&gt;at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:47)&lt;BR /&gt;at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115)&lt;BR /&gt;at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:114)&lt;BR /&gt;at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)&lt;BR /&gt;at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)&lt;BR /&gt;at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)&lt;BR /&gt;at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)&lt;BR /&gt;at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)&lt;BR /&gt;at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)&lt;BR /&gt;at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:114)&lt;BR /&gt;at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:248)&lt;BR /&gt;at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:246)&lt;BR /&gt;at scala.util.Try$.apply(Try.scala:161)&lt;BR /&gt;at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:246)&lt;BR /&gt;at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:181)&lt;BR /&gt;at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:87)&lt;BR /&gt;at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:86)&lt;BR /&gt;at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I hope there is a fix for this coming up soon.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Ben&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:21:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41138#M29162</guid>
      <dc:creator>benassi</dc:creator>
      <dc:date>2022-09-16T10:21:02Z</dc:date>
    </item>
    <item>
      <title>Re: CDH 5.7.0 Spark Streaming S3 Error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41139#M29163</link>
      <description>&lt;P&gt;That looks like a S3 library problem. The JVM says the bytecode itself is invalid. It is nothing to do with Spark per se.&lt;/P&gt;&lt;P&gt;CDH does not support S3 although there's no particular reason it wouldn't work if you had the right libraries in place.&lt;/P&gt;</description>
      <pubDate>Sat, 21 May 2016 14:11:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41139#M29163</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2016-05-21T14:11:10Z</dc:date>
    </item>
    <item>
      <title>Re: CDH 5.7.0 Spark Streaming S3 Error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41142#M29164</link>
      <description>&lt;P&gt;It looks like the AWS jar files have changed from CDH 5.4.8 to CDH 5.5.2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In CDH 5.4.8:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-1.7.14.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In CDH 5.5.2+, these replaced&amp;nbsp;aws-java-sdk-1.7.14.jar:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-core-1.10.6.jar&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-kms-1.10.6.jar&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/aws-java-sdk-s3-1.10.6.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But the jets3t files are the same:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/jets3t-0.9.0.jar&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/jets3t-0.6.1.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I don't know if this has anything to do with it but the only difference is the CDH&amp;nbsp;version:&lt;/P&gt;&lt;P&gt;/opt/cloudera/parcels/CDH/jars/hadoop-aws-2.6.0-cdh5.x.x.jar&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I wonder if any of these are the problem.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Ben&lt;/P&gt;</description>
      <pubDate>Sat, 21 May 2016 15:02:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41142#M29164</guid>
      <dc:creator>benassi</dc:creator>
      <dc:date>2016-05-21T15:02:54Z</dc:date>
    </item>
    <item>
      <title>Re: CDH 5.7.0 Spark Streaming S3 Error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41143#M29165</link>
      <description>&lt;P&gt;I got it to work. I found in another thread the solution. The way to access S3 has changed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;val hadoopConf = sc.hadoopConfiguration&lt;BR /&gt;hadoopConf.set("fs.s3a.access.key", accessKey)&lt;BR /&gt;hadoopConf.set("fs.s3a.secret.key", secretKey)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;val lines = ssc.textFileStream("s3a://amg-events-out/")&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers,&lt;/P&gt;&lt;P&gt;Ben&lt;/P&gt;</description>
      <pubDate>Sat, 21 May 2016 15:30:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41143#M29165</guid>
      <dc:creator>benassi</dc:creator>
      <dc:date>2016-05-21T15:30:47Z</dc:date>
    </item>
    <item>
      <title>Re: CDH 5.7.0 Spark Streaming S3 Error</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41148#M29166</link>
      <description>&lt;P&gt;Yes you will certainly need to provide access keys for S3 access to work. I don't think (?) that would be a solution to a VerifyError, which is a much lower-level error indicating corrupted builds. Yes, it's expected that AWS SDK dependencies were updated along with the new Spark version in CDH 5.7. I think the current version should depend on jets3t 0.9, which is the one you want.&lt;/P&gt;</description>
      <pubDate>Sat, 21 May 2016 17:42:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/CDH-5-7-0-Spark-Streaming-S3-Error/m-p/41148#M29166</guid>
      <dc:creator>srowen</dc:creator>
      <dc:date>2016-05-21T17:42:18Z</dc:date>
    </item>
  </channel>
</rss>

