<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66947#M77676</link>
    <description>&lt;P&gt;Can you please confirm that you are using spark-shell as opposed to spark-submit&lt;/P&gt;</description>
    <pubDate>Thu, 03 May 2018 21:48:10 GMT</pubDate>
    <dc:creator>xhadoop</dc:creator>
    <dc:date>2018-05-03T21:48:10Z</dc:date>
    <item>
      <title>spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66735#M77674</link>
      <description>&lt;P&gt;This seems like a really simple case of wrong classpaths, and the wrong version of snappy being brought in, but I can't work out how/why!&lt;/P&gt;&lt;P&gt;I have an upgraded CDH5.14.2 (upgraded from 5.13.2), spark2.3.0(upgraded from 2.2.0 cloudera2, including the csd)&lt;/P&gt;&lt;P&gt;I do the following in a spark2-shell;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;val df = spark.read.parquet("/inputdata/parquet/parquet*snappy*")&lt;/P&gt;&lt;P&gt;scala&amp;gt; df.show&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And then just get the following error.... Any ideas how to debug/fix?&lt;/P&gt;&lt;P&gt;I did all this on an offline platform, but have now replicated on a fresh online install.&lt;/P&gt;&lt;P&gt;Help!&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;WARN scheduler.TaskSetManager: Lost task 0.0 in stage 3.0 (TID 4, fozzy, executor 1): java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.uncompressedLength(Ljava/nio/ByteBuffer;II)I&lt;BR /&gt;at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)&lt;BR /&gt;at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:561)&lt;BR /&gt;at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)&lt;BR /&gt;at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)&lt;BR /&gt;at java.io.DataInputStream.readFully(DataInputStream.java:195)&lt;BR /&gt;at java.io.DataInputStream.readFully(DataInputStream.java:169)&lt;BR /&gt;at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:204)&lt;BR /&gt;at parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary.&amp;lt;init&amp;gt;(PlainValuesDictionary.java:237)&lt;BR /&gt;at parquet.column.Encoding$1.initDictionary(Encoding.java:100)&lt;BR /&gt;at parquet.column.Encoding$4.initDictionary(Encoding.java:149)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.&amp;lt;init&amp;gt;(VectorizedColumnReader.java:114)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:312)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:258)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:161)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106)&lt;BR /&gt;at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.scan_nextBatch$(Unknown Source)&lt;BR /&gt;at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)&lt;BR /&gt;at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)&lt;BR /&gt;at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)&lt;BR /&gt;at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)&lt;BR /&gt;at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)&lt;BR /&gt;at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)&lt;BR /&gt;at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)&lt;BR /&gt;at org.apache.spark.scheduler.Task.run(Task.scala:109)&lt;BR /&gt;at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;18/04/25 21:11:12 ERROR scheduler.TaskSetManager: Task 0 in stage 3.0 failed 4 times; aborting job&lt;BR /&gt;org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 7, fozzy, executor 1): java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.uncompressedLength(Ljava/nio/ByteBuffer;II)I&lt;BR /&gt;at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)&lt;BR /&gt;at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:561)&lt;BR /&gt;at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)&lt;BR /&gt;at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)&lt;BR /&gt;at java.io.DataInputStream.readFully(DataInputStream.java:195)&lt;BR /&gt;at java.io.DataInputStream.readFully(DataInputStream.java:169)&lt;BR /&gt;at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:204)&lt;BR /&gt;at parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary.&amp;lt;init&amp;gt;(PlainValuesDictionary.java:237)&lt;BR /&gt;at parquet.column.Encoding$1.initDictionary(Encoding.java:100)&lt;BR /&gt;at parquet.column.Encoding$4.initDictionary(Encoding.java:149)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.&amp;lt;init&amp;gt;(VectorizedColumnReader.java:114)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:312)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:258)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:161)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106)&lt;BR /&gt;at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.scan_nextBatch$(Unknown Source)&lt;BR /&gt;at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)&lt;BR /&gt;at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)&lt;BR /&gt;at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)&lt;BR /&gt;at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)&lt;BR /&gt;at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)&lt;BR /&gt;at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)&lt;BR /&gt;at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)&lt;BR /&gt;at org.apache.spark.scheduler.Task.run(Task.scala:109)&lt;BR /&gt;at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;&lt;P&gt;Driver stacktrace:&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1599)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1587)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1586)&lt;BR /&gt;at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)&lt;BR /&gt;at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1586)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)&lt;BR /&gt;at scala.Option.foreach(Option.scala:257)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)&lt;BR /&gt;at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820)&lt;BR /&gt;at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1769)&lt;BR /&gt;at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1758)&lt;BR /&gt;at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)&lt;BR /&gt;at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)&lt;BR /&gt;at org.apache.spark.SparkContext.runJob(SparkContext.scala:2027)&lt;BR /&gt;at org.apache.spark.SparkContext.runJob(SparkContext.scala:2048)&lt;BR /&gt;at org.apache.spark.SparkContext.runJob(SparkContext.scala:2067)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:363)&lt;BR /&gt;at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)&lt;BR /&gt;at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2484)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2484)&lt;BR /&gt;at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253)&lt;BR /&gt;at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)&lt;BR /&gt;at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252)&lt;BR /&gt;at org.apache.spark.sql.Dataset.head(Dataset.scala:2484)&lt;BR /&gt;at org.apache.spark.sql.Dataset.take(Dataset.scala:2698)&lt;BR /&gt;at org.apache.spark.sql.Dataset.showString(Dataset.scala:254)&lt;BR /&gt;at org.apache.spark.sql.Dataset.show(Dataset.scala:723)&lt;BR /&gt;at org.apache.spark.sql.Dataset.show(Dataset.scala:682)&lt;BR /&gt;at org.apache.spark.sql.Dataset.show(Dataset.scala:691)&lt;BR /&gt;... 49 elided&lt;BR /&gt;Caused by: java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.uncompressedLength(Ljava/nio/ByteBuffer;II)I&lt;BR /&gt;at org.xerial.snappy.SnappyNative.uncompressedLength(Native Method)&lt;BR /&gt;at org.xerial.snappy.Snappy.uncompressedLength(Snappy.java:561)&lt;BR /&gt;at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)&lt;BR /&gt;at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)&lt;BR /&gt;at java.io.DataInputStream.readFully(DataInputStream.java:195)&lt;BR /&gt;at java.io.DataInputStream.readFully(DataInputStream.java:169)&lt;BR /&gt;at parquet.bytes.BytesInput$StreamBytesInput.toByteArray(BytesInput.java:204)&lt;BR /&gt;at parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary.&amp;lt;init&amp;gt;(PlainValuesDictionary.java:237)&lt;BR /&gt;at parquet.column.Encoding$1.initDictionary(Encoding.java:100)&lt;BR /&gt;at parquet.column.Encoding$4.initDictionary(Encoding.java:149)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedColumnReader.&amp;lt;init&amp;gt;(VectorizedColumnReader.java:114)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:312)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:258)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.nextKeyValue(VectorizedParquetRecordReader.java:161)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)&lt;BR /&gt;at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:106)&lt;BR /&gt;at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.scan_nextBatch$(Unknown Source)&lt;BR /&gt;at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)&lt;BR /&gt;at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)&lt;BR /&gt;at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253)&lt;BR /&gt;at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)&lt;BR /&gt;at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)&lt;BR /&gt;at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)&lt;BR /&gt;at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)&lt;BR /&gt;at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)&lt;BR /&gt;at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)&lt;BR /&gt;at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)&lt;BR /&gt;at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)&lt;BR /&gt;at org.apache.spark.scheduler.Task.run(Task.scala:109)&lt;BR /&gt;at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt;at java.lang.Thread.run(Thread.java:748)&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:08:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66735#M77674</guid>
      <dc:creator>moopoo</dc:creator>
      <dc:date>2022-09-16T13:08:42Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66942#M77675</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Support fixed it for me :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;The workaround for this is the following:&amp;nbsp;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;copy snappy-java-1.1.4.jar to /opt/cloudera/parcels/SPARK2/lib/spark2/jars/ on each node where such executors are running. That can be downloaded from &lt;A href="http://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.4/snappy-java-1.1.4.jar" target="_blank"&gt;http://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.4/snappy-java-1.1.4.jar&lt;/A&gt;.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Tested - confirmed working&lt;/DIV&gt;</description>
      <pubDate>Thu, 03 May 2018 15:01:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66942#M77675</guid>
      <dc:creator>ChrisV</dc:creator>
      <dc:date>2018-05-03T15:01:43Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66947#M77676</link>
      <description>&lt;P&gt;Can you please confirm that you are using spark-shell as opposed to spark-submit&lt;/P&gt;</description>
      <pubDate>Thu, 03 May 2018 21:48:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66947#M77676</guid>
      <dc:creator>xhadoop</dc:creator>
      <dc:date>2018-05-03T21:48:10Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66948#M77677</link>
      <description>&lt;P&gt;Hiya, yes it was spark2-shell.&lt;/P&gt;&lt;P&gt;&lt;SPAN class=""&gt;The answer ChrisV&amp;nbsp;has given is the fix, I'm so surprised&amp;nbsp;the spark2.3.0 parcel didn't have this included!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 03 May 2018 22:06:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66948#M77677</guid>
      <dc:creator>moopoo</dc:creator>
      <dc:date>2018-05-03T22:06:43Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66949#M77678</link>
      <description>Perfect! Thank you Chris, definitely solved. I thought I had caused it, but obviously based on this fix the spark2.3.0 parcel is just incomplete.</description>
      <pubDate>Thu, 03 May 2018 22:05:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66949#M77678</guid>
      <dc:creator>moopoo</dc:creator>
      <dc:date>2018-05-03T22:05:02Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66967#M77679</link>
      <description>&lt;P&gt;Glad I could help.&lt;/P&gt;</description>
      <pubDate>Fri, 04 May 2018 10:49:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/66967#M77679</guid>
      <dc:creator>ChrisV</dc:creator>
      <dc:date>2018-05-04T10:49:39Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/67642#M77680</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This works also without copying the jar to all nodes:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;wget&amp;nbsp;http://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.4/snappy-java-1.1.4.jar
spark2-shell --jars snappy-java-1.1.4.jar --conf spark.executor.userClassPathFirst=true --conf spark.executor.extraClassPath=snappy-java-1.1.4.jar&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 24 May 2018 13:23:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/67642#M77680</guid>
      <dc:creator>Hersh</dc:creator>
      <dc:date>2018-05-24T13:23:44Z</dc:date>
    </item>
    <item>
      <title>Re: spark2 upgrade to 2.3.0 from 2.2.0 wont read or write snappy compressed parquet</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/78034#M77681</link>
      <description>&lt;P&gt;According to&amp;nbsp;&lt;A href="https://www.cloudera.com/documentation/spark2/latest/topics/spark2_known_issues.html#KI_spark2_CDH-67889" target="_blank"&gt;https://www.cloudera.com/documentation/spark2/latest/topics/spark2_known_issues.html#KI_spark2_CDH-67889&lt;/A&gt;, the resolution is to u&lt;SPAN&gt;pgrade to CDS 2.3 Release 3, which contains the fix.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 02 Aug 2018 18:36:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/spark2-upgrade-to-2-3-0-from-2-2-0-wont-read-or-write-snappy/m-p/78034#M77681</guid>
      <dc:creator>cappaberra</dc:creator>
      <dc:date>2018-08-02T18:36:35Z</dc:date>
    </item>
  </channel>
</rss>

