<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark 2.0 with GPLEXTRAS in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47264#M45284</link>
    <description>&lt;P&gt;I am happy to see that the upgrade resolved your issue. Best of luck as you continue with the project. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 10 Nov 2016 13:14:36 GMT</pubDate>
    <dc:creator>cjervis</dc:creator>
    <dc:date>2016-11-10T13:14:36Z</dc:date>
    <item>
      <title>Spark 2.0 with GPLEXTRAS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47043#M45282</link>
      <description>&lt;P&gt;Good Afternoon,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We're giving the Spark 2.0 beta a try on a cluster running CDH 5.9 that has GPLEXTRAS deployed. Under Spark 1.6 we haven't noticed any problems, but with 2.0 the RDD interface for reading text files fails because it looks like the Lzo JARs and native libraries (from GPLEXTRAS) aren't on the classpath. For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;scala&amp;gt; sc.textFile("/any/path").count()
java.lang.RuntimeException: Error in configuring object
  at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
  at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
  at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
  at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:185)
  at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:198)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1933)
  at org.apache.spark.rdd.RDD.count(RDD.scala:1128)
  ... 48 elided
Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
  ... 63 more
Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
  at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:135)
  at org.apache.hadoop.io.compress.CompressionCodecFactory.&amp;lt;init&amp;gt;(CompressionCodecFactory.java:175)
  at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
  ... 68 more
Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
  at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2105)
  at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
  ... 70 more&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm a bit stumped on how to tackle this. It looks like the &lt;FONT face="courier new,courier"&gt;SPARK_EXTRA_LIB_PATH&lt;/FONT&gt; and &lt;FONT face="courier new,courier"&gt;SPARK_DIST_CLASSPATH&lt;/FONT&gt; environment variables are where I should be looking to fix this, but these seem to be managed by CM.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any ideas?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;- Andrew&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:46:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47043#M45282</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2022-09-16T10:46:46Z</dc:date>
    </item>
    <item>
      <title>Re: Spark 2.0 with GPLEXTRAS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47261#M45283</link>
      <description>&lt;P&gt;This problem occurred with &lt;FONT face="courier new,courier"&gt;beta1&lt;/FONT&gt;. After upgrading to &lt;FONT face="courier new,courier"&gt;beta2&lt;/FONT&gt; we no longer see this problem.&lt;/P&gt;</description>
      <pubDate>Thu, 10 Nov 2016 12:11:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47261#M45283</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2016-11-10T12:11:43Z</dc:date>
    </item>
    <item>
      <title>Re: Spark 2.0 with GPLEXTRAS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47264#M45284</link>
      <description>&lt;P&gt;I am happy to see that the upgrade resolved your issue. Best of luck as you continue with the project. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 10 Nov 2016 13:14:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-2-0-with-GPLEXTRAS/m-p/47264#M45284</guid>
      <dc:creator>cjervis</dc:creator>
      <dc:date>2016-11-10T13:14:36Z</dc:date>
    </item>
  </channel>
</rss>

