Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Please help ! -> ERROR SnappyCompressor: failed to load SnappyCompressor in sparkcontext

Please help ! -> ERROR SnappyCompressor: failed to load SnappyCompressor in sparkcontext

Explorer

I get the following error when I create SparkContext in standalone mode using scala class

 val sparkConfig = new SparkConf()
      .setAppName("test")
      .setMaster("local")
      .set("hive.metastore.uris", "thrift://sandbox.hortonworks.com:9083")
   
val spark = SparkSession.builder()
      .config(sparkConfig)
      .enableHiveSupport()
      .getOrCreate()

val model = PipelineModel.load("snappy model path from hdfs")

18/10/03 17:59:28 ERROR SnappyCompressor: failed to load SnappyCompressor java.lang.NoSuchFieldError: clazz at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native Method) at org.apache.hadoop.io.compress.snappy.SnappyCompressor.<clinit>(SnappyCompressor.java:57) at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:71) at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:195) at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:181) at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:111) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) at org.apache.spark.rdd.HadoopRDD$anon$1.liftedTree1$1(HadoopRDD.scala:252) at org.apache.spark.rdd.HadoopRDD$anon$1.<init>(HadoopRDD.scala:251) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 18/10/03 17:59:28 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.RuntimeException: native snappy library not available: SnappyCompressor has not been loaded. at

3 REPLIES 3
Highlighted

Re: Please help ! -> ERROR SnappyCompressor: failed to load SnappyCompressor in sparkcontext

Mentor

@sparkhadoop

Yes definitely you are missing the Compression Libraries, you will need to install Install Snappy and LZO execute on all nodes in the cluster

For Linux

sudo yum install snappy snappy-devel 

For Linux

sudo yum install lzo lzo-devel hadooplzo hadooplzo-native 

Check this official HWX install compression Libraries document

Highlighted

Re: Please help ! -> ERROR SnappyCompressor: failed to load SnappyCompressor in sparkcontext

Explorer

thanks @Geoffrey Shelton Okot for the reply!

but these are already installed . I am using HDP 2.5 hortonworks sandbox.

Highlighted

Re: Please help ! -> ERROR SnappyCompressor: failed to load SnappyCompressor in sparkcontext

Mentor

@sparkhadoop

Can you look for this property in HDFS-->Confifs-->Advanced set value to

io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec

And restart the stale configs and re-try

Don't have an account?
Coming from Hortonworks? Activate your account here