Support Questions

Find answers, ask questions, and share your expertise

Spark saveAsTextFile with SnappyCodec on YARN getting : "native snappy library not available: this version of libhadoop was built without snappy support."

avatar
Super Collaborator

Hello,

I have a spark job that is run by spark-submit and at the end it saves some output using SparkContext saveAsTextFile with SnappyCodec. To test I am using sandbox 2.3.4

This is running fine using master = local[x], after following the suggestion in this thread. However Once I changed master = yarn-client I am getting the same "native snappy library not available: this version of libhadoop was built without snappy support." on YARN.

However I thought I have done all the necessary setup - any suggestions welcome!

When I check the Spark History server I can see the following in the environment:

Spark properties:

4819-2016-06-08-09-52-43-comlbglakeingestbatchsparkfile.png

4820-2016-06-08-09-52-52-comlbglakeingestbatchsparkfile.png

system properties:

4841-2016-06-08-09-54-13-comlbglakeingestbatchsparkfile.png

Furthermore in Ambari -> Yarn -> config -> Advanced yarn-env -> yarn-env template -> LD_LIBRARY_PATH:

4842-2016-06-08-09-57-55-ambari-sandbox.png

Anything else I could do to make snappy available as a compression codec on YARN?

Thank you.

1 ACCEPTED SOLUTION

avatar
Super Guru

can you check whether snappy lib is installed on the cluster nodes using command 'hadoop checknative'?

View solution in original post

5 REPLIES 5

avatar
Super Guru

can you check whether snappy lib is installed on the cluster nodes using command 'hadoop checknative'?

avatar
Super Collaborator

yes it is - as I said I was able to save in local mode.

4843-2016-06-08-10-09-33-root.png

avatar
Super Guru

can you add the following parameters in spark conf and see if it works

  spark.executor.extraClassPath /usr/hdp/current/hadoop-client/lib/snappy-java-*.jar 
  spark.executor.extraLibraryPath /usr/hdp/current/hadoop-client/native 
  spark.executor.extraJavaOptions -Djava.library.path=/usr/hdp/current/hadoop-client/lib/native/lib

avatar
Super Collaborator

Thanks @Rajkumar Singh I have tried the mapred.child.java.opts and a few of the mapreduce settings suggested by @Jitendra Yadav but at the end just adding this spark.executor.extraLibraryPath would do the job.

4844-2016-06-08-14-03-37-ambari-sandbox.png

avatar
New Contributor

spark.executor.extraLibraryPath did the job. Thanks! @David Tam and @Jitendra Yadav