Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Spark saveAsTextFile with SnappyCodec on YARN getting : "native snappy library not available: this version of libhadoop was built without snappy support."

Expert Contributor

Hello,

I have a spark job that is run by spark-submit and at the end it saves some output using SparkContext saveAsTextFile with SnappyCodec. To test I am using sandbox 2.3.4

This is running fine using master = local[x], after following the suggestion in this thread. However Once I changed master = yarn-client I am getting the same "native snappy library not available: this version of libhadoop was built without snappy support." on YARN.

However I thought I have done all the necessary setup - any suggestions welcome!

When I check the Spark History server I can see the following in the environment:

Spark properties:

4819-2016-06-08-09-52-43-comlbglakeingestbatchsparkfile.png

4820-2016-06-08-09-52-52-comlbglakeingestbatchsparkfile.png

system properties:

4841-2016-06-08-09-54-13-comlbglakeingestbatchsparkfile.png

Furthermore in Ambari -> Yarn -> config -> Advanced yarn-env -> yarn-env template -> LD_LIBRARY_PATH:

4842-2016-06-08-09-57-55-ambari-sandbox.png

Anything else I could do to make snappy available as a compression codec on YARN?

Thank you.

1 ACCEPTED SOLUTION

can you check whether snappy lib is installed on the cluster nodes using command 'hadoop checknative'?

View solution in original post

5 REPLIES 5

can you check whether snappy lib is installed on the cluster nodes using command 'hadoop checknative'?

Expert Contributor

yes it is - as I said I was able to save in local mode.

4843-2016-06-08-10-09-33-root.png

can you add the following parameters in spark conf and see if it works

  spark.executor.extraClassPath /usr/hdp/current/hadoop-client/lib/snappy-java-*.jar 
  spark.executor.extraLibraryPath /usr/hdp/current/hadoop-client/native 
  spark.executor.extraJavaOptions -Djava.library.path=/usr/hdp/current/hadoop-client/lib/native/lib

Expert Contributor

Thanks @Rajkumar Singh I have tried the mapred.child.java.opts and a few of the mapreduce settings suggested by @Jitendra Yadav but at the end just adding this spark.executor.extraLibraryPath would do the job.

4844-2016-06-08-14-03-37-ambari-sandbox.png

New Contributor

spark.executor.extraLibraryPath did the job. Thanks! @David Tam and @Jitendra Yadav

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.