Support Questions

Find answers, ask questions, and share your expertise

this version of libhadoop was built without snappy support.

Contributor

Hi,

I hope it's the right place to ask the following question 🙂

I try to put in hdfs a file with snappy compression. I write a Java code for that and when I try to run it on my cluster I got the following exception:

Exception in thread "main" java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.

at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65)

at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:134)

at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:150)

at org.apache.hadoop.io.compress.CompressionCodec$Util.createOutputStreamWithCodecPool(CompressionCodec.java:131)

at org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:99)

Apparently the snappy library is not available... I check on the os with the following cmd "rpm -qa | less | grep snappy" and snappy and snappy-devel is present.

In the configuration of hdfs (core-site.xml) org.apache.hadoop.io.compress.SnappyCodec is present in the field io.compression.codecs.

Does anyone has a idea why it's not working?

Thanks in advance

1 ACCEPTED SOLUTION

Contributor

The problem is solve by making the following change in the spark config:

2840-cparkconfig.jpg

Thanks for the help guys!

View solution in original post

18 REPLIES 18

Mentor
@Michel Sumbul

please post your code.

Contributor

here's the piece of code:

Path outFile = new Path(destPathFolder.toString() + "/" + listFolder[i].getName() + "_" + listFiles[b].getName() + ".txt");
FSDataOutputStream fin = dfs.create(outFile);
Configuration conf = new Configuration();
conf.setBoolean("mapreduce.map.output.compress", true);
conf.set("mapreduce.map.output.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec");
CompressionCodecFactory codecFactory = new CompressionCodecFactory(conf);
CompressionCodec codec = codecFactory.getCodecByName("SnappyCodec");
CompressionOutputStream compressedOutput = codec.createOutputStream(fin);
FileReader input = new FileReader(listFiles[b]);
BufferedReader bufRead = new BufferedReader(input);
String myLine = null;
while ((myLine = bufRead.readLine()) != null) {
if (!myLine.isEmpty()) {
compressedOutput.write(myLine.getBytes());
compressedOutput.write('\n'); } }
compressedOutput.flush();
compressedOutput.close();

Mentor
@Michel Sumbul

I don't remember setting CompressionCodecFactory explicitly. Just let configuration do its magic so remove the following

CompressionCodecFactory codecFactory = new CompressionCodecFactory(conf);
CompressionCodec codec = codecFactory.getCodecByName("SnappyCodec");
CompressionOutputStream compressedOutput = codec.createOutputStream(fin);

and

compressedOutput.write(myLine.getBytes());
compressedOutput.write('\n'); } }
compressedOutput.flush();
compressedOutput.close();

Contributor

Hi Artem,

Thanks for the fast reply. I don't really understand how it will work without the

compressedOutput.write(myLine.getBytes());
compressedOutput.write('\n'); } }
compressedOutput.flush();
compressedOutput.close();

How it will write to hdfs? Also if I remove the first part, when the configuration will be use?

Can you give me an example because, I don't see how it works without the part that you specify :s

Thanks in advance

Mentor
@Michel Sumbul

it's been awhile, here's an example from definitive guide book

public class StreamCompressor {

  public static void main(String[] args) throws Exception {
    String codecClassname = args[0];
    Class<?> codecClass = Class.forName(codecClassname);
    Configuration conf = new Configuration();
    CompressionCodec codec = (CompressionCodec)
      ReflectionUtils.newInstance(codecClass, conf);
    
    CompressionOutputStream out = codec.createOutputStream(System.out);
    IOUtils.copyBytes(System.in, out, 4096, false);
    out.finish();
  }
}

Contributor

@Artem Ervits I just made the test with the example of the definitve guide and I still have exactly the same error:

Exception in thread "main" java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.

Any idea?

Mentor

Mentor

Can you post your maven pom?

@Michel Sumbul See this thread. The same solution was used in the past by a client.

Contributor

@Neeraj Sabharwal

Thanks for the reply, In my case it's not a solution because when I'm doing

hadoop checknative -a

I see that the snappy lib is true located at / usr/hdp/2.3.4.0-3485/hadoop/lib/native/libsnappy.so.1.

New Contributor

We have the same problem.

> hadoop checknative -a

snappy: true /usr/hdp/2.3.4.0-3485/hadoop/lib/native/libsnappy.so.1

> rpm -qa snappy

snappy-1.1.0-3.el7.x86_64

What else can I check?

Mentor

Please confirm that you have the following property set correctly in hadoop-env.sh

Contributor

@Artem Ervits which property?

Mentor

New Contributor

I have compiled the hadoop again with snappy :

svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.5.0

mvn package -Drequire.snappy -Pdist,native,src -DskipTests -Dtar

but got the same exception again...

I have also checked the hadoop-env.sh:

export JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}

Contributor

The problem is solve by making the following change in the spark config:

2840-cparkconfig.jpg

Thanks for the help guys!

Expert Contributor

just want to add that it seems the spark.driver.extraClassPath is not necessary, at least in my case when I write file in snappy in spark using:

rdd.saveAsTextFile(path, SnappyCodec.class)

Explorer

For me adding the line below to spark-defaults.conf helped based on packages installed on my test cluster.

spark.executor.extraLibraryPath /usr/hdp/current/hadoop-client/lib/native/:/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/