Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to use lzo codec

avatar
Expert Contributor

I'm trying to get LZO compression to work on our HDP 2.3.2 cluster and getting nowhere. Here's what I've done:

- Installed the hadooplzo and hadoop-lzo-native RPMs

- Made the documented changes to add the codec and the lzo class spec to core-site.xml

When I try to run a job thusly:

yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /path/to/lzofiles

It tells me:

[hirschs@sees24-lin ~]$ yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /xxxx/yyy
16/10/28 16:44:56 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
        at java.lang.Runtime.loadLibrary0(Runtime.java:849)
        at java.lang.System.loadLibrary(System.java:1088)
        at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
        at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
        at com.hadoop.compression.lzo.LzoIndexer.<init>(LzoIndexer.java:36)
        at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:134)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
16/10/28 16:44:56 ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop
16/10/28 16:44:57 INFO lzo.LzoIndexer: LZO Indexing directory /xxxxx/yyyyy...
16/10/28 16:44:57 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file hdfs://correct_path_to_file, size 1.08 GB...
16/10/28 16:44:57 INFO compress.LzoCodec: Bridging org.apache.hadoop.io.compress.LzoCodec to com.hadoop.compression.lzo.LzoCodec.
16/10/28 16:44:57 ERROR lzo.LzoIndexer: Error indexing hdfs://correct_path_to_file
java.io.IOException: Could not find codec for file hdfs://correct_path_to_file - you may need to add the LZO codec to your io.compression.codecs configuration in core-site.xml
        at com.hadoop.compression.lzo.LzoIndex.createIndex(LzoIndex.java:212)
        at com.hadoop.compression.lzo.LzoIndexer.indexSingleFile(LzoIndexer.java:117)
        at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:98)
        at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:86)
        at com.hadoop.compression.lzo.LzoIndexer.index(LzoIndexer.java:52)
        at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:137)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

I get the feeling I'm missing a step somewhere. The shared libraries appear to be in place:

[hirschs@sees24-lin native]$ rpm -ql hadoop-lzo-native
/usr/hdp/current/share/lzo/0.6.0/lib/native
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.a
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.la
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0.0.0
/usr/hdp/current/share/lzo/0.6.0/lib/native/docs

In core-site.xml:

    <property>
      <name>io.compression.codecs</name>
      <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec,com.hadoop.compression.lzo.LzoCodec</value>
    </property>

In hdfs-site.xml:

   <property>
      <name>io.compression.codec.lzo.class</name>
      <value>com.hadoop.compression.lzo.LzoCodec</value>
    </property>

What more do I need to do in order for this to run?

Even a guess would be helpful at this point.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.

View solution in original post

1 REPLY 1

avatar
Expert Contributor

I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.