Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Unable to use lzo codec

avatar
Expert Contributor

I'm trying to get LZO compression to work on our HDP 2.3.2 cluster and getting nowhere. Here's what I've done:

- Installed the hadooplzo and hadoop-lzo-native RPMs

- Made the documented changes to add the codec and the lzo class spec to core-site.xml

When I try to run a job thusly:

yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /path/to/lzofiles

It tells me:

[hirschs@sees24-lin ~]$ yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /xxxx/yyy
16/10/28 16:44:56 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
        at java.lang.Runtime.loadLibrary0(Runtime.java:849)
        at java.lang.System.loadLibrary(System.java:1088)
        at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
        at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
        at com.hadoop.compression.lzo.LzoIndexer.<init>(LzoIndexer.java:36)
        at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:134)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
16/10/28 16:44:56 ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop
16/10/28 16:44:57 INFO lzo.LzoIndexer: LZO Indexing directory /xxxxx/yyyyy...
16/10/28 16:44:57 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file hdfs://correct_path_to_file, size 1.08 GB...
16/10/28 16:44:57 INFO compress.LzoCodec: Bridging org.apache.hadoop.io.compress.LzoCodec to com.hadoop.compression.lzo.LzoCodec.
16/10/28 16:44:57 ERROR lzo.LzoIndexer: Error indexing hdfs://correct_path_to_file
java.io.IOException: Could not find codec for file hdfs://correct_path_to_file - you may need to add the LZO codec to your io.compression.codecs configuration in core-site.xml
        at com.hadoop.compression.lzo.LzoIndex.createIndex(LzoIndex.java:212)
        at com.hadoop.compression.lzo.LzoIndexer.indexSingleFile(LzoIndexer.java:117)
        at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:98)
        at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:86)
        at com.hadoop.compression.lzo.LzoIndexer.index(LzoIndexer.java:52)
        at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:137)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

I get the feeling I'm missing a step somewhere. The shared libraries appear to be in place:

[hirschs@sees24-lin native]$ rpm -ql hadoop-lzo-native
/usr/hdp/current/share/lzo/0.6.0/lib/native
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.a
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.la
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0
/usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0.0.0
/usr/hdp/current/share/lzo/0.6.0/lib/native/docs

In core-site.xml:

    <property>
      <name>io.compression.codecs</name>
      <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec,com.hadoop.compression.lzo.LzoCodec</value>
    </property>

In hdfs-site.xml:

   <property>
      <name>io.compression.codec.lzo.class</name>
      <value>com.hadoop.compression.lzo.LzoCodec</value>
    </property>

What more do I need to do in order for this to run?

Even a guess would be helpful at this point.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.

View solution in original post

1 REPLY 1

avatar
Expert Contributor

I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.