Created on 10-28-2016 09:01 PM - edited 09-16-2022 03:46 AM
I'm trying to get LZO compression to work on our HDP 2.3.2 cluster and getting nowhere. Here's what I've done:
- Installed the hadooplzo and hadoop-lzo-native RPMs
- Made the documented changes to add the codec and the lzo class spec to core-site.xml
When I try to run a job thusly:
yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /path/to/lzofiles
It tells me:
[hirschs@sees24-lin ~]$ yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /xxxx/yyy
16/10/28 16:44:56 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
at com.hadoop.compression.lzo.LzoIndexer.<init>(LzoIndexer.java:36)
at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
16/10/28 16:44:56 ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop
16/10/28 16:44:57 INFO lzo.LzoIndexer: LZO Indexing directory /xxxxx/yyyyy...
16/10/28 16:44:57 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file hdfs://correct_path_to_file, size 1.08 GB...
16/10/28 16:44:57 INFO compress.LzoCodec: Bridging org.apache.hadoop.io.compress.LzoCodec to com.hadoop.compression.lzo.LzoCodec.
16/10/28 16:44:57 ERROR lzo.LzoIndexer: Error indexing hdfs://correct_path_to_file
java.io.IOException: Could not find codec for file hdfs://correct_path_to_file - you may need to add the LZO codec to your io.compression.codecs configuration in core-site.xml
at com.hadoop.compression.lzo.LzoIndex.createIndex(LzoIndex.java:212)
at com.hadoop.compression.lzo.LzoIndexer.indexSingleFile(LzoIndexer.java:117)
at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:98)
at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:86)
at com.hadoop.compression.lzo.LzoIndexer.index(LzoIndexer.java:52)
at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:137)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
I get the feeling I'm missing a step somewhere. The shared libraries appear to be in place:
[hirschs@sees24-lin native]$ rpm -ql hadoop-lzo-native /usr/hdp/current/share/lzo/0.6.0/lib/native /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64 /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.a /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.la /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0 /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0.0.0 /usr/hdp/current/share/lzo/0.6.0/lib/native/docs
In core-site.xml:
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec,com.hadoop.compression.lzo.LzoCodec</value>
</property>
In hdfs-site.xml:
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
What more do I need to do in order for this to run?
Even a guess would be helpful at this point.
Created 10-31-2016 07:12 PM
I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.
Created 10-31-2016 07:12 PM
I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.