Created on 10-28-2016 09:01 PM - edited 09-16-2022 03:46 AM
I'm trying to get LZO compression to work on our HDP 2.3.2 cluster and getting nowhere. Here's what I've done:
- Installed the hadooplzo and hadoop-lzo-native RPMs
- Made the documented changes to add the codec and the lzo class spec to core-site.xml
When I try to run a job thusly:
yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /path/to/lzofiles
It tells me:
[hirschs@sees24-lin ~]$ yarn jar /usr/hdp/2.3.2.0-2950/hadoop/lib/hadoop-lzo-0.6.0.2.3.2.0-2950.jar com.hadoop.compression.lzo.LzoIndexer /xxxx/yyy 16/10/28 16:44:56 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886) at java.lang.Runtime.loadLibrary0(Runtime.java:849) at java.lang.System.loadLibrary(System.java:1088) at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32) at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71) at com.hadoop.compression.lzo.LzoIndexer.<init>(LzoIndexer.java:36) at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:134) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 16/10/28 16:44:56 ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop 16/10/28 16:44:57 INFO lzo.LzoIndexer: LZO Indexing directory /xxxxx/yyyyy... 16/10/28 16:44:57 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file hdfs://correct_path_to_file, size 1.08 GB... 16/10/28 16:44:57 INFO compress.LzoCodec: Bridging org.apache.hadoop.io.compress.LzoCodec to com.hadoop.compression.lzo.LzoCodec. 16/10/28 16:44:57 ERROR lzo.LzoIndexer: Error indexing hdfs://correct_path_to_file java.io.IOException: Could not find codec for file hdfs://correct_path_to_file - you may need to add the LZO codec to your io.compression.codecs configuration in core-site.xml at com.hadoop.compression.lzo.LzoIndex.createIndex(LzoIndex.java:212) at com.hadoop.compression.lzo.LzoIndexer.indexSingleFile(LzoIndexer.java:117) at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:98) at com.hadoop.compression.lzo.LzoIndexer.indexInternal(LzoIndexer.java:86) at com.hadoop.compression.lzo.LzoIndexer.index(LzoIndexer.java:52) at com.hadoop.compression.lzo.LzoIndexer.main(LzoIndexer.java:137) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
I get the feeling I'm missing a step somewhere. The shared libraries appear to be in place:
[hirschs@sees24-lin native]$ rpm -ql hadoop-lzo-native /usr/hdp/current/share/lzo/0.6.0/lib/native /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64 /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.a /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.la /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0 /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/libgplcompression.so.0.0.0 /usr/hdp/current/share/lzo/0.6.0/lib/native/docs
In core-site.xml:
<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.SnappyCodec,com.hadoop.compression.lzo.LzoCodec</value> </property>
In hdfs-site.xml:
<property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property>
What more do I need to do in order for this to run?
Even a guess would be helpful at this point.
Created 10-31-2016 07:12 PM
I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.
Created 10-31-2016 07:12 PM
I was missing com.hadoop.compression.lzo.LzopCodec in the compression codecs listing... Grrr. The error message proved to be utterly misleading.