Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PutHDFS unable to write Snappy compressed avro files

Highlighted

PutHDFS unable to write Snappy compressed avro files

New Contributor

I've been trying to write snappy compressed avro files HDFS using PutHDFS but I've been getting the following error (see bottom of post)

I know that Snappy compression isn't enabled in NiFi and I followed the instructions from this post:

https://community.hortonworks.com/articles/71719/using-snappy-and-other-compressions-with-nifi-hdfs....

I've done the following:

mkdir /usr/hdf/3.2.0.0-520/nifi/lib/compression

cp -r /usr/hdp/3.0.0.0-1634/hadoop/lib/* /usr/hdf/3.2.0.0-520/nifi/lib/compression/

Then edited bootstrap.conf in Ambari

# Add compression codecs

java.arg.19=-Djava.library.path=/usr/hdf/3.2.0.0-520/nifi/lib/compression/

java.arg.20=-Djava.library.path=/usr/hdf/3.2.0.0-520/nifi/lib/compression/native/

Restarted NiFi

I've tried both java.arg.19 and 20 separately and still get the error.

I am using the latest HDP and HDF. Can someone please provide instructions for enabling Snappy compression for NiFi. Als, are there other components such as Hive and Spark that require special steps to enable Snappy compression? This should be enabled out of the box really.

ERROR [Timer-Driven Process Thread-8] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=991432da-a2cd-1c5$ java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method) at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63) at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:136) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:150) at org.apache.hadoop.io.compress.CompressionCodec$Util.createOutputStreamWithCodecPool(CompressionCodec.java:131) at org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:102) at org.apache.nifi.processors.hadoop.PutHDFS$1$1.process(PutHDFS.java:312) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2211) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2179) at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:299) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1942) at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:229) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)

at org.apache.nifi.processors.hadoop.PutHDFS$1$1.process(PutHDFS.java:312) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2211) at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2179) at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:299) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1942) at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:229) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)