Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Testing Compression ration in HDFS.

Highlighted

Testing Compression ration in HDFS.

Explorer

Hi,

 

How can I compress data which is in HDFS?

I have data in a  CVS format I want to convert it to JSON format and then compress it. I want to test different compression like

 

Default,Gzip,BZip2,Deflate,Snappy,Lz4

 

I am kind of lost, can some one please help me. 

 

Thanks

Jay

1 REPLY 1
Highlighted

Re: Testing Compression ration in HDFS.

Master Guru
You can compress output when using MapReduce; For instance, if you use Hive, you can use this guide to go about it: https://cwiki.apache.org/confluence/display/Hive/CompressedStorage and http://spryinc.com/blog/compression-hive

Your JSON transformation is a separate objective, however, and perhaps http://ottomata.org/tech/too-many-hive-json-serdes/ will help you proceed on that end.
Don't have an account?
Coming from Hortonworks? Activate your account here