Created 12-11-2015 11:41 AM
Sometimes I insert data to hive table using two ways:Hive and Hive on Tez.The HDFS output file size is twice when using hive on Tez. It take up more hdfs space.Is there any configurations to reduce the size?
Created 12-11-2015 02:50 PM
Have you looked into CompressedStorage features on Hive?
You should be able to use this (for Snappy at least):
SET hive.exec.compress.output=true; SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec; SET mapred.output.compression.type=BLOCK;
Created 12-11-2015 02:50 PM
Have you looked into CompressedStorage features on Hive?
You should be able to use this (for Snappy at least):
SET hive.exec.compress.output=true; SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec; SET mapred.output.compression.type=BLOCK;
Created 02-03-2016 03:46 PM
@Jun Chen are you still having issues with this? Can you accept best answer or provide your own solution?