Our Hadoop environment continues to grow rapidly, I'm always approving server purchases. I don't think the value of the data that is 6 months old is useful for what we are doing, but I'm not sure I want to delete it yet. I was researching 'hadoop archive' and planning on buying a jbod to dump the old data, but was curious how others are handling their older data in HDFS. Is there something better?
Please go through the below links may be useful in your case(converting existing data to compressed format):