06-30-2018 03:04 PM
07-01-2018 01:00 AM - edited 07-01-2018 01:03 AM
Is storage is more important that scan read in your requirement , Then yes you can perfom compression on it because LZ4 is the best in my opinion interms of compression / performance when compared with snappy or zlib.
whats the cardinality on the string column ? low or high ?
07-03-2018 07:31 AM
If the cardinality is not high I would vote for dictionary encoding without compression. The dict encoding will save you a lot of space. However I dont know the internals of Kudu, where is the limit, how much distinct value can be in the dictionary.