After converting a 10TB table A to 6 TB table B using ORC compression, I expected to see faster reads too. But even after firing numerous JOIN and SELECT and COUNT queries, either the performance is same or worse in ORC. What can be the reason for it?
1. ORC is columnar format file format 2. With ORC selecting selective columns will result in performance improvement else for whole row no benefit 3. Are you enabling column pruning from the tool you are using, else the tool will fetch the whole row and prune the non required columns in memory. IO from DIsk will be same. With column pruning only selective rows will be fetched from disk.