Support Questions
Find answers, ask questions, and share your expertise

Need reason for no significant change in read speeds in ORC tables.

Explorer

After converting a 10TB table A to 6 TB table B using ORC compression, I expected to see faster reads too. But even after firing numerous JOIN and SELECT and COUNT queries, either the performance is same or worse in ORC. What can be the reason for it?

2 REPLIES 2

1. ORC is columnar format file format
2. With ORC selecting selective columns will result in performance improvement else for whole row no benefit
3. Are you enabling column pruning from the tool you are using, else the tool will fetch the whole row and prune the non required columns in memory. IO from DIsk will be same. With column pruning only selective rows will be fetched from disk.

Explorer

Can you tell me more of column pruning and enabling it? Im using Hive 0.13