Created 09-14-2017 06:24 AM
Fill in the details...
Created 09-15-2017 06:50 AM
Check this blog. You have a detailed comparison between ORC and Parquet.
Other than that there are very in terms of use case. But in the future most I believe most of the improvements are being developed based on ORC i believe.
1. Many of the performance improvements provided in the Stinger initiative are dependent on features of the ORC format including block level index for each column. This leads to potentially more efficient I/O allowing Hive to skip reading entire blocks of data if it determines predicate values are not present there. Also the Cost Based Optimizer has the ability to consider column level metadata present in ORC files in order to generate the most efficient graph.
2. ACID transactions are only possible when using ORC as the file format
Created 09-15-2017 06:50 AM
Check this blog. You have a detailed comparison between ORC and Parquet.
Other than that there are very in terms of use case. But in the future most I believe most of the improvements are being developed based on ORC i believe.
1. Many of the performance improvements provided in the Stinger initiative are dependent on features of the ORC format including block level index for each column. This leads to potentially more efficient I/O allowing Hive to skip reading entire blocks of data if it determines predicate values are not present there. Also the Cost Based Optimizer has the ability to consider column level metadata present in ORC files in order to generate the most efficient graph.
2. ACID transactions are only possible when using ORC as the file format
Created 09-15-2017 06:51 AM
Hope it helps! If so then please accept it as best answer!
Created 09-15-2017 09:28 AM
Thanks....