Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Since ACID Transactions cannot be done through Parquet format in HIVE , what are the restrictions Parquet have that ORC doesn't?

avatar

Fill in the details...

1 ACCEPTED SOLUTION

avatar

@I1095

Check this blog. You have a detailed comparison between ORC and Parquet.

Other than that there are very in terms of use case. But in the future most I believe most of the improvements are being developed based on ORC i believe.

1. Many of the performance improvements provided in the Stinger initiative are dependent on features of the ORC format including block level index for each column. This leads to potentially more efficient I/O allowing Hive to skip reading entire blocks of data if it determines predicate values are not present there. Also the Cost Based Optimizer has the ability to consider column level metadata present in ORC files in order to generate the most efficient graph.

2. ACID transactions are only possible when using ORC as the file format

View solution in original post

3 REPLIES 3

avatar

@I1095

Check this blog. You have a detailed comparison between ORC and Parquet.

Other than that there are very in terms of use case. But in the future most I believe most of the improvements are being developed based on ORC i believe.

1. Many of the performance improvements provided in the Stinger initiative are dependent on features of the ORC format including block level index for each column. This leads to potentially more efficient I/O allowing Hive to skip reading entire blocks of data if it determines predicate values are not present there. Also the Cost Based Optimizer has the ability to consider column level metadata present in ORC files in order to generate the most efficient graph.

2. ACID transactions are only possible when using ORC as the file format

avatar

Hope it helps! If so then please accept it as best answer!

avatar

Thanks....