Support Questions
Find answers, ask questions, and share your expertise

Do continous insert and read operations corrupt ORC file format ?

 
1 REPLY 1

Re: Do continous insert and read operations corrupt ORC file format ?

@sivakumar sudhakarannair girijakumari

Hive has a setting exec.orc.skip.corrupt.data which is set to false by default. This value is used to determine whether to skip the corrupt data or throw an exception. This is set to false by default and the default behavior is to throw an exception. You will know when ORC is corrupt.

Read won't corrupt, but Insert, Update, Delete or a DDL statement could potentially do it, but we have no facts to be certain, just speculation until a bug is reported.

Related to your question read the following discussion:

https://community.hortonworks.com/questions/23762/help-understanding-corrupt-orc-file-in-hive.html

More to learn about ORC configurations in Hive: https://orc.apache.org/docs/hive-config.html