Support Questions
Find answers, ask questions, and share your expertise

Is it possible to filter files for split calculation while launching Hive Query ?

Highlighted

Is it possible to filter files for split calculation while launching Hive Query ?

New Contributor

I have an external Hive table which is partitioned by date where new files would be created, however, when I launch a query on this table I want to filter out certain files i.e do not include as part of the InputSplit while launching the job. I tried my own InputFormat and excluded the files matching a certain pattern in the split calculation, but didn't have any effect. Kindly let me know is there a way to achieve this?

1 REPLY 1

Re: Is it possible to filter files for split calculation while launching Hive Query ?

New Contributor

My Bad, the split calculation works as expected there was a class loading conflict at runtime as I had also named my InputFormat as org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat which already existed in the hive-exec jar.