Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possible to filter files for split calculation while launching Hive Query ?

Is it possible to filter files for split calculation while launching Hive Query ?

New Contributor

I have an external Hive table which is partitioned by date where new files would be created, however, when I launch a query on this table I want to filter out certain files i.e do not include as part of the InputSplit while launching the job. I tried my own InputFormat and excluded the files matching a certain pattern in the split calculation, but didn't have any effect. Kindly let me know is there a way to achieve this?

1 REPLY 1
Highlighted

Re: Is it possible to filter files for split calculation while launching Hive Query ?

New Contributor

My Bad, the split calculation works as expected there was a class loading conflict at runtime as I had also named my InputFormat as org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat which already existed in the hive-exec jar.

Don't have an account?
Coming from Hortonworks? Activate your account here