Support Questions
Find answers, ask questions, and share your expertise
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

Apache hive: to ignore the header and footer



I have data in HDFS that is output by pig. The data is stored in partition first by date and then by cust_segment. Each file under the segment has a header and footer. I wanted to load this data to hive ignoring the header and footer. I got the 'org.apache.hadoop.hive.serde2.OpenCSVSerde' to remove the header. Is there similiar serde to remove both the header and footer.

Or could you suggest an approach to remove the footer. It is a single line. Header is also single line.

Thank you.


Super Guru
@Revathy Mourouguessane

In your Hive table properties you can specify skip.footer.line.count to remove footer from your data. If you just have one line footer, set this value to 1. You will specify this in your create table properties:

tblproperties("skip.header.line.count"="1", "skip.footer.line.count"="1");