Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Apache hive: to ignore the header and footer

avatar
Rising Star

Hi,

I have data in HDFS that is output by pig. The data is stored in partition first by date and then by cust_segment. Each file under the segment has a header and footer. I wanted to load this data to hive ignoring the header and footer. I got the 'org.apache.hadoop.hive.serde2.OpenCSVSerde' to remove the header. Is there similiar serde to remove both the header and footer.

Or could you suggest an approach to remove the footer. It is a single line. Header is also single line.

Thank you.

1 REPLY 1

avatar
Super Guru
@Revathy Mourouguessane

In your Hive table properties you can specify skip.footer.line.count to remove footer from your data. If you just have one line footer, set this value to 1. You will specify this in your create table properties:

tblproperties("skip.header.line.count"="1", "skip.footer.line.count"="1");