Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Apache hive: to ignore the header and footer

Explorer

Hi,

I have data in HDFS that is output by pig. The data is stored in partition first by date and then by cust_segment. Each file under the segment has a header and footer. I wanted to load this data to hive ignoring the header and footer. I got the 'org.apache.hadoop.hive.serde2.OpenCSVSerde' to remove the header. Is there similiar serde to remove both the header and footer.

Or could you suggest an approach to remove the footer. It is a single line. Header is also single line.

Thank you.

1 REPLY 1

Super Guru
@Revathy Mourouguessane

In your Hive table properties you can specify skip.footer.line.count to remove footer from your data. If you just have one line footer, set this value to 1. You will specify this in your create table properties:

tblproperties("skip.header.line.count"="1", "skip.footer.line.count"="1");