Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hi i am planning to took HDPCD certificate exam this week. on practice exam in amazon webservices flight_delays1.csv file contains data with header. In exam i need to remove header manually ??

avatar
New Contributor
@rich

hi i am planning to took HDPCD certificate exam this week. on practice exam in amazon webservices flight_delays1.csv file contains data with header. In exam i need to remove header manually ??

1 ACCEPTED SOLUTION

avatar

@Ramesh Raja

In the exam you may or may not be required to remove the header.

It is better to know how to do it and feel more comfortable.

To remove header in Hive use tblproperties:

Create table test(
name string,
email string
)
tblproperties("skip.header.line.count"="1");

//Now load the data into the table

To remove header in Pig:

A=load 'data.csv' using PigStorage(',');
B=FILTER A BY $0>1;

View solution in original post

3 REPLIES 3

avatar

@Ramesh Raja

In the exam you may or may not be required to remove the header.

It is better to know how to do it and feel more comfortable.

To remove header in Hive use tblproperties:

Create table test(
name string,
email string
)
tblproperties("skip.header.line.count"="1");

//Now load the data into the table

To remove header in Pig:

A=load 'data.csv' using PigStorage(',');
B=FILTER A BY $0>1;

avatar
Super Collaborator

I did the same way, load data using PIG into a bag, and FILTER the TOP row.

Good Luck

avatar
@Ramesh Raja -

Pls consider accepting the answer if this has helped you at all.

Thank you.