Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Aggregate multiple text files into one table in Hive

avatar
Rising Star

Hi experts,

I've 100 text files in HFDS and I want to aggregate all of them into one big table in Hive (Having the Date as Key). How can I load this multiple files to one table created in hive? Thanks!

1 ACCEPTED SOLUTION

avatar
Super Guru

@Pedro Rodgers

If schema type is same on all the 100 text files then better to create a hive external table since you already have those files on HDFS.

Example: If you have all the files under "/user/test/dummy/data" directory than run below command to create the external hive table and point it to the hdfs location.

CREATE EXTERNAL TABLE user(
  userId BIGINT,
  type INT,
  level TINYINT,
  date String
)
COMMENT 'User Infomation'
PARTITIONED BY (date String)
LOCATION '/user/test/dummy/data';

Then, create the folder date=2011-11-11 inside /user/test/dummy/data/

And put the data files of date 2011-11-11 into the folder, Once you done you also need to add the partition in the hive metastore.

ALTER TABLE user ADD PARTITION(date='2011-11-11');

View solution in original post

1 REPLY 1

avatar
Super Guru

@Pedro Rodgers

If schema type is same on all the 100 text files then better to create a hive external table since you already have those files on HDFS.

Example: If you have all the files under "/user/test/dummy/data" directory than run below command to create the external hive table and point it to the hdfs location.

CREATE EXTERNAL TABLE user(
  userId BIGINT,
  type INT,
  level TINYINT,
  date String
)
COMMENT 'User Infomation'
PARTITIONED BY (date String)
LOCATION '/user/test/dummy/data';

Then, create the folder date=2011-11-11 inside /user/test/dummy/data/

And put the data files of date 2011-11-11 into the folder, Once you done you also need to add the partition in the hive metastore.

ALTER TABLE user ADD PARTITION(date='2011-11-11');