- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Aggregate multiple text files into one table in Hive
- Labels:
-
Apache Hive
Created 06-12-2016 08:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts,
I've 100 text files in HFDS and I want to aggregate all of them into one big table in Hive (Having the Date as Key). How can I load this multiple files to one table created in hive? Thanks!
Created 06-12-2016 09:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If schema type is same on all the 100 text files then better to create a hive external table since you already have those files on HDFS.
Example: If you have all the files under "/user/test/dummy/data" directory than run below command to create the external hive table and point it to the hdfs location.
CREATE EXTERNAL TABLE user( userId BIGINT, type INT, level TINYINT, date String ) COMMENT 'User Infomation' PARTITIONED BY (date String) LOCATION '/user/test/dummy/data';
Then, create the folder date=2011-11-11
inside /user/test/dummy/data/
And put the data files of date 2011-11-11 into the folder, Once you done you also need to add the partition in the hive metastore.
ALTER TABLE user ADD PARTITION(date='2011-11-11');
Created 06-12-2016 09:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If schema type is same on all the 100 text files then better to create a hive external table since you already have those files on HDFS.
Example: If you have all the files under "/user/test/dummy/data" directory than run below command to create the external hive table and point it to the hdfs location.
CREATE EXTERNAL TABLE user( userId BIGINT, type INT, level TINYINT, date String ) COMMENT 'User Infomation' PARTITIONED BY (date String) LOCATION '/user/test/dummy/data';
Then, create the folder date=2011-11-11
inside /user/test/dummy/data/
And put the data files of date 2011-11-11 into the folder, Once you done you also need to add the partition in the hive metastore.
ALTER TABLE user ADD PARTITION(date='2011-11-11');
