- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
External Hive Partitioned table, is empty!!
- Labels:
-
Apache Hadoop
-
Apache Hive
Created ‎01-10-2017 09:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
Im trying to create an external hive partitioned table which location points to an HDFS location.This HDFS location get appended every time i run my spark streaming application, so my hive table appends too.
Kafka >> Spark Streaming >> HDFS >> Hive External Table.
I could do the above flow smoothly with a non partitioned table, but when i want to add partitions to my external table i'm not able to get data in to my hive external table whatsoever.
I have tried the below after creating the external table, but still the problem persists.
<code>CREATE EXTERNAL TABLE user ( userId BIGINT, type INT, level TINYINT, date String ) PARTITIONED BY (date String) LOCATION '/external_table_path';
<code>ALTER TABLE user ADD PARTITION(date='2010-02-22');
I tried the above fix but the things haven't changed.
What is the best work around to get my appending HDFS data in to an external Partition table??
Created ‎01-10-2017 09:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Neeraj Sabharwal, your ALTER TABLE statement should work. One question: Do you place the data for partition in a subdirectory with the partition name?
Hive partitions exist as subdirectories. For example, your user table should have a structure similar to this:
/external_table_path/date=2010-02-22 /external_table_path/date=2010-02-23 /external_table_path/date=2010-02-24
And so on.
The ALTER TABLE statement will create the directories as well as adding the partition details to the Hive metastore. Once the partitions are created you can simply drop the right file/s in the right directory.
Cheers, Steven.
Created ‎01-11-2017 05:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i'm not touching the data in the HDFS. it has created a seperate folder for the date column on using alter statement.
Created ‎02-08-2017 10:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Want to point out couple of things..
- date is a hive key word. To have column called date you need to say `date` string,.
- If you specify that a table is partitioned ( in your cased partitioned by (`date` string) ), that partition is implicitly accepted as the last column of the table. No need to specify it in the schema again.
CREATE EXTERNAL TABLE user ( userId BIGINT, type INT, level TINYINT )PARTITIONED BY (`date` String)LOCATION '/external_table_path';
As @Steven O'Neill mentioned, when ingesting data into HDFS, you should create sub-directories like date=2016-01-01 under your table location. Then your alter table statement will work.
