- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Loading data into Hive table is infinite from other hive table
- Labels:
-
Apache Hive
Created 01-16-2019 04:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Issue: Loading data into Hive ORC table is infinite, I should manually kill the load process.
I am trying to load data into ORC Hive table from another Hive TEXTFILE table. Since the source files are TXT/Json, loading data first into TEXT table and then trying to load into ORC table.
Cluster: HDP 2.6.5-292
Hive version: 1.2.1000.2.6.5.0-292
Here is the Hive TEXTFILE table schema:
Create external table if not exists TEXTTable(ID bigint, DOCUMENT_ID bigint, NUM varchar(20), SUBMITTER_ID bigint, FILING string, CODE varchar(10), RECEIPTNUM varchar(20))
row format delimited fields terminated by '|'
Location '/data/3rdPartyData/Hive/ TEXTTable '
TBLPROPERTIES ('skip.header.line.count'='1');
Load Data into TEXTFILE table:
load data local inpath '/data/TextFile.txt' overwrite into table TEXTTable;
Here is the Hive ORC table schema:
Create external table if not exists ORCTable(ID bigint, DOCUMENT_ID bigint, NUM varchar(20), SUBMITTER_ID bigint, FILING TIMESTAMP, CODE varchar(10), RECEIPTNUM varchar(20))
row format delimited fields terminated by '|'
STORED as ORC
Location '/data/3rdPartyData/Hive/ ORCTable '
TBLPROPERTIES ('orc.compress'='SNAPPY');
Load data into ORC table:
Insert overwrite table ORCTable select _ID, DOCUMENT_ID, NUM, SUBMITTER_ID,from_unixtime(unix_timestamp(FILING, "yyyy-MM-dd'T'HH:mm:ss")) as FILING, CODE, RECEIPTNUM from TEXTTable;
Created 01-16-2019 09:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since it is weird behavior, I have rechecked all the memory usage and configuration and noticed that it was due to TEZ memory set to Yarn Max memory. After reducing the TEZ memory, this is fixed.
Created 01-16-2019 05:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have noticed that it is not only ORC table, it is also the same for normal table. This is happening to load data from other table whereas loading data from source file into table is fine.
Created 01-16-2019 09:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Since it is weird behavior, I have rechecked all the memory usage and configuration and noticed that it was due to TEZ memory set to Yarn Max memory. After reducing the TEZ memory, this is fixed.
