Member since
02-03-2019
7
Posts
0
Kudos Received
0
Solutions
06-05-2019
08:07 AM
I found the issue for all 3 queries. Hue editor which i was using was not reflecting, I was getting the results when I ran from command line. After logging off and logging back in Hue, I was able to get the results. Thanks.
... View more
06-05-2019
07:16 AM
Yes, I am not using Impala or Spark. Only hive tables and queries. I am using below query to insert into parquet table. This oozie job needs to run daily. So every day when new partition is added, I need also need to run msck repair table? Insert into table final partition (date) select col1, col2,...dtCol from staging.
... View more
06-04-2019
01:28 PM
I am facing few issues with hive. Please provide your inputs. Thanks.
I have an hive external table (staging) in text format which points to the location where sqoop imports the file. I have another hive internal table (final) in parquet format with date level partitioning. then load data into final table using Insert into table final partition (date)
select col1, col2,...dtCol from staging.
oozie job:
step1 - sqoop to hive external table location
step2 - load to final table using insert SQL above
1. I noticed that sometimes though oozie job runs successfully and files are available in external table path, select * from staging returns 0 results. If I drop and create the table again, then it is returning data.
Any idea why? Do I need to run something to refresh the table after files are placed everytime?
This happens randomly, not every time.
2. I am seeing above issue with final table also. I can see partition folder created and file in it but select * is not returning any values. If I give show partitions, it is displaying none. This is also happening sometimes. If I drop, create and run the job again, it works. But I can do this as for internal table, all data will deleted if I drop the table.
3. if I run the rerun job more than once, I see one than one file under the partition folder in hdfs. Files names are like below. If I do select * from final, it it only returning data from first file. Wont hive return duplicate records? How I can add more data/file to the existing partition?
00000_0
00000_0_copy_1
00000_0_copy_2
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Oozie
-
HDFS