Created on 06-04-2019 01:28 PM - edited 09-16-2022 07:25 AM
I am facing few issues with hive. Please provide your inputs. Thanks.
I have an hive external table (staging) in text format which points to the location where sqoop imports the file. I have another hive internal table (final) in parquet format with date level partitioning. then load data into final table using Insert into table final partition (date)
select col1, col2,...dtCol from staging.
oozie job:
step1 - sqoop to hive external table location
step2 - load to final table using insert SQL above
1. I noticed that sometimes though oozie job runs successfully and files are available in external table path, select * from staging returns 0 results. If I drop and create the table again, then it is returning data.
Any idea why? Do I need to run something to refresh the table after files are placed everytime?
This happens randomly, not every time.
2. I am seeing above issue with final table also. I can see partition folder created and file in it but select * is not returning any values. If I give show partitions, it is displaying none. This is also happening sometimes. If I drop, create and run the job again, it works. But I can do this as for internal table, all data will deleted if I drop the table.
3. if I run the rerun job more than once, I see one than one file under the partition folder in hdfs. Files names are like below. If I do select * from final, it it only returning data from first file. Wont hive return duplicate records? How I can add more data/file to the existing partition?
00000_0
00000_0_copy_1
00000_0_copy_2
Created 06-05-2019 03:03 AM
Hello @csegokul ,
With read on your environment, I am assuming that you aren't using Impala or Spark for query or inserting for Parquet tables. Your show partitions is showing none because info about the partition isn't been seen by metastore. You may choose to sync it with msck repair table. But please bear in-mind that msck repair table isn't a cheap operation. So, depending on the requirement you can choose to use ALTER TABLE. For e.g. for your regular procedure for partition creation, you can use ALTER TABLE with ADD PARTITION.
Hope that helps.
Created 06-05-2019 07:16 AM
Created 06-05-2019 08:07 AM
Created 06-12-2019 11:25 PM