- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive - issue inserting records to partitioned table in hdp 2.6.3
Created ‎01-30-2018 03:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Facing below error in HDP 2.6.3 when trying to insert overwrite data from an external table to a dynamic partitioned table. The code worked fine in HDP 2.5.3, noticing the issue post upgrade.
org.apache.hadoop.hive.ql.metadata.HiveException:Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up.
Code:
SET hive.stats.autogather = false; SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict; SET hive.exec.compress.output= true; SET mapred.output.compression.codec= org.apache.hadoop.io.compress.SnappyCodec; SET hive.mapjoin.smalltable.filesize=4000000000; SET hive.mapjoin.localtask.max.memory.usage=0.99; SET hive.auto.convert.join =true; SET hive.optimize.skewjoin = true; SET hive.exec.compress.intermediate = true; SET mapred.reduce.tasks=-1; SET hive.exec.reducers.bytes.per.reducer=1000000000; USE db_schema; DROP TABLE IF EXISTS table_name_1_temp; CREATE EXTERNAL TABLE table_name_1_temp ( col_1 STRING, col_2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION ${hdfs_location}; INSERT OVERWRITE TABLE table_name_1 partition(part_col_1) SELECT DST.col_1_parm[0] as col_1 ,DST.col_1_parm[1] as col_2 ,DST.col_2_parm[0] as col_3 ,DST.col_2_parm[1] as col_4 ,DST.col_1_parm[2] as part_col_1 FROM ( SELECT split(col_1,'\001') as col_1_parm, split(col_2,'\001') as col_2_parm FROM table_name_1_temp) DST;
Error:
18/01/30 01:44:07 [load-dynamic-partitions-0]: ERROR metadata.Hive: Exception when loading partition with parameters partPath=hdfs://..../.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000/part_col_1=1, table=table_name_1, partSpec={part_col_1=1}, loadFileType=REPLACE_ALL, listBucketingEnabled=false, isAcid=false, hasFollowingStatsTask=false org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up. at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3380) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1518) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1799) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1790) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 18/01/30 01:44:07 [main]: DEBUG metadata.Hive: Cancelling 2 dynamic loading tasks Failed with exception Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000 18/01/30 01:44:07 [main]: ERROR exec.Task: Failed with exception Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000 org.apache.hadoop.hive.ql.metadata.HiveException: Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000 at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1841) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:449) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1756) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1829) ... 20 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up. at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3380) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1518) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1799) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1790) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Created ‎01-31-2018 04:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There is an already existing partition at this location hdfs://....../table_name_1/part_col_1=1 or may be a sub-directory and before loading the new partition, cleanup of destination directory is failing.
Can you try deleting those files manually and then try Insert Overwrite. Also, share the listing for this hdfs path.
Created ‎02-01-2018 04:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I checked permission and the contents of the hdfs directory, it has all required permissions and also empty. Even though if the partition exists in the query I'm trying to overwrite it.
Created ‎04-18-2018 03:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sam,
Recently we upgraded our cluster from HDP2.5.6 to HDP2.6.4 and I am getting the similar error. With the current version Hive is more stricter on INSERT OVERWRITE TABLE. What it means is you might be deleting the data prior to loading the table and not dropping the partition when you do INSERT OVERWRITE TABLE.
To get around it,
Try to delete the data and drop partition,prior to running the INSERT OVERWRITE TABLE.
OR don't delete the data/drop partition for the external table let the INSERT OVERWRITE TABLE replace it.
Regards
Khaja Hussain.
Similar Error:
Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://data_dir/pk_business=bsc/pk_data_source=pos/pk_frequency=bnw/pk_data_state=c13251_ps2111_bre000_pfc00000_spr000_pfs00000/pk_reporttype=BNN/pk_ppweek=2487 has not be cleaned up.
Created ‎04-18-2018 07:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, you are right.
