Created 01-30-2018 03:34 PM
Hi
Facing below error in HDP 2.6.3 when trying to insert overwrite data from an external table to a dynamic partitioned table. The code worked fine in HDP 2.5.3, noticing the issue post upgrade.
org.apache.hadoop.hive.ql.metadata.HiveException:Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up.
Code:
SET hive.stats.autogather = false; SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict; SET hive.exec.compress.output= true; SET mapred.output.compression.codec= org.apache.hadoop.io.compress.SnappyCodec; SET hive.mapjoin.smalltable.filesize=4000000000; SET hive.mapjoin.localtask.max.memory.usage=0.99; SET hive.auto.convert.join =true; SET hive.optimize.skewjoin = true; SET hive.exec.compress.intermediate = true; SET mapred.reduce.tasks=-1; SET hive.exec.reducers.bytes.per.reducer=1000000000; USE db_schema; DROP TABLE IF EXISTS table_name_1_temp; CREATE EXTERNAL TABLE table_name_1_temp ( col_1 STRING, col_2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION ${hdfs_location}; INSERT OVERWRITE TABLE table_name_1 partition(part_col_1) SELECT DST.col_1_parm[0] as col_1 ,DST.col_1_parm[1] as col_2 ,DST.col_2_parm[0] as col_3 ,DST.col_2_parm[1] as col_4 ,DST.col_1_parm[2] as part_col_1 FROM ( SELECT split(col_1,'\001') as col_1_parm, split(col_2,'\001') as col_2_parm FROM table_name_1_temp) DST;
Error:
18/01/30 01:44:07 [load-dynamic-partitions-0]: ERROR metadata.Hive: Exception when loading partition with parameters partPath=hdfs://..../.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000/part_col_1=1, table=table_name_1, partSpec={part_col_1=1}, loadFileType=REPLACE_ALL, listBucketingEnabled=false, isAcid=false, hasFollowingStatsTask=false org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up. at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3380) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1518) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1799) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1790) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 18/01/30 01:44:07 [main]: DEBUG metadata.Hive: Cancelling 2 dynamic loading tasks Failed with exception Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000 18/01/30 01:44:07 [main]: ERROR exec.Task: Failed with exception Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000 org.apache.hadoop.hive.ql.metadata.HiveException: Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000 at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1841) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:449) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1756) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1829) ... 20 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up. at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3380) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1518) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1799) at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1790) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Created 01-31-2018 04:19 AM
There is an already existing partition at this location hdfs://....../table_name_1/part_col_1=1 or may be a sub-directory and before loading the new partition, cleanup of destination directory is failing.
Can you try deleting those files manually and then try Insert Overwrite. Also, share the listing for this hdfs path.
Created 02-01-2018 04:25 AM
I checked permission and the contents of the hdfs directory, it has all required permissions and also empty. Even though if the partition exists in the query I'm trying to overwrite it.
Created 04-18-2018 03:57 PM
Hi Sam,
Recently we upgraded our cluster from HDP2.5.6 to HDP2.6.4 and I am getting the similar error. With the current version Hive is more stricter on INSERT OVERWRITE TABLE. What it means is you might be deleting the data prior to loading the table and not dropping the partition when you do INSERT OVERWRITE TABLE.
To get around it,
Try to delete the data and drop partition,prior to running the INSERT OVERWRITE TABLE.
OR don't delete the data/drop partition for the external table let the INSERT OVERWRITE TABLE replace it.
Regards
Khaja Hussain.
Similar Error:
Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://data_dir/pk_business=bsc/pk_data_source=pos/pk_frequency=bnw/pk_data_state=c13251_ps2111_bre000_pfc00000_spr000_pfs00000/pk_reporttype=BNN/pk_ppweek=2487 has not be cleaned up.
Created 04-18-2018 07:11 PM
Yeah, you are right.