Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive - issue inserting records to partitioned table in hdp 2.6.3

avatar
Explorer

Hi

Facing below error in HDP 2.6.3 when trying to insert overwrite data from an external table to a dynamic partitioned table. The code worked fine in HDP 2.5.3, noticing the issue post upgrade.

org.apache.hadoop.hive.ql.metadata.HiveException:Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up.

Code:

SET hive.stats.autogather = false;
SET hive.exec.dynamic.partition = true; 
SET hive.exec.dynamic.partition.mode = nonstrict;
SET hive.exec.compress.output= true;
SET mapred.output.compression.codec= org.apache.hadoop.io.compress.SnappyCodec;
SET hive.mapjoin.smalltable.filesize=4000000000;
SET hive.mapjoin.localtask.max.memory.usage=0.99;
SET hive.auto.convert.join =true;
SET hive.optimize.skewjoin = true;
SET hive.exec.compress.intermediate = true;
SET mapred.reduce.tasks=-1; 
SET hive.exec.reducers.bytes.per.reducer=1000000000;
USE db_schema;
DROP TABLE IF EXISTS table_name_1_temp;
CREATE EXTERNAL TABLE table_name_1_temp (
col_1 STRING,
col_2 string)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
LOCATION ${hdfs_location};
INSERT OVERWRITE TABLE table_name_1 partition(part_col_1)
SELECT
    DST.col_1_parm[0]            as col_1
   ,DST.col_1_parm[1]            as col_2  
   ,DST.col_2_parm[0]    as col_3
   ,DST.col_2_parm[1]    as col_4   
   ,DST.col_1_parm[2]           as part_col_1
FROM (
SELECT split(col_1,'\001') as col_1_parm, split(col_2,'\001') as col_2_parm
FROM table_name_1_temp) DST;

Error:

18/01/30 01:44:07 [load-dynamic-partitions-0]: ERROR metadata.Hive: Exception when loading partition with parameters  partPath=hdfs://..../.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000/part_col_1=1,  table=table_name_1,  partSpec={part_col_1=1},  loadFileType=REPLACE_ALL,  listBucketingEnabled=false,  isAcid=false,  hasFollowingStatsTask=false
org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up.
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3380)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1518)
        at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1799)
        at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1790)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)		
18/01/30 01:44:07 [main]: DEBUG metadata.Hive: Cancelling 2 dynamic loading tasks
Failed with exception Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000
18/01/30 01:44:07 [main]: ERROR exec.Task: Failed with exception Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000
org.apache.hadoop.hive.ql.metadata.HiveException: Exception when loading 2 in table table_name_1 with loadPath=hdfs://....../table_name_1/.hive-staging_hive_2018-01-30_01-42-09_454_4846747145041946340-1/-ext-10000
        at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1841)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:449)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1756)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up.
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1829)
        ... 20 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://....../table_name_1/part_col_1=1 has not be cleaned up.
        at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3380)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1518)
        at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1799)
        at org.apache.hadoop.hive.ql.metadata.Hive$2.call(Hive.java:1790)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
4 REPLIES 4

avatar
Expert Contributor

@Sam Cse

There is an already existing partition at this location hdfs://....../table_name_1/part_col_1=1 or may be a sub-directory and before loading the new partition, cleanup of destination directory is failing.

Can you try deleting those files manually and then try Insert Overwrite. Also, share the listing for this hdfs path.

avatar
Explorer

I checked permission and the contents of the hdfs directory, it has all required permissions and also empty. Even though if the partition exists in the query I'm trying to overwrite it.

avatar
Explorer

Hi Sam,

Recently we upgraded our cluster from HDP2.5.6 to HDP2.6.4 and I am getting the similar error. With the current version Hive is more stricter on INSERT OVERWRITE TABLE. What it means is you might be deleting the data prior to loading the table and not dropping the partition when you do INSERT OVERWRITE TABLE.

To get around it,

Try to delete the data and drop partition,prior to running the INSERT OVERWRITE TABLE.

OR don't delete the data/drop partition for the external table let the INSERT OVERWRITE TABLE replace it.

Regards

Khaja Hussain.

Similar Error:

Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.hive.ql.metadata.HiveException: Destination directory hdfs://data_dir/pk_business=bsc/pk_data_source=pos/pk_frequency=bnw/pk_data_state=c13251_ps2111_bre000_pfc00000_spr000_pfs00000/pk_reporttype=BNN/pk_ppweek=2487 has not be cleaned up.

avatar
Explorer

Yeah, you are right.