28787
DISCUSSIONS
102102
MEMBERS
3161
ARTICLES
Created 11-20-2013 03:49 AM
Hi,
In CDH4.3.x and CDH3 I could load the files with the same name into the same partition multiple times. The following was working fine:
1. LOAD DATA INPATH '/tmp/ht/sdp-fss-ccreporting.log' OVERWRITE INTO TABLE ccrlocal_sdp_fss_transaction_logs PARTITION (year='2013', month='10',day='11');
2. hadoop fs -put sdp-fss-ccreporting.log /tmp/ht
3. LOAD DATA INPATH '/tmp/ht/sdp-fss-ccreporting.log' OVERWRITE INTO TABLE ccrlocal_sdp_fss_transaction_logs PARTITION (year='2013', month='10',day='11');
In CHD4.4.0 it stopped doing it. It fails with the following message:
hive -e "LOAD DATA INPATH '/tmp/ht/sdp-fss-ccreporting.log' OVERWRITE INTO TABLE ccrlocal_sdp_fss_transaction_logs PARTITION (year='2013', month='10',day='01')" Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties Hive history file=/tmp/tm-ccr/hive_job_log_9c596186-0eac-4471-aff9-7ff6e8d3b5d3_533521785.txt Loading data to table default.ccrlocal_sdp_fss_transaction_logs partition (year=2013, month=10, day=01) Failed with exception Error moving: hdfs://localhost:54310/tmp/ht/sdp-fss-ccreporting.log into: hdfs://localhost:54310/user/hive/warehouse/ccrlocal_sdp_fss_transaction_logs/year=2013/month=10/day=01/sdp-fss-ccreporting.log FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
Here is the hive.log
2013-11-20 11:44:41,785 WARN conf.Configuration (Configuration.java:loadProperty(2068)) - org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2d8ca1e3:an attempt to override final parameter: mapred.tasktracker.reduce.tasks.maximum; Ignoring. 2013-11-20 11:44:41,795 WARN conf.Configuration (Configuration.java:loadProperty(2068)) - org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2d8ca1e3:an attempt to override final parameter: mapred.tasktracker.map.tasks.maximum; Ignoring. 2013-11-20 11:44:42,675 WARN conf.Configuration (Configuration.java:loadProperty(2068)) - org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@61922138:an attempt to override final parameter: mapred.tasktracker.reduce.tasks.maximum; Ignoring. 2013-11-20 11:44:42,680 WARN conf.Configuration (Configuration.java:loadProperty(2068)) - org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@61922138:an attempt to override final parameter: mapred.tasktracker.map.tasks.maximum; Ignoring. 2013-11-20 11:44:42,720 INFO ql.Driver (PerfLogger.java:PerfLogBegin(88)) - <PERFLOG method=Driver.run> 2013-11-20 11:44:42,720 INFO ql.Driver (PerfLogger.java:PerfLogBegin(88)) - <PERFLOG method=TimeToSubmit> 2013-11-20 11:44:42,720 INFO ql.Driver (PerfLogger.java:PerfLogBegin(88)) - <PERFLOG method=compile> 2013-11-20 11:44:45,216 INFO ql.Driver (Driver.java:compile(468)) - Semantic Analysis Completed 2013-11-20 11:44:45,234 INFO ql.Driver (Driver.java:getSchema(265)) - Returning Hive schema: Schema(fieldSchemas:null, properties:null) 2013-11-20 11:44:45,235 INFO ql.Driver (PerfLogger.java:PerfLogEnd(115)) - </PERFLOG method=compile start=1384947882720 end=1384947885235 duration=2515> 2013-11-20 11:44:45,235 INFO ql.Driver (PerfLogger.java:PerfLogBegin(88)) - <PERFLOG method=Driver.execute> 2013-11-20 11:44:45,235 INFO ql.Driver (Driver.java:execute(1099)) - Starting command: LOAD DATA INPATH '/tmp/ht/sdp-fss-ccreporting.log' OVERWRITE INTO TABLE ccrlocal_sdp_fss_transaction_logs PARTITION (year='2013', month='10',day='01') 2013-11-20 11:44:45,265 INFO ql.Driver (PerfLogger.java:PerfLogEnd(115)) - </PERFLOG method=TimeToSubmit start=1384947882720 end=1384947885265 duration=2545> 2013-11-20 11:44:45,269 INFO exec.Task (SessionState.java:printInfo(418)) - Loading data to table default.ccrlocal_sdp_fss_transaction_logs partition (year=2013, month=10, day=01) from hdfs://localhost:54310/tmp/ht/sdp-fss-ccreporting.log 2013-11-20 11:44:45,561 DEBUG metadata.Hive (Hive.java:renameFile(2028)) - Replacing src:hdfs://localhost:54310/tmp/ht/sdp-fss-ccreporting.log;dest: hdfs://localhost:54310/user/hive/warehouse/ccrlocal_sdp_fss_transaction_logs/year=2013/month=10/day=01/sdp-fss-ccreporting.log;Status:false 2013-11-20 11:44:45,563 ERROR exec.Task (SessionState.java:printError(427)) - Failed with exception Error moving: hdfs://localhost:54310/tmp/ht/sdp-fss-ccreporting.log into: hdfs://localhost:54310/user/hive/warehouse/ccrlocal_sdp_fss_transaction_logs/year=2013/month=10/day=01/sdp-fss-ccreporting.log org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: hdfs://localhost:54310/tmp/ht/sdp-fss-ccreporting.log into: hdfs://localhost:54310/user/hive/warehouse/ccrlocal_sdp_fss_transaction_logs/year=2013/month=10/day=01/sdp-fss-ccreporting.log at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2182) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1189) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:304) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.io.IOException: Error moving: hdfs://localhost:54310/tmp/ht/sdp-fss-ccreporting.log into: hdfs://localhost:54310/user/hive/warehouse/ccrlocal_sdp_fss_transaction_logs/year=2013/month=10/day=01/sdp-fss-ccreporting.log at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2176) ... 19 more 2013-11-20 11:44:45,576 ERROR ql.Driver (SessionState.java:printError(427)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask 2013-11-20 11:44:45,576 INFO ql.Driver (PerfLogger.java:PerfLogEnd(115)) - </PERFLOG method=Driver.execute start=1384947885235 end=1384947885576 duration=341> 2013-11-20 11:44:45,577 INFO ql.Driver (PerfLogger.java:PerfLogBegin(88)) - <PERFLOG method=releaseLocks> 2013-11-20 11:44:45,577 INFO ql.Driver (PerfLogger.java:PerfLogEnd(115)) - </PERFLOG method=releaseLocks start=1384947885577 end=1384947885577 duration=0> 2013-11-20 11:44:45,583 INFO ql.Driver (PerfLogger.java:PerfLogBegin(88)) - <PERFLOG method=releaseLocks> 2013-11-20 11:44:45,583 INFO ql.Driver (PerfLogger.java:PerfLogEnd(115)) - </PERFLOG method=releaseLocks start=1384947885583 end=1384947885583 duration=0>
The only time when I can overwrite data into the table is when in step 3 I do not specify the file name:
1. LOAD DATA INPATH '/tmp/ht/sdp-fss-ccreporting.log' OVERWRITE INTO TABLE ccrlocal_sdp_fss_transaction_logs PARTITION (year='2013', month='10',day='11');
2. hadoop fs -put sdp-fss-ccreporting.log /tmp/ht
3. LOAD DATA INPATH '/tmp/ht' OVERWRITE INTO TABLE ccrlocal_sdp_fss_transaction_logs PARTITION (year='2013', month='10',day='11');
It could be a similar issuer as https://issues.apache.org/jira/browse/HIVE-3300, but I am not sure.
Thanks,
Alexei