Created 10-24-2016 10:11 AM
Table created with this :
create table syslog_staged (id string, facility string, sender string, severity string, tstamp string, service string, msg string) partitioned by (hostname string, year string, month string, day string) clustered by (id) into 20 buckets stored as orc tblproperties("transactional"="true");
the table is populated with Apache nifi's PutHiveStreaming...
alter table syslog_staged partition (hostname="cloudserver19", year="2016", month="10", day="24") compact 'major';
Now it turns out compaction fails for some reason.....(from job history)
No of maps and reduces are 0 job_1476884195505_0031 Job commit failed: java.io.FileNotFoundException: File hdfs://hadoop1.openstacksetup.com:8020/apps/hive/warehouse/log.db/syslog_staged/hostname=cloudserver19/year=2016/month=10/day=24/_tmp_27c40005-658e-48c1-90f7-2acaa124e2fa does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904) at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:113) at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:966) at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:962) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:962) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(CompactorMR.java:776) at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
from hive metastore log :
2016-10-24 16:33:35,503 WARN [Thread-14]: compactor.Initiator (Initiator.java:run(132)) - Will not initiate compaction for log.syslog_staged.hostname=cloudserver19/year=2016/month=10/day=24 since last hive.compactor.initiator.failed.compacts.threshold attempts to compact it failed.
Created 02-16-2017 07:14 PM
As Eugene suggested, could you paste the output of "dfs -lsr" here so that we can see which dirs are owned by whom?
A few other things we need to confirm:
Created 02-16-2017 05:08 PM
This _tmp file should be created in the Mapper of the compaction job. Is there anything about it in the job logs?