Support Questions

ariful_hossain · ‎10-24-2016

Table created with this :

create table syslog_staged (id string, facility string, sender string, severity string, tstamp string, service string, msg string) partitioned by (hostname string,  year string, month string, day string) clustered by (id) into 20 buckets stored as orc tblproperties("transactional"="true");

the table is populated with Apache nifi's PutHiveStreaming...

alter table syslog_staged partition (hostname="cloudserver19", year="2016", month="10", day="24") compact 'major';

Now it turns out compaction fails for some reason.....(from job history)

No of maps and reduces are 0 job_1476884195505_0031
Job commit failed: java.io.FileNotFoundException: File hdfs://hadoop1.openstacksetup.com:8020/apps/hive/warehouse/log.db/syslog_staged/hostname=cloudserver19/year=2016/month=10/day=24/_tmp_27c40005-658e-48c1-90f7-2acaa124e2fa does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:113)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:966)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:962)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:962)
at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(CompactorMR.java:776)
at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

from hive metastore log :

2016-10-24 16:33:35,503 WARN  [Thread-14]: compactor.Initiator (Initiator.java:run(132)) - Will not initiate compaction for log.syslog_staged.hostname=cloudserver19/year=2016/month=10/day=24 since last hive.compactor.initiator.failed.compacts.threshold attempts to compact it failed.

wzheng · ‎02-16-2017

As Eugene suggested, could you paste the output of "dfs -lsr" here so that we can see which dirs are owned by whom?

A few other things we need to confirm:

Is streaming being used before and after the upgrade?
When you say compaction fails, what triggered the compaction? Is that triggered by the system automatically, or is it run by some user manually? If it's a manual compaction, then which user issued the command?
You mentioned the problematic table partition was created on 2.5.3. Which user created it? Do you have issue compacting pre-existing tables created on 2.3.6?

ekoifman · ‎02-16-2017

This _tmp file should be created in the Mapper of the compaction job. Is there anything about it in the job logs?

Cloudera Community

Support Questions

hive transactional table compaction fails