Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

hive transactional table compaction fails

New Contributor

Table created with this :

create table syslog_staged (id string, facility string, sender string, severity string, tstamp string, service string, msg string) partitioned by (hostname string,  year string, month string, day string) clustered by (id) into 20 buckets stored as orc tblproperties("transactional"="true");

the table is populated with Apache nifi's PutHiveStreaming...

alter table syslog_staged partition (hostname="cloudserver19", year="2016", month="10", day="24") compact 'major';

Now it turns out compaction fails for some reason.....(from job history)

No of maps and reduces are 0 job_1476884195505_0031
Job commit failed: File hdfs:// does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(
at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(
at org.apache.hadoop.mapred.OutputCommitter.commitJob(
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$

from hive metastore log :

2016-10-24 16:33:35,503 WARN  [Thread-14]: compactor.Initiator ( - Will not initiate compaction for log.syslog_staged.hostname=cloudserver19/year=2016/month=10/day=24 since last hive.compactor.initiator.failed.compacts.threshold attempts to compact it failed.

Rising Star

I had the same issue. The root cause of my issue was the hive user account (that the compactor was running under) did not have write permissions to the directory that the hive database was in. After adjusting the permissions I was able to run the compactor succesfully.

Expert Contributor

@Benjamin Hopp, did your case involve Streaming Ingest?

I am also having same issue I am getting error like

org.apache.hadoop.ipc.RemoteException( File does not exist: /apps/hive/warehouse/abc_db.db/abc_aca_present/delta_0001296_0001296/bucket_00000_flush_length
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
	at org.apache.hadoop.ipc.RPC$
	at org.apache.hadoop.ipc.Server$Handler$
	at org.apache.hadoop.ipc.Server$Handler$
	at Method)
	at org.apache.hadoop.ipc.Server$

Expert Contributor

This looks like in which case it can be ignored

@Arif Hossain

did you manage to fix the compaction problem? How?

I have the same problem on a new partition after upgrading to HDP 2.5.3 from 2.3.6 and it's not a permission problem as in @Benjamin Hopp case


@Eugene Koifman @Wei Zheng

this seems related to

do you have any idea about our problem?

Expert Contributor

@Davide Ferrari could you post "dfs -lsr" of your table dir please

Are you able to see the _tmp... file?


No, @Eugene Koifman Just the "_orc_acid_version" and all the delta subdirs with the 8 bucket files + 8 _flush_length. _tmp file never gets createdand actually the compaction job fails in a matter of seconds


This is a Hive Streaming installation updated from HDP 2.3.6 to HDP 2.5.3 directly. This table partition was created on 2.5.3


As Eugene suggested, could you paste the output of "dfs -lsr" here so that we can see which dirs are owned by whom?

A few other things we need to confirm:

  1. Is streaming being used before and after the upgrade?
  2. When you say compaction fails, what triggered the compaction? Is that triggered by the system automatically, or is it run by some user manually? If it's a manual compaction, then which user issued the command?
  3. You mentioned the problematic table partition was created on 2.5.3. Which user created it? Do you have issue compacting pre-existing tables created on 2.3.6?

Expert Contributor

This _tmp file should be created in the Mapper of the compaction job. Is there anything about it in the job logs?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.