Support Questions
Find answers, ask questions, and share your expertise

hive transactional table compaction fails

hive transactional table compaction fails

New Contributor

Table created with this :

create table syslog_staged (id string, facility string, sender string, severity string, tstamp string, service string, msg string) partitioned by (hostname string,  year string, month string, day string) clustered by (id) into 20 buckets stored as orc tblproperties("transactional"="true");

the table is populated with Apache nifi's PutHiveStreaming...

alter table syslog_staged partition (hostname="cloudserver19", year="2016", month="10", day="24") compact 'major';

Now it turns out compaction fails for some reason.....(from job history)

No of maps and reduces are 0 job_1476884195505_0031
Job commit failed: java.io.FileNotFoundException: File hdfs://hadoop1.openstacksetup.com:8020/apps/hive/warehouse/log.db/syslog_staged/hostname=cloudserver19/year=2016/month=10/day=24/_tmp_27c40005-658e-48c1-90f7-2acaa124e2fa does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:113)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:966)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:962)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:962)
at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(CompactorMR.java:776)
at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

from hive metastore log :

2016-10-24 16:33:35,503 WARN  [Thread-14]: compactor.Initiator (Initiator.java:run(132)) - Will not initiate compaction for log.syslog_staged.hostname=cloudserver19/year=2016/month=10/day=24 since last hive.compactor.initiator.failed.compacts.threshold attempts to compact it failed.
11 REPLIES 11

Re: hive transactional table compaction fails

Rising Star

I had the same issue. The root cause of my issue was the hive user account (that the compactor was running under) did not have write permissions to the directory that the hive database was in. After adjusting the permissions I was able to run the compactor succesfully.

Re: hive transactional table compaction fails

Expert Contributor

@Benjamin Hopp, did your case involve Streaming Ingest?

Re: hive transactional table compaction fails

I am also having same issue I am getting error like

org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/abc_db.db/abc_aca_present/delta_0001296_0001296/bucket_00000_flush_length
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:693)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:373)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)

Re: hive transactional table compaction fails

Expert Contributor

This looks like https://issues.apache.org/jira/browse/HIVE-15309 in which case it can be ignored

Re: hive transactional table compaction fails

Explorer
@Arif Hossain

did you manage to fix the compaction problem? How?

I have the same problem on a new partition after upgrading to HDP 2.5.3 from 2.3.6 and it's not a permission problem as in @Benjamin Hopp case

Re: hive transactional table compaction fails

Explorer

@Eugene Koifman @Wei Zheng

this seems related to

https://issues.apache.org/jira/browse/HIVE-15142

do you have any idea about our problem?

Re: hive transactional table compaction fails

Expert Contributor

@Davide Ferrari could you post "dfs -lsr" of your table dir please

Are you able to see the _tmp... file?

Re: hive transactional table compaction fails

Explorer

No, @Eugene Koifman Just the "_orc_acid_version" and all the delta subdirs with the 8 bucket files + 8 _flush_length. _tmp file never gets createdand actually the compaction job fails in a matter of seconds

Re: hive transactional table compaction fails

Explorer

This is a Hive Streaming installation updated from HDP 2.3.6 to HDP 2.5.3 directly. This table partition was created on 2.5.3