Support Questions

ariful_hossain · ‎10-24-2016

Table created with this :

create table syslog_staged (id string, facility string, sender string, severity string, tstamp string, service string, msg string) partitioned by (hostname string,  year string, month string, day string) clustered by (id) into 20 buckets stored as orc tblproperties("transactional"="true");

the table is populated with Apache nifi's PutHiveStreaming...

alter table syslog_staged partition (hostname="cloudserver19", year="2016", month="10", day="24") compact 'major';

Now it turns out compaction fails for some reason.....(from job history)

No of maps and reduces are 0 job_1476884195505_0031
Job commit failed: java.io.FileNotFoundException: File hdfs://hadoop1.openstacksetup.com:8020/apps/hive/warehouse/log.db/syslog_staged/hostname=cloudserver19/year=2016/month=10/day=24/_tmp_27c40005-658e-48c1-90f7-2acaa124e2fa does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:113)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:966)
at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:962)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:962)
at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorOutputCommitter.commitJob(CompactorMR.java:776)
at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

from hive metastore log :

2016-10-24 16:33:35,503 WARN  [Thread-14]: compactor.Initiator (Initiator.java:run(132)) - Will not initiate compaction for log.syslog_staged.hostname=cloudserver19/year=2016/month=10/day=24 since last hive.compactor.initiator.failed.compacts.threshold attempts to compact it failed.

bhopp · ‎11-04-2016

I had the same issue. The root cause of my issue was the hive user account (that the compactor was running under) did not have write permissions to the directory that the hive database was in. After adjusting the permissions I was able to run the compactor succesfully.

ekoifman · ‎02-16-2017

@Benjamin Hopp, did your case involve Streaming Ingest?

kiranbhakre1 · ‎01-15-2017

I am also having same issue I am getting error like

org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/abc_db.db/abc_aca_present/delta_0001296_0001296/bucket_00000_flush_length
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:693)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:373)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)

ekoifman · ‎02-16-2017

This looks like https://issues.apache.org/jira/browse/HIVE-15309 in which case it can be ignored

davide1 · ‎02-16-2017

@Arif Hossain

did you manage to fix the compaction problem? How?

I have the same problem on a new partition after upgrading to HDP 2.5.3 from 2.3.6 and it's not a permission problem as in @Benjamin Hopp case

davide1 · ‎02-16-2017

@Eugene Koifman @Wei Zheng

this seems related to

https://issues.apache.org/jira/browse/HIVE-15142

do you have any idea about our problem?

ekoifman · ‎02-16-2017

@Davide Ferrari could you post "dfs -lsr" of your table dir please

Are you able to see the _tmp... file?

davide1 · ‎02-16-2017

No, @Eugene Koifman Just the "_orc_acid_version" and all the delta subdirs with the 8 bucket files + 8 _flush_length. _tmp file never gets createdand actually the compaction job fails in a matter of seconds

davide1 · ‎02-16-2017

This is a Hive Streaming installation updated from HDP 2.3.6 to HDP 2.5.3 directly. This table partition was created on 2.5.3

Cloudera Community

Support Questions

hive transactional table compaction fails