Created 09-20-2017 01:51 PM
I have a Hive MERGE query, reading avro files to write ORC files.
The avro files are input data, and the ORC files will be my main database.
The merge query almost completes, but always end up failing. The relevant log lines (I think) are:
# Just before failing, still good [Thread-9646]: monitoring.TezJobMonitor$UpdateFunction (TezJobMonitor.java:update(1 37)) - Map 1: 19/19 Map 5: 80/80 Reducer 2: 1009/1009 Reducer 3: 9(+0)/10 Reducer 4: 1(+8,-19)/10 # a few more log.PerfLogger line... Vertex failed, vertexName=Reducer 4, vertexId=vertex_1502360038800_0027_2_03, diagnostics=[Task failed, taskId=task_1502360038800_0027_2_03_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {} ... Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /dwh/vault/contact/.hive-staging_hive _2017-09-20_08-56-51_838_2864382824593930489-1/_task_tmp.-ext-10000/name=hsys/id=46/_tmp.000000_0/delta_0000076_0000076_0000/bucket_00000 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
During this query run I could see that yarn memory was quite high (91%). There was no other things I notices, except this repeated in hadoop-hdfs-namenode.log
WARN blockmanagement.BlockPlacementPolicy (BlockPlacementPolicyDefault.java:chooseTarget(385)) - Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
I could not find myself anything relevant related to this error.
An fsck (after query failure) does not give any error. Ambari does not find any under replicated blocks.
Any idea if there is a usual culprit for this error, or where I could look?
Thanks.
Small (1 ambari, 3 DN) hdp2.6 cluster, on AWS.
Created 09-25-2017 07:45 PM
@Guillaume Roger As seen from the log you provided, the file blocks were not written to HDFS
Failed to place enough replicas, still in need of 3 to reach 3 - indicates the block couldnot be written to any of the 3 DNs in the cluster.
All required storage types are unavailable
Can you please check if HDFS is in healthy state and whether you are able to write files to HDFS.
Created 09-26-2017 05:46 AM
I agree with your assessment (files cannot be written to HDFS) but my problem is that as far as I know HDFS is in an healthy state: all ambari lights are green, no under replicated blocks, fsck is happy, I can indeed write even huge files on HDFS... If you are aware of other checks I could perform I would love to know about them.
Thanks,