Tez Could not get block location when insert


I've been stuck for so long, I don't know what to do. Help❤️


I have installed Hive using ambari,When I try to insert a piece of data into the table, I get a Tez error,Here's what I've done:

  • hive services and Tez services are normal.
  • The user name I use is root, and encryption-free login has been configured on all nodes.
  • root has its own /user/root directory and /tmp/hive/root on hdfs
  • root is a hive administrator and has all permissions on related databases and tables
  • I use yarn application-list to verify that no tasks are running
  • I use yarn node-list to confirm that all nodes are RUNNING
  • I use hdfs dfsadmin-report to confirm that all nodes are Normal and in normal state
  • I use hadoop fsck confirmed /warehouse/tablespace/managed/hive/root.db/test is healthy

My question is detailed below:

I have a list like this

0: jdbc:hive2://hdp0:2181,hdp1:2181,hdp2:2181> desc test;
INFO  : Compiling command(queryId=hive_20230614090425_2f598338-0603-4ba5-b4a2-8b77796b6d1b): desc test
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=hive_20230614090425_2f598338-0603-4ba5-b4a2-8b77796b6d1b); Time taken: 0.103 seconds
INFO  : Executing command(queryId=hive_20230614090425_2f598338-0603-4ba5-b4a2-8b77796b6d1b): desc test
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=hive_20230614090425_2f598338-0603-4ba5-b4a2-8b77796b6d1b); Time taken: 0.012 seconds
| col_name  | data_type  | comment  |
| id        | int        |          |
| score     | int        |          |

 When I try to insert data, I get an error

0: jdbc:hive2://hdp0:2181,hdp1:2181,hdp2:2181> insert into table test values (1,1);
INFO  : Compiling command(queryId=hive_20230614090629_4118dca3-cb4d-4ac4-a5e7-28b932278055): insert into table test values (1,1)
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col1, type:int, comment:null), FieldSchema(name:col2, type:int, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20230614090629_4118dca3-cb4d-4ac4-a5e7-28b932278055); Time taken: 0.263 seconds
INFO  : Executing command(queryId=hive_20230614090629_4118dca3-cb4d-4ac4-a5e7-28b932278055): insert into table test values (1,1)
INFO  : Query ID = hive_20230614090629_4118dca3-cb4d-4ac4-a5e7-28b932278055
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
WARN  : The session: sessionId=32ed8d7a-8acb-4e68-a4d3-20210f38c38b, queueName=null, user=root, doAs=true, isOpen=false, isDefault=false has not been opened
INFO  : Subscribed to counters: [] for queryId: hive_20230614090629_4118dca3-cb4d-4ac4-a5e7-28b932278055
INFO  : Tez session hasn't been created yet. Opening session
INFO  : Dag name: insert into table test values (1,1) (Stage-1)
INFO  : Dag submit failed due to Could not get block locations. Source file "/tmp/hive/root/_tez_session_dir/3816e222-7805-48e5-a726-9ee2d1d84c5d/.tez/application_1686711037964_0022/recovery/1/summary" - Aborting...block==null
        at org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(
        at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(
        at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
        at org.apache.hadoop.ipc.RPC$
        at org.apache.hadoop.ipc.Server$
        at org.apache.hadoop.ipc.Server$
        at Method)
        at org.apache.hadoop.ipc.Server$
Caused by: Could not get block locations. Source file "/tmp/hive/root/_tez_session_dir/3816e222-7805-48e5-a726-9ee2d1d84c5d/.tez/application_1686711037964_0022/recovery/1/summary" - Aborting...block==null
        at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(
        at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(
 stack trace: [sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method), sun.reflect.NativeConstructorAccessorImpl.newInstance(, sun.reflect.DelegatingConstructorAccessorImpl.newInstance(, java.lang.reflect.Constructor.newInstance(, org.apache.tez.common.RPCUtil.instantiateException(, org.apache.tez.common.RPCUtil.instantiateRuntimeException(, org.apache.tez.common.RPCUtil.unwrapAndThrowException(, org.apache.tez.client.TezClient.submitDAGSession(, org.apache.tez.client.TezClient.submitDAG(, org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(, org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(, org.apache.hadoop.hive.ql.exec.Task.executeTask(, org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(, org.apache.hadoop.hive.ql.Driver.launchTask(, org.apache.hadoop.hive.ql.Driver.execute(, org.apache.hadoop.hive.ql.Driver.runInternal(,,,, org.apache.hive.service.cli.operation.SQLOperation.runQuery(, org.apache.hive.service.cli.operation.SQLOperation.access$700(, org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$, Method),,, org.apache.hive.service.cli.operation.SQLOperation$, java.util.concurrent.Executors$,, java.util.concurrent.Executors$,, java.util.concurrent.ThreadPoolExecutor.runWorker(, java.util.concurrent.ThreadPoolExecutor$,] retrying...
ERROR : Failed to execute tez graph. Could not get block locations. Source file "/tmp/hive/root/_tez_session_dir/3816e222-7805-48e5-a726-9ee2d1d84c5d/.tez/application_1686711037964_0023/tez-conf.pb" - Aborting...block==null
        at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery( ~[hadoop-hdfs-client-]
        at org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError( ~[hadoop-hdfs-client-]
        at ~[hadoop-hdfs-client-]
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
INFO  : Completed executing command(queryId=hive_20230614090629_4118dca3-cb4d-4ac4-a5e7-28b932278055); Time taken: 5.046 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)


Here's what I've done since the problem arose:

  • I set the /tmp/root directory to 755 with the chmod -R command, but this did not solve the problem


every body.

I think this is an issue with Docker containerization. I redeployed it using VMware virtual machine today and there were no issues.

@xiamu this error could appear if the data nodes are not healthy. Does the job fail repeatedly, or it succeeds at times? Have you tried running it with a different user? 

This is where it is failing:


private void setupPipelineForAppendOrRecovery() throws IOException {
    // Check number of datanodes. Note that if there is no healthy datanode,
    // this must be internal error because we mark external error in striped
    // outputstream only when all the streamers are in the DATA_STREAMING stage
    if (nodes == null || nodes.length == 0) {
      String msg = "Could not get block locations. " + "Source file \""
          + src + "\" - Aborting..." + this;
      lastException.set(new IOException(msg));
      streamerClosed = true;
    setupPipelineInternal(nodes, storageTypes, storageIDs);



My data datanodes are all normal.


hdfs dfsadmin -report
Configured Capacity: 707412281856 (658.83 GB)
Present Capacity: 592360489158 (551.68 GB)
DFS Remaining: 585697374208 (545.47 GB)
DFS Used: 6663114950 (6.21 GB)
DFS Used%: 1.12%
Replicated Blocks:
        Under replicated blocks: 5
        Blocks with corrupt replicas: 1
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 2
Erasure Coded Block Groups:
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Low redundancy blocks with highest priority to recover: 0
        Pending deletion blocks: 0

Live datanodes (3):

Name: (bgs1)
Hostname: bgs1
Decommission Status : Normal
Configured Capacity: 235804093952 (219.61 GB)
DFS Used: 2220986368 (2.07 GB)
Non DFS Used: 24545226240 (22.86 GB)
DFS Remaining: 195142991872 (181.74 GB)
DFS Used%: 0.94%
DFS Remaining%: 82.76%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Sat Jun 17 04:27:19 UTC 2023
Last Block Report: Sat Jun 17 04:17:01 UTC 2023
Num of Blocks: 130

Name: (bgs2)
Hostname: bgs2
Decommission Status : Normal
Configured Capacity: 235804093952 (219.61 GB)
DFS Used: 2220888064 (2.07 GB)
Non DFS Used: 24545361408 (22.86 GB)
DFS Remaining: 195277172736 (181.87 GB)
DFS Used%: 0.94%
DFS Remaining%: 82.81%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Sat Jun 17 04:27:20 UTC 2023
Last Block Report: Sat Jun 17 04:16:28 UTC 2023
Num of Blocks: 130

Name: (bgs3)
Hostname: bgs3
Decommission Status : Normal
Configured Capacity: 235804093952 (219.61 GB)
DFS Used: 2221240518 (2.07 GB)
Non DFS Used: 24544972090 (22.86 GB)
DFS Remaining: 195277209600 (181.87 GB)
DFS Used%: 0.94%
DFS Remaining%: 82.81%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Sat Jun 17 04:27:19 UTC 2023
Last Block Report: Sat Jun 17 04:17:38 UTC 2023
Num of Blocks: 130



The same error happens when I use different users.


It might be worth mentioning that I used docker containers to simulate the cluster. When I continued to run the test program, I found that the HDFS test program also reported errors, so I thought maybe the error of Tez was due to the error of HDFS.

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/", line 167, in <module>
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/", line 352, in execute
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/", line 88, in service_check
  File "/usr/lib/ambari-agent/lib/resource_management/core/", line 166, in __init__
  File "/usr/lib/ambari-agent/lib/resource_management/core/", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/", line 124, in run_action
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 677, in action_create_on_execute
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 674, in action_delayed
    self.get_hdfs_resource_executor().action_delayed(action_name, self)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 373, in action_delayed
    self.action_delayed_for_nameservice(None, action_name, main_resource)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 403, in action_delayed_for_nameservice
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 419, in _create_resource
    self._create_file(, source=self.main_resource.resource.source, mode=self.mode)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 534, in _create_file
    self.util.run_command(target, 'CREATE', method='PUT', overwrite=True, assertable_result=False, file_to_put=source, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 214, in run_command
    return self._run_command(*args, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/", line 295, in _run_command
    raise WebHDFSCallException(err_msg, result_dict)
resource_management.libraries.providers.hdfs_resource.WebHDFSCallException: Execution of 'curl -sS -L -w '%{http_code}' -X PUT --data-binary @/var/lib/ambari-agent/tmp/hdfs-service-check -H 'Content-Type: application/octet-stream' 'http://bgm:50070/webhdfs/v1/tmp/id13ac0200_date521723?op=CREATE&'' returned status_code=403. 
  "RemoteException": {
    "exception": "IOException", 
    "javaClassName": "", 
    "message": "File /tmp/id13ac0200_date521723 could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.\n\tat org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$\n\tat org.apache.hadoop.ipc.RPC$\n\tat org.apache.hadoop.ipc.Server$\n\tat org.apache.hadoop.ipc.Server$\n\tat Method)\n\tat\n\tat\n\tat org.apache.hadoop.ipc.Server$\n"

There are 3 datanode(s) running and 3 node(s) are excluded in this operation. Why?


every body.

I think this is an issue with Docker containerization. I redeployed it using VMware virtual machine today and there were no issues.