Support Questions

BORDIN · ‎08-12-2022

my cluster is CDH 5.16.2

Please help

##log##

[xxx@hdpwxxx1.true.care]:/reserve1>hdfs dfs -put 30.img /tmp/41.img
22/08/12 17:49:55 INFO hdfs.DFSClient: Exception while adding a block
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException): Not replicated yet: /tmp/41.img._COPYING_
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3688)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3477)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:694)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:219)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:507)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)

at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:425)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1875)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1671)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:790)
22/08/12 17:49:55 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /tmp/41.img._COPYING_ retries left 4
22/08/12 17:49:57 INFO hdfs.DFSClient: Exception while adding a block
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException): Not replicated yet: /tmp/41.img._COPYING_
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3688)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3477)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:694)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:219)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:507)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)

at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:425)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1875)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1671)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:790)
22/08/12 17:49:57 INFO hdfs.DFSClient: Waiting for replication for 5 seconds
22/08/12 17:49:57 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /tmp/41.img._COPYING_ retries left 3
[superuser@hdpwdbpr1.true.care]:/reserve1>

################# i try to use distcp

[superuser@xxx1.true.care]:/reserve1>hadoop distcp -Ddfs.block.size=$[256*1024*1024] file:///reserve1/30.img /tmp
22/08/13 00:14:30 INFO tools.OptionsParser: parseChunkSize: blocksperchunk false
22/08/13 00:14:31 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[file:/reserve1/30.img], targetPath=/tmp, targetPathExists=true, filtersFile='null', blocksPerChunk=0, copyBufferSize=8192}
22/08/13 00:14:32 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; dirCnt = 0
22/08/13 00:14:32 INFO tools.SimpleCopyListing: Build file listing completed.
22/08/13 00:14:32 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb
22/08/13 00:14:32 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor
22/08/13 00:14:32 INFO tools.DistCp: Number of paths in the copy list: 1
22/08/13 00:14:32 INFO tools.DistCp: Number of paths in the copy list: 1
22/08/13 00:14:32 INFO hdfs.DFSClient: Created token for hdppr2_cdretl1: HDFS_DELEGATION_TOKEN owner=hdppr2_cdretl1@TRUE.CARE, renewer=yarn, realUser=, issueDate=1660324472531, maxDate=1660929272531, sequenceNumber=191747308, masterKeyId=2346 on ha-hdfs:HDPPR2-NNHA
22/08/13 00:14:32 INFO security.TokenCache: Got dt for hdfs://HDPPR2-NNHA; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:HDPPR2-NNHA, Ident: (token for hdppr2_cdretl1: HDFS_DELEGATION_TOKEN owner=hdppr2_cdretl1@TRUE.CARE, renewer=yarn, realUser=, issueDate=1660324472531, maxDate=1660929272531, sequenceNumber=191747308, masterKeyId=2346)
22/08/13 00:14:32 INFO mapreduce.JobSubmitter: number of splits:1
22/08/13 00:14:32 INFO Configuration.deprecation: dfs.block.size is deprecated. Instead, use dfs.blocksize
22/08/13 00:14:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1660296669893_0347
22/08/13 00:14:32 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:HDPPR2-NNHA, Ident: (token for hdppr2_cdretl1: HDFS_DELEGATION_TOKEN owner=xxx_cdretl1@TRUE.CARE, renewer=yarn, realUser=, issueDate=1660324472531, maxDate=1660929272531, sequenceNumber=191747308, masterKeyId=2346)
22/08/13 00:14:33 INFO impl.YarnClientImpl: Submitted application application_1660296669893_0347
22/08/13 00:14:33 INFO mapreduce.Job: The url to track the job: http://xxx.true.care:8088/proxy/application_1660296669893_0347/
22/08/13 00:14:33 INFO tools.DistCp: DistCp job-id: job_1660296669893_0347
22/08/13 00:14:33 INFO mapreduce.Job: Running job: job_1660296669893_0347
22/08/13 00:14:40 INFO mapreduce.Job: Job job_1660296669893_0347 running in uber mode : false
22/08/13 00:14:40 INFO mapreduce.Job: map 0% reduce 0%
22/08/13 00:14:46 INFO mapreduce.Job: Task Id : attempt_1660296669893_0347_m_000000_0, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:227)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
... 10 more
Caused by: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:432)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:220)
... 9 more

22/08/13 00:14:51 INFO mapreduce.Job: Task Id : attempt_1660296669893_0347_m_000000_1, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:227)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
... 10 more
Caused by: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:432)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:220)
... 9 more

22/08/13 00:15:13 INFO mapreduce.Job: Task Id : attempt_1660296669893_0347_m_000000_2, Status : FAILED
Error: java.io.IOException: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:227)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
... 10 more
Caused by: java.io.FileNotFoundException: File file:/reserve1/30.img does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:432)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:220)
... 9 more

22/08/13 00:15:18 INFO mapreduce.Job: map 100% reduce 0%
22/08/13 00:15:18 INFO mapreduce.Job: Job job_1660296669893_0347 failed with state FAILED due to: Task failed task_1660296669893_0347_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

22/08/13 00:15:18 INFO mapreduce.Job: Counters: 8
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=248600
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=31075
Total vcore-milliseconds taken by all map tasks=31075
Total megabyte-milliseconds taken by all map tasks=127283200
22/08/13 00:15:18 ERROR tools.DistCp: Exception encountered
java.io.IOException: DistCp failure: Job job_1660296669893_0347 has failed: Task failed task_1660296669893_0347_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

at org.apache.hadoop.tools.DistCp.execute(DistCp.java:195)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:493)

rki_ · ‎08-12-2022

Hi @BORDIN , Are you able to copy any other files to hdfs using hdfs dfs -put?

Does this happen all the time?

This problem can be caused by the range of different issues that can cause the datanode block reports being delayed from reaching the namenode or the namenode being delayed when processing them.

Can you check the Namenode log for the same time when put was done for any WARN/ERRORS.

If the Namenode is busy, we can perform more retries and thereby more time for Namenode for the block write to complete.

- Go to Cloudera Manager -> HDFS -> Configuration -> HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml and add an entry like following:

<property>
<name>dfs.client.block.write.locateFollowingBlock.retries</name>
<value>10</value>
</property>

- Save changes
- Restart the stale services and deploy the client configuration.

View solution in original post

rki_ · ‎08-12-2022

Hi @BORDIN , Are you able to copy any other files to hdfs using hdfs dfs -put?

Does this happen all the time?

This problem can be caused by the range of different issues that can cause the datanode block reports being delayed from reaching the namenode or the namenode being delayed when processing them.

Can you check the Namenode log for the same time when put was done for any WARN/ERRORS.

If the Namenode is busy, we can perform more retries and thereby more time for Namenode for the block write to complete.

- Go to Cloudera Manager -> HDFS -> Configuration -> HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml and add an entry like following:

<property>
<name>dfs.client.block.write.locateFollowingBlock.retries</name>
<value>10</value>
</property>

- Save changes
- Restart the stale services and deploy the client configuration.

BORDIN · ‎08-19-2022

HI @DianaTorres @rki_

this not working.

i test by follow command. it's still error
hadoop fs -D dfs.client.block.write.locateFollowingBlock.retries=10 -copyFromLocal <file> <hdfs_path>

when i test hdfs -put file to hdfs this is error

--> error is: java.io.IOException: Unable to close file because the last blockBP-1523801842-xxxxx-1491276361828:blk_6867946754_5796409008 does not have enough number of replicas.

pleas help this is major incident.

BR
Bordin S.

rki_ · ‎08-19-2022

Hi @BORDIN ,

The parameter "dfs.client.block.write.locateFollowingBlock.retries" is used to tackle this situation when the file doesn't close by increasing the retries. More info on this is here . As it is still failing, I would suggest to look in the Namenode log for the reason behind the failure. You can grep for block "blk_6867946754_5796409008" or check the Namenode and Datanode log for any WARN/ERROR when the put operation was done.

DianaTorres · ‎08-22-2022

@BORDIN Has the reply above helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks!

Regards,

Diana Torres,
Senior Community Moderator

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

DianaTorres · ‎08-16-2022

@BORDIN Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. If you are still experiencing the issue, can you provide the information @rki_ has requested?

Regards,

Diana Torres,
Senior Community Moderator

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

Support Questions

Exception while adding a block org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException): Not replicated yet: