Member since
09-29-2014
224
Posts
11
Kudos Received
10
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1860 | 01-24-2024 10:45 PM | |
| 6132 | 03-30-2022 08:56 PM | |
| 4677 | 08-12-2021 10:40 AM | |
| 10830 | 04-28-2021 01:30 AM |
03-30-2022
08:56 PM
1 Kudo
it's done. after i set storage policy to ALL_SSD, and restart all the service , this error disappeared.
... View more
03-30-2022
01:31 PM
i followed https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_sg_ldap_grp_mappings.html#ldap_group_mapping to set up openldap integration . 1. install openldap 2. set ldap parameter by doucments. 3. restart all service.
... View more
03-30-2022
11:47 AM
as you know , this file locate many path, namenode, datenode, yarn ,hbase. and this file is created by CDH, do you suggest me to change these location path permission ? if i restart one of these role, this file as i think would created again , and the permission still would be 700
... View more
03-30-2022
10:48 AM
HI, after i have integrated CDH with Openldap, I found there is a WARNING in container log like below, try to get password file localjecks and permission denied. 2022-03-31 00:53:13,420 WARN [main] org.apache.hadoop.security.LdapGroupsMapping: Exception while trying to get password for alias hadoop.security.group.mapping.ldap.ssl.keystore.password:
java.io.IOException: Configuration problem with provider path.
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2118)
at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:2037)
at org.apache.hadoop.security.LdapGroupsMapping.getPassword(LdapGroupsMapping.java:528)
at org.apache.hadoop.security.LdapGroupsMapping.setConf(LdapGroupsMapping.java:473)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.security.Groups.<init>(Groups.java:104)
at org.apache.hadoop.security.Groups.<init>(Groups.java:100)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:435)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:341)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:308)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:895)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:861)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:728)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:387)
Caused by: java.io.FileNotFoundException: /run/cloudera-scm-agent/process/9392-yarn-NODEMANAGER/creds.localjceks (Permission denied)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.getInputStreamForFile(LocalJavaKeyStoreProvider.java:83)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.locateKeystore(AbstractJavaKeyStoreProvider.java:334)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:88)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:58)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:50)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider$Factory.createProvider(LocalJavaKeyStoreProvider.java:177)
at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:73)
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2098) this warning doesn't affect the mapreduce job, i just want to know how to resolve this.
... View more
Labels:
- Labels:
-
Apache Hadoop
03-25-2022
03:06 PM
recently i have set up a new CDH cluster with all SSD disk. after this cluster goes live , i found the namenode log always output some WARNING log, as below:
2022-03-26 06:00:57,688 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology 2022-03-26 06:00:57,688 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology.
i would like to know what happend exactly, then i open debug log:
2022-03-26 05:56:50,837 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.20.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to choose from local rack (location = /default); the second replica is not found, retry choosing randomly org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:827) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:715) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:622) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:582) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:485) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:445) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:292) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:159) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2094) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2673) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) 2022-03-26 05:56:50,837 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:827) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:689) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:494) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:465) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:445) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:292) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:159) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2094) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2673) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:827) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:689) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:503) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:465) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:445) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:292) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:159) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2094) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2673) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
there is a so strange information for me : the node xxxx has no enough space, actually, this is a new cluster, and all the node still has 8T space.
2022-03-26 05:56:45,328 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.23.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:46,724 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.23.27:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:46,724 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.23.27:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:50,836 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.20.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.20.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:51,777 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.31:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:51,778 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.31:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:57,978 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.228:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:57,978 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.228:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0).
anyone knows how to handle this kind error ?
... View more
Labels:
03-23-2022
01:21 AM
1 Kudo
oh, this is a long time ago issue, the root cause is because new machines charset is not utf-8, just keep all the machines chaset is utf-8 , then its ok.
... View more
08-12-2021
10:40 AM
i give you more details about this cdh cluster. the original cluster is 5.14 and os version is Centos 6.5, parcels REHL6, and recently i have added new machines into this cluster, os version is Centos 7.6 parcels is REHL7. all this erros happend just on the new machines which is REHL 7. the old datanode doesn't have this errors.
... View more
08-02-2021
02:30 PM
i have found my one of CDH has so many errors on every datanode, the error logs as below. who have this kind experience on this issue ? and give me some advises
2021-08-03 05:23:43,389 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061604_2849065475 src: /10.37.54.218:36088 dest: /10.37.54.218:1004
2021-08-03 05:23:43,700 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36082, dest: /10.37.54.218:1004, bytes: 358, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-859199005_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061597_2849065468, duration: 59733778
2021-08-03 05:23:43,700 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061597_2849065468, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:43,833 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36088, dest: /10.37.54.218:1004, bytes: 309, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-859199005_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061604_2849065475, duration: 200220559
2021-08-03 05:23:43,833 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061604_2849065475, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:44,044 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061619_2849065490 src: /10.37.54.15:59320 dest: /10.37.54.218:1004
2021-08-03 05:23:44,058 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.15:59320, dest: /10.37.54.218:1004, bytes: 112, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1165227557_139, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061619_2849065490, duration: 3752037
2021-08-03 05:23:44,058 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061619_2849065490, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:45,037 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061679_2849065550 src: /10.37.54.218:36108 dest: /10.37.54.218:1004
2021-08-03 05:23:45,185 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36108, dest: /10.37.54.218:1004, bytes: 1415899, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1849481388_3452, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061679_2849065550, duration: 61038196
2021-08-03 05:23:45,185 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061679_2849065550, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:45,497 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-2123011416-10.37.54.12-1457006347704:blk_3802213701_2741214333 from /10.37.54.13:44312, delHint=6a0ea409-35ad-42c5-956d-44a5b9bd58a6
2021-08-03 05:23:45,703 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061646_2849065517 src: /10.37.54.216:54728 dest: /10.37.54.218:1004
2021-08-03 05:23:45,714 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received BP-2123011416-10.37.54.12-1457006347704:blk_3910061646_2849065517 src: /10.37.54.216:54728 dest: /10.37.54.218:1004 of size 4786053
2021-08-03 05:23:45,998 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-2123011416-10.37.54.12-1457006347704:blk_1842563008_775812314 from /10.37.54.13:50434, delHint=6a0ea409-35ad-42c5-956d-44a5b9bd58a6
2021-08-03 05:23:46,042 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: BlockSender.sendChunks() exception:
java.io.IOException: 断开的管道
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:605)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:789)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:736)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:551)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:148)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
2021-08-03 05:23:46,043 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: BlockSender.sendChunks() exception:
java.io.IOException: 断开的管道
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:605)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:789)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:736)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:551)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:148)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
2021-08-03 05:23:47,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061723_2849065594 src: /10.37.54.216:54770 dest: /10.37.54.218:1004
2021-08-03 05:23:47,018 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061724_2849065595 src: /10.37.54.216:54772 dest: /10.37.54.218:1004
2021-08-03 05:23:47,019 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.216:54772, dest: /10.37.54.218:1004, bytes: 4158, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1438538333_1, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061724_2849065595, duration: 1392081
2021-08-03 05:23:47,019 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061724_2849065595, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2021-08-03 05:23:47,048 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061725_2849065596 src: /10.37.54.216:54774 dest: /10.37.54.218:1004
2021-08-03 05:23:47,056 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.216:54774, dest: /10.37.54.218:1004, bytes: 69, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1452909160_189, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061725_2849065596, duration: 7712861
2021-08-03 05:23:47,056 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061725_2849065596, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2021-08-03 05:23:47,371 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061731_2849065602 src: /10.37.54.218:36198 dest: /10.37.54.218:1004
2021-08-03 05:23:47,407 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36198, dest: /10.37.54.218:1004, bytes: 314, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_466653976_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061731_2849065602, duration: 11069615
2021-08-03 05:23:47,407 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061731_2849065602, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:47,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061732_2849065603 src: /10.37.54.218:36202 dest: /10.37.54.218:1004
2021-08-03 05:23:47,458 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36202, dest: /10.37.54.218:1004, bytes: 17456, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_466653976_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061732_2849065603, duration: 9623611
2021-08-03 05:23:47,458 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061732_2849065603, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:47,497 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-2123011416-10.37.54.12-1457006347704:blk_2529434549_1466543157 from /10.37.54.13:39396, delHint=6a0ea409-35ad-42c5-956d-44a5b9bd58a6
... View more
Labels:
05-05-2021
07:52 PM
in my pervious experience, i haven't ever set the port range, he default port is 32768 ---65536. so my only question is why 1000~ port can't be connected ? could you give me some information?
... View more
04-28-2021
01:30 AM
this issue has been solved right now. and the investigation road likes below: when i got this issue from development team, these peoples told me some tasks will be failed, and asked me how to solve it. then i open Yarn web ui to check what's exact errors of this issue, and found the connection time out. this is the first vision i have got. so i was considering why the port can't connect ? maybe there is a firewall ? or maybe one machine got some problem, when task assigned to this machine, then this issue happended? these all are my assumption, and after two days checked, the answer is no. since no firewall, and this issue happended randomly on every machine. just yesterday night, i found if the connection port is near 1000, then the job failed and connection timeout, but if the port is near 30000+, there are no any issue happend. so i am going to check the sysctl.conf, i found the setting for port range is "net.ipv4.ip_local_port_range = 1024 65000", at last i set the port range between "32678. 655000", this issue has been solved.
... View more