Created on 08-13-2019 02:03 AM - edited 08-17-2019 04:12 PM
Hi!
I'm trying to upload csv file to HDP 3.0 (hdfs 3.1.1) with Pentaho Spoon 8.3 (Hadoop copy files step)
Got the same error as here: https://community.hortonworks.com/questions/108440/copying-files-to-hadoop-error.html
"Could not close the output stream for file"
Empty file in hdfs appeared as a result. Host OS - windows 8, in Pentaho.bat set user hdfs:
set HADOOP_USER_NAME=hdfs
Enabled debug log for hdfs in advanced log4j settings in ambari (mostly by guess and github source codes):
log4j.logger.org.apache.hadoop.ipc=DEBUGlog4j.logger.org.apache.hadoop.net=DEBUGlog4j.logger.org.apache.hadoop.net.NetworkTopology=DEBUGlog4j.logger.org.apache.hadoop.fs=DEBUGlog4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUGlog4j.logger.org.apache.hadoop.fs.FSDataOutputStream=DEBUGlog4j.logger.org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy=DEBUGlog4j.logger.org.apache.hadoop.hdfs.StateChange=DEBUGlog4j.logger.org.apache.hadoop.hdfs.DFSClient=DEBUGlog4j.logger.org.apache.hadoop.hdfs.protocol=DEBUG
Hadoop Root Logger = DEBUG,RFA
Now I have such log:
2019-08-12 06:30:00,840 DEBUG org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer (NameNodeRpcServer.java:create(757)) - *DIR* NameNode.create: file /user/hdfs/riskfactor.csv for DFSClient_NONMAPREDUCE_1981358968_179 at 172.18.0.3 2019-08-12 06:30:00,841 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:startFileInt(2378)) - DIR* NameSystem.startFile: src=/user/hdfs/riskfactor.csv, holder=DFSClient_NONMAPREDUCE_1981358968_179, clientMachine=172.18.0.3, createParent=true, replication=1, createFlag=[CREATE, OVERWRITE], blockSize=134217728, supportedVersions=[CryptoProtocolVersion{description='Encryption zones', version=2, unknownValue=null}] 2019-08-12 06:30:00,842 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp (FSDirDeleteOp.java:delete(55)) - DIR* FSDirectory.delete: /user/hdfs/riskfactor.csv 2019-08-12 06:30:00,845 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp (FSDirDeleteOp.java:unprotectedDelete(269)) - DIR* FSDirectory.unprotectedDelete: /user/hdfs/riskfactor.csv is removed 2019-08-12 06:30:00,848 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:addFile(571)) - DIR* addFile: riskfactor.csv is added 2019-08-12 06:30:00,849 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:startFile(407)) - DIR* NameSystem.startFile: added /user/hdfs/riskfactor.csv inode 681603 DFSClient_NONMAPREDUCE_1981358968_179 2019-08-12 06:30:00,851 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog (FSEditLog.java:printStatistics(774)) - Number of transactions: 2 Total time for transactions(ms): 4 Number of transactions batched in Syncs: 1967516 Number of syncs: 2 SyncTimes(ms): 11 2019-08-12 06:30:00,852 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: create, queueTime= 0 procesingTime= 14 2019-08-12 06:30:00,853 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 12 on 8020: responding to Call#178 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 172.18.0.3:41538 2019-08-12 06:30:00,853 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 12 on 8020: responding to Call#178 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 172.18.0.3:41538 Wrote 91 bytes. 2019-08-12 06:30:01,016 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #179 2019-08-12 06:30:01,017 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 20 on 8020: Call#179 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.3:41538 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 06:30:01,018 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:getAdditionalBlock(2684)) - BLOCK* getAdditionalBlock: /user/hdfs/riskfactor.csv inodeId 681603 for DFSClient_NONMAPREDUCE_1981358968_179 2019-08-12 06:30:01,041 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(538)) - Choosing random from 1 available nodes on node /default-rack, scope=/default-rack, excludedScope=null, excludeNodes=[]. numOfDatanodes=1. 2019-08-12 06:30:01,041 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(547)) - chooseRandom returning 172.18.0.2:50010 2019-08-12 06:30:01,043 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:addBlock(516)) - DIR* FSDirectory.addBlock: /user/hdfs/riskfactor.csv with blk_1074001529_260815 block is added to the in-memory file system 2019-08-12 06:30:01,043 INFO org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:logAllocatedBlock(790)) - BLOCK* allocate blk_1074001529_260815, replicas=172.18.0.2:50010 for /user/hdfs/riskfactor.csv 2019-08-12 06:30:01,043 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:persistNewBlock(750)) - persistNewBlock: /user/hdfs/riskfactor.csv with new block blk_1074001529_260815, current total block count is 1 2019-08-12 06:30:01,059 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: addBlock, queueTime= 1 procesingTime= 42 2019-08-12 06:30:01,059 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 20 on 8020: responding to Call#179 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.3:41538 2019-08-12 06:30:01,059 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 20 on 8020: responding to Call#179 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.3:41538 Wrote 490 bytes. 2019-08-12 06:30:01,188 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #180 2019-08-12 06:30:01,189 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 7 on 8020: Call#180 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults from 172.18.0.3:41538 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 06:30:01,195 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: getServerDefaults, queueTime= 1 procesingTime= 6 2019-08-12 06:30:01,195 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 7 on 8020: responding to Call#180 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults from 172.18.0.3:41538 2019-08-12 06:30:01,195 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 7 on 8020: responding to Call#180 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults from 172.18.0.3:41538 Wrote 64 bytes. 2019-08-12 06:30:01,243 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #181 2019-08-12 06:30:01,243 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 11 on 8020: Call#181 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.abandonBlock from 172.18.0.3:41538 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 06:30:01,244 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:abandonBlock(2785)) - BLOCK* NameSystem.abandonBlock: BP-1419118625-172.17.0.2-1543512323726:blk_1074001529_260815 of file /user/hdfs/riskfactor.csv 2019-08-12 06:30:01,244 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:unprotectedRemoveBlock(91)) - DIR* FSDirectory.removeBlock: /user/hdfs/riskfactor.csv with blk_1074001529_260815 block is removed from the file system 2019-08-12 06:30:01,244 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:persistBlocks(111)) - persistBlocks: /user/hdfs/riskfactor.csv with 0 blocks is persisted to the file system 2019-08-12 06:30:01,244 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:abandonBlock(2794)) - BLOCK* NameSystem.abandonBlock: BP-1419118625-172.17.0.2-1543512323726:blk_1074001529_260815 is removed from pendingCreates 2019-08-12 06:30:01,244 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: abandonBlock, queueTime= 0 procesingTime= 1 2019-08-12 06:30:01,245 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - FSEditLogAsync: responding to Call#181 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.abandonBlock from 172.18.0.3:41538 2019-08-12 06:30:01,245 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - FSEditLogAsync: responding to Call#181 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.abandonBlock from 172.18.0.3:41538 Wrote 33 bytes. 2019-08-12 06:30:01,247 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #182 2019-08-12 06:30:01,247 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 5 on 8020: Call#182 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.3:41538 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 06:30:01,249 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:getAdditionalBlock(2684)) - BLOCK* getAdditionalBlock: /user/hdfs/riskfactor.csv inodeId 681603 for DFSClient_NONMAPREDUCE_1981358968_179 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(538)) - Choosing random from 0 available nodes on node /default-rack, scope=/default-rack, excludedScope=null, excludeNodes=[172.18.0.2:50010]. numOfDatanodes=1. 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(547)) - chooseRandom returning null 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.hdfs.net.DFSNetworkTopology (DFSNetworkTopology.java:chooseRandomWithStorageTypeTwoTrial(125)) - No node to choose. 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault (BlockPlacementPolicyDefault.java:chooseLocalRack(647)) - Failed to choose from local rack (location = /default-rack); the second replica is not found, retry choosing randomly org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:834) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:722) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:629) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:589) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:488) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:419) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:295) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:148) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:164) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2113) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:286) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2706) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(538)) - Choosing random from 0 available nodes on node /, scope=, excludedScope=null, excludeNodes=[172.18.0.2:50010]. numOfDatanodes=1. 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(547)) - chooseRandom returning null 2019-08-12 06:30:01,250 DEBUG org.apache.hadoop.hdfs.net.DFSNetworkTopology (DFSNetworkTopology.java:chooseRandomWithStorageTypeTwoTrial(125)) - No node to choose. 2019-08-12 06:30:01,251 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault (BlockPlacementPolicyDefault.java:chooseTarget(432)) - Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) 2019-08-12 06:30:01,251 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(538)) - Choosing random from 0 available nodes on node /default-rack, scope=/default-rack, excludedScope=null, excludeNodes=[172.18.0.2:50010]. numOfDatanodes=1. 2019-08-12 06:30:01,251 DEBUG org.apache.hadoop.net.NetworkTopology (NetworkTopology.java:chooseRandom(547)) - chooseRandom returning null 2019-08-12 06:30:01,251 DEBUG org.apache.hadoop.hdfs.net.DFSNetworkTopology (DFSNetworkTopology.java:chooseRandomWithStorageTypeTwoTrial(125)) - No node to choose. 2019-08-12 06:30:01,251 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault (BlockPlacementPolicyDefault.java:chooseLocalRack(647)) - Failed to choose from local rack (location = /default-rack); the second replica is not found, retry choosing randomly
You can see that at some point abandonBlock procedure called: abandonBlock from 172.18.0.3:41538
Why can this happen?
I tried to upload csv through web-interface (both Ambari files view and browse directory at http://sandbox-hdp.hortonworks.com:50070/explorer.html#/) and it's succesfully stored. Logs in that case shows another ip - 172.18.0.2
2019-08-12 07:19:22,565 DEBUG org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer (NameNodeRpcServer.java:create(757)) - *DIR* NameNode.create: file /tmp/riskfactor.csv for DFSClient_NONMAPREDUCE_1960509895_42 at 172.18.0.2 2019-08-12 07:19:22,565 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:startFileInt(2378)) - DIR* NameSystem.startFile: src=/tmp/riskfactor.csv, holder=DFSClient_NONMAPREDUCE_1960509895_42, clientMachine=172.18.0.2, createParent=true, replication=1, createFlag=[CREATE], blockSize=134217728, supportedVersions=[CryptoProtocolVersion{description='Encryption zones', version=2, unknownValue=null}] 2019-08-12 07:19:22,566 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:addFile(571)) - DIR* addFile: riskfactor.csv is added 2019-08-12 07:19:22,567 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:startFile(407)) - DIR* NameSystem.startFile: added /tmp/riskfactor.csv inode 681608 DFSClient_NONMAPREDUCE_1960509895_42 2019-08-12 07:19:22,568 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: create, queueTime= 0 procesingTime= 3 2019-08-12 07:19:22,568 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 87 on 8020: responding to Call#1051 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 172.18.0.2:52358 2019-08-12 07:19:22,568 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 87 on 8020: responding to Call#1051 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create from 172.18.0.2:52358 Wrote 92 bytes. 2019-08-12 07:19:22,616 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #1052 2019-08-12 07:19:22,616 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 92 on 8020: Call#1052 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.2:52358 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 07:19:22,616 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:getAdditionalBlock(2684)) - BLOCK* getAdditionalBlock: /tmp/riskfactor.csv inodeId 681608 for DFSClient_NONMAPREDUCE_1960509895_42 2019-08-12 07:19:22,617 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:addBlock(516)) - DIR* FSDirectory.addBlock: /tmp/riskfactor.csv with blk_1074001532_260818 block is added to the in-memory file system 2019-08-12 07:19:22,617 INFO org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:logAllocatedBlock(790)) - BLOCK* allocate blk_1074001532_260818, replicas=172.18.0.2:50010 for /tmp/riskfactor.csv 2019-08-12 07:19:22,617 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:persistNewBlock(750)) - persistNewBlock: /tmp/riskfactor.csv with new block blk_1074001532_260818, current total block count is 1 2019-08-12 07:19:22,618 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: addBlock, queueTime= 0 procesingTime= 2 2019-08-12 07:19:22,619 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - FSEditLogAsync: responding to Call#1052 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.2:52358 2019-08-12 07:19:22,619 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - FSEditLogAsync: responding to Call#1052 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.18.0.2:52358 Wrote 492 bytes. 2019-08-12 07:19:22,627 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #1053 2019-08-12 07:19:22,627 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 88 on 8020: Call#1053 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults from 172.18.0.2:52358 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 07:19:22,627 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: getServerDefaults, queueTime= 0 procesingTime= 0 2019-08-12 07:19:22,628 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 88 on 8020: responding to Call#1053 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults from 172.18.0.2:52358 2019-08-12 07:19:22,628 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 88 on 8020: responding to Call#1053 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getServerDefaults from 172.18.0.2:52358 Wrote 64 bytes. 2019-08-12 07:19:22,739 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #1054 2019-08-12 07:19:22,739 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 57 on 8020: Call#1054 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.complete from 172.18.0.2:52358 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 07:19:22,739 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:completeFile(666)) - DIR* NameSystem.completeFile: /tmp/riskfactor.csv for DFSClient_NONMAPREDUCE_1960509895_42 2019-08-12 07:19:22,740 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2912)) - BLOCK* blk_1074001532_260818 is COMMITTED but not COMPLETE(numNodes= 0 < minimum = 1) in file /tmp/riskfactor.csv 2019-08-12 07:19:22,742 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: complete, queueTime= 0 procesingTime= 3 2019-08-12 07:19:22,742 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 57 on 8020: responding to Call#1054 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.complete from 172.18.0.2:52358 2019-08-12 07:19:22,742 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 57 on 8020: responding to Call#1054 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.complete from 172.18.0.2:52358 Wrote 35 bytes. 2019-08-12 07:19:22,749 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #1055 2019-08-12 07:19:22,749 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 35 on 8020: Call#1055 Retry#0 org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.blockReceivedAndDeleted from 172.18.0.2:40772 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 07:19:22,758 DEBUG org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer (NameNodeRpcServer.java:blockReceivedAndDeleted(1574)) - *BLOCK* NameNode.blockReceivedAndDeleted: from DatanodeRegistration(172.18.0.2:50010, datanodeUuid=9ba7d4f6-56d3-4c19-806d-7919c2444bb0, infoPort=50075, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-57;cid=CID-489f8882-b69a-4b89-ad7a-bbca42038186;nsid=1278768352;c=1543512323726) 1 blocks. 2019-08-12 07:19:22,759 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: blockReceivedAndDeleted, queueTime= 8 procesingTime= 2 2019-08-12 07:19:22,759 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1468)) - IPC Server handler 35 on 8020: responding to Call#1055 Retry#0 org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.blockReceivedAndDeleted from 172.18.0.2:40772 2019-08-12 07:19:22,759 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockManager (BlockManager.java:addStoredBlock(3219)) - BLOCK* addStoredBlock: 172.18.0.2:50010 is added to blk_1074001532_260818 (size=2314) 2019-08-12 07:19:22,760 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockManager (BlockManager.java:processIncrementalBlockReport(4020)) - BLOCK* block RECEIVED_BLOCK: blk_1074001532_260818 is received from 172.18.0.2:50010 2019-08-12 07:19:22,760 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockManager (BlockManager.java:processIncrementalBlockReport(4023)) - *BLOCK* NameNode.processIncrementalBlockReport: from 172.18.0.2:50010 receiving: 0, received: 1, deleted: 0 2019-08-12 07:19:22,759 DEBUG org.apache.hadoop.ipc.Server$Responder (Server.java:processResponse(1487)) - IPC Server handler 35 on 8020: responding to Call#1055 Retry#0 org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.blockReceivedAndDeleted from 172.18.0.2:40772 Wrote 33 bytes. 2019-08-12 07:19:23,154 DEBUG org.apache.hadoop.ipc.Server$Connection (Server.java:processOneRpc(2355)) - got #1056 2019-08-12 07:19:23,154 DEBUG org.apache.hadoop.ipc.Server$Handler (Server.java:run(2670)) - IPC Server handler 85 on 8020: Call#1056 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.complete from 172.18.0.2:52358 for RpcKind RPC_PROTOCOL_BUFFER 2019-08-12 07:19:23,155 DEBUG org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp (FSDirWriteFileOp.java:completeFile(666)) - DIR* NameSystem.completeFile: /tmp/riskfactor.csv for DFSClient_NONMAPREDUCE_1960509895_42 2019-08-12 07:19:23,156 DEBUG org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:closeFile(3951)) - closeFile: /tmp/riskfactor.csv with 1 blocks is persisted to the file system 2019-08-12 07:19:23,156 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem (FSNamesystem.java:completeFile(2865)) - DIR* completeFile: /tmp/riskfactor.csv is closed by DFSClient_NONMAPREDUCE_1960509895_42 2019-08-12 07:19:23,156 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker (ProtobufRpcEngine.java:call(550)) - Served: complete, queueTime= 1 procesingTime= 1
May be this routing issue?
[root@sandbox-hdp hdfs]# ip route list
default via 172.18.0.1 dev eth0
172.18.0.0/16 dev eth0 proto kernel scope link
There was also "... src 172.18.0.2" that I deleted by recreating route, but nothing changed.
Some cases that I'm already tried with no luck:
- uploading another file
- uploading with another user (admin, root)
- uploading in another folder (/tmp, /user/hdfs, /user/admin)
- using another verwion of PDI
- forwarding port 50010 in vbox machine NAT settings
- set dfs.client.use.datanode.hostname = true at both sides (pentaho and server)
Have any idea what to do?
Created 08-18-2019 03:44 AM
If you want to just study, I recommend the config dfs.permissions.enabled=false .
And in my experience, Pentaho Spoon store some info to it's cache dir.
You can try like this...
- After Pentaho Spoon exit
- then delete cache dir
- and then restart your job.
Created 08-20-2019 11:20 PM
Thanks, youngick
I set dfs.permissions.enabled=false in ambari and restarted hdfs, then closed Spoon, cleared PDI caches in ...\data-integration\system\karaf\caches\ and in ~\.pentaho\caches, started Spoon back and try job again with the same error. Nothing changed.
Also I'm tried such hard thing as namenode format as described here:
This method didn't help either.
As this is all a study case only, I'll come to it back later.