Created 07-06-2018 06:42 AM
Hello everyone,
we want to connect our SQL Server 2016 Enterprise via Polybase with our Kerberized OnPrem Hadoop-Cluster with Cloudera 5.14. I followed the Microsoft Polybase Guide to configure Polybase and I was successful with all four checkpoints. Unfortunately we are not able to export tables from SQL-Server to our Hadoop-Cluster.
Short information about the four checkpoints from Polybase guide:
Uploading a local file to our HDFS works great but exporting a small table from SQL-Server to HDFS throws following exception.
Exception from primary NameNode:
IPC Server handler 22 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.100.160.13:53900 Call#805 Retry#0 java.io.IOException: File /PolybaseTest/QID2585_20180706_150246_1.parq.gz could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3448) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:690) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
On our SQL-Server we get almost the same exception.
Exception from SQL-Server:
Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11" for linked server "SQLNCLI11". 110802;An internal DMS error occurred that caused this operation to fail. Details: Exception: Microsoft.SqlServer.DataWarehouse.DataMovement.Common.ExternalAccess.HdfsAccessException, Message: Java exception raised on call to HdfsBridge_DestroyRecordWriter: Error [File /PolybaseTest/QID2585_20180706_150246_7.parq.gz could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3448) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:690) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275) ] occurred while accessing external file.
I appreciate any help!
Created 07-18-2018 12:26 AM
After opening ports 1004 and 1006 we are now able to write data into our Cluster.
Many thanks to weichiu, without his hint regarding enabling debug log for class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology I would not be able to see the problem.
Created 07-06-2018 01:50 PM
Block placement is a very complex algorithm. I would suggest enable debug log for class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology on the NameNode. (Or just enable NameNode debug log level) The debug log should given an explanation as to why it couldn't choose the DataNodes to write.
Created 07-06-2018 05:48 PM
Hello weichiu,
thank you very much for your reply!
Below you can find the logs for Block placement and Network topology.
1:21:28.874 AM INFO Server Auth successful for hdfs@MYCOMPANY.REALM.COM (auth:KERBEROS) 1:21:28.876 AM INFO ServiceAuthorizationManager Authorization successful for hdfs@MYCOMPANY.REALM.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol 1:21:28.888 AM DEBUG NetworkTopology Choosing random from 3 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[] 1:21:28.888 AM DEBUG NetworkTopology chooseRandom returning X.X.X.45:1004 1:21:28.888 AM DEBUG NetworkTopology Choosing random from 3 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[] 1:21:28.888 AM DEBUG NetworkTopology Failed to find datanode (scope="" excludedScope="/default"). 1:21:28.888 AM DEBUG NetworkTopology chooseRandom returning X.X.X.43:1004 1:21:28.888 AM DEBUG BlockPlacementPolicy Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:746) . . . 1:21:28.889 AM DEBUG NetworkTopology Failed to find datanode (scope="" excludedScope="/default"). 1:21:28.889 AM DEBUG NetworkTopology Choosing random from 2 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[X.X.X.45:1004] 1:21:28.889 AM DEBUG BlockPlacementPolicy Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:746) . . . 1:21:28.890 AM DEBUG NetworkTopology Choosing random from 2 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[X.X.X.43:1004] 1:21:28.889 AM DEBUG NetworkTopology chooseRandom returning X.X.X.44:1004 1:21:28.890 AM DEBUG NetworkTopology Node X.X.X.43:1004 is excluded, continuing. 1:21:28.890 AM DEBUG NetworkTopology Node X.X.X.43:1004 is excluded, continuing. 1:21:28.890 AM DEBUG NetworkTopology Node X.X.X.43:1004 is excluded, continuing. 1:21:28.890 AM DEBUG NetworkTopology chooseRandom returning X.X.X.44:1004 1:21:28.890 AM DEBUG NetworkTopology Failed to find datanode (scope="" excludedScope="/default"). 1:21:28.890 AM DEBUG BlockPlacementPolicy Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:746) . . . 1:21:28.890 AM DEBUG NetworkTopology Failed to find datanode (scope="" excludedScope="/default"). 1:21:28.890 AM DEBUG BlockPlacementPolicy Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:746) . . . 1:21:28.890 AM DEBUG NetworkTopology Choosing random from 1 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[X.X.X.45:1004, X.X.X.44:1004] 1:21:28.890 AM DEBUG NetworkTopology Choosing random from 1 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[X.X.X.43:1004, X.X.X.44:1004] 1:21:28.890 AM DEBUG NetworkTopology Node X.X.X.44:1004 is excluded, continuing. 1:21:28.890 AM DEBUG NetworkTopology Node X.X.X.43:1004 is excluded, continuing. 1:21:28.891 AM DEBUG NetworkTopology Node X.X.X.44:1004 is excluded, continuing. 1:21:28.891 AM DEBUG NetworkTopology Node X.X.X.43:1004 is excluded, continuing. 1:21:28.891 AM DEBUG NetworkTopology Node X.X.X.45:1004 is excluded, continuing. 1:21:28.891 AM DEBUG NetworkTopology chooseRandom returning X.X.X.45:1004 1:21:28.891 AM DEBUG NetworkTopology Node X.X.X.45:1004 is excluded, continuing. 1:21:28.891 AM DEBUG NetworkTopology chooseRandom returning X.X.X.43:1004 1:21:28.891 AM INFO StateChange BLOCK* allocateBlock: /PolybaseTest/QID2601_20180707_12128_3.parq.gz. BP-1767765873-X.X.X.41-1525850808562 blk_1073840961_100142{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-d069f8ba-9a4e-4b64-863d-9b818b27d298:NORMAL:X.X.X.43:1004|RBW], ReplicaUnderConstruction[[DISK]DS-e30cd499-5230-4e68-a6f7-4517e8f5b367:NORMAL:X.X.X.44:1004|RBW], ReplicaUnderConstruction[[DISK]DS-246055b9-1252-4d70-8b4a-6406346da99f:NORMAL:X.X.X.45:1004|RBW]]} 1:21:28.891 AM INFO StateChange BLOCK* allocateBlock: /PolybaseTest/QID2601_20180707_12128_4.parq.gz. BP-1767765873-X.X.X.41-1525850808562 blk_1073840962_100143{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-246055b9-1252-4d70-8b4a-6406346da99f:NORMAL:X.X.X.45:1004|RBW], ReplicaUnderConstruction[[DISK]DS-e30cd499-5230-4e68-a6f7-4517e8f5b367:NORMAL:X.X.X.44:1004|RBW], ReplicaUnderConstruction[[DISK]DS-d069f8ba-9a4e-4b64-863d-9b818b27d298:NORMAL:X.X.X.43:1004|RBW]]} 1:21:28.891 AM DEBUG NetworkTopology Choosing random from 3 available nodes on node /default, scope=/default, excludedScope=null, excludeNodes=[] 1:21:28.892 AM DEBUG NetworkTopology chooseRandom returning X.X.X.44:1004 1:21:28.892 AM DEBUG NetworkTopology Failed to find datanode (scope="" excludedScope="/default"). 1:21:28.892 AM DEBUG BlockPlacementPolicy Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:746) . . .
I could see from logs two important points for me:
Many thanks in advance.
Baris
Created 07-09-2018 03:31 AM
What we will do so far is now that we will open the Ports 1004 and 1006 for secure communication.
Can someone tell us please if the logs from our NameNode looks correct or strange?
We would need more assistance regarding logs.
Many thanks.
Baris
Created 07-18-2018 12:26 AM
After opening ports 1004 and 1006 we are now able to write data into our Cluster.
Many thanks to weichiu, without his hint regarding enabling debug log for class org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology I would not be able to see the problem.
Created 07-30-2018 09:32 AM
While reading and writing from HDFS getting bellow errors from Java prg side
Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hdfs/test11/tutorials11.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
Created 07-30-2018 07:48 PM