Member since
09-08-2017
15
Posts
0
Kudos Received
0
Solutions
09-09-2020
09:13 AM
Hi @Debangshu It worked with 1.10.0 and 1.11.3, thanks mate for the resolution. Thanks David
... View more
09-03-2020
06:14 AM
Does it work with 1.12.0 NiFi? Or 1.11.3 is better?
... View more
09-03-2020
04:31 AM
hi, If "root" is trying to write the file on HDFS then privilege issue should throw up as HDFS ACLs sync with sentry privileges. Thanks David
... View more
09-03-2020
02:24 AM
Is this applicable for Hadoop 2.6.0.(CDH 5.16.2)?
... View more
08-13-2020
07:21 AM
Hi Timothy, Here are some logs which might give you some insights, today I have eliminated Networking and Dual NiC issues, as both the clusters are the in the same Subnet and there is no dual NiC for these VMs and all possible traffic is seamlessly flowing back and forth. Name Node Logs:
2020-08-13 16:01:26,120 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for mapred/principle@user.queue (auth:KERBEROS)
2020-08-13 16:01:26,126 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for mapred/principle@user.queue (auth:KERBEROS) for protocol=interface user.queue.user.queue.user.queue
2020-08-13 16:01:28,672 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for principle@user.queue (auth:KERBEROS)
2020-08-13 16:01:28,679 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for principle@user.queue (auth:KERBEROS) for protocol=interface user.queue.user.queue.user.queue
2020-08-13 16:01:28,704 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /user/abc/puthdfs_test/.user.queue. user.queue.64.55-1545405130172 blk_1075343453_1603222{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-53536364-33f4-40d6-85c2-508abf7ff023:NORMAL:00.00.64.58:50010|RBW], ReplicaUnderConstruction[[DISK]DS-abba7d97-925a-4299-af86-b58fef9aaa12:NORMAL:00.00.64.84:50010|RBW], ReplicaUnderConstruction[[DISK]DS-286b28e8-d035-4b8c-a2dd-aabb08666234:NORMAL:00.00.64.56:50010|RBW]]}
2020-08-13 16:01:28,727 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /user/abc/puthdfs_test/.user.queue. user.queue.64.55-1545405130172 blk_1075343454_1603223{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-abba7d97-925a-4299-af86-b58fef9aaa12:NORMAL:00.00.64.84:50010|RBW], ReplicaUnderConstruction[[DISK]DS-286b28e8-d035-4b8c-a2dd-aabb08666234:NORMAL:00.00.64.56:50010|RBW], ReplicaUnderConstruction[[DISK]DS-d6f56418-6e18-4317-a8ec-4a5b15757728:NORMAL:00.00.64.57:50010|RBW]]}
2020-08-13 16:01:28,734 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on user.queue.user.queue.user.queue.BlockPlacementPolicy and user.queue.user.queue.NetworkTopology
2020-08-13 16:01:28,735 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
DataNode Logs:
2020-08-13 16:00:41,154 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving user.queue.64.55-1545405130172:blk_1075343452_1603221 src: /00.00.64.58:55510 dest: /00.00.64.57:50010
2020-08-13 16:00:41,213 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /00.00.64.58:55510, dest: /00.00.64.57:50010, bytes: 56, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1029630366_107, offset: 0, srvID: cb4e7a77-f5d6-49a5-abab-58d060602ec7, blockid: user.queue.64.55-1545405130172:blk_1075343452_1603221, duration: 54439548
2020-08-13 16:00:41,214 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: user.queue.64.55-1545405130172:blk_1075343452_1603221, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2020-08-13 16:00:45,149 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1075343452_1603221 file /data/dfs/dn/current/user.queue.64.55-1545405130172/current/finalized/subdir24/subdir112/blk_1075343452 for deletion
2020-08-13 16:00:45,149 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted user.queue.64.55-1545405130172 blk_1075343452_1603221 file /data/dfs/dn/current/user.queue.64.55-1545405130172/current/finalized/subdir24/subdir112/blk_1075343452
2020-08-13 16:01:28,743 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: user.queue.user.queue:50010:DataXceiver error processing unknown operation src: /00.00.64.67:59988 dst: /00.00.64.57:50010
java.io.IOException:
at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessage(DataTransferSaslUtil.java:217)
at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:364)
at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
at java.lang.Thread.run(Thread.java:748)
2020-08-13 16:01:41,210 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving user.queue.64.55-1545405130172:blk_1075343458_1603227 src: /00.00.64.56:58556 dest: /00.00.64.57:50010
2020-08-13 16:01:41,223 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /00.00.64.56:58556, dest: /00.00.64.57:50010, bytes: 56, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1081910632_107, offset: 0, srvID: cb4e7a77-f5d6-49a5-abab-58d060602ec7, blockid: user.queue.64.55-1545405130172:blk_1075343458_1603227, duration: 8787325
2020-08-13 16:01:41,225 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: user.queue.64.55-1545405130172:blk_1075343458_1603227, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2020-08-13 16:01:45,151 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1075343458_1603227 file /data/dfs/dn/current/user.queue.64.55-1545405130172/current/finalized/subdir24/subdir112/blk_1075343458 for deletion
2020-08-13 16:01:45,151 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted user.queue.64.55-1545405130172 blk_1075343458_1603227 file /data/dfs/dn/current/user.queue.64.55-1545405130172/current/finalized/subdir24/subdir112/blk_1075343458
2020-08-13 16:02:13,278 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving user.queue.64.55-1545405130172:blk_1075343459_1603228 src: /00.00.64.55:43446 dest: /00.00.64.57:50010
Thanks David
... View more
08-13-2020
02:47 AM
Hi Timothy, I can reach hive and insert data into tables as well, that works perfectly fine. Also The NiFi cluser and the CDH cluster are in the same subnet. We already did have a solutions architect from cloudera who did assess our clusters from security stand point and certified everything as good. I am able to connect to HDFS from other applications seamlessly with the same security standards. Also in bootstrap.conf I have "java.arg.16=-Djavax.security.auth.useSubjectCredsOnly=true" Thanks David
... View more
08-12-2020
01:50 AM
Hi Timothy, I have Free IPA IAM which actually assigns all kerberos related transactions, here is what i am trying to do, 1) create a separate service account for NiFi and use that service account to start NiFi and add that service account, in the service account which has required permissions on HDFS and sentry. So that when the flow is triggered with PutHDFS processor it has all required permissions on the CDH cluster. 2) Trigger the work flow in my local NiFi instance installed in my PC while I capture the logs from NiFi intsance, Name Node and the data nodes when the flow is running, again trigger the flow in NiFi cluster while running processors from the primary node alone and capture the logs of NiFi instance, Name node logs and the Data Node logs, and then compare all these logs and see if i can find anything. Because I have eliminated all the security related options because local security, Firewalls and file permissions are eliminated out of the equation. Thanks David
... View more
08-11-2020
01:37 PM
It's actually the same, however the NiFi service is being run by root and I use a service account principal and it's keytab in the PutHDFS processor, So do you say that "root" is not having sufficient privileges to write the file into HDFS?
... View more
08-11-2020
12:53 PM
Hi Timothy, Here is the dfs admin report [hdfssuperuser@abc ~]$ hadoop dfsadmin -report Configured Capacity: 8748844187648 (7.96 TB) Present Capacity: 8649609633023 (7.87 TB) DFS Remaining: 6585870316863 (5.99 TB) DFS Used: 2063739316160 (1.88 TB) DFS Used%: 23.86% Live datanodes (4): Name: 0.1.0.1:50010 Hostname: abc Rack: /default Decommission Status : Normal Configured Capacity: 2187211046912 (1.99 TB) DFS Used: 673284505773 (627.05 GB) Non DFS Used: 24917778259 (23.21 GB) DFS Remaining: 1488606111329 (1.35 TB) DFS Used%: 30.78% DFS Remaining%: 68.06%
... View more
08-11-2020
11:42 AM
Hi Timothy, The NameNode may be overloaded. Check the logs for messages that say "discarding calls..." - Name node is fine and works fine when I actually have a nifi instance running in my laptop and use the same work flow with same configurations am able to write the file successfully to the same cluster There may not be enough (any) DataNode nodes running for the data to be written. Again, check the logs. - All datanodes are up and running, Will check the logs and get back to you Every DataNode on which the blocks were stored might be down (or not connected to the NameNode; it is impossible to distinguish the two). - Am able to put file from command line.
... View more