Member since
08-08-2020
68
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1955 | 11-14-2022 07:14 AM |
04-20-2023
12:33 PM
Hello @Eren Mabe this kb can be useful https://community.cloudera.com/t5/Customer/HDFS-write-occasionally-fails-with-error-message-quot-Unable/ta-p/81420 The block chain replication for HDFS does not seem to be able to complete: snapshot export: 2023-04-19 19:44:10,199 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* blk_-9223372036854751104_9943 is COMMITTED but not COMPLETE(numNodes= 5 >= minimum = 3) in file /hbase/archive/data/default/test/09748ed90d0d58c0fe7ac4b3c08f3cd4/cf/e35fadd88d244766800728318ccea508
2023-04-19 19:44:10,200 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8020, call Call#31 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 172.31.0.146:52210
….
hadoop distcp:
java.io.IOException: Unable to close file because the last block does not have enough number of replicas. Then make sure that the destination cluster has enough number of datanodes available and with space based on the replication factor or the hbase file, in this case /hbase/archive/data/default/test/09748ed90d0d58c0fe7ac4b3c08f3cd4/cf/e35fadd88d244766800728318ccea508 You can reconfirm the RF of the file by running any of the following commands: hdfs fsck /hbase/archive/data/default/test/09748ed90d0d58c0fe7ac4b3c08f3cd4/cf/e35fadd88d244766800728318ccea508 hdfs dfs -ls /hbase/archive/data/default/test/09748ed90d0d58c0fe7ac4b3c08f3cd4/cf/e35fadd88d244766800728318ccea508. In the last one, the second column will tell the RF applied to the file. You can also check in the destination cluster if the datanodes are distributed in racks, if so, check that each rack has enough number of datanodes available and with space. Hope this helps.
... View more
03-14-2023
10:32 AM
Hello @Vako You may require further investigation from the AWS support team concerning this: [ERROR] Deferred error: s3:c96 close("/data/test/datasync//test.txt"): 5 (Input/output error) This error comes from S3 protocol, also this message from NN logs: java.io.IOException: File /data/test/datasync/test.txt could only be written to 0 of the 1 minReplication nodes. There are 6 datanode(s) running and 6 node(s) are excluded in this operation. This means that the client reached the Namenode, so it wrote the file at metadata level in HDFS, but when the application tried to write the data to the different datanodes this fails for some reason, maybe you can check if Datanode's logs tell you something at the time of the issue, but I've always seen this issue caused by some problem/misconfiguration in the application side, also you can validate if there are no firewall rules blocking the datanode port for the server where you are running this AWS DataSync agent. Hope this helps.
... View more
03-02-2023
07:37 AM
Hello, org.apache.hadoop.hbase.YouAreDeadException, normally occurs if Region Server lost communication or last too much reporting availability with the znode cause different reasons [1] You may want to check what is in Region Server logs and check if the zookeeper service is not crashing and if ZK timeouts are properly set [2] Hope this helps [1] https://issues.apache.org/jira/browse/HBASE-25274 [2] https://community.cloudera.com/t5/Customer/What-is-the-formula-to-calculate-ZooKeeper-timeouts-for/ta-p/271310
... View more
12-05-2022
11:11 AM
Hello @hanumanth , If Zookeeper services look up and running, you may need to compare the Spark job failure timestamp against Zookeeper logs from the Leader sever. if there is not a visible issue from Zookeeper side you can check if the hbase client configurations were applied properly in the spark job configurations. Also, confirm that the Hbase service is up and functional as well. If the above does not help, you may want to raise a support ticket with the Spark component.
... View more
11-14-2022
07:14 AM
@hanumanth you may check if the files you deleted from HDFS still exist somehow in HDFS and if so, check the replication factor applied to them, this can be contained in the second column of the hdfs -ls output, for this you can collect a recursive listing by running hdfs dfs -ls -R / > hdfs_recursive After you can try a filter to this file to know the RF factor applied to your files: hdfs dfs -ls -R / | awk '{print $2}' | sort | uniq -c Also, ensure that there is no other content from other processes (something different than HDFS) filling up the mount points used to store the HDFS blocks. You can get also the following outputs to compare if existing data is protected by snapshots or not: #1 du -s -x #2 du -s ---> Du command usage [1] If the results of the above commands differ between them probably there are snapshots still present that are avoiding blocks to be deleted from datanodes.
... View more
11-07-2022
08:04 AM
What is the current HDP, CDH, or CDP version of this cluster? Is this issue present on every browser? Did it work as expected before?
... View more