About weichiu

weichiu · ‎05-26-2017

It looks like an issue described in https://issues.apache.org/jira/browse/HDFS-11254 ( Standby NameNode may crash during failover if loading edits takes too long) 2017-05-25 14:40:37,740 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 137515/140868 transactions completed. (98%) 2017-05-25 14:41:27,207 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Loaded 140868 edits starting from txid 20804872532 It took 50 seconds to load edits. Edit log loading must acquire namenode lock and ZKFC may fail to be establish connection with NameNode. at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1640) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1375) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107) at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) Looks like it happened during transition from standby to active. It may be fixed by HDFS-8865 Improve quota initialization performance I suspect the stackdump in the log is not complete. If it's induced by HDFS-8865, you would see stacktrace like: Thread 188 (IPC Server handler 25 on 8022): State: RUNNABLE Blocked count: 278 Waited count: 17419 Stack: org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875) org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860) org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827) org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232) org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188) org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182) java.security.AccessController.doPrivileged(Native Method) javax.security.auth.Subject.doAs(Subject.java:415) org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477) org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458) A workaround is to increase ZKFC connection timeout value. The default is 45 seconds IIRC. Double this number should alleviate the problem.

weichiu · ‎05-09-2017

Based on the error message, it comes from org.apache.hadoop.ipc.Server#checkDataLength() Fundamentally, this property changes the max length of protobuf (a widely used data exchange format), and there's a reason why there needs a size limit. Excerpt from protobuf doc: https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/CodedInputStream#setSizeLimit-int- public int setSizeLimit(int limit) Set the maximum message size. In order to prevent malicious messages from exhausting memory or causing integer overflows, CodedInputStream limits how large a message may be. The default limit is 64MB. You should set this limit as small as you can without harming your app's functionality. Note that size limits only apply when reading from an InputStream, not when constructed around a raw byte array (nor with ByteString.newCodedInput()). You could increase this limit, but there are other Hadoop limits that you could also hit. For example, number of files in a directory. In summary, you should go back and check what went over the limit. It can be number of files in a directory, number of blocks on a DataNode, ... and so on. It is an indication that something went over the recommended range.

weichiu · ‎04-11-2017

If restarting NameNode doesn't help, see if you can bump NameNode log level to DEBUG and post the NameNode log (or you can send that to me privately weichiu at cloudera dot com)

weichiu · ‎04-11-2017

Can you try to restart NameNode and see if it helps? The symptom matches HDFS-10788: https://issues.apache.org/jira/browse/HDFS-10788 and I initially thought HDFS-10788 is resolved via HDFS-9958, but apparently that's not the case.

weichiu · ‎04-06-2017

use lsof command, and you should be able to see all the open files

weichiu · ‎04-06-2017

Got it. The warning message "Inconsistent number of corrupt replicas" suggests you may have encountered the bug described in HDFS-9958 (BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.) HDFS-9958 is fixed in a number of CDH versions: CDH5.5.6 CDH5.7.4 CDH5.7.5 CDH5.7.6 CDH5.8.2 CDH5.8.3 CDH5.8.4 CDH5.9.0 CDH5.9.1 CDH5.10.0 CDH5.10.1 Unfortunately, given that you're already on CDH5.10.0, it appears to be a new bug that gives this symptom. I can file an Apache Hadoop jira on your behalf for this bug report. The Cloudera Community forum is supposed to be a troubleshooting site, and bug reports should be sent to Apache Hadoop so that more people can look into it.

weichiu · ‎04-06-2017

diskbalanacer is a new feature in CDH5.8, and by definition, a new feature will not be backported to an older minor version.

weichiu · ‎04-04-2017

Hi, It appears to be a bug and I am interested to understand this bug further. I did a quick search and it doesn't seem to be reported previous on Apache Hadoop Jira. Would you be able to look at the Active NameNode log and search for ArrayIndexOutOfBoundsException The client side of log doesn't print its stack trace so it's impossible to know where this exception was thrown. NameNode log should likely contain the entire stacktrace, which will help finding where it originated.

weichiu · ‎10-18-2016

I feel what you described has its own inherent risk. Since CDH5.8.2, you can use a new HDFS feature: intra datanode balancer to do exactly what you asked for. And we have a new blog post about this feature: http://blog.cloudera.com/blog/2016/10/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/

weichiu · ‎10-16-2016

Hi, I don't think that's possible given that most applications are based on HDFS semantics (strong consistency, POSIX compatible), and S3 simply isn't designed as a file system (eventual consistency, blob store). Plus, you lose data locality. As far as I know, most cloud use cases still use HDFS as temporary, intermediate storage, and use S3 as permanent, eventual storage. There've been several studies in using HDFS as meta store, and cloud as data store, but that's a huge work (see HDFS-9806) and probably in the Hadoop 4/CDH 7 timeframe. Hope this helps.

Online	Offline
Last Visited	‎04-05-2023 01:32 PM

Member Since	‎08-16-2016 10:10 AM
Last Visited	‎04-05-2023 01:32 PM
Posts	48
Kudos received	9

Cloudera Community

Re: HDFS to many bad blocks due to checksum errors...

Re: HDFS diskbalancer unexpected permission denied...

Re: Balancing Blocks Between Disks on Datanode

Re: AWS S3 bucket as a primary storage for HDFS

Re: NameNoedStanby shutdown by itself when journal...

Re: ISSUE: Requested data length 146629817 is long...

Re: Cannot get a file on HDFS becouse of "java.lan...

Re: Cannot get a file on HDFS becouse of "java.lan...

Re: Datanode DiskOutOfSpaceException even if disk ...

Re: Cannot get a file on HDFS becouse of "java.lan...

Re: Balancing Blocks Between Disks on Datanode

Re: Cannot get a file on HDFS becouse of "java.lan...

Re: Balancing Blocks Between Disks on Datanode

Re: AWS S3 bucket as a primary storage for HDFS