Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

A lot of blocks missing in HDFS

avatar
Rising Star

Yesterday I add three more data nodes to my hdfs cluster with hdp 2.6.4.

Few hours later, because of sparking writing error(No lease on...), I increase dfs.datanode.max.xcievers to 65536 and increase the heap size of name node and data node from 5G to 12G. And then restart it.

However, the hdfs restart progress pauses in name node stage. It shows it is always in safe mode, and continues for 10 minutes. I force to leave the safe mode manually, and then hdfs reports a lot blocks are missing(about more than 90%).

I checked the log of datanode and namenode, there are two kinds of error log:

1. In name node: Requested data length ** is longer than maximum configured RPC length **

2. In data node: End of file exception between local host is "***", destination host is "**:8020"

So how can I recovery my missing file? and what's the actual cause of this problem?

1 ACCEPTED SOLUTION

avatar
Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
5 REPLIES 5

avatar
Contributor

For your first question explore param ipc.maximum.data.length that should help, on the other hand, value of 65536 for dfs.datanode.max.xcievers seems way high.

Basically, I feel your datanode block reports are not reaching namenode because of length limitation and so namenode is missing enough blocks to exit safemode. Thus it makes sense why its reporting missing blocks upon forceful safemode exit.

For namenode heap configuration visit https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_command-line-installation/content/config...

avatar
Rising Star

Thanks @KB

I have reset the dfs.datanode.max.xcievers to 32768, is it still too high?

I increase it to avoid "No lease on file (inode 5425306)" error. So what's the proper value for this property?

If I set the value to a proper value, will the missing block be recovered automatically?

avatar
Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

I increase the ipc max length according to this https://community.hortonworks.com/questions/101841/issue-requested-data-length-146629817-is-longer-t...

The hdfs service seems back to work.

avatar
Rising Star

Thanks @KB

And another question:

When my spark application writing massive of data to hdfs, it always throws error message like following:

No lease on /user/xx/sample_2016/_temporary/0/_temporary/attempt_201604141035_0058_m_019029_0/part-r-19029-1b93e1fa-9284-4f2c-821a-c83795ad27c1.gz.parquet:File does not exist.HolderDFSClient_NONMAPREDUCE_1239207978_115 does not have any open files.

How to solve this problem? I search online and others said it is related to dfs.datanode.max.xcievers